Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.
The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.
The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.
The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.
Signed-off-by: Abhinandan Prateek <abhinandan.prateek@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This introduces a new certificate authority framework that allows
pluggable CA provider implementations to handle certificate operations
around issuance, revocation and propagation. The framework injects
itself to `NioServer` to handle agent connections securely. The
framework adds assumptions in `NioClient` that a keystore if available
with known name `cloud.jks` will be used for SSL negotiations and
handshake.
This includes a default 'root' CA provider plugin which creates its own
self-signed root certificate authority on first run and uses it for
issuance and provisioning of certificate to CloudStack agents such as
the KVM, CPVM and SSVM agents and also for the management server for
peer clustering.
Additional changes and notes:
- Comma separate list of management server IPs can be set to the 'host'
global setting. Newly provisioned agents (KVM/CPVM/SSVM etc) will get
radomized comma separated list to which they will attempt connection
or reconnection in provided order. This removes need of a TCP LB on
port 8250 (default) of the management server(s).
- All fresh deployment will enforce two-way SSL authentication where
connecting agents will be required to present certificates issued
by the 'root' CA plugin.
- Existing environment on upgrade will continue to use one-way SSL
authentication and connecting agents will not be required to present
certificates.
- A script `keystore-setup` is responsible for initial keystore setup
and CSR generation on the agent/hosts.
- A script `keystore-cert-import` is responsible for import provided
certificate payload to the java keystore file.
- Agent security (keystore, certificates etc) are setup initially using
SSH, and later provisioning is handled via an existing agent connection
using command-answers. The supported clients and agents are limited to
CPVM, SSVM, and KVM agents, and clustered management server (peering).
- Certificate revocation does not revoke an existing agent-mgmt server
connection, however rejects a revoked certificate used during SSL
handshake.
- Older `cloudstackmanagement.keystore` is deprecated and will no longer
be used by mgmt server(s) for SSL negotiations and handshake. New
keystores will be named `cloud.jks`, any additional SSL certificates
should not be imported in it for use with tomcat etc. The `cloud.jks`
keystore is stricly used for agent-server communications.
- Management server keystore are validated and renewed on start up only,
the validity of them are same as the CA certificates.
New APIs:
- listCaProviders: lists all available CA provider plugins
- listCaCertificate: lists the CA certificate(s)
- issueCertificate: issues X509 client certificate with/without a CSR
- provisionCertificate: provisions certificate to a host
- revokeCertificate: revokes a client certificate using its serial
Global settings for the CA framework:
- ca.framework.provider.plugin: The configured CA provider plugin
- ca.framework.cert.keysize: The key size for certificate generation
- ca.framework.cert.signature.algorithm: The certificate signature algorithm
- ca.framework.cert.validity.period: Certificate validity in days
- ca.framework.cert.automatic.renewal: Certificate auto-renewal setting
- ca.framework.background.task.delay: CA background task delay/interval
- ca.framework.cert.expiry.alert.period: Days to check and alert expiring certificates
Global settings for the default 'root' CA provider:
- ca.plugin.root.private.key: (hidden/encrypted) CA private key
- ca.plugin.root.public.key: (hidden/encrypted) CA public key
- ca.plugin.root.ca.certificate: (hidden/encrypted) CA certificate
- ca.plugin.root.issuer.dn: The CA issue distinguished name
- ca.plugin.root.auth.strictness: Are clients required to present certificates
- ca.plugin.root.allow.expired.cert: Are clients with expired certificates allowed
UI changes:
- Button to download/save the CA certificates.
Misc changes:
- Upgrades bountycastle version and uses newer classes
- Refactors SAMLUtil to use new CertUtils
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes issue of enabling dynamic roles based on the global setting
only. This also fixes application of the default role/permissions mapping
on upgrade from 4.8 and previous versions to 4.9+.
Previously, it would make additional check to ensure commands.properties
is not in the classpath however this creates confusion for admins who
may skip/skim through the rn/docs and assume that mere changing the
global settings was not enough.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Change default configuration for router.aggregation.command.each.timeout from 3 to 600 seconds (#2223)
(cherry picked from commit 17bc6afc8228ed2da6e0b09f330e18217483577c)
This fixes some test_nic failures caused due to short aggregation command timeout
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows changing permission for existing role permissions, as those were static and could not be changed once created. It also provides the ability to change these permissions in the UI using a drop down menu for each permission rule, in which admin can select ‘Allow’ or ‘Deny’ permission.
Changes in the API:
This feature modifies behaviour of updateRolePermission API method:
New optional parameters ‘ruleid’ and ‘permission’ are introduced, they are mutual exclusive to ‘ruleorder’ parameter. This defines two use cases:
Update role permission: ‘ruleid’ and ‘permission’ parameters needed
Update rules order: ‘ruleorder’ parameter needed
Parameter ‘ruleorder’ is now optional
updateRolePermission providing ‘ruleorder’ parameter should be sent via POST
Default value of the account level global config vmsnapshot.expire.interval is -1 that conforms to legacy behaviour. A positive value will expire the VM snapshots for the respective account in that many hours.
Errored and Abandoned Templates should also be displayed on UI so that user has the accessibility to delete the template even before the clean up thread is run. Refer - CLOUDSTACK-9608
ISSUE: Featured Templates/Iso's created by Root/admin user are not visible to Domain Admin users.
STEPS TO REPRODUCE
Mark a template as featured and try to view it from a domain admin user
The issue occurs for both templates and iso's registered before and after upgrade
Templates,ISO's whose owner is ROOT admin, public: Yes, featured: Yes
Log in to UI (as a domain admin, such as an admin of “TEST/TEST1” domain)
Choose “Templates”.
Error message will be shown on UI
CloudStack has several background polling tasks that are spread across
the codebase, the aim of this work is to provide a single manager to
handle submission, execution and handling of background tasks. With
the framework implemented, existing oobm background task has been
refactored to use this manager.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
There is no cpuspeed, cpunumber or memory details in the listUsageRecords output as documented
In DB (cloud_usage table) we have cpu_speed, cpu_cores and ram fileds, but these are not populated for all the VM's. These fields are only populated for the VM's which are deployed with custom service offerings.
Include the timezone in datetime format of snapshot events, to be consistent
with every other events.
"eventDateTime" was added by @chipchilders in commit 14ee684ce3 and was
updated the same day to add the timezone (commit bf967eb622f) except for
Snapshots.
Unable to create service offering with networkrate=0(Unlimited network throttling) with an error "Failed to create service offering xxxxxxx: specify the network rate value more than 0".
Now, Updating the password via UpdateUser API is not allowed via integration port
(cherry picked from commit d206336e1a89d45162c95228ce3486b31d476504)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
capacity_type for local storage in op_host_capacity
is still enabled
(cherry picked from commit e06e3b7cd41787efc4e0f3cbf2d5a3040b4f15c9)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
removed code which nullifies vm_instance_id
Also modified QueryManagerImpl to ignore volume which does not have uuid. This is to avoid duplicate volume listing.
(cherry picked from commit 3cced927c4b1d7e1d8f19bccef46ed8d82e31f41)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
check if acl service provider is configured when network is associated with a acl.
(cherry picked from commit bbff9f15754c06dc8a7a74fdd34ab7968b052c3f)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
registerTemplate and getUploadParamsForTemplate API's
Any string is allowed as hypervisor type from the api.
HypervisorType.getType() tries to validate with the enums and if nothing
matches, sets the type as None.
Added a check to not allow None hypervisor type when registering.
(cherry picked from commit cc06c5189a377fc6bc4b7e8bbed56e9058588863)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
the destination pool is in maintenance mode do not allow a volume to be migrated to
the storage pool. Fixed it for volume migration and vm migration with volume.
(cherry picked from commit 8ef94819dabfc7b027d638473e9454e9f73ac49d)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>