3422 Commits

Author SHA1 Message Date
Vladimir Melnik
c94ee1454d kvm: suspend a VM before snapshot deletion (see PR #3193) (#3194)
To make sure that a qemu2-image won't be corrupted by the snapshot deletion procedure which is being performed after copying the snapshot to a secondary store, I'd propose to put a VM in to suspended state.

Additional reference: https://bugzilla.redhat.com/show_bug.cgi?id=920020#c5

Fixes #3193
2019-06-04 16:04:45 +05:30
Nicolas Vazquez
12c850ed2f
KVM: Improvements on upload direct download certificates (#2995)
* Improvements on upload direct download certificates

* Move upload direct download certificate logic to KVM plugin

* Extend unit test certificate expiration days

* Add marvin tests and command to revoke certificates

* Review comments

* Do not include revoke certificates API
2019-06-04 03:08:31 -03:00
Nicolas Vazquez
c9ce3e2344 router: Persistent DHCP leases file on VRs and cleanup /etc/hosts on VM deletion (#3351)
Since the CloudStack virtual router was redesigned on version 4.6 it has been observed that the DHCP leases file is not persistent across network operations. This causes conflicts on guest VMs static IPs, causing these static IPs to not be renewed by the DHCP server running on isolated and VPC networks' virtual routers (dnsmasq). On stopping or destroying a VM, its dhcp/dns records are not removed from the virtual router causing ghost effects.

Fixes #3272
Fixes #3354

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-03 17:04:16 +05:30
Richard Lawley
2484527cae srx: Fix removing static NAT rules with Juniper SRX (#3310)
Fixed the logic for deleting static NAT rules on a Juniper SRX device. Previously the private (trust) rule was not being removed.

Fixes #3309
2019-06-03 16:51:23 +05:30
Rohit Yadav
0929866956
server: ssh-keygen in PEM format and reduce main systemvm patching script (#3333)
On first startup, the management server creates and saves a random
ssh keypair using ssh-keygen in the database. The command does
not specify keys in PEM format which is not the default as generated
by latest ssh-keygen tool.

The systemvmtemplate always needs re-building whenever there is a change
in the cloud-early-config file. This also tries to fix that by introducing a
stage 2 bootstrap.sh where the changes specific to hypervisor detection
etc are refactored/moved. The initial cloud-early-config only patches
before the other scripts are called.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-23 18:08:00 +05:30
Rohit Yadav
9ff819da2c
systemvm: new qemu-guest-agent based patching for KVM (#3278)
This introduces a new patching script for patching systemvms on KVM
using qemu-guest-agent that runs inside the systemvm on startup. This
also removes the vport device which was previously used by the legacy
patching script and instead uses the modern and new uniform guest
agent vport for host-guest communication.

Also updates the sytemvmtemplate build config to use the latest Debian
9.9.0 iso.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-10 23:42:19 +05:30
Dingane Hlaluku
e56c499fb8 vmware: syncVolumeToRootFolder method to avoid an infite recursive loop (#3105)
The static method syncVolumeToRootFolder() from VmwareStorageLayoutHelper.java:146 has been incorrectly called and leads to an infinite recursive call that ends up in a StackOverflowError. This PR fixes this.
public static void syncVolumeToRootFolder(DatacenterMO dcMo, DatastoreMO ds, String vmdkName, String vmName) throws Exception { syncVolumeToRootFolder(dcMo, ds, vmdkName, null); } -> public static void syncVolumeToRootFolder(DatacenterMO dcMo, DatastoreMO ds, String vmdkName, String vmName) throws Exception { syncVolumeToRootFolder(dcMo, ds, vmdkName, vmName, null); }
2019-01-07 13:59:45 +05:30
Craig Squire
8d53557ba7 api: don't throttle api discovery for listApis command (#2894)
Users reported that they weren't getting all apis listed in cloudmonkey when running a sync. After some debugging, I found that the problem is that the ApiDiscoveryService is calling ApiRateLimitServiceImpl.checkAccess(), so the results of the listApis command are being truncated because Cloudstack believes the user has exceeded their API throttling rate.

I enabled throttling with a 25 request per second limit. I then created a test role with only list* permissions and assigned it to a test user. When this user calls listApis, they will typically receive anywhere from 15-18 results. Checking the logs, you see The given user has reached his/her account api limit, please retry after 218 ms..

I raised the limit to 200 requests per second, restarted the management server and tried again. This time I got 143 results and no log messages about the user being throttled.
2018-12-12 23:55:32 +05:30
Craig Squire
290df5f423 api: Discover tags field on superclass of API responses (#3005)
Updated ApiServiceDiscoveryImpl to check superclasses of API responses for fields.

Fixes: #3002
2018-12-04 13:59:48 +05:30
Rohit Yadav
29b8a9da48
kvm: when untagged vxlan is used, use the default guest/public bridge (#3037)
When vxlan://untagged is used for public (or guest) network, use the
default public/guest bridge device same as how vlan://untagged works.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-11-28 22:22:30 +05:30
Paul Angus
fb80e51307 Updating pom.xml version numbers for release 4.11.3.0-SNAPSHOT
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2018-11-20 13:11:52 +00:00
Rohit Yadav
c6e53f6cc6
kvm: reset KVM host on heartbeat failure (#2984)
On actual testing, I could see that kvmheartbeat.sh script fails on NFS
server failure and stops the agent only. Any HA VMs could be launched
in different hosts, and recovery of NFS server could lead to a state
where a HA enabled VM runs on two hosts and can potentially cause
disk corruptions. In most cases, VM disk corruption will be worse than
VM downtime. I've kept the sleep interval between check/rounds but
reduced it to 10s. The change in behaviour was introduced in #2722.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-30 15:13:59 +05:30
Nicolas Vazquez
dffb430975 kvm: Fix migrating VM from ISO failures (#2928)
Prevents errors while migrating VM from ISO:

Test 1: Deploy VM from ISO -> Live migrate VM to another host -> ERROR
Test 2: Register ISO using Direct Download on KVM -> Deploy VM from ISO -> Live migrate VM to another host -> ERROR

- Prevent NullPointerException migrating VM from ISO
- Prevent mount secondary storage on ISO direct downloads on KVM
2018-10-29 16:14:20 +05:30
Rohit Yadav
933ee23104
vr: memory and swap optimizations (#2892)
This tries to provide a threshold based fix for #2873 where swappinness of VR is not used until last resort. By limiting swappiness unless actually needed, the VR system degradation can be avoided for most cases. The other change is around not starting baremetal-vr by default on all VRs, according to the spec https://cwiki.apache.org/confluence/display/CLOUDSTACK/Baremetal+Advanced+Networking+Support only vmware VRs need to run it and that too only as the last step of the setup/completion, so we don't need to run it all the time.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-16 10:29:48 +05:30
Frank Maximus
a6196b0a60 Fixes: #2881 Improve Exception message (#2889)
Network.Service and Network.Provider were missing a toString() method.
Added this so appending (a list of) them will be understandable.
2018-10-09 15:43:48 +05:30
Rohit Yadav
f430f41edd
ca: Fixes #2877 mgmt server cert should have all addrs of default nic (#2879)
This fixes the default RootCA provider implementation to initiate
and issue certificate for mgmt server on startup for all the IP addresses
on the default nic of that host.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-07 21:07:10 +05:30
sureshanaparti
e9003fafcd CLOUDSTACK-8609: [VMware] VM is not accessible after migration across clusters (#2091)
[VMware] VM is not accessible after migration across clusters.

Once a VM is successfully started, don't delete the files associated with the unregistered VM, if the files are in a storage that is being used by the new VM.
Attempt to unregister a VM in another DC, only if there is a host associated with a VM.

This closes #556
2018-08-22 01:06:09 +05:30
Slair1
023dcec5ef CLOUDSTACK-10310 Fix KVM reboot on storage issue (#2722) 2018-08-20 10:28:03 +02:00
Kris Sterckx
71bbbb7718 vmware: Fixes #2759 config drive iso path for Vmware (#2769)
Fix config drive iso path on Vmware. Use constant.
2018-07-31 13:00:35 +05:30
Khosrow Moossavi
67860d9f46 maven: Updating pom.xml version numbers for release 4.11.2.0-SNAPSHOT (#2728)
Fixes the version in pom etc. to be consistent with versioning pattern as X.Y.Z.0-SNAPSHOT after a minor release.

Signed-off-by: Khosrow Moossavi <khos2ow@gmail.com>
2018-07-06 17:27:12 +05:30
Paul Angus
8ba318da19 Updating pom.xml version numbers for release 4.11.2-SNAPSHOT
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2018-06-26 17:53:54 +01:00
Paul Angus
2cb2dacbe7 Updating pom.xml version numbers for release 4.11.1.0
Signed-off-by: Paul Angus <paulangus@PA-Ansible-GUI.sblab.local>
2018-06-21 15:52:43 +01:00
dahn
f02e402ebb kvm: send unsupported answer only when applicable (#2714)
Throw specific NPE child when command is known not to be known. Add unit tests.
2018-06-21 11:03:43 +05:30
Rohit Yadav
39471c8c00
configdrive: make fewer mountpoints on hosts (#2716)
This ensure that fewer mount points are made on hosts for either
primary storagepools or secondary storagepools.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-06-20 12:25:16 +05:30
Paul Angus
4afdee9896
Merge pull request #2699 from shapeblue/ldapConfigs
remove old config artifacts from update path
2018-06-11 14:53:38 +01:00
David Passante
6025f25840 Fixes #2685: broken SXM support (#2686) 2018-06-07 21:56:42 +02:00
Daan Hoogland
935ca766dc remove old config artifacts from update path 2018-06-07 07:52:24 +00:00
cl-k-takahashi
d67af8661b kvm: check if storage pool is mounted before creating pool xml (#2696)
Now the KVM agent checks whether a storage pool is mounted or not mounted before calling storagePoolCreateXML().

Signed-off-by: Kai Takahashi <k-takahashi@creationline.com>
2018-06-07 12:23:21 +05:30
Frank Maximus
8798014ca8 CLOUDSTACK-10377: Fix Network restart for Nuage (#2672)
Changes in PR #2508 have caused network restart to fail in a Nuage setup,
as the new VR takes the same IP as the old one, and the old VR is still running.
Nuage doesn't support multiple VM's having the same IP.
We delay provisioning the interfaces in VSD until the old VR interface is released.
2018-06-06 12:17:10 +05:30
瓜@アイソマ:ゆるゆりはいいぞ
caf5857434 Fix two typos (from uanble to unable). (#2676)
Signed-off-by: carrot031 <www.carrotsoft@gmail.com>
2018-05-27 09:54:25 -03:00
Frank Maximus
5221778aa4 CLOUDSTACK-10375: Don't create DefaultNuageVspSharedNetworkOfferingWithSGService (#2667) 2018-05-23 16:15:15 +02:00
Rohit Yadav
acc5fdcdbd
CLOUDSTACK-10290: allow config drives on primary storage for KVM (#2651)
This introduces a new global setting `vm.configdrive.primarypool.enabled` to toggle creation/hosting of config drive iso files on primary storage, the default will be false causing them to be hosted on secondary storage. The current support is limited from hypervisor resource side and in current implementation limited to `KVM` only. The next big change is that config drive is created at a temporary location by management server and shipped to either KVM or SSVM agent via cmd-answer pattern, the data of which is not logged in logs. This saves us from adding genisoimage dependency on cloudstack-agent pkg.

The APIs to reset ssh public key, password and user-data (via update VM API) requires that VM should be shutdown. Therefore, in the refactoring I removed the case of updation of existing ISO. If there are objections I'll re-put the strategy to detach+attach new config iso as a way of updation. In the refactored implementation, the folder name is changed to lower-cased configdrive. And during VM start, migration or shutdown/removal if primary storage is enable for use, the KVM agent will handle cleanup tasks otherwise SSVM agent will handle them.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-21 14:27:23 +05:30
Mike Tutkowski
7e6fddb7ab managed-storage: Handle Ceph (#2655)
In 4.11.0, I added the ability to online migrate volumes from NFS to managed storage. This actually works for Ceph to managed storage in a private 4.8 branch, as well. I thought I had brought along all of the necessary code from that private 4.8 branch to make Ceph to managed storage functional in 4.11.0, but missed one piece (which is fixed by this PR).
2018-05-21 12:54:42 +05:30
Nicolas Vazquez
06f7e495dc Host Affinity plugin (#2630)
This implements a new host-affinity plugin.
2018-05-21 12:49:08 +05:30
Rohit Yadav
1b3046e376
CLOUDSTACK-9184: Fixes #2631 VMware dvs portgroup autogrowth (#2634)
* CLOUDSTACK-9184: Fixes #2631 VMware dvs portgroup autogrowth

This deprecates the vmware.ports.per.dvportgroup global setting.

The vSphere Auto Expand feature (introduced in vSphere 5.0) will take
care of dynamically increasing/decreasing the dvPorts when running out
of distributed ports . But in case of vSphere 4.1/4.0 (If used), as this
feature is not there, the new default value (=> 8) have an impact in the
existing deployments. Action item for vSphere 4.1/4.0: Admin should
modify the global configuration setting "vmware.ports.per.dvportgroup"
from 8 to any number based on their environment because the proposal
default value of 8 would be very less without auto expand feature in
general. The current default value of 256 may not need immediate
modification after deployment, but 8 would be very less which means
admin need to update immediately after upgrade.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-11 22:16:13 +05:30
Rohit Yadav
4534cefa40
backports for 4.11.1 from master (#2621)
* CLOUDSTACK-10147 Disabled Xenserver Cluster can still deploy VM's. Added code to skip disabled clusters when selecting a host (#2442)

(cherry picked from commit c3488a51db4bce4ec32c09e6fef78193d360cf3f)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* CLOUDSTACK-10318: Bug on sorting ACL rules list in chrome (#2478)

(cherry picked from commit 4412563f19ec8b808fe4c79e2baf658507a84873)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* CLOUDSTACK-10284:Creating a snapshot from VM Snapshot generates error if hypervisor is not KVM.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* CLOUDSTACK-10221: Allow IPv6 when creating a Basic Network (#2397)

Since CloudStack 4.10 Basic Networking supports IPv6 and thus
should be allowed to be specified when creating a network.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
(cherry picked from commit 9733a10ecda5f1af0f2c0fa863fc976a3e710946)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* CLOUDSTACK-10214: Unable to remove local primary storage (#2390)

Allow admins to remove primary storage pool.
Cherry-picked from eba2e1d8a1ce4e86b4df144db03e96739da455e5

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* dateutil: constistency of tzdate input and output (#2392)

Signed-off-by: Yoan Blanc <yoan.blanc@exoscale.ch>
Signed-off-by: Daan Hoogland <daan.hoogland@shapeblue.com>
(cherry picked from commit 2ad520282319da9a03061b8c744e51a4ffdf94a2)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* CLOUDSTACK-10054:Volume download times out in 3600 seconds (#2244)

(cherry picked from commit bb607d07a97476dc4fb934b3d75df6affba47086)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* When creating a new account (via domain admin) it is possible to select “root admin” as the role for the new user (#2606)

* create account with domain admin showing 'root admin' role

Domain admins should not be able to assign the role of root admin to new users. Therefore, the role ‘root admin’ (or any other of the same type) should not be visible to domain admins.

* License and formatting

* Break long sentence into multiple lines

* Fix wording of method 'getCurrentAccount'

* fix typo in variable name

* [CLOUDSTACK-10259] Missing float part of secondary storage data in listAccounts

* [CLOUDSTACK-9338] ACS not accounting resources of VMs with custom service offering

ACS is accounting the resources properly when deploying VMs with custom service offerings. However, there are other methods (such as updateResourceCount) that do not execute the resource accounting properly, and these methods update the resource count for an account in the database. Therefore, if a user deploys VMs with custom service offerings, and later this user calls the “updateResourceCount” method, it (the method) will only account for VMs with normal service offerings, and update this as the number of resources used by the account. This will result in a smaller number of resources to be accounted for the given account than the real used value. The problem becomes worse because if the user starts to delete these VMs, it is possible to reach negative values of resources allocated (breaking all of the resource limiting for accounts). This is a very serious attack vector for public cloud providers!

* [CLOUDSTACK-10230] User should not be able to use removed “Guest OS type” (#2404)

* [CLOUDSTACK-10230] User is able to change to “Guest OS type” that has been removed

Users are able to change the OS type of VMs to “Guest OS type” that has been removed. This becomes a security issue when we try to force users to use HVM VMs (Meltdown/Spectre thing). A removed “guest os type” should not be usable by any users in the cloud.

* Remove trailing lines that are breaking build due to checkstyle compliance

* Remove unused imports

* fix classes that were in the wrong folder structure

* Updates to capacity management
2018-05-09 15:20:19 +05:30
Rohit Yadav
bd0959517b
hypervisor: allow Ubuntu 18.04 to be added as KVM host (#2626)
This adds and allows Ubuntu 18.04 to be used as KVM host. In addition,
on the UI when hypervisor version key is missing, this adds and display
the host os and version detail which is useful to show the KVM host
os and version.

When cache mode 'none' is used for empty cdrom drives, systemvms
and guest VMs fail to start on newer libvirtd such as Ubuntu bionic.
The fix is ensure that cachemode is not declared when drives are empty
upon starting of the VM. Similar issue logged at redhat here:
https://bugzilla.redhat.com/show_bug.cgi?id=1342999

The workaround is to ensure that we don't configure cachemode for
cdrom devices at all. This also fixes live VM migration issue.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-08 15:56:49 +05:30
Nathan Johnson
a53dcd6aa9 ceph: Fixes #2611 use raw disk type for rdb (#2623)
Fix issue where kvm / ceph cannot create volumes. Fixes #2611
2018-05-08 15:00:44 +05:30
Rohit Yadav
6412e50471 saml2: Fixes #2548 SAML2 cert encoding and decoding
This fixes SAML2 certificate encoding/decoding issue due to refactoring
regression introduced in 7ce54bf7a85d6df72f84c00fadf9b0fd42ab0d99 that
did not account for base64 based encoding/decoding. The changes
effectively restore the same logic as used in previous versions.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-04 12:37:23 +05:30
Rohit Yadav
eb75c1eff5 ca: Fixes #2530 have all IPs from KVM host in issued X509 cert
This ensures that certificate setup includes all the IP addresses (v4
and v6) when a (KVM) host is added to CloudStack. This fixes #2530.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-04 12:37:23 +05:30
Rohit Yadav
2be45c2186 solidfire: fix potential NPE
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-04 12:37:23 +05:30
Rohit Yadav
464551208c
xenserver: Add support for XS 7.3, 7.4 and XCP-ng 7.4 (#2605)
This adds support for XenServer 7.3 and 7.4, and XCP-ng 7.4 version as hypervisor hosts. Fixes #2523.

This also fixes the issue of 4.11 VRs stuck in starting for up-to 10mins, before they come up online.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-30 08:19:10 +02:00
Khosrow Moossavi
b6d420bec3 CLOUDSTACK-9677: Adding storage policy support for swift as secondary storage (#2412)
Original-Author: @pdube on PR Fixes #1830.
2018-04-26 00:42:15 +02:00
Olivier Lemasle
9a13227a78 CLOUDSTACK-10327: Do not invalidate the session when an API command is not available (#2498)
CloudStack SSO (using security.singlesignon.key) does not work anymore with CloudStack 4.11, since commit 9988c26, which introduced a regression due to a refactoring: every API request that is not "validated" generates the same error (401 - Unauthorized) and invalidates the session.

However, CloudStack UI executes a call to listConfigurations in method bypassLoginCheck. A non-admin user does not have the permissions to execute this request, which causes an error 401:

{"listconfigurationsresponse":{"uuidList":[],"errorcode":401,"errortext":"unable to verify user credentials and/or request signature"}}
The session (already created by SSO) is then invalidated and the user cannot access to CloudStack UI (error "Session Expired").

Before 9988c26 (up to CloudStack 4.10), an error 432 was returned (and ignored):

{"errorresponse":{"uuidList":[],"errorcode":432,"cserrorcode":9999,"errortext":"The user is not allowed to request the API command or the API command does not exist"}}
Even if the call to listConfigurations was removed, another call to listIdps also lead to an error 401 for user accounts if the SAML plugin is not enabled.

This pull request aims to fix the SSO issue, by restoring errors 432 (instead of 401 + invalidate session) for commands not available. However, if an API command is explicitly denied using ACLs or if the session key is incorrect, it still generates an error 401 and invalidates the session.
2018-04-24 15:01:19 +02:00
Rohit Yadav
8da2462469
CLOUDSTACK-10333: Secure Live VM Migration for KVM (#2505)
This extends securing of KVM hosts to securing of libvirt on KVM
host as well for TLS enabled live VM migration. To simplify implementation
securing of host implies that both host and libvirtd processes are
secured with management server's CA plugin issued certificates.

Based on whether keystore and certificates files are available at
/etc/cloudstack/agent, the KVM agent determines whether to use TLS or
TCP based uris for live VM migration. It is also enforced that a secured
host will allow live VM migration to/from other secured host, and an
unsecured hosts will allow live VM migration to/from other unsecured
host only.

Post upgrade the KVM agent on startup will expose its security state
(secured detail is sent as true or false) to the managements server that
gets saved in host_details for the host. This host detail can be accesed
via the listHosts response, and in the UI unsecured KVM hosts will show
up with the host state of ‘unsecured’. Further, a button has been added
that allows admins to provision/renew certificates to KVM hosts and can
be used to secure any unsecured KVM host.

The `cloudstack-setup-agent` was modified to accept a new flag `-s`
which will reconfigure libvirtd with following settings:

    listen_tcp=0
    listen_tls=1
    tcp_port="16509"
    tls_port="16514"
    auth_tcp="none"
    auth_tls="none"
    key_file = "/etc/pki/libvirt/private/serverkey.pem"
    cert_file = "/etc/pki/libvirt/servercert.pem"
    ca_file = "/etc/pki/CA/cacert.pem"

For a connected KVM host agent, when the certificate are
renewed/provisioned a background task is scheduled that waits until all
of the agent tasks finish after which libvirt process is restarted and
finally the agent is restarted via AgentShell.

There are no API or DB changes.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-20 00:36:18 +05:30
dahn
2756d41039
manual mapped ldap fix (#2517)
* translate groovy test for ADLdapUserManagerImpl to java

* fixed by returning the actual result instead of false

* unit test case for manual mapped user in ldap
2018-04-09 17:38:49 +02:00
Khosrow Moossavi
535e6153cc CLOUDSTACK-10232: SystemVMs and VR to run as HVM on XenServer (#2465)
Publishing boot args both to grub and xenstore-data and let
cloud-early-config decides if the VM is in PV or HVM mode
to read from correct source.
2018-03-27 15:48:37 +05:30
Nicolas Vazquez
6a75423779 CLOUDSTACK-10231: Asserted fixes for Direct Download on KVM (#2408)
Several fixes addressed:

- Dettach ISO fails when trying to detach a direct download ISO
- Fix for metalink support on SSVM agents (this closes CLOUDSTACK-10238)
- Reinstall VM from bypassed registered template (this closes CLOUDSTACK-10250)
- Fix upload certificate error message even though operation was successful
- Fix metalink download, checksum retry logic and metalink SSVM downloader
2018-03-20 19:24:46 +05:30
Rohit Yadav
30175d6879
CLOUDSTACK-10132: Extend support for management servers LB for agents (#2469)
The new CA framework introduced basic support for comma-separated
list of management servers for agent, which makes an external LB
unnecessary.

This extends that feature to implement LB sorting algorithms that
sorts the management server list before they are sent to the agents.
This adds a central intelligence in the management server and adds
additional enhancements to Agent class to be algorithm aware and
have a background mechanism to check/fallback to preferred management
server (assumed as the first in the list). This is support for any
indirect agent such as the KVM, CPVM and SSVM agent, and would
provide support for management server host migration during upgrade
(when instead of in-place, new hosts are used to setup new mgmt server).

This FR introduces two new global settings:

- `indirect.agent.lb.algorithm`: The algorithm for the indirect agent LB.
- `indirect.agent.lb.check.interval`: The preferred host check interval
  for the agent's background task that checks and switches to agent's
  preferred host.

The indirect.agent.lb.algorithm supports following algorithm options:

- static: use the list as provided.
- roundrobin: evenly spreads hosts across management servers based on
  host's id.
- shuffle: (pseudo) randomly sorts the list (not recommended for production).

Any changes to the global settings - `indirect.agent.lb.algorithm` and
`host` does not require restarting of the mangement server(s) and the
agents. A message bus based system dynamically reacts to change in these
global settings and propagates them to all connected agents.

Comma-separated management server list is propagated to agents on
following cases:
- Addition of a host (including ssvm, cpvm systevms).
- Connection or reconnection by the agents to a management server.
- After admin changes the 'host' and/or the
  'indirect.agent.lb.algorithm' global settings.

On the agent side, the 'host' setting is saved in its properties file as:
`host=<comma separated addresses>@<algorithm name>`.

First the agent connects to the management server and sends its current
management server list, which is compared by the management server and
in case of failure a new/update list is sent for the agent to persist.

From the agent's perspective, the first address in the propagated list
will be considered the preferred host. A new background task can be
activated by configuring the `indirect.agent.lb.check.interval` which is
a cluster level global setting from CloudStack and admins can also
override this by configuring the 'host.lb.check.interval' in the
`agent.properties` file.

Every time agent gets a ms-host list and the algorithm, the host specific
background check interval is also sent and it dynamically reconfigures
the background task without need to restart agents.

Note: The 'static' and 'roundrobin' algorithms, strictly checks for the
order as expected by them, however, the 'shuffle' algorithm just checks
for content and not the order of the comma separate ms host addresses.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-03-15 16:34:03 +05:30
Nicolas Vazquez
74db647dbb CLOUDSTACK-10321: CPU Cap for KVM (#2482) 2018-03-14 18:21:24 +00:00