40 Commits

Author SHA1 Message Date
Daniel Augusto Veronezi Salvador
349120f7c5
Externalize config to enable manually setting CPU topology on KVM VM (#5273)
* Externalize config to enable manually setting CPU topology on KVM VM

* Change log level

Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
2021-08-15 23:52:50 -03:00
Daniel Augusto Veronezi Salvador
7b752c3077
Externalize KVM Agent storage's timeout configuration (#5239)
* Externalize KVM Agent storage's timeout configuration

* Address @nvazquez review

* Add empty line at the end of the agent.properties file

Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
2021-07-28 15:45:27 +02:00
dahn
6f93e5cd08
Revert "Externalize kvm agent storage timeout configuration (#4585)" (#5218)
This reverts commit 05a978c2495d1c93c798d2b845c5030ad78f6a14.
2021-07-20 09:16:43 +02:00
Daniel Augusto Veronezi Salvador
05a978c249
Externalize kvm agent storage timeout configuration (#4585)
* Externalize KVM Agent storage's timeout configuration

Created a class of constant agent's properties available to configure on "agent.properties".
Created a class to provides a facility to read the agent's properties file and get its properties.

* Refactored KVHAMonitor nested thread and changed some logs

* It has been added the timeout's config in the agent.properties file

* Rename classes

* Rename var and remove comment

* Fix typo with word "heartbeat"

* Extract multiple methods call to variables

* Add unit tests to file handler

* Increase info about the property

* Create inner class Property

* Rename method getProperty to getPropertyValue

* Remove copyright

* Remove copyright

* Extract code to createHeartBeatCommand

* Change method access from protected to private

Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
2021-07-19 09:41:50 +02:00
sureshanaparti
eba186aa40
storage: New Dell EMC PowerFlex Plugin (formerly ScaleIO, VxFlexOS) (#4304)
Added support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack (for KVM hypervisor) and enabled VM/Volume operations on that pool (using pool tag).
Please find more details in the FS here:
https://cwiki.apache.org/confluence/x/cDl4CQ

Documentation PR: apache/cloudstack-documentation#169

This enables support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack

Other improvements addressed in addition to PowerFlex/ScaleIO support:

- Added support for config drives in host cache for KVM
	=> Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
	=> Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
	=> Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
	=> Added new parameter "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path and create config drives on the "/config" directory on the host cache path
	=> Maintain the config drive location and use it when required on any config drive operation (migrate, delete)

- Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates

- Updated full deployment destination for preparing the network(s) on VM start

- Propagate the direct download certificates uploaded to the newly added KVM hosts

- Discover the template size for direct download templates using any available host from the zones specified on template registration
	=> When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones

- Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)

- Retry VM deployment/start when the host cannot grant access to volume/template

- Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
	=> Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore

- Check the router filesystem is writable or not, before performing health checks
	=> Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
	=> The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
	=> Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not

- Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s)

* Addressed some issues for few operations on PowerFlex storage pool.

- Updated migration volume operation to sync the status and wait for migration to complete.

- Updated VM Snapshot naming, for uniqueness in ScaleIO volume name when more than one volume exists in the VM.

- Added sync lock while spooling managed storage template before volume creation from the template (non-direct download).

- Updated resize volume error message string.

- Blocked the below operations on PowerFlex storage pool:
  -> Extract Volume
  -> Create Snapshot for VMSnapshot

* Added the PowerFlex/ScaleIO client connection pool to manage the ScaleIO gateway clients, which uses a single gateway client per Powerflex/ScaleIO storage pool and renews it when the session token expires.

- The token is valid for 8 hours from the time it was created, unless there has been no activity for 10 minutes.
  Reference: https://cpsdocs.dellemc.com/bundle/PF_REST_API_RG/page/GUID-92430F19-9F44-42B6-B898-87D5307AE59B.html

Other fixes included:

- Fail the VM deployment when the host specified in the deployVirtualMachine cmd is not in the right state (i.e. either Resource State is not Enabled or Status is not Up)

- Use the physical file size of the template to check the free space availability on the host, while downloading the direct download templates.

- Perform basic tests (for connectivity and file system) on router before updating the health check config data
	=> Validate the basic tests (connectivity and file system check) on router
	=> Cleanup the health check results when router is destroyed

* Updated PowerFlex/ScaleIO storage plugin version to 4.16.0.0

* UI Changes to support storage plugin for PowerFlex/ScaleIO storage pool.
- PowerFlex pool URL generated from the UI inputs(Gateway, Username, Password, Storage Pool) when adding "PowerFlex" Primary Storage
- Updated protocol to "custom" for PowerFlex provider
- Allow VM Snapshot for stopped VM on KVM hypervisor and PowerFlex/ScaleIO storage pool

and Minor improvements in PowerFlex/ScaleIO storage plugin code

* Added support for PowerFlex/ScaleIO volume migration across different PowerFlex storage instances.

- findStoragePoolsForMigration API returns PowerFlex pool(s) of different instance as suitable pool(s), for volume(s) on PowerFlex storage pool.
- Volume(s) with snapshots are not allowed to migrate to different PowerFlex instance.
- Volume(s) of running VM are not allowed to migrate to other PowerFlex storage pools.
- Volume migration from PowerFlex pool to Non-PowerFlex pool, and vice versa are not supported.

* Fixed change service offering smoke tests in test_service_offerings.py, test_vm_snapshots.py

* Added the PowerFlex/ScaleIO volume/snapshot name to the paths of respective CloudStack resources (Templates, Volumes, Snapshots and VM Snapshots)

* Added new response parameter “supportsStorageSnapshot” (true/false) to volume response, and Updated UI to hide the async backup option while taking snapshot for volume(s) with storage snapshot support.

* Fix to remove the duplicate zone wide pools listed while finding storage pools for migration

* Updated PowerFlex/ScaleIO volume migration checks and rollback migration on failure

* Fixed the PowerFlex/ScaleIO volume name inconsistency issue in the volume path after migration, due to rename failure
2021-02-24 14:58:33 +05:30
Rohit Yadav
05ae3f8d81 Merge remote-tracking branch 'origin/4.13' into 4.14 2020-08-21 15:38:18 +05:30
Sid Kattoju
1da76d27f1
iscsi session cleanup now configurable, filters iscsi partitions (#4219)
Added property to agent.properties that enables or disables the iscsi session clean up feature. #4210
Added a condition to prevent disk partitions from being cleaned up. #4216
2020-08-21 14:38:36 +05:30
Bitworks LLC
750abf3551
FEATURE-3823: kvm agent hooks (#3839) 2020-03-14 09:22:08 +01:00
Nicolas Vazquez
efe00aa7e0
[KVM] Rolling maintenance (#3610) 2020-03-12 16:59:46 +01:00
Nicolas Vazquez
73122fd0a9
[KVM] Direct download agnostic of the storage provider (#3828)
* Remove constraint for NFS storage

* Add new property on agent.properties

* Add free disk space on the host prior template download

* Add unit tests for the free space check

* Fix free space check - retrieve avaiable size in bytes

* Update default location for direct download

* Improve the method to retrieve hosts to retry on depending on the destination pool type and scope

* Verify location for temporary download exists before checking free space

* In progress - refactor and extension

* Refactor and fix

* Last fixes and marvin tests

* Remove unused test file

* Improve logging

* Change default path for direct download

* Fix upload certificate

* Fix ISO failure after retry

* Fix metalink filename mismatch error

* Fix iso direct download

* Fix for direct download ISOs on local storage and shared mount point

* Last fix iso

* Fix VM migration with ISO

* Refactor volume migration to remove secondary storage intermediate

* Fix simulator issue
2020-03-06 19:56:54 +01:00
Rohit Yadav
524b995083
IoT/ARM64 support: allow cloudstack-agent on Raspberry Pi 4 (armv8) to use kvm acceleration (#3644)
KVM is supported on arm64 Linux (https://www.linux-kvm.org/page/Processor_support#ARM:).
For a small (IoT) platform such as the new Raspberry Pi 4 that uses armv8 processor
(cortex-a72) it's possible to run Linux host with `/dev/kvm`
accleration. This adds support for IoT IaaS in CloudStack.

This PR is from a fun weekend project where:
- I set up a Raspberry Pi 4 - 4GB RAM model with 4 CPU cores @ 1.5Ghz, 128GB SD samsung evo plus card
- Installed Ubuntu 19.10 raspi3 base image: http://cdimage.ubuntu.com/releases/19.10/release/ubuntu-19.10-preinstalled-server-arm64+raspi3.img.xz
- Build a custom Linux 5.3 kernel with KVM enabled, deb here: http://dl.rohityadav.cloud/cloudstack-rpi/kernel-19.10/ and install the linux-image and linux-module
- Then install/setup CloudStack on it (fix some issues around jna, by manually installing newer libjna-java to /usr/share/cloudstack-agent/lib)
- Since the host processor is not x86_64, I had to build a new arm64 (or aarch64) systemvmtemplate: http://dl.rohityadav.cloud/cloudstack-rpi/systemvmtemplate/

I could finally get a 4.13 CloudStack + Adv zone/networking to run on it
and deployed a KVM based Ubuntu 19.10 environment and NFS storage.
Deployed a test vm with isolated network, VR works as expected. Console
proxy works as well, for this tested against arm64 openstack Debian 9/10
templates.

I raised the issue of enabling KVM in upstream Ubuntu arm64 build: https://bugs.launchpad.net/ubuntu/+source/linux-raspi2/+bug/1783961
Ubuntu kernel team has come back and future arm64 releases may have 
KVM enabled by default.

Limitation: on my aarch64 env, it did not support IDE, therefore all
default bus type for volumes are SCSI by default. With VIRTIO it fails
sometimes.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-11-11 22:01:05 +05:30
Andrija Panic
df173859d7 agent: add host.reserved.mem.mb parameter documentation (#3016)
Document "host.reserved.mem.mb" parameter in agent.properties.
2018-11-12 11:40:37 +05:30
Nicolas Vazquez
4de4eabd18
Enable DPDK support on KVM (#2839)
* Enable DPDK support on KVM

* Allow DPDK deployments on user VMs only

* Fix port name ordering
2018-11-07 09:29:01 -03:00
Rohit Yadav
30175d6879
CLOUDSTACK-10132: Extend support for management servers LB for agents (#2469)
The new CA framework introduced basic support for comma-separated
list of management servers for agent, which makes an external LB
unnecessary.

This extends that feature to implement LB sorting algorithms that
sorts the management server list before they are sent to the agents.
This adds a central intelligence in the management server and adds
additional enhancements to Agent class to be algorithm aware and
have a background mechanism to check/fallback to preferred management
server (assumed as the first in the list). This is support for any
indirect agent such as the KVM, CPVM and SSVM agent, and would
provide support for management server host migration during upgrade
(when instead of in-place, new hosts are used to setup new mgmt server).

This FR introduces two new global settings:

- `indirect.agent.lb.algorithm`: The algorithm for the indirect agent LB.
- `indirect.agent.lb.check.interval`: The preferred host check interval
  for the agent's background task that checks and switches to agent's
  preferred host.

The indirect.agent.lb.algorithm supports following algorithm options:

- static: use the list as provided.
- roundrobin: evenly spreads hosts across management servers based on
  host's id.
- shuffle: (pseudo) randomly sorts the list (not recommended for production).

Any changes to the global settings - `indirect.agent.lb.algorithm` and
`host` does not require restarting of the mangement server(s) and the
agents. A message bus based system dynamically reacts to change in these
global settings and propagates them to all connected agents.

Comma-separated management server list is propagated to agents on
following cases:
- Addition of a host (including ssvm, cpvm systevms).
- Connection or reconnection by the agents to a management server.
- After admin changes the 'host' and/or the
  'indirect.agent.lb.algorithm' global settings.

On the agent side, the 'host' setting is saved in its properties file as:
`host=<comma separated addresses>@<algorithm name>`.

First the agent connects to the management server and sends its current
management server list, which is compared by the management server and
in case of failure a new/update list is sent for the agent to persist.

From the agent's perspective, the first address in the propagated list
will be considered the preferred host. A new background task can be
activated by configuring the `indirect.agent.lb.check.interval` which is
a cluster level global setting from CloudStack and admins can also
override this by configuring the 'host.lb.check.interval' in the
`agent.properties` file.

Every time agent gets a ms-host list and the algorithm, the host specific
background check interval is also sent and it dynamically reconfigures
the background task without need to restart agents.

Note: The 'static' and 'roundrobin' algorithms, strictly checks for the
order as expected by them, however, the 'shuffle' algorithm just checks
for content and not the order of the comma separate ms host addresses.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-03-15 16:34:03 +05:30
Bitworks Software, Ltd
3381c38cc7 CLOUDSTACK-10073: KVM host RAM overprovisioning (#2266)
Commit enables a new feature for KVM hypervisor which purpose is to increase virtually amount of RAM available beyond the actual limit.
There is a new parameter in agent.properties: host.overcommit.mem.mb which enables adding specified amount of RAM to actually available. It is necessary to utilize KSM and ZSwap features which extend RAM with deduplication and compression.
2017-09-29 11:46:09 +05:30
Wido den Hollander
b130e55088 CLOUDSTACK-9397: Add Watchdog timer to KVM Instance (#1707)
The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.
The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

In the agent.properties the action to be taken can be defined:

vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

vm.watchdog.model=i6300esb

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

  <watchdog model='i6300esb' action='reset'/>

In the agent.properties the action to be taken can be defined:

  vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

  vm.watchdog.model=i6300esb

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-09-28 13:56:15 +05:30
Rohit Yadav
e9f526e221 Merge branch '4.9' into 4.10
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-07-28 10:49:34 +02:00
Abhinandan Prateek
7ed3179bd1 CLOUDSTACK-9569: add router.aggregation.command.each.timeout to agent.properties (#1933) 2017-07-27 11:17:20 +02:00
Wido den Hollander
2a5f37c1b1
CLOUDSTACK-8715: Add channel to Instances for Qemu Guest Agent
This commit adds a additional VirtIO channel with the name
'org.qemu.guest_agent.0' to all Instances.

With the Qemu Guest Agent the Hypervisor gains more control over the Instance if
these tools are present inside the Instance, for example:

* Power control
* Flushing filesystems
* Fetching Network information

In the future this should allow safer snapshots on KVM since we can instruct the
Instance to flush the filesystems prior to snapshotting the disk.

More information: http://wiki.qemu.org/Features/QAPI/GuestAgent

Keep in mind that on Ubuntu AppArmor still needs to be disabled since the default
AppArmor profile doesn't allow libvirt to write into /var/lib/libvirt/qemu

This commit does not add any communication methods through API-calls, it merely
adds the channel to the Instances and installs the Guest Agent in the SSVMs.

With the addition of the Qemu Guest Agent channel a second channel appears in /dev
on a SSVM as a VirtIO port.

The order in which the ports are defined in the XML matters for the naming inside
the SSVM VM and by not relying on /dev/vportXX but looking for a static name the
SSVM still boots properly if the order in the XML definition is changed.

A SSVM with both ports attached will have something like this:

  root@v-215-VM:~# ls -l /dev/virtio-ports
  total 0
  lrwxrwxrwx 1 root root 11 May 13 21:41 org.qemu.guest_agent.0 -> ../vport0p2
  lrwxrwxrwx 1 root root 11 May 13 21:41 v-215-VM.vport -> ../vport0p1
  root@v-215-VM:~# ls -l /dev/vport*
  crw------- 1 root root 251, 1 May 13 21:41 /dev/vport0p1
  crw------- 1 root root 251, 2 May 13 21:41 /dev/vport0p2
  root@v-215-VM:~#

In this case the SSVM port points to /dev/vport0p1, but if the order in the XML
is different it might point to /dev/vport0p2

By looking for a portname with a pre-defined pattern in /dev/virtio-ports we
do not rely on the order in the XML definition.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-11-23 16:01:08 +01:00
Wido den Hollander
0beb41b6e7 CLOUDSTACK-9395: Add Virtio RNG device to Instances when configured
By adding a Random Number Generator device to Instances we can prevent
entropy starvation inside guest.

The default source is /dev/random on the host, but this can be configured
to another source when present, for example a hardware RNG.

When enabled it will add the following to the Instance's XML definition:

  <rng model='virtio'>
    <rate period='1000' bytes='2048' />
    <backend model='random'>/dev/random</backend>
  </rng>

If the Instance has the proper support, which most modern distributions have,
it will have a /dev/hwrng device which it can use for gathering entropy.

More information: https://libvirt.org/formatdomain.html#elementsRng
2016-10-04 12:44:55 +02:00
Rohit Yadav
52a98fa6cf CLOUDSTACK-8762: Check to confirm disk activity before starting a VM
Implements a VM volume/disk file activity checker that checks if QCOW2 file
has been changed before starting the VM. This is useful as a pessimistic
approach to save VMs that were running on faulty hosts that CloudStack could
try to launch on other hosts while the host was not cleanly fenced. This is
optional and available only if you enable the settings in agent.properties
file, on per-host basis.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-08-28 16:05:30 +05:30
Rohit Yadav
58cc569273 CLOUDSTACK-8424: Add cpu features if guest.cpu.features is set
This improvements checks for "guest.cpu.features" property which is a space
separated list of cpu features that is specific for a host. When added, it
will add  <feature policy='require' name='{{feature-you-listed}}'/> in the
<cpu> section of the generated vm spec xml.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit ea7fd37783cbc7ec78de5a5e84395381b1800a3e)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-28 13:19:49 +02:00
Kishan Kavala
89854de18d CLOUDSTACK-6931: Set hypervisor.type in agent.properties using cloudstack-setup -t option. Default is kvm. 2014-06-19 11:31:23 +05:30
Marcus Sorensen
f3a0d257b8 CLOUDSTACK-6788: Add agent property to optionally disable kvmclock on guests 2014-05-27 16:16:41 -06:00
Marcus Sorensen
bbaec7bae8 CLOUDSTACK-6203: Correct documentation for KVM migration tuneables 2014-03-05 12:41:09 -07:00
Marcus Sorensen
e5449e29c9 CLOUDSTACK-6203: KVM Migration fixes. Moved migration to a thread
so we can monitor it and potentially take action to make migration
complete if admin has defined such.
2014-03-05 12:24:04 -07:00
Marcus Sorensen
1530c162e5 CLOUDSTACK-5968 create vm.memballoon.disable agent parameter 2014-01-28 10:44:44 -07:00
JijunLiu
2903bb5fd9 Add cpu model for kvm guest.Now all the kvm guest's cpu model is 'QEMU Virtual CPU version xxx'. This will affect the activation of Windows OS and low performance. I add three mode for user to indicate the guest cpu model.add libvirt version check
Signed-off-by: Wei Zhou <w.zhou@leaseweb.com>
2013-08-02 11:56:59 +02:00
Wido den Hollander
322db71eed agent: Only probe running VMs for the current Hypervisor Type
Otherwise we try to probe for LXC VMs on a KVM hypervisor and vise versa.
2013-05-26 11:38:41 +02:00
Nitin Mehta
c11dbad9c9 merge master 2013-05-11 15:28:43 +05:30
Phong Nguyen
aa79ccf985 CLOUDSTACK-922: LXC Support in Cloudstack.
Signed-off-by: Edison Su <sudison@gmail.com>
2013-04-01 15:41:42 -07:00
Marcus Sorensen
e08281838a Summary: Add EOF to agent.properties for proper parsing
Detail: lack of newline at end of file was keeping cloudstack-setup-agent from
properly editing/creating new config.

BUG-ID: CLOUDSTACK-1487
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1362191198 -0700
2013-03-01 19:26:38 -07:00
Hugo Trippaers
9f00302ad3 Summary: Polish and shine
Document the used options in agent.properties

Default the bridge driver to something sensible based on the
configuration of the bridge type.
2013-01-15 09:18:24 +01:00
Wido den Hollander
6c015d4f81 agent: Add default path for KVM scripts to sample configuration 2012-08-08 22:31:06 +02:00
Wido den Hollander
5238a83688 agent: Update sample configuration file
It should now contain everything you need to run your agent
2012-08-08 22:31:06 +02:00
Edison Su
7a0a9231c3 Move KVM related code into plugins/hypervisor/kvm, a new jar file is
created: cloud-kvm.jar
2012-07-30 14:55:47 -07:00
David Nalley
c15948a3ef committing Chip Childers patches fixing licensing headers
Applying to the following directories:
* api
* deamonize
* agnet
* agent-simulator
* cloud-cli
2012-06-12 12:32:58 -04:00
Edison Su
cf0a4e0274 bug 14034: add newline around configuration file. status 14034: resolved fixed. Reviewed-by: frank 2012-02-27 15:40:27 -08:00
Edison Su
86ce1b0675 bug 13857: set vm migration speed as the network speed
status 13857: resolved fixed
Reviewed-by: Frank
2012-02-17 15:43:52 -08:00
Manuel Amador (Rudd-O)
05c020e1f6 Source code committed 2010-08-11 09:13:29 -07:00