CLOUDSTACK-8991 - IP address is not removed from VR even after disabling static NATThis PR fixes the Public IP removal form the virtual routers. It also improves the existing test_network.py.
* pr/989:
CLOUDSTACK-8991 - Process the IPs that have been removed
CLOUDSTACK-8991 - Remove public IP form interface in case add = false
CLOUDSTACK-8991 - Make sure the public IP is removed form the router before checking
Signed-off-by: Remi Bergsma <github@remi.nl>
[master/4.6] CLOUDSTACK-8999: Don't override resource if provided by agent.propertiesIf a custom resource (kvm/libvirt implementation) is defined in agent.properties
don't override with the default, but check and fallback to the default if
resource property not defined
A simple if-else fix, cc @remibergsma @wido @wilderrodrigues @borisroman and others
* pr/991:
CLOUDSTACK-8999: Don't override resource if provided by agent.properties
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8941: fix NPE when migrate vm to other zone-wide pools the second timeThis is because the pod_id is set to NULL at the first time when I migrate the instance to a zone-wide pool (not cluster-wide).
* pr/918:
CLOUDSTACK-8941: fix NPE when migrate vm to other zone-wide pools the second time
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8990: start a stopped machine on a specific determinable host on UI
* pr/978:
CLOUDSTACK-8990: start a stopped machine on a specific determinable host on UI
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8985: Deleted volume's removed column not updatedI found this issue when a SolidFire integration test yesterday wasn't able to delete primary storage because it claimed there were still volumes using the primary storage in question (this was due to the removed column not being updated appropriately).
I decided to go with a solution where the delete logic would pass in a volume ID to ignore when computing the used space of the primary storage in question.
* pr/968:
CLOUDSTACK-8985: Deleted volume's removed column not updated
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8911: VM start job got stuck in loop looking for suitable host
VM instance creation job get stuck in the loop, when VMs require local storage there are host that reached max guest limit and remain hosts does have storage available. This happens because the hosts that reach the max guest limit were not getting added to the avoid list and hence the cluster.
Verified the fix on my local setup.
Repro Steps:
1. Take an environment with single cluster and 2 hosts.
2. change the max guest limit for the hypervisor such that on one host max guest limit should reach.
3. change thresholds so that other host should not have enough storage. If required create a VM for sufficient bigger disk.
4. Now deploy a VM with local storage.
5. cluster will never be put in the avoid set and job will keep looking for suitable host.
6. once we increase the max guest limit, VM will deploy or will fail if there is a lack of storage.
* pr/895:
CLOUDSTACK-8911: VM start job got stuck in loop looking for suitable host
Signed-off-by: Remi Bergsma <github@remi.nl>
If a custom resource (kvm/libvirt implementation) is defined in agent.properties
don't override with the default, but check and fallback to the default if
resource property not defined
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit 536a8b22c8865dc94281bce6267930a63e03ab77)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Cloudstack-8816 some of the events do not have resource uuidsThe key objects in the context map are sometimes String and sometimes object. This causes missing uuids when an entity put in the context map with key entity.toString is queried with key entity
Testing:
manually tested by deploying a vm and checked that the created events in rabbitmq now has uuids.
events before and after the change are update at https://issues.apache.org/jira/browse/CLOUDSTACK-8816?focusedCommentId=14805239
unittests
```
$ mvn -pl :cloud-api test -Dtest=CallContextTest
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running org.apache.cloudstack.context.CallContextTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.152 sec - in org.apache.cloudstack.context.CallContextTest
Results :
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11.445 s
[INFO] Finished at: 2015-09-18T14:58:53+05:30
[INFO] Final Memory: 55M/448M
[INFO] ------------------------------------------------------------------------
```
* pr/849:
CLOUDSTACK-8816 added missing events
CLOUDSTACK-8816: fixed missing resource uuid in delete network cmd
CLOUDSTACK-8816: fixed missing resource uuid in destroy vm event
Cloudstack-8816: Fixed missing resource uuid in delete snapshot events
CLOUDSTACK-8816: some of the events do not have resource uuids
CLOUDSTACK-8816: some of the events do not have resource uuids
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8917 : Instance tab takes long time to load with 12K Vmsmodified sql that is used for retrieving vm count .
In load test environment listVirtualmachine takes 8-11 sec to load. This environment has around 12k active VMs. Total number of rows is 190K.
Performance bottleneck in listVirtualmachine command is fetching the count and distinct vms.
{noformat}
// search vm details by ids
Pair<List<UserVmJoinVO>, Integer> uniqueVmPair = _userVmJoinDao.searchAndCount(sc, searchFilter);
Integer count = uniqueVmPair.second();
{noformat}
This takes 95% of the total time.
To fetch the count and distinct vms we are using below sqls.
Query 1:
{noformat}
SELECT DISTINCT(user_vm_view.id) FROM user_vm_view WHERE user_vm_view.account_type != 5 AND user_vm_view.display_vm = 1 AND user_vm_view.removed IS NULL ORDER BY user_vm_view.id ASC LIMIT 0, 20
{noformat}
Query 2:
select count(distinct id) from user_vm_view WHERE user_vm_view.account_type != 5 AND user_vm_view.display_vm = 1 AND user_vm_view.removed IS NULL
Query 2 is a problematic query.
If we rewrite the query as mentioned below then it will be ~2x faster.
select count(*) from (select distinct id from user_vm_view WHERE user_vm_view.account_type != 5 AND user_vm_view.display_vm = 1 AND user_vm_view.removed IS NULL) as temp;
Mysql Test result:
With 134 active Vms (total rows 349)
mysql> select count(*) from vm_instance;
+----------+
| count(*) |
+----------+
| 349 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from user_vm_view;
+----------+
| count(*) |
+----------+
| 135 |
+----------+
1 row in set (0.02 sec)
mysql> select count(distinct id) from user_vm_view WHERE user_vm_view.account_type != 5 AND user_vm_view.display_vm = 1 AND user_vm_view.removed IS NULL;
+--------------------+
| count(distinct id) |
+--------------------+
| 134 |
+--------------------+
1 row in set (0.02 sec)
mysql> select count(*) from (select distinct id from user_vm_view WHERE user_vm_view.account_type != 5 AND user_vm_view.display_vm = 1 AND user_vm_view.removed IS NULL) as temp;
+----------+
| count(*) |
+----------+
| 134 |
+----------+
1 row in set (0.01 sec)
With 14326 active Vms (total rows 195660)
mysql> select count(*) from vm_instance;
+----------+
| count(*) |
+----------+
| 195660 |
+----------+
1 row in set (0.04 sec)
mysql> select count(*) from user_vm_view;
+----------+
| count(*) |
+----------+
| 41313 |
+----------+
1 row in set (4.55 sec)
mysql> select count(distinct id) from user_vm_view WHERE user_vm_view.account_type != 5 AND user_vm_view.display_vm = 1 AND user_vm_view.removed IS NULL;
+--------------------+
| count(distinct id) |
+--------------------+
| 14326 |
+--------------------+
1 row in set (7.39 sec)
mysql> select count(*) from (select distinct id from user_vm_view WHERE user_vm_view.account_type != 5 AND user_vm_view.display_vm = 1 AND user_vm_view.removed IS NULL) as temp;
+----------+
| count(*) |
+----------+
| 14326 |
+----------+
1 row in set (2.08 sec)
UI test Results:
Before:

After

* pr/894:
CLOUDSTACK-8917 : Instance tab takes long time to load with 12K active VM (total vms: 190K)
Signed-off-by: Rajani Karuturi <rajani.karuturi@citrix.com>
CLOUDSTACK-8889: delete volume doesnt decrement primary store resource countPrimary Storage count for an account does not decrease when a Data Disk
is deleted belonging to the account unless the VM to which volume
belonged is destroyed
The resource counts are updated even before the disk is actually deleted
resulting in the same value.
Moved the resource counts updation to after the expunge operation as
thats when the disk is actually deleted.
Testing:
Earlier, test_create_multiple_volumes in test/integration/component/test_ps_limits.py failed with error AssertionError: Resource count 37 should match with the expected resource count 32
Before
Test create multiple volumes ... === TestName: test_create_multiple_volumes_1_root_domain_admin | Status : FAILED ===
FAIL
Test create multiple volumes ... === TestName: test_create_multiple_volumes_2_child_domain_admin | Status : FAILED ===
FAIL
After the Fix
Test create multiple volumes ... === TestName: test_create_multiple_volumes_1_root_domain_admin | Status : SUCCESS ===
ok
Test create multiple volumes ... === TestName: test_create_multiple_volumes_2_child_domain_admin | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 2 tests in 334.823s
OK
* pr/860:
CLOUDSTACK-8889: delete volume doesnt decrement primary store resource count
Signed-off-by: Remi Bergsma <github@remi.nl>
Primary Storage count for an account does not decrease when a Data Disk
is deleted belonging to the account unless the VM to which volume
belonged is destroyed
The resource counts are updated even before the disk is actually deleted
resulting in the same value.
Moved the resource counts updation to after the expunge operation as
thats when the disk is actually deleted.
all the tests in test/integration/component/test_ps_limits.py now pass
CLOUDSTACK-8964: Can't create template or volume from snapshot on KVM
* pr/954:
CLOUDSTACK-8964: Can't create template or volume from snapshot
Signed-off-by: Remi Bergsma <github@remi.nl>
FIX: Ovm3 physical network traffic labels to work.The labeling was broken. Only labels assigned at zone creation
were used, changing labels was not working. Tested with changing
a label and checking it, labels at zone creation still works.
As a bonus fixed the consistency of KVM in Dutch compared to other
traffic labels in Dutch and copied in the OVM3 translated label
in other languages based on the other tarffic labels in those languages.
* pr/964:
FIX: Ovm3 physical network traffic labels to work.
Signed-off-by: Remi Bergsma <github@remi.nl>
uuid is missing in the first event of VM create as the entity is just
created and never put in the Context.
Added the entity uuid to context on successful creation.
the key for an entity is sometimes an object a String with value
object.toString() due to serialization and deserialization of them.
Addressed this in the getter of CallContext to check for key.toString
if an object is not found with key.
CLOUDSTACK-8964: Can't create volume from snapshot of a removed volumeThis issue happens on KVM as well.
This is because the volume info is missing in the CopyCommand once the volume has been removed.
When the KVM agent tries to process the command, it will throws a NPE.
* pr/976:
CLOUDSTACK-8964: Can't create volume from snapshot of a removed volume
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8838 Interface pattern checkthsi closes#812 and #966 as well
* pr/973:
unit test for interface patterns in libvirt compute resource
Added support for KVM teamd devices to LibvirtComputingResource.java. This will allow users to utilze teamd nic teaming devices named team*.
CLOUDSTACK-8838: Allow ensX enoX enpX enxX format for nics in CentOS 7
Signed-off-by: Remi Bergsma <github@remi.nl>
smoke/test_internal_lb.py: Fix template not ready errorAdd wait for template download
Refactored template section of services
Added some extra logging in the setup phase
* pr/971:
Add wait for template download Refactored template section of services
Signed-off-by: Remi Bergsma <github@remi.nl>
CID 1324349: conditionally return -1 or the dc id for the volume
* pr/822:
CID 1324349: conditionally return -1 or the dc id for the volume
Signed-off-by: Remi Bergsma <github@remi.nl>
Fix error message in test_isolate_network_FW_PF_default_routesWhile running test_isolate_network_FW_PF_default_routes it is expected
that SSH'ing into a VM does not work immediately. However, when it fails
(as expected) witht he follwoing error
====Trying SSH Connection: Host:192.168.23.12 Uer:root
Port:22 RetryCnt:1===
Traceback (most recent call last):
File "/home/jenkins/workspace/mccloud/mct-run-marvin-tests@2/venv/lib/python2.7/site-packages/marvin/sshClient.py", line 121, in createConnection timeout=self.timeout)
File "/home/jenkins/workspace/mccloud/mct-run-marvin-tests@2/venv/lib/python2.7/site-packages/paramiko/client.py", line 251, in connect retry_on_signal(lambda: sock.connect(addr))
File /home/jenkins/workspace/mccloud/mct-run-marvin-tests@2/venv/lib/python2.7/site-packages/paramiko/util.py", line 270, in retry_on_signal return function()
File /home/jenkins/workspace/mccloud/mct-run-marvin-tests@2/venv/lib/python2.7/site-packages/paramiko/client.py", line 251, in <lambda> retry_on_signal(lambda: sock.connect(addr))
File "/usr/lib64/python2.7/socket.py", line 224, in meth return getattr(self._sock,name)(*args)
error: [Errno 113] No route to host
it would try to print a message that generates a actual error:
```
DEBUG: ====Trying SSH Connection: Host:192.168.23.12 User:root
Port:22 RetryCnt:0===
test_isolate_network_FW_PF_default_routes
(integration.component.test_routers_network_ops.TestIsolatedNetworks):
CRITICAL: EXCEPTION: test_isolate_network_FW_PF_default_routes:
Traceback (most recent call last):,
File "/usr/lib64/python2.7/unittest/case.py", line 369, in run
testMethod()',
File "/home/jenkins/workspace/mccloud/mct-run-marvin-tests@2/test/integration/component/test_routers_network_ops.py", line 448, in test_isolate_network_FW_PF_default_routes self.fail("Failed to SSH into VM - %s" % (public_ip.ipaddress.ipaddress)),
"AttributeError: 'unicode' object has no attribute 'ipaddress'"
```
* pr/972:
Fix error message in test_isolate_network_FW_PF_default_routes
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8935 - Cannot remove [r]VPC networks due to RTNETLINK errorThis PR fixes the "sequence item 0: expected string, NoneType found" error found in the CsDhcp.py file when attempting to remo a network from a VPC.
* pr/967:
CLOUDSTACK-8935 - Filter the DNS list because it might contain 1 None entry which breaks the code.
CLOUDSTACK-8935 - Clean up network resources in the right order
CLOUDSTACK-8935 - Do not retry 60 times when we expect the SSH to fail
CLOUDSTACK-8935 - Check if the key is available in the dictionary
CLOUDSTACK-8935 - Add a check to avoid exception related to None value
Signed-off-by: Remi Bergsma <github@remi.nl>
This issue happens on KVM as well.
This is because the volume info is missing in the CopyCommand once the volume has been removed.
When the KVM agent tries to process the command, it will throws a NPE.
This issue happens on KVM.
Normally the SSVM will process the CopyCommand from snapshot to template.
However, Ovm3HypervisorGuru chooses a KVM hypervisor to process the CopyCommand.
This is obviously wrong.
CLOUDSTACK-8981 coded a more obscure host and clear failure message test fails when port is reachable so prevent it as much as possible
making sure that all kinds of weird developers can work with it
* pr/965:
CLOUDSTACK-8981 coded a more obscure host and clear failure message test fails when port is reachable so prevent it as much as possible making sure that all kinds of weird developers can work with it
Signed-off-by: Remi Bergsma <github@remi.nl>