CLOUDSTACK-9231: Root volume migration from one primary to another primary storage within the same cluster is failingEXPECTED BEHAVIOUR:
====================
Root Volume migration within cluster should work.
ACTUAL BEHAVIOUR:
==================
Root volume migration within cluster failed.
This situation arises when there are two management server accessing the same database.
When the migration request comes the command is forwarded from one management server to another because the host is owned by the second management server. So, serialisation of map from one to another fails.
Fix:
===
This is fixed by converting the maps to lists.
* pr/1336:
CLOUDSTACK-9231: Root volume migration from one primary to another primary storage within the same cluster is failing
Signed-off-by: Remi Bergsma <github@remi.nl>
Implement a NSX API request execution counter per threadThe NSX plugin has a execution counter to prevent infinite recursion (and as a result a stack overflow exception). However, the thread safeness of this counter are not as desired. The counter was implemented with an AtomicInteger which make it safe for multiple threads to update and read it. The desired property would be to have a counter per thread.
This PR addresses that issue.
* pr/1294:
Fix execution counter to support separate counts per thread
Add test to check that each thread has it's own execution counter
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9230: Remove unnecessary return statement from cloudStack.jsRemoved the unnecessary return statement.
The statement is never reached.
* pr/1335:
CLOUDSTACK-9230: Remove unnecessary return statement from cloudStack.js
Signed-off-by: Remi Bergsma <github@remi.nl>
[4.7] Critical VPCVR issues fixed: CLOUDSTACK-9154; CLOUDSTACK-9187; and CLOUDSTACK-9188This PR applies the same fixes as in the PR #1259, but against branch 4.7.
Please refer to PR #1259 for the tests results and all the comments already made there.
Issues fixed are:
* CLOUDSTACK-9154: rVPC doesn't recover from cleaning up of network garbage collector
* CLOUDSTACK-9187: rVPC routers in Master/Master due to concurrency problem when writing the keepalivd.conf
* CLOUDSTACK-9188: NetworkGarbageCollector is not using gc.interval and gc.wait from settings
Those changes have been covered by 2 new tests added to ```smoke/test_vpc_redundant.py```:
* test_04_rvpc_network_garbage_collector_nics
* test_05_rvpc_multi_tiers
The test ```test_04_rvpc_network_garbage_collector_nics``` depends on the global settings for the network.gc.interval and gc.wait. If one wants the test to run quicker, please change the settings (default is 600 seconds for each) and restart the Management Server before running the tests. I would suggest to set it to 60 seconds.
In addition, the NetworkGarbageCollector was redefining the settings above mentioned and not reading their values through ConfigDao. Due to that, the settings were not being applied properly and the test was waiting to long to check the VPC routers.
* pr/1277:
CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far
CLOUDSTACK-9188 - Reads network GC interval and wait from configDao
CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
CLOUDSTACK-9154 - Adds test to cover nics state after GC
CLOUDSTACK-9154 - Returns the guest iterface that is marked as added
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9240 remove 40GB filesize limit from SSVM scriptsBoth createvolume.sh and createtmplt.sh have a 40GB hardcoded limit for the size of the template that gets created. I could not find any justification of that. I am just removing them as they caused us a huge headache when we tried to create bigger templates and they failed without any good error.
This closes#1223 (This PR was against master, made a new one against 4.7)
Thanks Syed <syed1.mushtaq@gmail.com>
* pr/1343:
CLOUDSTACK-9240 remove 40GB filesize limit from SSVM scripts
Signed-off-by: Remi Bergsma <github@remi.nl>
Add Health Check Command to NSX pluginThe NSX plugin does not support the HeathCheckCommand. Instead it fakes a PingCommand as a call tot he control cluster status API.
However, we have seen in production that the management server will sometimes find the NSX controller to be behind on ping and that will trigger a HealthCheckCommand which will return with an unsupported command answer.
Once this happens the controller is put into Alert state and will not recover until the management sever is restarted.
In addition, during the investigation, there will be a null pointer exception due tot he fact that the NSX controllers do not live in a pod.
This PR tries to address those two issues.
* pr/1293:
Implement CheckHealthCommand for NSX controllers
Fix log message that refers to agent, not host
Prevent NullPointerException when host does not belong to a pod
Signed-off-by: Remi Bergsma <github@remi.nl>
NicProfileHelperImpl NullpointerException when ipVO is nullWhen a VPC has a private gateway, and one would like to restart the VPC with **cleanup** it would fail.
This PR adds a NullPointer check and verifies it with an integration test.
```
test_01_vpc_privategw_acl (integration.smoke.test_privategw_acl.TestPrivateGwACL) ... === TestName: test_01_vpc_privategw_acl | Status : SUCCESS ===
ok
test_02_vpc_privategw_static_routes (integration.smoke.test_privategw_acl.TestPrivateGwACL) ... === TestName: test_02_vpc_privategw_static_routes | Status : SUCCESS ===
ok
test_03_vpc_privategw_restart_vpc_cleanup (integration.smoke.test_privategw_acl.TestPrivateGwACL) ... === TestName: test_03_vpc_privategw_restart_vpc_cleanup | Status : SUCCESS ===
ok
test_04_rvpc_privategw_static_routes (integration.smoke.test_privategw_acl.TestPrivateGwACL) ... === TestName: test_04_rvpc_privategw_static_routes | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 4 tests in 2945.055s
OK
```
* pr/1328:
Add integration test for restartVPC with cleanup, and Private Gateway enabled.
Nullpointer Exception in NicProfileHelperImpl
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9222 Prevent cloud.log.1 filling up the diskDelay Compress results in more space usage than needed. Since we have copy truncate we don't need it.
* pr/1329:
CLOUDSTACK-9222 Prevent cloud.log.1 filling up the disk
Signed-off-by: Remi Bergsma <github@remi.nl>
[4.7] FIX Site2SiteVPN on redundant VPCThis PR:
- fixes the inability to setup more than one Site2Site VPN connection from a VPC
- fixes starting of Site2Site VPN on redundant VPC
- fixes Site2Site VPN state checking on redundant VPC
- improves the vpc_vpn test to allow multple hypervisors
- adds an integration test for Site2Site VPN on redundant VPC
Tested it on 4.7 single Xen server zone:
command:
```
nosetests --with-marvin --marvin-config=/data/shared/marvin/mct-zone1-xen1.cfg -a tags=advanced,required_hardware=true /tmp/test_vpc_vpn.py
```
results:
```
Test Site 2 Site VPN Across redundant VPCs ... === TestName: test_01_redundant_vpc_site2site_vpn | Status : SUCCESS ===
ok
Test Remote Access VPN in VPC ... === TestName: test_01_vpc_remote_access_vpn | Status : SUCCESS ===
ok
Test Site 2 Site VPN Across VPCs ... === TestName: test_01_vpc_site2site_vpn | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 3 tests in 1490.076s
OK
```
also performed numerous manual inspections of state of VPN connections and connectivity between VPC's
* pr/1276:
Fix unable to setup more than one Site2Site VPN Connection
FIX S2S VPN rVPC: Check only redundant routers in state MASTER
PEP8 of integration/smoke/test_vpc_vpn
Add S2S VPN test for Redundant VPC
Make integration/smoke/test_vpc_vpn Hypervisor independant
FIX VPN: non-working ipsec commands
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9202 Bump ssh timeout for VR commandsIt seems the VR needs more time for some of its commands. Until we figured out the root cause, this allows the VRs to start again.
Error seen:
```
2015-12-28 14:35:18,201 ERROR [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Work-Job-Executor-1:ctx-34ff7f80 job-39723/job-39726 ctx-d63de41b) Timed out in waiting SSH execution result
2015-12-28 14:35:18,201 WARN [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Work-Job-Executor-1:ctx-34ff7f80 job-39723/job-39726 ctx-d63de41b) Command: com.cloud.agent.api.Command failed while starting virtua
l router
2015-12-28 14:35:18,201 INFO [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-1:ctx-34ff7f80 job-39723/job-39726 ctx-d63de41b) The guru did not like the answers so stopping VM[DomainRouter|r-1534-VM]
.Answer":{"result":true,"wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Timed out in waiting SSH execution result","wait":0}}] }
```
* pr/1291:
CLOUDSTACK-9202 Bump ssh timeout
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9181 Prevent syntax error in checkrouter.shAdded quotes to prevent syntax errors in weird situations.
Error seen in mgt server:
```
2015-12-15 14:30:32,371 DEBUG [c.c.a.m.AgentManagerImpl] (RedundantRouterStatusMonitor-7:ctx-0dd8ef3e) Details from executing class com.cloud.agent.api.CheckRouterCommand: Status: UNKNOWN
/opt/cloud/bin/checkrouter.sh: line 28: [: =: unary operator expected
/opt/cloud/bin/checkrouter.sh: line 31: [: =: unary operator expected
```
Cause:
```
root@r-1191-VM:/opt/cloud/bin# ./checkrouter.sh
./checkrouter.sh: line 28: [: =: unary operator expected
./checkrouter.sh: line 31: [: =: unary operator expected
Status: UNKNOWN
```
Somehow a nic was missing.
After fix the script can handle this:
```
root@r-1191-VM:/opt/cloud/bin# ./checkrouter.sh
Status: UNKNOWN
```
The other states are also reported fine:
```
root@r-1191-VM:/opt/cloud/bin# ./checkrouter.sh
Status: MASTER
```
```
root@r-1192-VM:/opt/cloud/bin# ./checkrouter.sh
Status: BACKUP
```
While at it, I also removed the INTERFACES variable/constant as it was only used once and hardcoded the second time. Now both are hardcoded and easier to read.
* pr/1296:
make both check lines consistent
CLOUDSTACK-9181 Prevent syntax error in checkrouter.sh
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9204 Do not error when staticroute is already goneWhen deleting a static route fails because it isn't there any more (KeyError), it should succeed instead.
Error seen:
```
[INFO] Processing JSON file static_routes.json.1451560145
Traceback (most recent call last):
File "/opt/cloud/bin/update_config.py", line 140, in <module>
process_file()
File "/opt/cloud/bin/update_config.py", line 52, in process_file
qf.load(None)
File "/opt/cloud/bin/merge.py", line 258, in load
proc = updateDataBag(self)
File "/opt/cloud/bin/merge.py", line 91, in _init_
self.process()
File "/opt/cloud/bin/merge.py", line 131, in process
dbag = self.process_staticroutes(self.db.getDataBag())
File "/opt/cloud/bin/merge.py", line 179, in process_staticroutes
return cs_staticroutes.merge(dbag, self.qFile.data)
File "/opt/cloud/bin/cs_staticroutes.py", line 26, in merge
del dbag[key]
KeyError: u'192.168.0.3'
```
* pr/1298:
CLOUDSTACK-9204 Do not error when staticroute is already gone
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-6485 prevent ip asignment of private gw ifacePrevent ipaddress asignment of gateway to gateway-interface on vpc router by setting vpcid to null in network. This was fixed in 4.4 by 1f209ff226a24979cf3a43ce0c02e05c84dd4dc2, reimplemented for 4.7
* pr/1299:
CLOUDSTACK-6485 prevent ip asignment of private gw iface
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9192: UpdateVpnCustomerGateway is failingReproducible Steps:
================
1.Create a customer gateway for a VPC.
2.Edit it using UI(API call is UpdateVpnCustomerGateway).
3.When we try to update the customer vpn gateway with connection state is not in "Error", we see the API error but that won't be reflected to the user in UI.
Actual Behaviour:
==============
The API throws error. But UI doesn't show it to user.
Expected Behaviour:
================
The UI should show the error to user.
Fix:
===
TypeError: json.updatecustomergatewayresponse is undefined
The response name was wrong so corrected it.
It should be json.updatevpncustomergatewayresponse.
Added the error function.
* pr/1300:
CLOUDSTACK-9192: UpdateVpnCustomerGateway is failing
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9186: Root admin cannot see VPC created by Domain admin userIssue:
=====
Root admin cannot see LB rules and Public LB IP addresses created by domain-admin in UI therefore root admin cannot manage those.
Reproducible Steps:
================
Log in as a Domain-Admin account and create a VPC with vpc virtual router as public load balancer provider
click on the newly created VPC -> click on the VPC tier -> click internal LB
Add internal LB,
Logoff domain-admin and login as root admin
Navigate the VPC created previously and click internal LB, internal lb is not showing up.
Same steps for Public LB IP addresses except select the correct Network offering while creating a tier.
Expected Behaviour:
================
Root admin should be able to manage VPC created by Domain admin user .
Actual Behaviour:
==============
Root admin cannot see VPC created by Domain admin user and hence not able to manage it.
Fix:
===
Added the parameter listAll=true in case of Internal LB as well as Public LB IP addresses.
* pr/1301:
CLOUDSTACK-9186: Root admin cannot see VPC created by Domain admin user
Signed-off-by: Remi Bergsma <github@remi.nl>
[4.7] ADD Force UDP encapsulation option to Site2Site VPNThis PR adds the option to enable forced UDP encapsulation of ESP packets during a setup of a site2site vpn. This options enforces the 'forceencaps' option in the openswan ipsec config:
https://wiki.strongswan.org/projects/strongswan/wiki/ConnSection
* pr/1317:
[UI] MADNESS
[DB] Add force_encap field to s2s_customer_gateway table
[ROUTER] Add forceencaps field to python router ipsec config method
[TEST] unittest needs rework
[MARVIN] Add forceencap field to VpnCustomerGateway class in marvin base
[CORE] Add Force UDP Encapsulation option to Site2Site VPN
Signed-off-by: Remi Bergsma <github@remi.nl>
This situation arises when there are two management server accessing the same database.
When the migration request comes the command is forwarded from one management server to another because
the host is owned by the second management server. So, serialization of map from one to another fails.
This is fixed by converting the maps to lists.
CLOUDSTACK-9221 Allow admin to see user VMs on port forwarding pageOn commit a902443708ee10acb9f68fff74af346a6a9fb370 the 'listAll=true' is removed. On some places the domainid and accountid are added but not on these. I added them now.
It's either doing this, or readding listAll is true. I've seeing other folks doing that so let's see what performs best.
* pr/1325:
Admin cannot see VMs on port forwarding page
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9220 Sort list of domains on Domain tab in UIThe list of domains was unsorted and that annoyed me so I sorted it :-)
* pr/1327:
CLOUDSTACK-9220 Sort list of domains on Domain tab in UI
Signed-off-by: Remi Bergsma <github@remi.nl>
On commir a902443708ee10acb9f68fff74af346a6a9fb370 the 'listAll=true'
is removed. On some places the domainid and accountid are added but not
on these. I added them now.
It's either doing this, or readding listAll is true. I've seeing other
folks doing that so let's see what performs best.
Fix mariadb related listCapacity bug (CLOUDSTACK-8966)type bigint(20) with type varchar does not work well on MariaDB
So forcing it to type decimal
* pr/1314:
Fix mariadb related listCapacity bug (CLOUDSTACK-8966)
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9213 - As a user I want to be able to use multiple ip's/cidrs in an ACLThis PR fixes a problem with iptables when creating ACL items using a comma separated value list of CIDRs. Please refer to the details in the Jira issue.
* pr/1311:
CLOUDSTACK-9213 - Split the ACL rules using comma instead of dash.
CLOUDSTACK-9213 - Formatting the code
Signed-off-by: Remi Bergsma <github@remi.nl>
Prevent ipaddress asignment of gateway to gateway-interface on vpc router by setting vpcid to null in network
Was fixed in 4.4 by 1f209ff226a24979cf3a43ce0c02e05c84dd4dc2
Reimplemented for 4.7