[BLOCKER] Combined PRs that fix VR issuesTonight I worked with @wilderrodrigues to figure out what is wrong with the virtual router. As we couldn't test single PRs any more (because of other issues with them causing tests to fail) we added all VR related PRs in a separate branch and started testing from there.
We combined the following PRs into this PR:
#836#851#867#870#881#882#842
After that, one issue remains: the VPC does not get a default gateway. Which is strange, because we already solved it in PR #738. When I look back, it was fixed again in PR #784. It could very well be that either one fixed one specific case, but also breaking the other. We need to investigate this, and make sure there will be a fix that works both for VPCs and VRs.
When we manually add the default gateway on the VPC, most tests pass and also spinning up two VPCs with one tier each, having a VM and them using s2s to VPN them together works fine. See for more details the report Wilder sent earlier.
Tomorrow we'll try to figure out how to fix the default gateway and merge this. Then we should have a base to work from again. Any PR that fixes another blocker, should at least then be rebased against the fixed master so we can run the tests against the PR branch. I'm not saying everything is fixed, I'm just saying that we can spin up a cloud that has working VMs.
When, in the mean time, someone has the time to checkout this branch and make the default route work for both VPC and VR that would be awesome. After that we should double check and verify the test results.
Pinging @karuturi to let her know the current status.
Regards,
Wilder / Remi
* pr/887:
Fixing the index out of bounds error in the check_if_link_up() function
small cleanups
Fixing the defaut route for VPC routers
Formatting the get_gateway() method in the CsDatabag.py file
Fixing the dhcpsrvr iptables file
Formatting the router_proxy.sh script
CLOUDSTACK-8881: Fixed Static and PF configuration issue
CLOUDSTACK-8905: Fixed hooking egress rules
CLOUDSTACK-8891: Fixed default iptables rules on VR for guest traffic
Configured dnsmasq to listen on all interfaces so that vpn client gets dns
CLOUDSTACK-8864: Not able to add TCP port forwarding rule in VPN for specific ports
CLOUDSTACK-8863: VM doesn't reconnect to internet post VR RESTART/STOP-START/RECREATE
CLOUDSTACK-8843: Fixed issue in default iptables rules on shared network VR
Signed-off-by: Remi Bergsma <github@remi.nl>
- Instead of changing the router type in a local variable, lets have a dedicated file for the dhcpsrvr routers
- The file is called iptables-dhcpsrvr, just like we have iptables-vpcrouter and iptables-router
This reverts commit 6841ba61da5e407f7a16c4a575d1a4e8c8345970, reversing
changes made to 13b29bac5a1778e295df7e9fb21c502fcf017183.
Master is currently frozen, no merges without RM approval.
http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201509.mbox/browser
It also broke the build:
[INFO] Apache CloudStack Framework - Jobs ................ SUCCESS [3.448s]
[INFO] Apache CloudStack Cloud Engine Internal Components API SUCCESS [2.528s]
[INFO] Apache CloudStack Server .......................... FAILURE [24.769s]
[INFO] Apache CloudStack Usage Server .................... SKIPPED
Use java.io.tmpdir instead of hardcoded /tmpSmall fix to have the tests also work on other platforms
* pr/884:
Use java.io.tmpdir instead of hardcoded /tmp
Signed-off-by: Wido den Hollander <wido@widodh.nl>
CLOUDSTACK-8843: Fixed issue in default iptables rules on shared network VROn basic zone share network VR default iptables rules are not applied correctly. Due to this ssh to VR got failed.
In shared network the VR type is 'dhcpsrvr' not router. So corrected it in the ''del_standard' method to select the correct type.
Testing:
1. VR is deployed correctly.
2. Tested restart, stop, start VR.
3. New VM deployment is success.
4. ssh to VR from the host is successful.
5. iptables rules on the VR came up correctly.
below is the output from the VR:
iptables -L INPUT -nv
Chain INPUT (policy DROP 16 packets, 1056 bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT all -- * * 0.0.0.0/0 224.0.0.18
0 0 ACCEPT all -- * * 0.0.0.0/0 225.0.0.50
104 9800 ACCEPT all -- eth0 * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
281 36500 ACCEPT all -- eth1 * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
0 0 ACCEPT all -- eth2 * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
6 504 ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
2 656 ACCEPT udp -- eth0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67
13 780 ACCEPT tcp -- eth1 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:3922 state NEW,ESTABLISHED
0 0 ACCEPT tcp -- eth0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 state NEW
0 0 ACCEPT tcp -- eth0 * 10.147.40.0/23 0.0.0.0/0 state NEW tcp dpt:8080
* pr/842:
CLOUDSTACK-8843: Fixed issue in default iptables rules on shared network VR
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8881: Fixed Static and PF configuration issue1. For static nat filter rules are not configured in VR.
2. Corrected vm ip in PF rule.
* pr/882:
CLOUDSTACK-8881: Fixed Static and PF configuration issue
Signed-off-by: Remi Bergsma <github@remi.nl>
Configured dnsmasq to listen on all interfaces so that vpn client gets dns1. Dnsmasq is not listening on the ppp+ interfaces due to this remote access vpn clients dns requests are dropped.
2. Configured the dnsmasq to listen on all the interfaces except public. There is firewall to allow only specific cidr to allow the dns requests.
Tested from windows client nslookup.
* pr/870:
Configured dnsmasq to listen on all interfaces so that vpn client gets dns
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8891: Fixed default iptables rules on VR for guest trafficVR default iptables rules in INPUT chain are configured partially.
In CsAddress.py rules are configured while configuring public interface, guest interface post configuration is missed. Fixed to configure guest post configuration so that iptables rules are configured.
Testing:
1. Deployed vm in the network.
2.iptables rules on the VR configured correctly.
3.VM got the dhcp ip address from the VR.
* pr/867:
CLOUDSTACK-8891: Fixed default iptables rules on VR for guest traffic
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8864: Not able to add TCP port forwarding rule in VPN for specific ports
Setting port forwarding rules for port 500,1701 and 4500 after enabling VPN, gives the error message "The range specified, xxxx, conflicts with rule xxxx which has xxxx." This happens because the rules added for vpn doesn't have a matching condition to allow port forwarding rules.
Added a unit test to verify the detectRulesConflict function of FirewallManagerImpl.
* pr/851:
CLOUDSTACK-8864: Not able to add TCP port forwarding rule in VPN for specific ports
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-8863: VM doesn't reconnect to internet post VR RESTART/STOP-START/RECREATE
The ongoing ICMP request reply session is broken when the VR is down, the expectation is that it would resume once the VR is up. Investigations revealed that the ongoing ICMP packets are sent out of eth2 without being NATed post VR stop/start or restart or recreate.
TCPDUMP output from VR post restart/stop-start/recreate on eth2:
root@r-4-VM:~# tcpdump -i eth2 icmp -n -vvv
tcpdump: listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
06:22:52.749770 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 81, length 64
06:22:53.749782 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 82, length 64
06:22:54.749771 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 83, length 64
06:22:55.749775 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 84, length 64
06:22:56.749765 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 85, length 64
06:22:57.749776 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.200.67 > 173.194.33.163: ICMP echo request, id 30996, seq 86, length 64
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel
root@r-4-VM:~#
root@r-4-VM:~# grep icmp /proc/net/ip_conntrack
icmp 1 29 src=192.168.200.67 dst=173.194.33.163 type=8 code=0 id=30996 [UNREPLIED] src=173.194.33.163 dst=192.168.200.67 type=0 code=0 id=30996 mark=0 use=2
This get fixed after flushing the conntrack table.
Screenshots:
Before fix (ping session doesn't resume, stop and starting the ping works, 120 packets lost):

After fix(ping session resumes, 27 packets lost):

* pr/836:
CLOUDSTACK-8863: VM doesn't reconnect to internet post VR RESTART/STOP-START/RECREATE
Signed-off-by: Remi Bergsma <github@remi.nl>
Fixed box location on vagrant files for devcloud4 (CLOUDSTACK-8898)The centos-6.5 is no longer available
* pr/875:
Added fix to binary installation vagrant files (CLOUDSTACK-8898)
Fixed box location on vagrant files
Signed-off-by: Rajani Karuturi <rajani.karuturi@citrix.com>
[4.6][BLOCKER]CLOUDSTACK-8890: Added isEmpty() check to prevent nullPointerException.Check if the list is empty before trying to get the first entry. If the list is empty, in example when dealing with projects, it will user the caller user id.
Tests to verify working order:
1. Deploy ACS
2. Create project
3. Create resource in project -> Should succeed!
* pr/878:
Added isEmpty() check to prevent nullPointerException.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
[4.6][BLOCKER]CLOUDSTACK-8763: Resolved POD/ZONE deletion failure.Instead of having both checkIfPodIsDeletable() and checkIfZoneIsDeletable have there own SQL query, I've refactored them so they use DAO SQL Queries.
This resolves the SQL Exception thrown by both classes.
Test to confirm working order:
- deploy ACS
- Add zones / pods. -> Try to delete without resources. -> Direct success.
- Add resources to zones / pods. -> Try to delete with resources in the pod / zone. -> Correct exception thrown. (Error message is why it cannot remove the zone / pod. IE: There is still a hosts or vm or .... )
* pr/845:
Added unit tests for checkIfPodIsDeletable() and checkIfZoneIsDeletable().
Updated Dao classes with correct field names.
Refactored checkIfZoneIsDeletable().
Added findByDc(long dcId) to VolumeDao and VolumeDaoImpl.
Added countIPs(long dcId, boolean onlyCountAllocated) to IPAddressDao and IPAddressDaoImpl.
Added countIPs(long dcId, boolean onlyCountAllocated) to DataCenterIpAddressDao and DataCenterIpAddressDaoImpl.
Refactored checkIfPodIsDeletable().
Added findByPodId(Long podId) to HostDao and HostDaoImpl.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
CLOUDSTACK-8726 : Automation for Quickly attaching multiple data disks to a new VMAttach multiple Volumes simultaneously to a Running VM ... === TestName: test_attach_multiple_volumes | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 1 test in 196.931s
OK
* pr/683:
changed the testcase skip code into setup method
Imparting changes mentioned by nitt10prashant
Automation for multiple disk attachments to instance
Signed-off-by: sanjeev <sanjeev@apache.org>
CLOUDSTACK-8893: Fixing script as per the latest functionalityPlease check https://issues.apache.org/jira/browse/CLOUDSTACK-8893 for more details.
* pr/871:
Modified test description
CLOUDSTACK-8893: Fixing script as per the latest functionality
Signed-off-by: sanjeev <sanjeev@apache.org>
[4.6][BLOCKER]CLOUDSTACK-8883: Resolved connect/reconnect issue.Hi!
@wilderrodrigues by implementing Callable you switched a couple of methods and fields. I switched them some more!
The reason why the Agent wouldn't reconnect was due to two facts.
Problem 1: Selector was blocking.
In the while loop at [1] _selector.select(); was blocking when the connection was lost. This means at [2] _isStartup = false; was never excecuted. Therefore at [3] the call to isStartup() always returned true resulting in an infinite loop.
Resolution 1: Move the call to cleanUp() [4] before checking if isStartup() has turned to false. cleanUp() will close() the _selector resulting in _isStartup to be set to false.
Problem 2: Setting _isStartup & _isRunning to true when init() throwed an unchecked exception (ConnectException).
The exception was nicely caught, but only logged. No action was taken! Resulting in _isStartup & _isRunning being set to true. Resulting in the fact the Agent thought it was connected successfully, though it wasn't.
Resolution 2: Adding return to the catch statement [5]. This way _isStartup & _isRunning aren't set to true.
Steps to test:
1. Deploy ACS.
2. Try all combinations of stopping/starting managment server/agent.
[1]b34f86c8d5/utils/src/main/java/com/cloud/utils/nio/NioConnection.java (L128)
[2]b34f86c8d5/utils/src/main/java/com/cloud/utils/nio/NioConnection.java (L176)
[3]b34f86c8d5/agent/src/com/cloud/agent/Agent.java (L404)
[4]b34f86c8d5/agent/src/com/cloud/agent/Agent.java (L399)
[5]b34f86c8d5/utils/src/main/java/com/cloud/utils/nio/NioConnection.java (L91)
* pr/863:
Added return statement to stop start() if there has been an ConnectException.
Call cleanUp() before looping isStartup().
Signed-off-by: Rajani Karuturi <rajani.karuturi@citrix.com>
CLOUDSTACK-8826: XenServer - Use device id passed as part of attach volume API properly
If device id passed as part of API and available then use it otherwise fallback on XS to automatically assign one.
For ISO device id used is 3 and it is processed before any other entry to avoid conflict.
Signed-off-by: Koushik Das <koushik@apache.org>
CLOUDSTACK-8851 Redundant VR getting started in the same cluster or hwe are not populating the deployment destination of the previous rvr in the avoid set of the rvr that is being created. This was resulting in both the rvrs getting deployed to same host sometimes even when it
could have gone to a different one.
Now we are updating the avoids set of the deployment plan to fix this issue.
* pr/839:
CLOUDSTACK-8851 Redundant VR getting started in the same cluster or host even when there are suitable hosts available
Signed-off-by: Remi Bergsma <github@remi.nl>
If device id passed as part of API and available then use it otherwise fallback on XS to automatically assign one.
For ISO device id used is 3 and it is processed before any other entry to avoid conflict.
CLOUDSTACK-8840: Systemd service for the Usage ServerThere already was a uncompleted systemd service file for the Usage
Server.
This new one replaces sysvinit and the old systemd service file.
* pr/820:
CLOUDSTACK-8840: Do not include old systemd wrapper
CLOUDSTACK-8840: Fix the source path of the service file
CLOUDSTACK-8840: Systemd service for the Usage Server
Signed-off-by: Wido den Hollander <wido@widodh.nl>