This feature enables adding of guest ip ranges (public ips) form different subnets.
In order to provide the dhcp service to a different subnet we create an ipalias on the router. This allows the router to listen to the dhcp request from the guest vms and respond accordingly. Every time a vm is deployed in the new subnet we configure an ip alias on the router. Cloudstack uses dnsmasq to provide dhcp service. We need to configure the dnsmasq to issue ips on the new subnets. Added a new class dnsmasqconfigurator which generates the dnsmasq confg file, this file replaces the old config in the router.
The details of the alias ips are stored in db in the nic_ip_alias table. Every time a new subnet is added one of the ip from the subnet is used to configure the ip alias.
I have pushed the code to https://github.com/bvbharatk/cloud-stack/tree/Cloudstack-702 , also rebased the code with master.
I need to test the code for advanced sg enabled network using kvm.
I have added the unit test
Marvin tests are at https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=53e4965
Also accomodated some of the changes suggested by koushik.
corrected the import statements. renamed the IpAlias command to createIpAlias command.
This feature supports only ipv4
This signal will force the dnsmasq daemon to reload the configuration directly. This is much faster than restarting the daemon, which result in a much smaller window during which no dns server is available.
Tested by using the replaced version of edithosts.sh on a running vrouter causing dns problems.
The interval between keepalived.ts and keepalived.ts2 should be >= 60 seconds in
normal condition, because every 10 seconds keepalived.ts would be updated, and
at least every 60 seconds, keepalived.ts would be copy to keepalived.ts2.
If the interval is less than 60 seconds, then keepalived process failed to
update keepalived.ts every 10 seconds.
Take some delay of updating into consideration, check_heartbeat.sh would use 30
seconds as a way to tell keepalived process is alive or not.
In the previous version it would take at least 1 seconds for arping, and it
would be big if the VR has more than 30 IPs - our biglock default timeout is 30
seconds.
Fix it by send out two arping immediately, and then sleep 1 second for router to
update arp cache.
commands send to virtual router, instead of keeping silence.
Test:
Before change:
(1) Acquire IP. always in "Allocating" state.
(2) EnableStaticNat, the result is success(it is incorrect).
(3) DisableStaticNat, will get error message.. This is correct.
(4) Add Firewalls. always in "Adding" state.
(5) The AgentManager report statistics every 60 minutes(normally it
should be router.stats.interval=5 minutes).
After change:
(1) Acquire IP, will get error message.
(2) EnableStaticNat, will get error message.
(3) DisableStaticNat, will get error message.
(4) Add Firewalls, will get error message. But the firewall rules are
saved in database.
(5) The AgentManager report statistics every 5 minutes, except the
network with read-only FS virtual router.
If something got wrong with passwd_server_ip script, it would output to
keepalived.log, thus cause other scripts malfunctional.
Also make savepassword.sh using the same lock as serve_password.sh.
The already deleted same hostname is not deleted from /etc/hosts of
vRouter.
vRouter's /etc/hosts format:
$ip $host
This patch fixes deletion logic below.
sed -i /"$host "/d $HOSTS
Signed-off-by: Prasanna Santhanam <tsp@apache.org>
This fix would work because:
1. When booting up the router, there is possible that no ip information have
been set for the interface(CS would do it after confirm router is up), so the
interface isn't associate with any ip, then ifconfig cannot work. We have to use
ifup, this is especially true for the first router become master.
2. After booting up phase, the ip would be associated with interfaces, then we
can use ifconfig to bring them up.
Also added license header for passwd_server_ip
Ported from:
commit 1072ec7ae36911ed794c182a1146025a0e969ea9
Author: Sheng Yang <sheng.yang@citrix.com>
Date: Wed Sep 12 11:15:33 2012 -0700
CS-16318: Update the fix with some tweak
1. The old fix run cloud-passwd-srvr twice because cloud-passwd-srvr is
still in the list of enabled_svcs
2. The lock should be applied on serve_password.sh, which controlled the
accessing to the password. Applied on the MASTER/BACKUP switch is useless, two
instance of serve_password.sh would still able to access the password file at
the same time.
3. Password service is a part of redundant router state transition process
now, so if the service failed to start, then the transition failed.
4. Restart password service should be put before restart dnsmasq, which
would sent out DHCP offer to the user vms. If user VMs got the DHCP offer first
but failed to get password, there would be an issue.
Reviewed-by: Anthony Xu
commit fa94da114099da357df7daa1aad3c327868393ca
Author: Jayapal Reddy <jayapalreddy.uradi@citrix.com>
Date: Wed Sep 12 17:57:03 2012 +0530
Bug:CS-16318 Starting password server on the both IPs in RRVM
Reviewed-by: Abhi
Conflicts:
patches/systemvm/debian/config/opt/cloud/bin/passwd_server
Signed-off-by: Chip Childers <chip.childers@gmail.com>
I've assumed that Gavin's commit is appropriate, based
on an assumption that we will keep these files in the source
tree. If https://issues.apache.org/jira/browse/LEGAL-146
results in a different opionion from the members, then we
will end up having to do something more drastic anyway.
We need to use ifup/ifdown to bring up the interfaces, because ifconfig don't
know the ip of the interface after we modify cloud-early-config to avoid
first start up of public interface.
Reviewed-by: Edison
The routing table with two nics may be messed up, due to we sent same
router(gateway) information from different DHCP server, in order to specify
default gateway. E.g.
Network A: 192.168.1.0/24, gw 192.168.1.1
Network B: 192.168.2.0/24, gw 192.168.2.1
User VM: Nic 1 connect to network A, get ip 192.168.1.10; nic 2 connect to
network B, get ip 192.168.2.10.
Set network A as the default network of user VM.
Currently we would send this information to user VM through DHCP offer:
In network A: dhcp-option:router 192.168.1.1
In network B: dhcp-option:router 192.168.1.1
So both NIC in the guest VM would receive 192.168.1.1 as router(gateway).
But, in CentOS 5.6, dhclient-scripts try to tell if the gateway is reachable
for current subnet.
So when we try to enable nic 2(eth1) of user VM, dhclient would receive:
IP: 192.168.2.10
Mask: 255.255.255.0
Router: 192.168.1.1
Then it would found that the specified gateway(router) is not within its own
subnet(192.168.2.0/24). But since we send out this ip(192.168.1.1) as the
gateway for it, dhclient thought that it should got someway to access the
network through this IP. So it would execute:
ip route add 192.168.1.1 dev eth1
ip route replace default via 192.168.1.1 dev eth1
But it can never reach 192.168.1.1(which is in the eth0's subnet and the
gateway of eth0) by go through eth1 interface. So it is messed up.
We've tested Windows 2008 R2, CentOS 5.3, CentOS 5.6 and Ubuntu 10.04. Windows
and Ubuntu are fine with above policy.
To solve this, we send different dhcp:router option according to the guest OS
type now.
We may need expand this list later, but for now we only know that CentOS and
RHEL would behavior in this way.
status 14042: resolved fixed
Summary of changes:
- Fix the order of source nat ip's : Static Nat IP's will be on top of Router source nat IP's. means Static NAT ip will take higher preference when compare to router ip while picking ip for source nat.
Reviewed-by: Abhi
Summary of changes: Added Hairpin Nat.
- defined Harpin NAT function.
- Called Hairpin NAT while adding/deleting port forwading and Static NAT rules.
- added rules in IPtables config file, this will be iniated during bootup to forward New/established connectons from eth0 to eth0.
Summary of Changes: Using multiple routing tables to send the packets on the public NIC's based on source IP for the following type of connections:
- Inbound connections of Static NAT ip .
- Outbound connections of static-NAT (using static NAT-ip for SNAT).
The problem is remove_first_ip() in ipassoc.sh can't be called more than one.
The call after the first time would result in iptable and ip command failure,
thus result in failure of execution of IpAssocCommand.
Use the same way to detect already disassociated ip address of non-first
IP(remove_an_ip()) to fix the issue.
reviewed-by: Edison Su
status 13606: resolved fixed
Summary of changes :
- Added a new flag -s to ipassoc command to carry if the ip address is
used for SNAT or not.
- SNAT is completly decoupled from the first flag. first flag is used
to decide if the ip address is first ip address of the interface.
- -s and -f are independent, SNAT can be enabled on the non-first ip
also.
Summary of changes:
- Mutiple routing table for each public interface is added (previously there is only one routing table ). when the packet is send out of public interface corresponding per-interface routing table will be used. per-interface routing table will modified when ever ip/interface added/deleted.
- New parameter is added to ipassoc command to include the default gateway for every interface/ip. prevously it is using only one public interface to send out, default gateway is obtained at the boot up time.
- In the DNAT case. In the revese path(from guest vm to outside, or when DNAT packet receives from the eth0) the public ip/source ip will not be available till POSTROUTING. to overcome this, DNAT connection are marked with routing table number at the time of connection creation, in the reverse path the routing table# from DNAT connection is used to detect per-interface routing table.