On redundant VR setups, the primary resolver being handed out to instances is the guest_ip (primary IP for the VR). This might lead to problems upon failover, at least while the DHCP lease doesn't update (because the primary resolver will be checked first until times out, however it'll be gone upon failover).
If Global Setting use_ext_dns is true, we don't want the VR to be the primary resolver at all.
CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop- Reduces SSL handshake timeout to 15s, previously this was only 10s in
commit debfcdef788ce0d51be06db0ef10f6815f9b563b
- Adds an aggresive explicit wakeup to save the Nio main IO loop/handler from
getting blocked
- Fix NioTest to fail/succeed in about 60s, previously this was 300s
- Due to aggresive wakeup usage, NioTest should complete in less than 5s on most
systems. On virtualized environment this may slightly increase due to thread,
CPU burst/scheduling delays.
/cc @swill please review and merge.
Sorry about the previous values, they were not optimized for virtualized env. The aggressive selector.wakeup will ensure main IO loop does not get blocked even by malicious users, even for any timeout (ssl handshake etc).
* pr/1534:
CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop
Signed-off-by: Will Stevens <williamstevens@gmail.com>
- Reduces SSL handshake timeout to 15s, previously this was only 10s in
commit debfcdef788ce0d51be06db0ef10f6815f9b563b
- Adds an aggresive explicit wakeup to save the Nio main IO loop/handler from
getting blocked
- Fix NioTest to fail/succeed in about 60s, previously this was 300s
- Due to aggresive wakeup usage, NioTest should complete in less than 5s on most
systems. On virtualized environment this may slightly increase due to thread,
CPU burst/scheduling delays.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
MySQLdb has been deprecated and is also not supported in Python 3.
mysql.connector is a connector written in Python which talks the
native MySQL protocol without any external code.
https://dev.mysql.com/doc/connector-python/en/
Update L10N resource files with 4.7 strings from Transifex (20160502)Force "translator" mode with the transifex client.
cc @swill the new PR.
* pr/1527:
Update L10N resource files with 4.7 strings from Transifex (20160502) Force "translator" mode with the transifex client.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
kvm: Aqcuire lock when running security group Python scriptIt could happen that when multiple instances are starting at the same
time on a KVM host the Agent spawns multiple instances of security_group.py
which both try to modify iptables/ebtables rules.
This fails with on of the two processes failing.
The instance is still started, but it doesn't have any IP connectivity due
to the failed programming of the security groups.
This modification lets the script aqcuire a exclusive lock on a file so that
only one instance of the scripts talks to iptables/ebtables at once.
Other instances of the script which start will poll every 500ms if they can
obtain the lock and otherwise execute anyway after 15 seconds.
* pr/1408:
kvm: Aqcuire lock when running security group Python script
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling connect/accept events
Changes are covered the NioTest so I did not write a new test, advise how we can improve this. Further, I tried to invest time on writing a benchmark test to reproduce a degraded server but could not write it deterministic-ally (sometimes fails/passes but not always). Review, CI testing and feedback requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma @rafaelweingartner @GabrielBrascher
* pr/1493:
CLOUDSTACK-9348: Use non-blocking SSL handshake
CLOUDSTACK-9348: Unit test to demonstrate denial of service attack
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9336 surround the execution of baremetal-vr.py with condition
* pr/1463:
CLOUDSTACK-9336 surround the execution of baremetal-vr.py with condition
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Quota: consolidated lockable account check to a method. Added unit tests to check lockablity of various accounts.
Currently normal user and domain admin accounts are eligible for locking.
* pr/1350:
Quota: consolidated lockable account check to a method. Added unit tests to check lockablity of various accounts
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-8302: Removing snapshots on RBDSnapshot removing implemented if primary datastore is RBD
https://issues.apache.org/jira/browse/CLOUDSTACK-8302
* pr/1230:
CLOUDSTACK-8302 - Cleanup snapshot on KVM with RBD Snapshot removing implemented on RBD. 1. On management side: when created new shanpshot we checking if our primary storage is RBD, then do not remove record from cloud.snapshot_store_ref with link to Ceph image via 'install_path' field. 2. On management side: when removing snapshot, also send command to agent 'DeleteCommand'. 3. On agent side: method implemented 'public Answer deleteSnapshot(final DeleteCommand cmd)'
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9305: Cloudstack Usage Breaks if DB HA enabledWith DB HA enabled in db.properties, the cloudstack-usage service restarts every 10 seconds. Making the suggested change has fixed it for me. Cloudstack 4.8 on Centos7
* pr/1433:
Cloudstack Usage Breaks if DB HA enabled
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Set default networkDomain to empty instead of usernameThe 10th field of `createUserAccount` is `networkDomain` (See `AccountService.java`) and it is set to a var named `admin`, which is the user name.
So, the first user that is created in a domain that links to LDAP, creates the account within the domain, and sets the `networkDomain` field to the username. All next users are created in the same account.
Then we have the situation that in domain SBP we have a user `rbergsma` that logs in first, gets an account created and then (unless you override) all VMs started in the SBP domain will have network domain `rbergsma`. That is highly confusing and not what is should be.
The `linkDomainToLdap` api call has no `networkDomain` field, so I propose to make this field empty (set it to null). It's a sting and null / empty is allowed.
One can also specify the networkDomain when creating a VPC and also there it is allowed to be null.
When te networkDomain is needed (and is not set in the domain and not in the VPC) it is constructed by using `guest.domain.suffix` so there always is a networkDomain to be used.
It makes more sense to manually set it on a domain level, or specify it on the VPC and in the final case end up with something that is clearly generated (like cs342cloud.local) rather than the username of someone else.
* pr/1485:
Set default networkDomain to empty instead of username
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Fixes JSON deserialization of cmdInfo (current process fails with
StringIndexOutOfBoundsException when cmdEventType is the last parameter
in the JSON string.
If the size directive is used, logrotate will ignore the daily, weekly, monthly,
and yearly directives.
remove cloud-cleanup
This script does not do anything because it fails due missing /var/log/cloud directory. Logrotate is used for this functionality.
It was worked around some possible runtime exceptions introduced by the
changes that were added by the PR 780. Basically, the points in which a
null pointer exception could happen, we added safety checks to avoid
them. It was create a specific method do that, all together test cases
were created for this newly method that was added.
- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
The NioConnection uses blocking handlers for various events such as connect,
accept, read, write. In case a client connects NioServer (used by
agent mgr to service agents on port 8250) but fails to participate in SSL
handshake or just sits idle, this would block the main IO/selector loop in
NioConnection. Such a client could be either malicious or aggresive.
This unit test demonstrates such a malicious client that can perform a
denial-of-service attack on NioServer that blocks it to serve any other client.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Bump ssh retries to prevent false positives of test_loadbalanceNo more false positives after this change.
```
[root@cs1 integration]# cat /tmp//MarvinLogs/test_loadbalance_TR5RJD/results.txt
Test to create Load balancing rule with source NAT ... === TestName: test_01_create_lb_rule_src_nat | Status : SUCCESS ===
ok
Test to create Load balancing rule with non source NAT ... === TestName: test_02_create_lb_rule_non_nat | Status : SUCCESS ===
ok
Test for assign & removing load balancing rule ... === TestName: test_assign_and_removal_lb | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 3 tests in 930.418s
OK
```
* pr/1473:
bump ssh retries to prevent false positives of test_loadbalance
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM idWhen calling listServiceOfferings with VM id as parameter. It is returning incompatible tagged offerings. It should only list all compatible tagged offerings. Compatible means the new service offering should contain all the tags of the existing service offering(Existing offering SUBSET of new offering). If that is the case It should list in the result and can be upgraded to that offering.
* pr/1321:
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM id
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Installing bzip2 since it is required for extracting templatesIf you do not install bzip2, then installing templates that are bzip2 compressed will result in Cloudstack not being able to extract the contents and ending up copying the template in compressed form. This will result in VMs not being able to start.
* pr/1490:
Installing bzip2 since it is required for extracting templates.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
[4.7] vmware: improve support for disks- Improve disk chain usage while attaching, migrating disks
- Gets root disk controller based diskDeviceBusName from volume's chain info
* pr/1365:
vmware: improve support for disks
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9283: add pid to java arguments in cloudstack-usage.servicecloudstack-usage fails to start throwing Integer exception during PID retrieval, and the service keeps restarting after 10s (as defined in the systemd service definition).
Adding the pid to the java arguments in the systemd service definition makes it stop looping in centos7
* pr/1409:
CLOUDSTACK-9283: add pid to java arguments in systemd/cloudstack-usage.service
Signed-off-by: Will Stevens <williamstevens@gmail.com>
engine/schema: fix upgrade path to work with MySQL 5.7Found this issue when using MySQL 5.7 with Ubuntu 16.04. The upgrade path fix removes an invalid `IGNORE` param that is deprecated now, in the upgrade path we run the alter statement to add an index only if it does not exist so we're good.
For MySQL 5.7, we'll also need to update the docs at some point to include `server-id` along with other parameters. Some of the SQL statements used throughout engine/schema don't adhere to SQL 99 standard which is enforced by default in MySQL 5.7, therefore the following sql-mode (for backward compatibility with mysql 5.6 modes) will be necessary for anyone willing to use MySQL 5.7 (until we fix codebase wide raw and generated sql statements to be SQL99 compliant):
sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_ENGINE_SUBSTITUTION"
server-id = 1
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
/cc @swill @jburwell @agneya2001 @wido @DaanHoogland and others
* pr/1517:
engine/schema: fix upgrade path to work with MySQL 5.7
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9323: Fix cancel host maintenance canFix cancel host maintenance so that if maintenance is cancelled the host come back to normal state gracefully.
Added marvin tests for host maintennace.
* pr/1454:
CLOUDSTACK-9323: Fix Cancel maintenance so that if maintenance is cancelled the host come back to normal state gracefully. Added marvin tests for host maintennace.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
* 4.8:
Removed sleeps and used validateList as requested.
Added required_hardware="false" attr above test_02_root_volume_attach_detach
Modified test_volumes.py to include a hypervisor test for root attach/detach testing
Let hypervisor type KVM and Simulator detach root volumes. Updated test_volumes.py to include a test for detaching and reattaching a root volume from a vm. I also had to update base.py to allow attach_volume to have the parameter deviceid to be passed as needed.
* 4.7:
Removed sleeps and used validateList as requested.
Added required_hardware="false" attr above test_02_root_volume_attach_detach
Modified test_volumes.py to include a hypervisor test for root attach/detach testing
Let hypervisor type KVM and Simulator detach root volumes. Updated test_volumes.py to include a test for detaching and reattaching a root volume from a vm. I also had to update base.py to allow attach_volume to have the parameter deviceid to be passed as needed.
CLOUDSTACK-9349: Enable root disk detach for KVM with new Marvin testsThis PR addresses the KVM detach/attach ROOT disks from VMs (CLOUDSTACK-9349). In short, this allows the KVM Hypervisor, and I added the Simulator as a valid hypervisor for ease of development and testing of marvin, to detach a root volume and the reattach a root volume using the deviceid=0 flag to the attachVolume API. I have also written a marvin integration test that verifies this feature works for both KVM and the Simulator.
Below is the marvin results files of the full marvin test_volumes.py. All tests pass, including the new root detach/attach, on our KVM lab running with the patches in this PR.
[test_volumes_KIR4G3.zip](https://github.com/apache/cloudstack/files/223799/test_volumes_KIR4G3.zip)
* pr/1500:
Removed sleeps and used validateList as requested.
Added required_hardware="false" attr above test_02_root_volume_attach_detach
Modified test_volumes.py to include a hypervisor test for root attach/detach testing
Let hypervisor type KVM and Simulator detach root volumes. Updated test_volumes.py to include a test for detaching and reattaching a root volume from a vm. I also had to update base.py to allow attach_volume to have the parameter deviceid to be passed as needed.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9194: Making the console popup window resizable in IE to make sure the focus is not losthttps://issues.apache.org/jira/browse/CLOUDSTACK-9194
To test:
Open any VM console in IE, and try resizing the browser window of console
It should be resizable.
* pr/1270:
CLOUDSTACK-9194: Making the console popup window resizable in IE to make sure the focus is not lost.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Log asynchronous responses in the api logCurrently API responses for synchronous calls are logged, but asynchronous call responses are not. This pull request makes a minor modification that logs the response including the JobId of all asynchronous requests.
As an example, here is what a stopVirtualMachine request looked like in the logs:
2016-04-27 10:43:11,084 INFO [a.c.c.a.ApiServer] (catalina-exec-3:ctx-37d9f693 ctx-d2368de3) (logid:3a0fad97) (userId=2 accountId=2 sessionId=AF8B1F726ACB5C3A637B8B300AA218A7) 10.103.0.207 -- GET command=stopVirtualMachine&id=f63b6fcc-e0b0-480f-8f7a-cba329634ba1&forced=false&response=json&_=1461771791036 200
After this modification, here is what the logs look like:
2016-04-27 13:37:11,338 INFO [a.c.c.a.ApiServer] (catalina-exec-6:ctx-915b5c84 ctx-a03152fa) (logid:66249df0) (userId=2 accountId=2 sessionId=9EF127EED5CA6E74797DFE487D980FAF) 10.103.0.207 -- GET command=stopVirtualMachine&id=f63b6fcc-e0b0-480f-8f7a-cba329634ba1&forced=false&response=json&_=1461782231194 200 {"stopvirtualmachineresponse":{"jobid":"5b9f4a9b-eabe-4fa4-849d-3d004bb65634"}}
* pr/1522:
Log responses from asynchronous api commands
Signed-off-by: Will Stevens <williamstevens@gmail.com>
4.9 mvn version safeupgradeonlyUpgrades maven dependencies versions that can be safely upgraded without breaking console-proxy/crypto usage.
Bisected changes from: https://github.com/apache/cloudstack/pull/1397
cc @swill @DaanHoogland
* pr/1510:
maven: fix dependency version support by JDK7
further maven dependency updates from Daan
framework/quota: fix checkstyle issue
maven: Upgrade dependency versions
Signed-off-by: Will Stevens <williamstevens@gmail.com>