Make sure you've read Part 1 on how two-way devices impact performance and our approach for determining what to test.
Setting up testing data
The second question we asked was "how do we setup our data?" This is a very important question. It is drastically different to test 100,000 devices all in a single load control group versus one hundred thousand groups each with 1 device. Both are valid, interesting tests and both will produce wildly different results and optimizations. We again worked with our customers and their data to profile not just how their data was setup today but also how they planned to expand in the future. Ultimately we created 7,000 groups with different numbers of devices based on the profiles.
Just as testing the demand response event creation alone was not enough, simply creating 100,000 devices is not enough. By using our analysis of operations we identified the objects in our data model that have an impact on our performance. We added to our test system over 2,000,000 prospective customers, 750,000 legacy one-way devices, and 5,000 weather readings. The scripts that we created will help us in the future scale to 500,000 - 1,000,000 devices.
It's worth repeating the importance of setting up the test data correctly. There will always be an easier or simpler way to setup the data but getting this part wrong can completely invalidate testing. For example, we saw a significant slowdown when iterating over a hash table of a certain type of device group. This type of group was part of the production system but not being used in the demand response events. It would have been easy to ignore. However, by setting up the test data correctly we were able to move from a hash table to a set and see a dramatic improvement in our performance.
Since we don't have 100,000 extra IntelliTEMP DirectLink smart thermostats lying around for our performance testing, we built a software simulator to mimic an individual device and a testing harness that allowed us to easily launch thousands of simulators.
We again looked at the data in our production systems to identify the key functionality to simulate. Certainly connecting to our server was important but we also knew that devices do not stay connected 100% of the time. So we built in a variable disconnect rate and distributed simulators based on the observed behavior of deployed devices. Receiving and acknowledging messages are also core functionality but not all devices respond immediately. So we built in a variable lag rate and distributed simulators again based on the observed behavior of deployed devices. Lastly we built functionality for the simulators to push us periodic telemetry information along with demand response status messages in the same manner as our deployed devices.
To implement the device simulators we turned to the Erlang VM. Erlang's inherent parallelism and ease of scaling made it an ideal choice to run thousands of messaging based simulators. We also developed a test harness, also built in Erlang, to launch and monitor the simulators. The test harness ran simulators based on details
from a configuration file that was generated automatically from our test data and distribution profiles. This gave us tens of thousands of simulated thermostats and load control switches, each with individual characteristics based on observations from our real world devices Distribution of simulated device attributes
Check in tomorrow for Part 3 where we'll cover how we executed our tests and analyzed the output.
 Credentials, disconnect rate, lag rate, telemetry frequency, etc