[Me]:- Guruji, What is Stress Testing?
[Guruji]:- Stress test is targeted to find issues with stability and robustness of the product, such as AVs, resource leaks, deadlocks, and various issues/inconsistency under the stress/load condition.
[Me]:- How do we run stress tests?
[Guruji]:- Most time stress tests are running in combination with other diagnostic utility such as debugger, page heap, any fault injection tools/instrumentations, performance monitor, and all other necessary tracing and logging tools that deemed to be useful for debugging and analysis. These are crucial information for developer and tester for analyzing the final outcome of the results. Feature validation is important part of the stress, though it is sometimes not done as much as it should.
[Me]:- So what are the good practices for Stress Testing?
[Guruji]:- As a good practice, all stress run should implement feature validation to ensure the system behave consistently under the load condition. Stress tests should also focus on negative conditions. Systems tend to work fine when everything is tested for valid conditions and tends to become unstable when the error paths are hit. It is very important that these error conditions are included in the stress scenarios.
Stress test need to be clear on:
· What is the objective of the test? What for, why do you want to do it?
· What is the key primitive/action that is targeted to? And the code path it covers?
· How it related to release criteria/customer impact? Duration of the runs?
· Stability of the runs, and issues/bugs: AVs, leaks, deadlocks, or data integrity issues, or any other feature inconsistency that occurred during the stress?
· Not often done, but it is also a good practice to report the performance patterns during the stress, thought the perf number obtained in stress is tainted by the instrumentation tools we used?
· What are location/links to dumps, perf logs, trace files …, or simply offending call stack…
· What is the clear context under which this test is conducted? Single user or multiple concurrent users?
· Perf counters that record the stress run on resource utilization: CPU, Disk I/Os, memory consumption, network throughput/usage…
· The environment parameter
· What is the hardware used? in which the results obtained? Large customer sets/data center