Stress Testing

Under Stress
How OPNFV Should Act Beyond
Breaking Points
BOTTLENECKS
Contents
• Considerations on Stress Testing
• Highlights of Discussion in Testperf
• Test Cases in Danube Release
• Comparison Results for Stress Ping
• What We Should Do Next
Stress Testing == Load Testing?
• Stress testing is a kind of testing determines
the robustness of software by testing beyond
the limits of normal operation
•
It tests under unfavorable conditions
• Load Testing is to test for meeting certain
standards or requirements
•
There is a need to tune the system
• Stress Testing != Load Testing
•
•
•
Load Testing
Different purpose
Different testing conditions
Different testing/analyzing method
https://wiki.opnfv.org/display/bottlenecks/Sress+Testing+over+OPNFV+Platform
Stress Testing
Do We Break the System for Fun?
• Apart from providing PASS or FAIL result. Stress
testing can also provide more detailed result
with reliability, the probability that system will
survive during a given time interval. The
reliability could be another indication of the
level of confidence for the system.
Stress testing will provides level of
confidence of the system to users.
• It also allows observation of how system react
to the failures. Additional purpose behind this
madness is to make sure that the system fails
and recovers gracefully - a quality known as
recoverability.
Questions when Executing Stress Tests
• What is the first thing crashed, how and why?
• Does it save its state or does it crash suddenly?
• Does it just hang and freeze or does it fail gracefully?
• Could the system/component recover gracefully?
• On restart, is it able to recover from the last good state?
• Does it print out meaningful error messages to the user, or does it merely
display incomprehensible hex codes?
• Is the security of the system compromised because of unexpected failures?
• The list goes on.
Accelerate the Maturity and Adoption of OPNFV
Stress Testing for Danube – Highlights
• Stress testing principles and test cases discussion in Testperf
• https://etherpad.opnfv.org/p/DanubeStressTest
• Test Requirements
1. The stress test should from a user perspective be
• Easy to understand (what the test does and how the system is being stressed)
• Easy to run (i.e. "out-of-the-box" ... having deployed the OPNFV platform it should be possible to
run the test with minimal effort)
2. Where possible use existing proven methods, test-cases and tools
3. Should be possible to work with all OPNFV release scenarios
4. For Danube the stress test result is not part of our release criteria however for future
releases a stress test threshold (metric TBD) should be part of the release criteria
5. It should be possible to increase and decrease the load ("stress") and monitor the effect
on the system with a simple performance metric
6. The application running on SUT must be highly optimized to avoid being the bottleneck
Test Cases in Discussion
Availability
Data-Plane Traffic
VM1
Determine baseline test cases
ping
VM2
Robustness
TC3
confidence
throughput
cpu limit
TC1
VM1
VM2
cpu limit
TC5
throughput
TC2
VM1
VM2
TC4
Release criteria?
1 Hour Max?
General
Heavy Load
Easy-to-Run
VM1
VM2
Life-Cycle Events
Perform VM pairs/stacks testing
Easy-to-Understand
Test Cases in Discussion
Categories
Test Case
Description
Data-plane Traffic
for a virtual or bare
metal POD
TC1 –Determine baseline for
throughput
•
•
TC2 - Determine baseline for
CPU limit
•
•
Initiate one v/b POD and generate traffic
Increase the package size
• Throughput, latency and PKT loss up to x%
Initiate one v/b POD and generate traffic
Decrease the package size
• Measure CPU usage up to x%
Life-cycle Events
for VM pairs/stacks
TC3 – Perform life-cycle
events for ping
•
•
Spawn VM pairs, do pings and destroy VM pairs
Increase the number of simultaneous live VMs
• latency and PKT loss up to x%
• Testing time, count of failures, OOM killer?
TC4 – Perform life-cycle
events for throughput
•
•
Spawn VM pairs, generate traffic and destroy VM pairs
Serially or paralleled increase the package size
• Max load and PKT loss vs load
• Throughput, latency and PKT loss up to x% for either pair
TC5 – Perform life-cycle
events for CPU limit
•
•
Spawn VM pairs, generate traffic and destroy VM pairs
Serially or paralleled Decrease the package size
• Measure the CPU usage up to x%
Test Cases in Danube
TC1 – Determine baseline for throughput
Load Manager
Load Category “Data-Plane Traffic”
Preliminary work from Bottlenecks & Yardstick
Test Cases in Danube
Type 1
in
parallel
create
ping
Yardstick
run
destroy
create
ping
destroy
create
ping
destroy
TC3 – Perform life-cycle events
for ping
Start
Testing Flow
Initial stress test
Increase load
Run in parallel
Bottlenecks
Load Manager
Criteria Check
Iterate
Load
Manager
Time ends or Fail
Load
Manager
End
Load
Manager
Load
Manager
Visualization
If VMs successly built
Load
Manager If VMs successly built
200?
If VMs successly built
100
50
If VMs successly built
20
10
If VMs successly built
5
Quotas
t0
t1
t2
t3
Can not determine the exact time for expected number of VMs
t4
t5
T6?
12
Comparison Results for Stress Ping
• Testing Contents
• Executing Stress Test and Provide comparison results for
different installers (Installer A and Installer B)
• Up to 100 stacks for Installer A (Completes the test)
• Up to 40 stacks for Installer B (System fails to complete the test)
• Testing Steps
•
•
•
•
Enter the Bottlenecks Repo: cd /home/opnfv/bottlenecks
Prepare virtual environment: . pre_virt_env.sh
Executing ping test case: bottlenecks testcase run posca_factor_ping
Clean the virtual environment: . rm_virt_env.sh
• Video Demo
• https://youtu.be/TPd4NZr__HI
Comparison Results for Stress Ping
• Testing for Installer A
• Up to 100 stacks in configuration file for Installer B
• 1 stack SSH error when number of stack raised to 50
• When stack number up to 100, most of the errors are heat response time out
• 100 stacks are established successfully in the end
• Testing for Installer B
• Up to 40 stacks in configuration file for Installer B
• When stack number up to 30, the system fails to create all the stacks
• 21 stacks are either created failure or keeping in creation
• To verify the system performance, we choose to do clean-up and run the test again
• When stack number up to 20, same situation happens as in the last test
• The system performance degrades
• Different to the test for Installer A, we do the verification step because the system clearly
malfunctions.
• Which not shown in the demo is that after 3 rounds of the stress test, the system fail even
to create 5 stacks
What Should We do Next?
• Stress Test Cases from Testing Projects
• Collecting Requirements
• Joint Effort between Testing Community
• Analytic Tools for Breaking Points (Root Causes)
• Tracking the Failures
• Models for determine the root causes
• Quantitate the LOC or Reliability
• Long Duration PODs for Stress Testing over OPNFV
Releases
• Feedback bugs, provide LOC or Reliability, etc.
Thank You

Download Report

Stress Testing

Paperzz.com

Your Paperzz