Almost all application and business related risks can be addressed through performance testing, including user satisfaction and the application’s ability to achieve business goals. Generally, the risks that performance testing addresses are categorized in terms of speed, scalability, and stability. Speed is typically an end-user concern, scalability is a business concern, and stability is a technical or maintenance concern. Identifying project-related risks and the associated mitigation strategies where performance testing can be employed is almost universally viewed as a valuable and time-saving practice
Key types of Performance testing
[Me] : - What are the key types of performance testing we can do on websites?
[Guruji]:- The most common performance concerns related to Web applications are “Will it be fast enough?”, “Will it support all of my clients?”, “What happens if something goes wrong?”, and “What do I need to plan for when I get more customers?”. In casual conversation, most people associate “fast enough” with performance testing, “accommodate the current/expected user base” with load testing, “something going wrong” with stress testing, and “planning for future growth” with capacity testing. Collectively, these risks form the basis for the four key types of performance tests for Web applications..
Term | Purpose | Notes |
|
Performance test | To determine or validate speed, scalability, and/or stability. | · A performance test is a technical investigation done to determine or validate the responsiveness, speed, scalability, and/or stability characteristics of the product under test. |
|
Load test | To verify application behavior under normal and peak load conditions. | · Load testing is conducted to verify that your application can meet your desired performance objectives; these performance objectives are often specified in a service level agreement (SLA). A load test enables you to measure response times, throughput rates, and resource-utilization levels, and to identify your application’s breaking point, assuming that the breaking point occurs below the peak load condition. · Endurance testing is a subset of load testing. An endurance test is a type of performance test focused on determining or validating the performance characteristics of the product under test when subjected to workload models and load volumes anticipated during production operations over an extended period of time. · Endurance testing may be used to calculate Mean Time Between Failure (MTBF), Mean Time To Failure (MTTF), and similar metrics. |
|
Stress test | To determine or validate an application’s behavior when it is pushed beyond normal or peak load conditions. | · The goal of stress testing is to reveal application bugs that surface only under high load conditions. These bugs can include such things as synchronization issues, race conditions, and memory leaks. Stress testing enables you to identify your application’s weak points, and shows how the application behaves under extreme load conditions. · Spike testing is a subset of stress testing. A spike test is a type of performance test focused on determining or validating the performance characteristics of the product under test when subjected to workload models and load volumes that repeatedly increase beyond anticipated production operations for short periods of time |
|
Capacity test | To determine how many users and/or transactions a given system will support and still meet performance goals. | · Capacity testing is conducted in conjunction with capacity planning, which you use to plan for future growth, such as an increased user base or increased volume of data. For example, to accommodate future loads, you need to know how many additional resources (such as processor capacity, memory usage, disk capacity, or network bandwidth) are necessary to support future usage levels. Capacity testing helps you to identify a scaling strategy in order to determine whether you should scale up or scale out. |
Baseline? Benchmark?
[Me] : - What’s the difference between Baseline and Benchmarking with respect to performance testing?
[Guruji]:-Creating a baseline is the process of running a set of tests to capture performance metric data for the purpose of evaluating the effectiveness of subsequent performance-improving changes to the system or application. A critical aspect of a baseline is that all characteristics and configuration options except those specifically being varied for comparison must remain invariant.
With respect to Web applications, you can use a baseline to determine whether performance is improving or declining and to find deviations across different builds and versions. For example, you could measure load time, the number of transactions processed per unit of time, the number of Web pages served per unit of time, and resource utilization such as memory usage and processor usage. Some considerations about using baselines include:
A baseline can be created for a system, component, or application. A baseline can also be created for different layers of the application, including a database, Web services, and so on.
A baseline can set the standard for comparison, to track future optimizations or regressions. It is important to validate that the baseline results are repeatable, because considerable fluctuations may occur across test results due to environment and workload characteristics.
Baselines can help identify changes in performance. Baselines can help product teams identify changes in performance that reflect degradation or optimization over the course of the development life cycle. Identifying these changes in comparison to a well-known state or configuration often makes resolving performance issues simpler.
Baselines assets should be reusable. Baselines are most valuable if they are created by using a set of reusable test assets. It is important that such tests accurately simulate repeatable and actionable workload characteristics.
Baselines are metrics. Baseline results can be articulated by using a broad set of key performance indicators, including response time, processor capacity, memory usage, disk capacity, and network bandwidth.
Baselines act as a shared frame of reference. Sharing baseline results allows your team to build a common store of acquired knowledge about the performance characteristics of an application or component.
Avoid over-generalizing your baselines. If your project entails a major reengineering of the application, you need to reestablish the baseline for testing that application. A baseline is application-specific and is most useful for comparing performance across different versions. Sometimes, subsequent versions of an application are so different that previous baselines are no longer valid for comparisons.
Know your application’s behavior. It is a good idea to ensure that you completely understand the behavior of the application at the time a baseline is created. Failure to do so before making changes to the system with a focus on optimization objectives is frequently counterproductive.
Baselines evolve. At times you will have to redefine your baseline because of changes that have been made to the system since the time the baseline was initially captured.
Benchmarking is the process of comparing your system’s performance against a baseline that you have created internally or against an industry standard. In the case of a Web application, you would run a set of tests that comply with the specifications of an industry benchmark in order to capture the performance metrics necessary to determine your application’s benchmark score. You can then compare your application against other systems or applications that also calculated their score for the same benchmark. You may choose to tune your application performance to achieve or surpass a certain benchmark score. Some considerations about benchmarking include:
You need to play by the rules. A benchmark is achieved by working with industry specifications or by porting an existing implementation to meet such standards. Benchmarking entails identifying all of the necessary components that will run together, the market where the product exists, and the specific metrics to be measured.
Because you play by the rules, you can be transparent. Benchmarking results can be published to the outside world. Since comparisons may be produced by your competitors, you will want to employ a strict set of standard approaches for testing and data to ensure reliable results.
You divulge results across various metrics. Performance metrics may involve load time, number of transactions processed per unit of time, Web pages accessed per unit of time, processor usage, memory usage, search times, and so onPerf testing terminologies
Capacity | The capacity of a system is the total workload it can handle without violating predetermined key performance acceptance criteria. |
| |
Capacity test | A capacity test complements load testing by determining your server’s ultimate failure point, whereas load testing monitors results at various levels of load and traffic patterns. You perform capacity testing in conjunction with capacity planning, which you use to plan for future growth, such as an increased user base or increased volume of data. For example, to accommodate future loads, you need to know how many additional resources (such as processor capacity, memory usage, disk capacity, or network bandwidth) are necessary to support future usage levels. Capacity testing helps you to identify a scaling strategy in order to determine whether you should scale up or scale out. |
| |
Component test | A component test is any performance test that targets an architectural component of the application. Commonly tested components include servers, databases, networks, firewalls, and storage devices. |
| |
Endurance test | An endurance test is a type of performance test focused on determining or validating performance characteristics of the product under test when subjected to workload models and load volumes anticipated during production operations over an extended period of time. Endurance testing is a subset of load testing. |
| |
Investigation | Investigation is an activity based on collecting information related to the speed, scalability, and/or stability characteristics of the product under test that may have value in determining or improving product quality. Investigation is frequently employed to prove or disprove hypotheses regarding the root cause of one or more observed performance issues. |
| |
Latency | Latency is a measure of responsiveness that represents the time it takes to complete the execution of a request. Latency may also represent the sum of several latencies or subtasks. |
| |
Metrics | Metrics are measurements obtained by running performance tests as expressed on a commonly understood scale. Some metrics commonly obtained through performance tests include processor utilization over time and memory usage by load. |
| |
Performance | Performance refers to information regarding your application’s response times, throughput, and resource utilization levels. |
| |
Performance test | A performance test is a technical investigation done to determine or validate the speed, scalability, and/or stability characteristics of the product under test. Performance testing is the superset containing all other subcategories of performance testing described in this chapter. |
| |
Performance budgets or allocations | Performance budgets (or allocations) are constraints placed on developers regarding allowable resource consumption for their component. |
| |
Performance goals | Performance goals are the criteria that your team wants to meet before product release, although these criteria may be negotiable under certain circumstances. For example, if a response time goal of three seconds is set for a particular transaction but the actual response time is 3.3 seconds, it is likely that the stakeholders will choose to release the application and defer performance tuning of that transaction for a future release. |
| |
Performance objectives | Performance objectives are usually specified in terms of response times, throughput (transactions per second), and resource-utilization levels and typically focus on metrics that can be directly related to user satisfaction. | ||
Performance requirements | Performance requirements are those criteria that are absolutely non-negotiable due to contractual obligations, service level agreements (SLAs), or fixed business needs. Any performance criterion that will not unquestionably lead to a decision to delay a release until the criterion passes is not absolutely required ― and therefore, not a requirement. | ||
Performance targets | Performance targets are the desired values for the metrics identified for your project under a particular set of conditions, usually specified in terms of response time, throughput, and resource-utilization levels. Resource-utilization levels include the amount of processor capacity, memory, disk I/O, and network I/O that your application consumes. Performance targets typically equate to project goals. | ||
Performance testing objectives | Performance testing objectives refer to data collected through the performance-testing process that is anticipated to have value in determining or improving product quality. However, these objectives are not necessarily quantitative or directly related to a performance requirement, goal, or stated quality of service (QoS) specification. | ||
Performance thresholds | Performance thresholds are the maximum acceptable values for the metrics identified for your project, usually specified in terms of response time, throughput (transactions per second), and resource-utilization levels. Resource-utilization levels include the amount of processor capacity, memory, disk I/O, and network I/O that your application consumes. Performance thresholds typically equate to requirements. | ||
Resource utilization | Resource utilization is the cost of the project in terms of system resources. The primary resources are processor, memory, disk I/O, and network I/O. | ||
Response time | Response time is a measure of how responsive an application or subsystem is to a client request. | ||
Saturation | Saturation refers to the point at which a resource has reached full utilization. | ||
Scalability | Scalability refers to an application’s ability to handle additional workload, without adversely affecting performance, by adding resources such as processor, memory, and storage capacity. | ||
Scenarios | In the context of performance testing, a scenario is a sequence of steps in your application. A scenario can represent a use case or a business function such as searching a product catalog, adding an item to a shopping cart, or placing an order. | ||
Smoke test | A smoke test is the initial run of a performance test to see if your application can perform its operations under a normal load. | ||
Spike test | A spike test is a type of performance test focused on determining or validating performance characteristics of the product under test when subjected to workload models and load volumes that repeatedly increase beyond anticipated production operations for short periods of time. Spike testing is a subset of stress testing. | ||
Stability | In the context of performance testing, stability refers to the overall reliability, robustness, functional and data integrity, availability, and/or consistency of responsiveness for your system under a variety conditions. | ||
Stress test | A stress test is a type of performance test designed to evaluate an application’s behavior when it is pushed beyond normal or peak load conditions. The goal of stress testing is to reveal application bugs that surface only under high load conditions. These bugs can include such things as synchronization issues, race conditions, and memory leaks. Stress testing enables you to identify your application’s weak points, and shows how the application behaves under extreme load conditions. | ||
Throughput | Throughput is the number of units of work that can be handled per unit of time; for instance, requests per second, calls per day, hits per second, reports per year, etc. | ||
Unit test | In the context of performance testing, a unit test is any test that targets a module of code where that module is any logical subset of the entire existing code base of the application, with a focus on performance characteristics. Commonly tested modules include functions, procedures, routines, objects, methods, and classes. Performance unit tests are frequently created and conducted by the developer who wrote the module of code being tested. | ||
Utilization | In the context of performance testing, utilization is the percentage of time that a resource is busy servicing user requests. The remaining percentage of time is considered idle time. | ||
Validation test | A validation test compares the speed, scalability, and/or stability characteristics of the product under test against the expectations that have been set or presumed for that product. | ||
Workload | Workload is the stimulus applied to a system, application, or component to simulate a usage pattern, in regard to concurrency and/or data inputs. The workload includes the total number of users, concurrent active users, data volumes, and transaction volumes, along with the transaction mix. For performance modeling, you associate a workload with an individual scenario. |
Perf, Load & Stress testing
[Me] : - I think the most confused definition lies in understanding what exactly is performance, load and stress tests. Can you pls help us out?
[Guruji]:-Performance tests are usually described as belonging to one of the following three categories:
Performance testing: This type of testing determines or validates the speed, scalability, and/or stability characteristics of the system or application under test. Performance is concerned with achieving response times, throughput, and resource-utilization levels that meet the performance objectives for the project or product. In this guide, performance testing represents the superset of all of the other subcategories of performance-related testing.
Load testing: This subcategory of performance testing is focused on determining or validating performance characteristics of the system or application under test when subjected to workloads and load volumes anticipated during production operations.
Stress testing: This subcategory of performance testing is focused on determining or validating performance characteristics of the system or application under test when subjected to conditions beyond those anticipated during production operations. Stress tests may also include tests focused on determining or validating performance characteristics of the system or application under test when subjected to other stressful conditions, such as limited memory, insufficient disk space, or server failure. These tests are designed to determine under what conditions an application will fail, how it will fail, and what indicators can be monitored to warn of an impending failure
Performance testing Activities
Activity 1. Identify the Test Environment. Identify the physical test environment and the production environment as well as the tools and resources available to the test team. The physical environment includes hardware, software, and network configurations. Having a thorough understanding of the entire test environment at the outset enables more efficient test design and planning and helps you identify testing challenges early in the project. In some situations, this process must be revisited periodically throughout the project’s life cycle.
Activity 2. Identify Performance Acceptance Criteria. Identify the response time, throughput, and resource utilization goals and constraints. In general, response time is a user concern, throughput is a business concern, and resource utilization is a system concern. Additionally, identify project success criteria that may not be captured by those goals and constraints; for example, using performance tests to evaluate what combination of configuration settings will result in the most desirable performance characteristics.
Activity 3. Plan and Design Tests. Identify key scenarios, determine variability among representative users and how to simulate that variability, define test data, and establish metrics to be collected. Consolidate this information into one or more models of system usage to be implemented, executed, and analyzed.
Activity 4. Configure the Test Environment. Prepare the test environment, tools, and resources necessary to execute each strategy as features and components become available for test. Ensure that the test environment is instrumented for resource monitoring as necessary.
Activity 5. Implement the Test Design. Develop the performance tests in accordance with the test design.
Activity 6. Execute the Test. Run and monitor your tests. Validate the tests, test data, and results collection. Execute validated tests for analysis while monitoring the test and the test environment.
Activity 7. Analyze Results, Report, and Retest. Consolidate and share results data. Analyze the data both individually and as a cross-functional team. Reprioritize the remaining tests and re-execute them as needed. When all of the metric values are within accepted limits, none of the set thresholds have been violated, and all of the desired information has been collected, you have finished testing that particular scenario on that particular configuration
Increasing Perf test coverage
[Me] : - Guruji, how do we increase Performance test coverage?
[Guruji]:- The usefulness of the performance tests is directly related to how closely they emulate request streams that the Web services application will encounter once it is deployed in the production environment. In a complex Web services environment, it is of the essence to choose a systematic approach in order to achieve adequate performance test coverage. Such an approach should include a wide range of use case scenarios that your application may encounter. One such approach is to develop load test categories that can describe various sides of the expected stream of requests. Such categories can describe request types, sequences, and intensities with varying degrees of accuracy
Security Testing
“Can you perform security testing for us? “,asked a prospective customer.
“Sure, we are test specialists. Give us your security requirements, and we will test them! “,was the answer.
In reality, the situation is not that simple. In this respect, security is like usability: you can seldom expect customers to know all about it and get a free ride on their requirements. You may have to provide your customer with information about what “security” actually is and how it should be tested.
Besides, whereas security requirements may be simple (“no unauthorized access shall be possible”), testing for compliance with them may not be. Security testing is very much looking for unexpected side-effects were none (except hackers or crackers) expects them,which requires minute technical knowledge and plenty of “error-guessing”. For some reason, all this is called “penetration testing” in a security context.
Will ask Guruji more about this later
TMMi
[Me] : - Guruji, what’s this TMMi?
[Guruji]:- Just like the CMMI staged representation, the TMMi has a staged architecture for process improvement. It contains levels that an organization passes through as its testing process evolves from one with an ad-hoc and unmanaged nature to a mature and controlled process with defect prevention as its main objective. Achieving each level ensures that adequate improvements have been made as a foundation for the next stage. The internal structure of the TMMi contains testing practices that can be learned and applied systematically to support quality improvement in incremental steps. There are five levels in the TMMi that define a maturity hierarchy and an evolutionary path to test process improvement.
Test Process improvement manifesto
[Me] : - Guruji, if I need to improve my testing process, what is the manifesto that I need to stick to?
[Guruji]:- Like the Agile manifesto, some guys at EuroSTAR have created one for you. Just use them. Basically in the test process improvement manifesto they have tried to define what makes process improvement work and what not
Flexibility over Detailed Processes
In general, having defined processes supports an organization. Only something that is defined can be improvement. It guides new engineers and acts like corporate memory. However building too rigorous processes takes away people values. You want good testers that have the skills to act based on the context of a problem and perceive testing to be a challenging job. Supporting processes are needed but using the processes should give enough flexibility and freedom to testers to allow them to think for themselves and find the best way forwards. You only need “just enough process”.
Best Practices over Templates
Templates are great but even better is to provide examples of how they should be used. What provides more support; a test plan template or three test plan best practices? I guess almost everyone working in practice will choose the latter. When doing test process improvement focus on getting a best practices library set up as soon as possible instead of overspending on defining templates. Don’t worry whether the best practices are the best in the industry. They are the best in your organization and if something better comes along replace them. This is what supports our testing and makes process improvement work
Deployment orientation over Process orientation
Building process is easy, we have already done this so many times and there are numerous examples to be found. However, getting them deployed and thereby changing someone’s behavior is the hard part. Process improvement is all about change management. I have seen test improvement plans that focus almost entirely on defining the testing processes. In successful improvement projects at least 70% of the improvement effort is spent on deployment – “getting the job done”. Defining the processes is the easy part and should only account for a small percentage of the effort and focus
Reviews over Quality Assurance (departments)
Communication and providing feedback are essential to project success. It is exactly this peer review, if applied well, do. In principle also quality assurance officers evaluate documents and provide feedback to engineers. However, once to often I have experiences that quality officers, sorry no offence to those who do a good job !!, are too far away from the testing profession. Their feedback then focuses on conformance to templates and defined processes. Little added value to most projects, I believe. I have also experienced organizations where every test plan is peer reviews by one or two peer test manager giving feedback on the approach and content of the test plan. This is what we want, real feedback to we can use.
Business driven over Model driven
Whatever you do, make sure you know why you are doing it. What is the business problem you are trying to address? What is the test policy supported by management? Just trying to get to TMMi level 2 or 3 without understanding the business context will always fail in the short or long term. When addressing a certain practice from an improvement model, there are most often many different to comply. The business problem (poor product quality, long test execution lead time, costs, etc) will tell you which one to choose. Almost continuously review your process improvement against the business drivers and test policy