Micro Services & Performance Testing

Sudheer Kumar
3 min readSep 1, 2022
Photo by Frances Gunn on Unsplash

Performance Testing is to make sure the application is working properly under different workloads. Load Testing is a type of performance testing.

Need and Importance of Perf Testing

In Microservice architecture, your service interacts with several other services to achieve a goal. It is highly important to load test this setup so that you get an idea of how your system behaves under high load conditions. Some of the services might act a bottleneck to other services and will slow down the whole system.

Here I am going to talk about some tools that will help us to do the load testing of a microservices-based system.

Tools

Available Tools:
- JMeter (open source , java based), Locust(open source, python based), LoadRunner,NeoLoad, LoadUI (paid ones)…
I am using JMeter for my tests since it is open source, and has a lot of supporting tools and plug-ins available.

I will write a separate article on how to create a test set up and here will mainly talk about the strategies for perf testing using JMeter.

Perf Testing Strategies

First thing that you have to do is to create a test script. In this case, we are going to use a browser-based plug-in — Blaze Meter to run record our UI operations. The Blaze Meter will record each and every HTTP operation and you can create a binary JMX file as the output. You can load this JMX file on JMeter and can use it for running perf tests.

Now you have the scripts and first you have to remove all the unwanted HTTP calls (eg: Invoking static pages etc) and focus mainly on Service API calls. For that, run the tool with a single thread for once.

If all APIs are running without any error, now you are ready to put more load.

Scenario 1: Increase no of users to 10 and Infinite Count

Here you can see the sample response for a number of API calls and you can see one particular API is failing 100%, but it was working fine under lighter loads.

Now check that API performance on Postman and see how long it takes. typically, all the APIs should have a target response time < 1 second. If it is more than that, you need to optimize that API endpoint by taking a look at the traces and identifying which block is taking time, and optimizing the same.

Some of the issues that you might see are:

  • Dependent service it invokes is taking time (dig in to that API method)
  • Database operation is taking time. (Check proper indexes for GET and Keys for SET operation are present)

Scenario 2: Same 10 users, excluding the endpoints one by one

In this case, you exclude the 100% failing API and run the rest for 10 users. If you find they are performing good with low errors and high throughput, then:

  • Increase the no of users
  • Make sure performance is good
  • Keep scaling up your backend, if required (like Aurora ACU, ECS service memory by monitoring your backend)

Scenario 3: Bring back the optimized endpoints one by one

After optimizing, bring them back one by one and see how they work together.

It is an art to keep investigating the system like this and finally get it on to the required level of performance. For website with high scalability, you have to have sub-second performance with loads like 200–300 users per second.

Happy Perf Testing!!

--

--

Sudheer Kumar

Experienced Cloud Architect, Infrastrcuture Automation, Technical Mentor