How to Troubleshoot WSO2 Products — 1. How to get Average TPS with API Manager Analytics

Hi Everyone,

I had this thought of writing a set of posts on how to troubleshoot and do quality assurance of WSO2 products.

As the first step, I’m writing this story regarding a way to get average TPS with API Manager Analytics 2.6.0. Currently, there’s no OOTB method to check the TPS in the analytics dashboard, so I hope this guide will be useful for you to have some understanding.

How Analytics Works

Before going on how to work with TPS, let me give you some understanding on basic flows. If you have fundamental knowledge on how APIM Analytics 2.6.0 works, you can skip this section.

As most of you know the underlying product of WSO2 API Manager Analytics 2.6.0 is WSO2 Stream Processor 4.3.0 [1], a streaming SQL based, high performant, open source stream processing platform can used for real time analytics.

When we configured API Manager with APIM Analytics as in [2], each and every invocation done in API Manager end will send as a ‘wso2event’ to APIM Analytics end. So in Analytics distribution, there are a set of siddhi files available with a pre-defined set of logics in order to receive the events and process accordingly.

There are 3 different event streams used by WSO2 API Manager to send events to analytics as follows,

As the name conveys, successful requests, faulty requests and throttled requests will be separately sent to analytics end using these streams. You can check [3] for more information.

In APIM analytics distribution, there’s a siddhi file available with the name ‘APIM_EVENT_RECEIVER.siddhi’ in <APIM-Analytics-Home>/wso2/worker/deployment/siddhi-files which contains the streams and other logics to process above events.

Configure Analytics to get the record counts

Hope now you have some understanding on how the flow works. So let’s go to configurations by following the below steps,

1. As mentioned above, let’s enable analytics with API Manager if you haven’t done yet.

2. Then go to <APIM-Analytics-Home>/wso2/worker/deployment/siddhi-files directory and open ‘APIM_EVENT_RECEIVER.siddhi’ file and add the following set of lines after ‘@App:description’

define stream countRequestStream(count long);
define stream countFaultStream(count long);
define stream countThrottleStream(count long);
define stream requestLogStream(count long);
define stream faultLogStream(count long);
define stream throttleLogStream(count long);

Here we’re defining 6 streams, 2 for each type of event (request, fault and throttle).

‘requestLogStream’, ‘faultLogStream’ and ‘throttleLogStream’ will be used for logging purposes which I will explain in next steps.

3. In the same file, let’s add following set of lines at the end,

from InComingRequestStream
select count() as count
insert into countRequestStream;
from FaultStream
select count() as count
insert into countFaultStream;
from ThrottledOutStream
select count() as count
insert into countThrottleStream;
from countRequestStream[count == 1 or count % 10000 == 0]
select count
insert into requestLogStream;
from countFaultStream[count == 1 or count % 500 == 0]
select count
insert into faultLogStream;
from countThrottleStream[count == 1 or count % 500 == 0]
select count
insert into throttleLogStream;

4. Then save the file and it will be hot deployed in the analytics worker node.

5. Now starts the worker profile by executing in <APIM-Analytics-Home>/bin

6. After that start the API Manager instance and invoke an API a few times.

What have we done ? :O

Now let me explain what we did and what we got after following the above steps.

‘InComingRequestStream’, ‘FaultStream’ and ‘ThrottledOutStream’ are the streams defined by default in ‘APIM_EVENT_RECEIVER.siddhi’ file to receive the events of successful, faulty and throttled respectively with the Analytics. Let’s take the below section added in 3rd step above,

from InComingRequestStream
select count() as count
insert into countRequestStream;

As per this we will get the events which received to ‘InComingRequestStream’ stream and then use a count function to count the receiving events and insert the results to one of our predefined ‘countRequestStream’ streams. Similar to this event counts of ‘FaultStream’ will add to ‘countFaultStream’ and event counts of ‘ThrottledOutStream’ will add to ‘countThrottleStream’.

Now consider the latter configurations we added in 3rd step as below,

from countRequestStream[count == 1 or count % 10000 == 0]
select count
insert into requestLogStream;

As explained before ‘countRequestStream’ will contain the event count at the moment, so here we have defined a condition to have the first event received to analytics and every 10000th event and insert it to ‘requestLogStream’.

As you remember we have defined ‘requestLogStream’ in the 2nd step with a log sink, so the 1st event and every 10000th event (10000th, 20000th, 30000th) will be printed in the logs. We can log every receiving event to analytics with the logging function but it will lead to filling up the carbon.log file unnecessarily. That’s the reason for having this condition and print based on a defined frequency. You can change the log frequency by replacing the value ‘10000’ with a desired value if you feel your average TPS would be a low value.

Similarly ‘faultLogStream’ and ‘throttleLogStream’ will print the every 1st and 500th event of faulty and throttled out requests in the logs as defined in the 3rd step.

Following would be a similar set of logs in carbon.log of <APIM-Analytics-Home>/wso2/worker/logs generated after the invocations.

[2020–04–24 00:03:56,695] INFO {} — APIM_EVENT_RECEIVER : throttleLogStream : Event{timestamp=1587666836693, data=[1], isExpired=false}[2020–04–24 00:03:56,704] INFO {} — APIM_EVENT_RECEIVER : requestLogStream : Event{timestamp=1587666836704, data=[1], isExpired=false}[2020–04–24 00:03:56,710] INFO {} — APIM_EVENT_RECEIVER : faultLogStream : Event{timestamp=1587666836710, data=[1], isExpired=false}[2020–04–27 00:03:54,065] INFO {} — APIM_EVENT_RECEIVER : faultLogStream : Event{timestamp=1587945834065, data=[1000], isExpired=false}[2020–04–27 00:03:55,626] INFO {} — APIM_EVENT_RECEIVER : throttleLogStream : Event{timestamp=1587945835626, data=[2500], isExpired=false}[2020–04–27 00:04:28,125] INFO {} — APIM_EVENT_RECEIVER : requestLogStream : Event{timestamp=1587945868125, data=[35000000], isExpired=false}

The first event will be the one received to analytics after you have saved the ‘APIM_EVENT_RECEIVER.siddhi’ file with the above changes. You can check the number of events inside the ‘data=[]’ section in logs and the event receiving timestamp will also be logged as ‘timestamp’.

How to calculate Average TPS

As per the above logs, you can run your environment for a few days to get a sufficient set of records. This duration can be different from scenario to scenario as you might need the TPS for peak hours or during non peak hours or during weekdays or during weekends or probably for a special time duration. Here we will be considering 3 days of data to explain the scenario.

The Average TPS would be the total events received during the time divided by the average time duration. The total events would be the sum of successful requests, faulty requests and throttled requests which we can get from above logs under ‘requestLogStream’, ‘faultLogStream’ and ‘throttleLogStream’ log lines. Please follow the below steps to do the calculation.

1. Check the last log printed for ‘requestLogStream’, ‘faultLogStream’ and ‘throttleLogStream’ and get the respective value inside the ‘data=[]’ section. Let’s assume following values,

requestLogStream = 35000000 requests

faultLogStream = 1000 requests

throttleLogStream = 2500 requests

So the total request count would be 35003500.

2. To get the average time duration, we need to get the difference of timestamp value between the 1st event and the last event for each of the above streams and take the average. From above logs,

Time taken for requestLogStream = (1587945868125–1587666836704) milliseconds = 279031421 milliseconds

Time taken for faultLogStream = (1587945834065–1587666836710) milliseconds = 278997355 milliseconds

Time taken for throttleLogStream = (1587945835626–1587666836693) milliseconds = 278998933 milliseconds

Average total time = (279031421 + 278997355 + 278998933)/3 = 279009236.33 milliseconds ~ 279009 seconds

3. So the Average TPS would be = Total Request Count / Average total time = 35003500/279009 = 125 Transactions per second.

Just like this you can calculate average TPS within a specified time duration (ex- when a sale promotion goes in your organization during the next long weekend) based on your use case.

Hope you learn something new today. Let me know if you have any questions.




BSc in ICT, CTFL in QA, CIM Base for Marketing, HEQ in BCS and Sleeping in the Morning