Tuesday, December 20, 2011

Performance Testing (Part 1)

Over the past few years (like many other people in this business) I’ve needed to do performance testing.  Open source software is great but this is one place where you need to do your own leg work.  This conundrum first presented itself in the Asterisk community.  There are literally thousands of variables that can affect system the performance of Asterisk, FreeSWITCH, or any other software solution.  In no particular order:

- Configuration.  Which modules do you have loaded?  How are they configured?  If you’re using Kamailio, do you do hundreds of huge, slow, nasty DB queries for each call setup?  How is your logging configured?  Maybe you use Asterisk or FreeSWITCH and so several system calls, DB lookups, LUA scripts, etc?  Even the slightest misstep in configuration (synchronous syslogging with Kamailio, for example) can reduce your performance by 90%.

- Features in use.  Paging groups (unicast) are notorious for destroying performance on standard hardware - every call needs to be setup individually, you need to handle RTP, and some audio mixing is involved.  Hardware that can’t do 10 members in a page group using Asterisk or FreeSWITCH may be capable of hundreds of sessions using Kamailio with no media.

- Standard performance metrics.  “Thousands of calls” you say?  How many calls per second?  Are you transcoding?  Maybe you’re not handling any media at all?  What is the delay in call setup?

- Hardware.  This may seem obvious (MORE HERTZ) but even then there are issues...  If you’re handling RTP, what are you using for timing?  If you have lots of RTP, which network card are you using?  Does it and your kernel support MSI or MSI-X for better interrupt handling?  Can you load balance IRQs across cores?  How efficient (or buggy) is the driver (Realtek I’m looking at you)?!?

- The “guano” effect.  As features are added to the underlying toolkit (Asterisk, FreeSWITCH, etc) and to your configuration, how is performance affected over time?  Add a feature here, and a feature there - and repeat.  Over the months and years (even with faster hardware) you may find that each “little” feature reduced call capacity by 5%.  Or maybe your calls per second went down by two each time.  Not a big deal overall yet over time this adds up - assuming no other optimizations your call capacity could be down by 50% after ten “minor” changes.  It adds up - it really does.

Even when pointing out all of these issues you’d still be surprised how often one is faced with the question “Well yeah but how many calls can I handle on my dual core Dell server?”.

In almost every case the best answer is “Get your hardware, develop your solution, run sipp against it and see what happens”.  That’s really about as good as we can do.

SIPP is a great example of a typical, high quality open source tool.  In true “Unix philosophy” it does one thing and it does it well: SIP performance testing.  SIPP can be configured to initiate (or receive) just about any conceivable SIP scenario - from simple INVITE call handling to full SIMPLE test cases.  In these tests SIPP will tell you call setup time, messages received, successful dialogs, etc.

SIPP even goes a step further and includes some support for RTP.  SIPP has the ability to echo RTP from the remote end or even replay RTP from a PCAP file you have saved to disk.  This is where SIPP starts to show some deficiencies.  Again, you can’t blame SIPP because SIPP is a SIP performance testing tool - it does that and it does it well.  RTP testing leaves a lot to be desired.  First of all, you’re on your own when it comes to manipulating any of the PCAP parameters.  Length, content, codec, payload types, etc, etc need to be configured separately.  This isn’t a problem, necesarily, as there are various open source tools to help you with some of these tasks.  I won’t get into all of them here but they too leave something to be desired.

What about analyzing the quality of the RTP streams?  SIPP provides mechanisms to measure various SIP “quality” metrics - SIP response times, SIP retransmits, etc.  With RTP you’re on your own.  Once again, sure, you could setup tshark on a SPAN port (or something) to do RTP stream analysis on every stream but this would be tedious and (once again) subject you to some of the harsh realities of processing a tremendous amount of small packets in software on commodity hardware.

Let’s face it - for a typical B2BUA handling RTP the numbers add up very quickly - let’s assume 20ms packetization for the following:

Single RTP stream = 50 packets per second (pps)
Bi-directional RTP stream = 100 pps
A-leg bi-directional RTP stream = 100 pps
B-leg bi-directional RTP stream = 100 pps

A leg + B leg = 200 pps PER CALL

What does this look like with 10,000 channels (using g711u)?

952 mbit/s (close to Gigabit wire speed) in each direction
1,000,000 (total) packets per second

Open source software is great - it provides us with the tools to (ultimately) build services and businesses.  Many of us choose what to focus on (our core competency).  At Star2Star we provide business grade communication services and we spend a lot of time and energy to build these services because it’s what we do.  We don’t sell, manufacture, or support testing platforms.

At this point some of you may be getting an idea...  Why don’t I build/design an open source testing solution?  It’s a good question and while I don’t want to crush your dreams there are some harsh realities:

1)  This gets insanely complicated, quickly.  Anyone who follows this blog knows SIP itself is complicated enough.
2)  Scaling becomes a concern (as noted above).
3)  Who would use it?

The last question is probably the most serious - who really needs the ability to initiate 10,000 SIP channels at 100 calls per second while monitoring RTP stream quality, etc?  SIP carriers?  SIP equipment manufacturers?  A few SIP software developers?  How large is the market?  What kind of investment would be required to even get the project off the ground?  What does the competition look like?  While I don’t have the answers to most of these questions I can answer the last one.

Commercial SIP testing equipment is available from a few vendors:


...and I’m sure others.  We evaluated a few of these solutions and I’ll be talking more about them in a follow-up post in the near future.
Stay tuned because this series is going to be good!

No comments: