Monday, December 22, 2008

Introducing Recqual

I've been waiting to talk about this one for a while.

Several months ago Star2Star was having problems with one of our upstream SIP carriers. We were starting to notice a large increase in the number of one way audio calls our customers were reporting.

When most people think of one way calls their first reaction is to blame SIP. Must be NAT! Must be a firewall! SIP sucks! Etc, etc.

I knew that wasn't the case. I just had to prove it.

I was convinced the problem wasn't SIP/UDP/IP related at all. We had multiple pcaps where we were sending RTP to the appropriate gateway. It just wasn't getting to the PSTN. Where was it going? When was this happening? Which gateways (out of hundreds) were the most problematic? We needed to know and we needed to know quickly.

I came up with and "wrote" recqual over a couple of days. After a few runs we were noticing patterns with problematic RTP endpoint IP addresses. Long story short, once these were identified we worked with the carrier to replace various bits of equipment (DSPs, line cards, etc). The one way audio problem has largely disappeared and we continue to run recqual. If this starts happening again we should know /BEFORE/ our customers do.

Of course I'm using Asterisk to place the calls. The best part of using Asterisk is it's multi-protocol flexibility. You should be able to test just about any combination of voice technologies - G.279a, G711, GSM, SIP, IAX, PRI, FXO, FXO, gtalk/jabber/jingle, skype, etc. The possibilities boggle the mind.

I've just been too busy to get it together and release this to the community - until now.

Tarball with instructions here.

Questions? Comments? Suggestions? Drop me a line.

Sunday, December 21, 2008

Consulting Time Available

I haven't blogged in a while but there is some good news...

I have made some time available for consulting work!

You might not be as excited about it as I am but this is a good thing. I'm looking for interesting projects, people, and companies to work with.

If you or anyone you know might be interested please contact me. Resume, references, etc available on request.

I'll be offering bonus time, discounts, and a few other potential incentives to anyone that lets me blog about my projects and/or release any work under a liberal (read: FOSS) license.

Between my change in schedule and some (hopefully) fun new projects you can all expect to see much more frequent blogging soon!

Monday, November 17, 2008

SBCs are Killing SIP

Wow... Over a month since my last post! My how time flies.

No time to reminisce or catch up. I've got a rant that needs to get out - NOW.

SBCs (Session Border Controllers) are killing SIP. Breaking SIP. Smothering SIP. Especially when used by "carriers". Carriers and their SBCs I tells ya.

SBCs, technically, are pretty cool devices. While I certainly understand their purpose they tend to be overused, misconfigured, and misunderstood. Many entities deploy SBCs without any idea of the other components (I'm looking at you, proxies) that make up a well designed SIP network.

Why do I hate SBCs so much?

1) SIP is cool because it is end to end and designed with intelligent endpoints in mind (endpoints that can think for themselves).
2) SIP is very flexible, especially with regards to handling media.
3) Ubiquity.

SBCs (especially when misconfigured) break many of these features:

1) SBCs (by design) hide endpoints from one another. Both endpoints support G.722? The SBC doesn't and it's going to rewrite the SDP with it's capabilities. Too bad.
2) SBCs (by design) handle media. While this can be good often times it isn't and there are other, less drastic ways to ensure quality of media.
3) When the only tool you have is a hammer, every problem starts to look like a nail.

My biggest concerns with SBCs relate to the last point. I swear, there are many providers, enterprises, etc that have deployed SIP in some capacity using ONLY SBCs and simple UACs and UASs. They've never heard of a proxy. Or a registrar. Heck, I'd even go for a signalling-only B2BUA and call it a compromise. Chances are they've never heard of that either.

I have dealt with several devices that break down, utterly fall apart when used with a proxy. I've covered it on this blog before. I'm just too mad to look up the link now. Again, $MANUFACTURER designs and markets a SIP device. They only test it against SBCs and they've (apparently) never heard of a proxy. Guess what happens...

Some poor soul like myself tries to deploy said device in what I consider to be a well designed SIP network. Unfortunately for me, this call path might not involve an SBC. Guess what happens? The device doesn't understand traversing proxies (Record-Route, Via, etc) and does something silly like parse the Contact header when trying to send a response. Call failure and all kinds of brokenness ensue.

So... I talk to $MANUFACTURER and get the standard "We've deployed this device thousands of times and never seen this problem before". Let's assume that's true. I don't know what's more depressing: the fact that they skipped over multiple sections of a basic SIP RFC like 3261 or the fact that no one noticed it for this long because (apparantly) no one uses proxies anymore. Ugh. Gross.

It's not just device manufacturers. Carriers do this too. Often times the actual issue lies with their SBC. Many carriers (especially those using ACME SBCs, it seems) parse To: instead of the Request-URI. Probably because their customers are using SBCs too and Request-URI and To: match. Not so with a proxy. I don't blame the carrier's use of an SBC. This makes sense. That's what they were designed for. However, please test your device and configuration against something other than another SBC.

What happens if your Request-URI and To: don't match? They send a 404! Yet another RFC3261 violation. Section 8.2.2.1 allows for a UAS to route based off To (although it doesn't sound preffered). However, for the love of God, if you are going to deny a request because of the content of a To header, please send a 403 as specified in the RFC. Your 404s are confusing and ignorant. Was it really not found, or are you just routing based off To instead of the Request-URI? Once again I blame SBCs and a world where it's becoming common for SBCs to talk to each other (and nothing else).

This is yet another situation where assumptions are made based on the behavior of SBCs. It's bad. Please stop.

Thursday, October 9, 2008

Submit Your SIP

Ever since I've started blogging and talking about SIP people have come out of the woodwork with SIP interop problems.

After giving a talk about SIP at Astricon 2008 I received several e-mails from audience members with specific SIP issues. I LOVE getting these e-mails.

Why? I love working on SIP issues. With all of the devices using SIP there is no shortage of interop problems. Just today a guy on the Asterisk mailing list had a problem with his Cisco AS5300 and Asterisk 1.2 Usually that wouldn't be a problem at all - many people (including myself) use this combination of hardware with great success.

Why was he having problems? His AS5300 was configured for GTD and Asterisk 1.2 (apparently) doesn't handle multipart SIP bodies very well. I was able to find a patch to Asterisk 1.4 to improve multipart body parsing. That was a fun one.

I got to thinking... There should be a place where people can exchange specific SIP interop tips and notes. Otherwise how are we supposed to get anything to work!?!?

I came up with such a place and it's called SubmitYourSip.com. I' ve started to fill it in a little but hopefully (with time) it will become somewhat of a SIP wiki (with a focus on interop, of course).

I'm just getting started on it but I'll be working on my MediaWiki syntax and going back through my e-mail to dig out some of these examples.

Friday, September 19, 2008

A preview: performance tests

I'm headed out the door for some sushi but I thought I'd drop in to give you an idea of what I'm working on for my next blog post. I'm hungry so let's keep this short and sweet: receive interrupt mitigation and its effects on Linux media applications.

In general I'm a big fan of receive interrupt mitigation. I'll trade some delay for a substantial decrease in system CPU time spent servicing interrupts resulting in the ability to handle more calls. I didn't just come to this one day, I've done some tests in the past to verify this.

However, I've never done a large scale test on regular, server class hardware. Usually just Asterisk on an embedded system. It usually works out well. This is why, by default, all ethernet adapters that support NAPI in Linux are enabled in AstLinux by default.

The folks at TransNexus spend a fair amount of time testing OpenSER/Kamailio/OpenSips performance. Today Jim Dalton posted the results of another test to the Kamailio User's mailing list. I replied to his post with so many questions I figured it might be time for me to lab this up myself and test my theories about interrupt handling (in Linux, specifically).

If those brighly colored rolls of fish weren't so distracting and delicious I'd promise to think about all of this over dinner. Unfortunately it will have to wait until tomorrow...

Friday, September 12, 2008

Distro Wars!

I don't like to get into Distro Wars... Nothing is more pathetic than a bunch of FOSS geeks sitting around getting religious about:

- Distros
- Editors
- Star Wars

Ok, ok I started to embrace the stereotype a little towards the end there but if you've ever seen one of these epic battles with your own eyes you just might relate it to Star Wars too.

There are always far more serious things going on in the world and these software hippies sit around and argue about what software to edit ASCII with. Ridiculous.

However, even I will throw down when people bring up the worst idea for a distro ever:

Fedora

There, I said it. I've officially fired my first shot in a war that has been raging in the Linux community since, oh, 1991 or so.

What's wrong with Fedora, you ask? Of all of the other hundreds of distros, why would I single out Fedora and waste my time writing about it? I'll tell you why:

Much like the iPhone, it's a joke.

I CRINGE. ABSOULTELY CRINGE when I see someone trying to do something serious with Fedora. At this very minute (3am), I am typing a blog post when I should be sleeping. Why you ask? Because I just saw a post on Asterisk-Users with yet another poor soul trying to bring up Asterisk on a Fedora system (Fedora 9). I couldn't possibly sleep knowing the abomination that is Fedora+Asterisk continues.

This is a perfect example to illustrate why Fedora is such a joke. One of the fundamental principles of Linux is reliability. One of the fundamental principles of telephony is reliability. By installing Asterisk on Fedora you are flying in the face of 18 years of Linux and 100+ years of telephony.

Fedora is BLEEDING EDGE. It serves as a test bed for RedHat's next REAL Linux release. How do you feel being a tester for what is sold as a commercial product by a profitable company? Bugs in Fedora are found and fixed quickly...

For six months

You can install Fedora and deal with the usual issues in beta software. Once things finally settle down (after six months or so), you get to upgrade to the next release and start all over again!

Sounds like a great plan for a server or PBX, right?

Hosting companies sell packages based on Fedora. Shame on them. People install PBX systems with Fedora. Double shame on them. The average life span for a PBX is seven years. This means that your Fedora Asterisk system is supported with updates for %7 of the typical PBX lifespan.

CentOS/RHEL/Ubuntu LTS and several other Novell/etc offerings are supported for five years or more.

Granted Fedora has it's place. It makes a nice toy, much like the iPhone. If you want to play with cutting edge Linux, Fedora is for you and it might work on a test system, desktop, laptop, etc.

But please. Please. PLEASE do not install it on a server and don't even think of using it for Asterisk. ANYTHING else will do. Seriously, I don't have a problem with any other distro. Pick one.

Monday, August 25, 2008

Free cell phones

Just in case you thought I was selling out to the US wireless industry with my previous post, check this out:

After looking over some phones on the Nokia website, I thought of all of those "free" phones carriers like to give away.

Here's one for comparison. The Nokia 6085 is offered in the US by AT&T wireless. Their website says it retails for $189.99. My gosh! Oh but don't you worry, as long as you sign up for two years of service (and do so online) we'll discount our $190 phone to $39.99 and then give you a $39.99 discount (you're buying online, remember). OMG! Free phone! See how that works?

Funny enough, Nokia offers the phone (just the phone) on their US website for $118. That means that right off the bat, AT&T wireless is jacking the price of the phone up $60 just for the pleasure of buying it from them.

Some of you might say "Hey, a $60 markup isn't that bad". Yeah right. AT&T wireless is NOT paying $118 for that phone. I wonder how many of them they sell and what kind of special pricing Nokia gives them. Probably not even close. Probably not even half that. I bet AT&T still makes money at the $39.99 price. I also wonder how many they sell at $189.99...

Here's the catch. They are going to give you the same contract and charge you the same price whether you get the "free" phone or bring your own. That's what sucks about the wireless industry in the US. Unlike the rest of the world, Americans can't be bothered to buy their own phone and bring it to the carrier for service. Maybe it's because we've got some different wireless standards and there could be confusion (iDEN, CDMA, GSM, etc). More than likely it's because the wireless carriers know they can make out like bandits otherwise.

The coupling of the phone to the service is inherently wrong and evil. Some carriers (Verizon Wireless) don't even activate phones who's ESNs they don't have in a database. All to "protect the network". Yeah right. If your network is going to go down because I want to use a CDMA cell phone I bought from Sprint a year ago you've got some serious network issues...

At least there is some light at the end of the tunnel. Soon enough cell phone carriers will have to pro-rate early termination fees. It's tough to lock someone into a contract with a $200 early termination fee to cover the cost of the $40 phone you "gave away" almost two years ago.