Wednesday, January 7, 2009

Heads up!

UPDATE: Any developments on this and other SIP/RTP issues can be found here.

Some serious issues for all of those of you in SIP land:

There is a pretty serious RTP problem with Sonus equipment that has been making the rounds...

Simply put, Sonus equipment will not accept two RTP packets with the same timestamp, even if the sequence number has been properly incremented. According to various RFCs (namely 1889 and 2833) this is perfectly valid and in some cases (like video) desired.

A few slight problems... Many implementations (including Asterisk AND FreeSWITCH) will (did -more on this later) send out RFC 2833 DTMF events with the same timestamp as the last voice RTP packet. This is perfectly valid according to the RFCs mentioned above.

It appears (after my own testing) that Sonus will actually drop BOTH the voice RTP packet and the event packet. After some testing against Sonus gear it was pretty clear that no audio was being passed as long as the DTMF event occured. This makes sense because per RFC2833 a variable length DTMF event must use the same timestamp, increment the sequence counter and increase the duration when it is resent - DO NOT change the timestamp. Oh Sonus.

Both Asterisk and FreeSWITCH have incremented workarounds to address this. They are similar but there is one key difference. Asterisk now (as of SVN 12/15/2008 or so) will always use a unique timestamp for every RTP packet. I guess that solves that problem. FreeSWITCH is slightly smarter about it (as of SVN about the same time, interestingly enough) but I"m worried...

FreeSWITCH will parse the SDP to find the originator line (o=). If it is equal to "Sonus_UAC" FreeSWITCH activates a specific workaround to always send RTP packets with different timestamps. This seems more elegant but I am worried they will have to expand this hack for other equipment in the future (requiring a code change and recompile).

One could argue that Sonus has gotten this far with their current implementation and expected behavior. While it is valid (per the RFCs) to use the same timestamp, it is more /compatible/ to always use different timestamps. That appears to be what most equipment does.

This issue is what (apparantly) caused so many issues for Teliax a while back while they switched from Asterisk to FreeSWITCH. At least that's what I heard. What doesn't make any sense is that Asterisk had the same behavior as FreeSWITCH - they both sent voice and event RTP packets with identical timestamps. So that part doesn't make any sense.

Also, one would like to think that when you provide voice services (which are pretty important to your customers) you would *test* something like DTMF when you were completely switching platforms. I discovered these issues while testing Star2Star with Level(3), for example. I'm glad I was paying attention. Our customers would have been upset with broken DTMF while we updated all of our Asterisk machines (several hundred).

I'm suprised no one noticed this until mid-December or so. It will be interesting to see what other things pop out of this mess...