A colleague came to me today and had a troubling issue. He's using sipsak and nagios to monitor some SIP endpoints. Pretty standard so far, right? He noticed that when using UDP and checking on an endpoint that was completely offline sipsak would take over 30 seconds to finally return with an error. Meanwhile Nagios would block and wait for sipsak to return...
Without a simple command line option in sipsak that appeared to change this behavior, we had to enter the semi-complicated world of SIP timers. I feared that to change this behavior we'd have to do some things that might not necessarily be RFC compliant...
What's this? For once I'm actually suggesting you do something against the better advice of an RFC?
That's right, I am.
RFC3261 defines multiple timers and timeouts for messages and transactions. It says things like:
"If there is no final response for the original request in 64*T1 seconds"
"The UAC core considers the INVITE transaction completed 64*T1 seconds after the reception of the first 2xx response."
"The 2xx response is passed to the transport with an interval that starts at T1 seconds and doubles for each retransmission until it reaches T2 seconds"
Without even knowing what "T1" is you can start to see that it's a pretty important timing parameter and (more or less) serves as the father of all timeouts in SIP. Let's look at section 17 to find out what T1 is:
"The default value for T1 is 500 ms. T1 is an estimate of the RTT between the client and server transactions. Elements MAY (though it is NOT RECOMMENDED) use smaller values of T1 within closed, private networks that do not permit general Internet connection. T1 MAY be chosen larger, and this is RECOMMENDED if it is known in advance (such as on high latency access links) that the RTT is larger. Whatever the value of T1, the exponential backoffs on retransmissions described in this section MUST be used."
T1 is essentially a variable for RTT between two endpoints that serves as a multiplier for other timeouts. Unless we know better T1 should default to 500ms, which is quite high. Some implementations (such as Asterisk with the SIP peer qualify option) automatically send OPTIONS requests to endpoints in an effort to better determine RTT instead of using the RFC default of 500ms.
In reading through the sipsak source code it appeared to be RFC compliant for timing, using a default T1 value of 500ms and a transaction timeout value of 64*T1. This is why it was taking over 30 seconds (32 seconds to be exact) for sipsak to finally timeout and return the status code to nagios. This comes directly from the RFC:
"For any transport, the client transaction MUST start timer B with a value of 64*T1 seconds (Timer B controls transaction timeouts)."
This is all well and good but what happens when you don't have a way to dynamically determine T1 and you can't wait T1*64 (32s) for your results like my sipsak/nagios check earlier? Simple: you go renegade, throw out the RFC, and hack the sipsak source yourself!
So I had three options:
1) Change the default value of T1.
2) Change the value of T2 by changing the multiplier or setting a static timeout.
3) Some combination of both.
I decided to go with option #3 (RFC be damned). Why?
1) 500ms is crazy high for most of our endpoints. At a glance 100ms would be fine for ~90% of them. I'll pick 150ms.
2) I don't need that many retransmits. If the latency and/or packet loss is that bad I'm not going to wait (my RTP certainly isn't) and I just want to know about it that much quicker.
So I ended up with a quick easy patch to sipsak:
diff -urN sipsak-0.9.6.orig/sipsak.h sipsak-0.9.6/sipsak.h
--- sipsak-0.9.6.orig/sipsak.h 2006-01-28 16:11:50.000000000 -0500
+++ sipsak-0.9.6/sipsak.h 2010-10-26 18:38:45.000000000 -0400
@@ -102,11 +102,7 @@
# define FQDN_SIZE 100
#endif
-#ifdef HAVE_CONFIG_H
-# define SIP_T1 DEFAULT_TIMEOUT
-#else
-# define SIP_T1 500
-#endif
+#define SIP_T1 150
#define SIP_T2 8*SIP_T1
diff -urN sipsak-0.9.6.orig/transport.c sipsak-0.9.6/transport.c
--- sipsak-0.9.6.orig/transport.c 2006-01-28 16:11:34.000000000 -0500
+++ sipsak-0.9.6/transport.c 2010-10-26 18:38:51.000000000 -0400
@@ -286,7 +286,7 @@
}
}
senddiff = deltaT(&(srt->starttime), &(srt->recvtime));
- if (senddiff > (float)64 * (float)SIP_T1) {
+ if (senddiff > inv_final) {
if (timing == 0) {
if (verbose>0)
printf("*** giving up, no final response after %.3f ms\n", senddiff);
This changes the value of T1 to 150ms (more reasonable for most networks) and allows you to specify the number of retransmits (and thus the total timeout) using -D on the sipsak command line:
kkmac:sipsak-0.9.6-build kris$ ./sipsak -p 10.16.0.3 -s sip:ext_callqual@asterisk -D1 -v
** timeout after 150 ms**
*** giving up, no final response after 150.334 ms
kkmac:sipsak-0.9.6-build kris$ ./sipsak -p 10.16.0.3 -s sip:ext_callqual@asterisk -D2 -v
** timeout after 150 ms**
** timeout after 300 ms**
*** giving up, no final response after 460.612 ms
kkmac:sipsak-0.9.6-build kris$ ./sipsak -p 10.16.0.3 -s sip:ext_callqual@asterisk -D4 -v
** timeout after 150 ms**
** timeout after 300 ms**
** timeout after 600 ms**
*** giving up, no final response after 1071.137 ms
kkmac:sipsak-0.9.6-build kris$
Needless to say our monitoring situation is much improved.
I created AstLinux but I write and rant about a lot of other things here. Mostly rants about SIP and the other various technologies I deal with on a daily basis.
Showing posts with label sip. Show all posts
Showing posts with label sip. Show all posts
Tuesday, October 26, 2010
Friday, May 21, 2010
A ClueCon Preview...
A while back I saw a preview for the new A-Team movie. While the movie itself looks horrible I was reminded of the original TV series with its many interesting characters and catch phrases. Among my personal favorites?
I love it when a plan comes together.
That's exactly how I feel with one of my "pet projects" from the past couple of months. Much like Hanibel and the A-Team I was up against formidable issues in trying to accomplish my task: implementing a flexible (very flexible), reasonably high performance LCR server that could be added to my existing architecture.
First I needed to select an LCR "engine". Multiple possibilities were considered but I left the final recommendation up to the DB and billing teams I work with. They selected mod_lcr from FreeSWITCH. While I was certain droute from OpenSIPS (or something similar) would have higher performance I accepted their recommendation. After playing with mod_lcr a bit I can also see its potential.
So now the question was: can FreeSWITCH respond with the proper SIP signaling (300 Multiple Choices)? Using the redirect application from mod_dptools it could not. I created a bounty to add multiple Contact/300 Multiple Choices functionality to FreeSWITCH. Tony had it implemented that day.
With the ability to respond properly I now had to get the data. Mod_lcr looked nice but it certainly wasn't designed for this application. All of the default syntax, tables, etc showed it being used with FreeSWITCH for FreeSWITCH. The tables and code used several bridge specific syntax examples. I hacked mod_lcr to return data to mod_dptools/redirect properly. A created a JIRA issue with my patch and a couple of days later Rupa had it committed.
So now FreeSWITCH could be a route server. All I needed to do was make sure OpenSIPS could route from what FreeSWITCH returned. Turns out it could not. RFC 3261 (section 21.3.1) states "...the SIP response MAY contain several Contact fields or a list of addresses in a Contact field." The Sofia stack from FreeSWITCH used multiple Contact headers, each with its own URI. OpenSIPS would only parse the first one returned. Sofia couldn't be changed easily so OpenSIPS would need to be changed (it was non-compliant anyway). Without this change there is no ability to handle multiple contacts and only the first would be used. It could be worse but obviously this wasn't good enough.
I contacted Bogdan from OpenSIPS to see what it would take to update the parser to handle multiple Contact headers. He indicated it would take four hours or so. Once he got back to me I had an OpenSIPS system that would handle multiple contact headers and create new branches from a failure route as desired.
So how did it all turn out? Well, you have two ways to hear the end of this story:
1) Attend ClueCon at the Trump Hotel in Chicago, IL in early August.
2) Wait until mid-August for an update here.
I'll make sure to post all of my materials - conference presentation, sipp scenarios for testing, OpenSIPS configuration, FreeSWITCH configuration, DB tweaks, etc.
Too late to make it to ClueCon this year? Just make sure to register next year, I'm sure I'll be there.
I love it when a plan comes together.
That's exactly how I feel with one of my "pet projects" from the past couple of months. Much like Hanibel and the A-Team I was up against formidable issues in trying to accomplish my task: implementing a flexible (very flexible), reasonably high performance LCR server that could be added to my existing architecture.
First I needed to select an LCR "engine". Multiple possibilities were considered but I left the final recommendation up to the DB and billing teams I work with. They selected mod_lcr from FreeSWITCH. While I was certain droute from OpenSIPS (or something similar) would have higher performance I accepted their recommendation. After playing with mod_lcr a bit I can also see its potential.
So now the question was: can FreeSWITCH respond with the proper SIP signaling (300 Multiple Choices)? Using the redirect application from mod_dptools it could not. I created a bounty to add multiple Contact/300 Multiple Choices functionality to FreeSWITCH. Tony had it implemented that day.
With the ability to respond properly I now had to get the data. Mod_lcr looked nice but it certainly wasn't designed for this application. All of the default syntax, tables, etc showed it being used with FreeSWITCH for FreeSWITCH. The tables and code used several bridge specific syntax examples. I hacked mod_lcr to return data to mod_dptools/redirect properly. A created a JIRA issue with my patch and a couple of days later Rupa had it committed.
So now FreeSWITCH could be a route server. All I needed to do was make sure OpenSIPS could route from what FreeSWITCH returned. Turns out it could not. RFC 3261 (section 21.3.1) states "...the SIP response MAY contain several Contact fields or a list of addresses in a Contact field." The Sofia stack from FreeSWITCH used multiple Contact headers, each with its own URI. OpenSIPS would only parse the first one returned. Sofia couldn't be changed easily so OpenSIPS would need to be changed (it was non-compliant anyway). Without this change there is no ability to handle multiple contacts and only the first would be used. It could be worse but obviously this wasn't good enough.
I contacted Bogdan from OpenSIPS to see what it would take to update the parser to handle multiple Contact headers. He indicated it would take four hours or so. Once he got back to me I had an OpenSIPS system that would handle multiple contact headers and create new branches from a failure route as desired.
So how did it all turn out? Well, you have two ways to hear the end of this story:
1) Attend ClueCon at the Trump Hotel in Chicago, IL in early August.
2) Wait until mid-August for an update here.
I'll make sure to post all of my materials - conference presentation, sipp scenarios for testing, OpenSIPS configuration, FreeSWITCH configuration, DB tweaks, etc.
Too late to make it to ClueCon this year? Just make sure to register next year, I'm sure I'll be there.
Wednesday, May 19, 2010
I've said it before but I'll say it again...
FreeSWITCH rocks!
Earlier today I wanted to play with the possibility of using FreeSWITCH as a route/LCR server for another platform. FreeSWITCH already has mod_lcr and redirect. Using these two features FreeSWITCH could be made to respond with a 302 and a single SIP URI in the Contact field.
I wanted more. I wanted a way to respond with multiple routes.
The standard way to do this (using SIP, of course) is to respond to incoming INVITEs with a 300 Multiple Choices. This response should contain a Contact header (or multiple Contact headers) with a list of SIP URIs (along with optional q values, etc) for the original system to route the call to.
As usual I wrote the FreeSWITCH-Users mailing list to make sure this functionality didn't already exist somewhere. It did not and it was suggested I create a bounty.
Creating a bounty is always tough... I don't deal with the source code of FreeSWITCH all that often. I don't know how much work this is going to take. I don't know how much C programmers make. So I did my best to come up with something that seemed fair: $250.
Less than two hours later the feature was coded, committed to FreeSWITCH, tested by me, and paid for.
Once again, Open Source for the win!
Earlier today I wanted to play with the possibility of using FreeSWITCH as a route/LCR server for another platform. FreeSWITCH already has mod_lcr and redirect. Using these two features FreeSWITCH could be made to respond with a 302 and a single SIP URI in the Contact field.
I wanted more. I wanted a way to respond with multiple routes.
The standard way to do this (using SIP, of course) is to respond to incoming INVITEs with a 300 Multiple Choices. This response should contain a Contact header (or multiple Contact headers) with a list of SIP URIs (along with optional q values, etc) for the original system to route the call to.
As usual I wrote the FreeSWITCH-Users mailing list to make sure this functionality didn't already exist somewhere. It did not and it was suggested I create a bounty.
Creating a bounty is always tough... I don't deal with the source code of FreeSWITCH all that often. I don't know how much work this is going to take. I don't know how much C programmers make. So I did my best to come up with something that seemed fair: $250.
Less than two hours later the feature was coded, committed to FreeSWITCH, tested by me, and paid for.
Once again, Open Source for the win!
Wednesday, May 12, 2010
Another SIP gotcha: Cisco
Another quick and dirty SIP interop post.
A while back I was tasked to interface a FreeSWITCH server and a Cisco Unified Communications Manager system. Once the SIP trunk was configured on the Call Manager/CUCM side they sent an INVITE over. It didn't have an SDP.
It appeared that we needed to enable 3pcc (third party call control) in FreeSWITCH. No problem. I enabled 3pcc and interop continued.
Problems arose, however, when we needed to send the Cisco ringback. Whether it be a 180 or 183 (with or without SDP for either) this was going to be tough because with 3pcc enabled the dialog looked like so:
<-- Cisco
--> FreeSWITCH
INVITE (without SDP) <--
100 Trying -->
200 OK (with SDP) -->
ACK (with SDP) <--
So... There was no opportunity to signal progress as long as we 200 OKd the call almost immediately. Sure I probably could generate some ringback after the 200 but that would just be wrong!
As I like to say, the internet to the rescue. Not having much experience with CUCM I thought I'd ask on VoiceOps. Within a few minutes a very nice gentlemen by the name of Mark Holloway mentioned "Media Termination Point Required" as a CUCM configuration option. These were the magic words. After some research it turned out that was the configuration option I needed*. Thanks Mark!
Once "Media Termination Point Required" was enabled on the Cisco side I disabled 3pcc in FreeSWITCH and all was good. Users even get ringback now!
I also brought the issue up on the FreeSWITCH-Users mailing list and found out this has been bothering people for some time. MC from FreeSWITCH was even nice enough to start a wiki page for me to document all of this there.
Sometimes with SIP it's all about the SIMPLE achievements ;).
* That research also brought up another possibility: enabling PRACK/100rel on the CallManager side instead of "MTP Required". Of course the trouble with PRACK is there are a lot of SIP implementations (Asterisk) that don't support it. FreeSWITCH does but can crash. Many SIP implementations don't support the default CUCM configuration (INVITE w/o SDP). I was looking for the most canonical, compatible configuration possible.
A while back I was tasked to interface a FreeSWITCH server and a Cisco Unified Communications Manager system. Once the SIP trunk was configured on the Call Manager/CUCM side they sent an INVITE over. It didn't have an SDP.
It appeared that we needed to enable 3pcc (third party call control) in FreeSWITCH. No problem. I enabled 3pcc and interop continued.
Problems arose, however, when we needed to send the Cisco ringback. Whether it be a 180 or 183 (with or without SDP for either) this was going to be tough because with 3pcc enabled the dialog looked like so:
<-- Cisco
--> FreeSWITCH
INVITE (without SDP) <--
100 Trying -->
200 OK (with SDP) -->
ACK (with SDP) <--
So... There was no opportunity to signal progress as long as we 200 OKd the call almost immediately. Sure I probably could generate some ringback after the 200 but that would just be wrong!
As I like to say, the internet to the rescue. Not having much experience with CUCM I thought I'd ask on VoiceOps. Within a few minutes a very nice gentlemen by the name of Mark Holloway mentioned "Media Termination Point Required" as a CUCM configuration option. These were the magic words. After some research it turned out that was the configuration option I needed*. Thanks Mark!
Once "Media Termination Point Required" was enabled on the Cisco side I disabled 3pcc in FreeSWITCH and all was good. Users even get ringback now!
I also brought the issue up on the FreeSWITCH-Users mailing list and found out this has been bothering people for some time. MC from FreeSWITCH was even nice enough to start a wiki page for me to document all of this there.
Sometimes with SIP it's all about the SIMPLE achievements ;).
* That research also brought up another possibility: enabling PRACK/100rel on the CallManager side instead of "MTP Required". Of course the trouble with PRACK is there are a lot of SIP implementations (Asterisk) that don't support it. FreeSWITCH does but can crash. Many SIP implementations don't support the default CUCM configuration (INVITE w/o SDP). I was looking for the most canonical, compatible configuration possible.
Friday, February 5, 2010
(High Quality) VoIP on the iPhone
(Regular readers will note that my excessive use of parentheses has now spilled into my titles and first sentences)!
Ahhh Apple... Ahh the iPhone. Regardless of how you feel about this company or their product you can't doubt the market impact they've made over the last couple of years (decades perhaps?). Multitouch (NOT multitasking). App Store. iTunes. There are countless other blogs that discuss these topics so I don't need to. As usual I'm here to talk about VoIP.
For the last ten months or so I've been involved (part time) in another local venture. Voalte (pronounced volt) is a startup here in Sarasota, FL founded by Trey Lauderdale. When the Apple iPhone was announced Trey was working in sales for Emergin, a healthcare IT middleware provider. Trey noticed how incredibly arcane the mobile devices used in healthcare are when compared to this new device from Apple. Once the iPhone SDK was announced Trey knew he had to develop an iPhone application for healthcare. This application became Voalte One.
Voalte One is an iPhone application that provides voice, alarms, and text for healthcare point of care providers (that's nurses to you and I). The complete Voalte One solution is comprised of the following parts:
- iPhone
- Voalte Server (XMPP, LDAP, etc)
- Voalte Voice Server (FreeSWITCH using SIP + event socket)
- An overall excellent customer/user experience (also new to healthcare)
Text messaging and alarm integration are cool but as I've already said, I do VoIP. If you'd like to know more about iPhone development, XMPP, LDAP, etc let me know and I can point you in the right direction.
VoIP in our application is interesting. It's a softphone, technically, but unlike one you've ever seen before. As everyone knows the iPhone cannot run multiple applications. It can't background applications. These are just two of the many challenges introduced when developing a user-friendly, always available, reliable non-GSM phone experience for the iPhone. Simply downloading an off the shelf softphone and installing it on the iPhone is not enough.
We're a startup and we get to do cool things. For example, one of the big differences between VoIP/voice with Voalte One on the iPhone is the voice quality. We use G.722 wideband at 16kHz as our standard voice codec. Why? Because one Saturday (after a long night out) Trey and I were having lunch. I asked him if he thought we should set ourselves apart on something as basic as sample rate. After a little explanation on my part we quickly decided - why not?
As cool as G.722 is it introduces some interesting challenges:
- The iPhone. How are we going to get 16kHz audio from the hardware?
- PJSIP (our SIP stack). Does it support G.722? How does it interface with the audio hardware?
- Hospital PBXs. Voalte One interfaces with the hospital PBX as an ordinary extension. Most of them probably don't support G.722. How/where do we resample to the standard 8kHz used in G.711?
After looking through PJSIP and the available audio drivers for the iPhone we decided we needed to write our own. There were legal and technical reasons and I'm glad we did it. Especially because I didn't have to do most of the work! ;) We also confirmed PJSIP supports G.722.
Voalte has an amazing iPhone developer - Robbie Hanson. Robbie, Ben (Voalte CTO), and I were able to look over the available audio frameworks on the iPhone and pick the best. Not only is it the best overall (it supports echo cancellation, etc) it would provide us the sampling rate of 16kHz we knew we needed.
After working with PJSIP and AudioUnit for a while Robbie was able to write an iPhone audio driver (using AudioUnit, of course) for PJSIP. While working on the audio driver Robbie (along with another contributor) also wrote an Objective C wrapper for PJSIP. These are the raw ingredients of a high quality VoIP experience on the iPhone.
In the months leading up to release we had to deal with a plethora of other issues: push notifications, local ringback, wifi, etc, etc. I won't (and probably can't) describe these issues in detail.
The good news is Voalte has done the right thing and released the core components of this solution as open source.
I'm proud to work with companies that "get it" and are willing to actively participate in the free software ecosystem.
Ahhh Apple... Ahh the iPhone. Regardless of how you feel about this company or their product you can't doubt the market impact they've made over the last couple of years (decades perhaps?). Multitouch (NOT multitasking). App Store. iTunes. There are countless other blogs that discuss these topics so I don't need to. As usual I'm here to talk about VoIP.
For the last ten months or so I've been involved (part time) in another local venture. Voalte (pronounced volt) is a startup here in Sarasota, FL founded by Trey Lauderdale. When the Apple iPhone was announced Trey was working in sales for Emergin, a healthcare IT middleware provider. Trey noticed how incredibly arcane the mobile devices used in healthcare are when compared to this new device from Apple. Once the iPhone SDK was announced Trey knew he had to develop an iPhone application for healthcare. This application became Voalte One.
Voalte One is an iPhone application that provides voice, alarms, and text for healthcare point of care providers (that's nurses to you and I). The complete Voalte One solution is comprised of the following parts:
- iPhone
- Voalte Server (XMPP, LDAP, etc)
- Voalte Voice Server (FreeSWITCH using SIP + event socket)
- An overall excellent customer/user experience (also new to healthcare)
Text messaging and alarm integration are cool but as I've already said, I do VoIP. If you'd like to know more about iPhone development, XMPP, LDAP, etc let me know and I can point you in the right direction.
VoIP in our application is interesting. It's a softphone, technically, but unlike one you've ever seen before. As everyone knows the iPhone cannot run multiple applications. It can't background applications. These are just two of the many challenges introduced when developing a user-friendly, always available, reliable non-GSM phone experience for the iPhone. Simply downloading an off the shelf softphone and installing it on the iPhone is not enough.
We're a startup and we get to do cool things. For example, one of the big differences between VoIP/voice with Voalte One on the iPhone is the voice quality. We use G.722 wideband at 16kHz as our standard voice codec. Why? Because one Saturday (after a long night out) Trey and I were having lunch. I asked him if he thought we should set ourselves apart on something as basic as sample rate. After a little explanation on my part we quickly decided - why not?
As cool as G.722 is it introduces some interesting challenges:
- The iPhone. How are we going to get 16kHz audio from the hardware?
- PJSIP (our SIP stack). Does it support G.722? How does it interface with the audio hardware?
- Hospital PBXs. Voalte One interfaces with the hospital PBX as an ordinary extension. Most of them probably don't support G.722. How/where do we resample to the standard 8kHz used in G.711?
After looking through PJSIP and the available audio drivers for the iPhone we decided we needed to write our own. There were legal and technical reasons and I'm glad we did it. Especially because I didn't have to do most of the work! ;) We also confirmed PJSIP supports G.722.
Voalte has an amazing iPhone developer - Robbie Hanson. Robbie, Ben (Voalte CTO), and I were able to look over the available audio frameworks on the iPhone and pick the best. Not only is it the best overall (it supports echo cancellation, etc) it would provide us the sampling rate of 16kHz we knew we needed.
After working with PJSIP and AudioUnit for a while Robbie was able to write an iPhone audio driver (using AudioUnit, of course) for PJSIP. While working on the audio driver Robbie (along with another contributor) also wrote an Objective C wrapper for PJSIP. These are the raw ingredients of a high quality VoIP experience on the iPhone.
In the months leading up to release we had to deal with a plethora of other issues: push notifications, local ringback, wifi, etc, etc. I won't (and probably can't) describe these issues in detail.
The good news is Voalte has done the right thing and released the core components of this solution as open source.
I'm proud to work with companies that "get it" and are willing to actively participate in the free software ecosystem.
Wednesday, January 20, 2010
Testing with SIPP
A quick one, I promise...
I'd been having some issues testing Asterisk with sipp. It turns out there is a fairly well known issue with sipp when using five digit port numbers for RTP. A quick Google search found a solution pretty quickly.
Just in case that link ever goes dead, here's the diff:
diff -urb sipp.svn_orig/call.cpp sipp.svn_fixed/call.cpp
--- sipp.svn_orig/call.cpp 2008-12-19 13:14:51.000000000 +0300
+++ sipp.svn_fixed/call.cpp 2008-12-19 13:16:34.000000000 +0300
@@ -192,7 +192,7 @@
/* m=audio not found */
return 0;
}
- begin += strlen(pattern) - 1;
+ begin += strlen(pattern);
end = strstr(begin, "\r\n");
if (!end)
ERROR("get_remote_port_media: no CRLF found");
More on sipp later!
I'd been having some issues testing Asterisk with sipp. It turns out there is a fairly well known issue with sipp when using five digit port numbers for RTP. A quick Google search found a solution pretty quickly.
Just in case that link ever goes dead, here's the diff:
diff -urb sipp.svn_orig/call.cpp sipp.svn_fixed/call.cpp
--- sipp.svn_orig/call.cpp 2008-12-19 13:14:51.000000000 +0300
+++ sipp.svn_fixed/call.cpp 2008-12-19 13:16:34.000000000 +0300
@@ -192,7 +192,7 @@
/* m=audio not found */
return 0;
}
- begin += strlen(pattern) - 1;
+ begin += strlen(pattern);
end = strstr(begin, "\r\n");
if (!end)
ERROR("get_remote_port_media: no CRLF found");
More on sipp later!
Wednesday, March 25, 2009
CANCEL
I don't have a tremendous amount of time so this is going to be a short one.
There is a CANCEL related bug in Asterisk 1.4.23 versions. To be honest I'm not sure when it was introduced but I know when it was fixed:
http://bugs.digium.com/view.php?id=14431
This caused trouble for me because I often use Asterisk in tandem with OpenSIPS and FreeSWITCH. Both platforms were unable to match the CANCEL sent by Asterisk to the original INVITE. As the bug note says, this is because the CANCEL sent by Asterisk had a different branch parameter than the original INVITE. OpenSER/OpenSIPS would fail when checking t_check_trans()as long as method==CANCEL.
FreeSWITCH was a *little* easier to diagnose because it would send a 481. I suppose I could have made (and should make) my OpenSIPS configurations do this when using t_check_trans for CANCEL:
if (is_method("CANCEL")) {
if (!t_check_trans()) {
# No matching transaction, error and exit
sl_send_reply("481","Call leg/transaction does not exist");
exit;
}
# Hand it to tm
t_relay();
exit;
Anyways this has certainly been fixed in Asterisk 1.4.24. I'm looking forward to not dealing with any of this for some time...
There is a CANCEL related bug in Asterisk 1.4.23 versions. To be honest I'm not sure when it was introduced but I know when it was fixed:
http://bugs.digium.com/view.php?id=14431
This caused trouble for me because I often use Asterisk in tandem with OpenSIPS and FreeSWITCH. Both platforms were unable to match the CANCEL sent by Asterisk to the original INVITE. As the bug note says, this is because the CANCEL sent by Asterisk had a different branch parameter than the original INVITE. OpenSER/OpenSIPS would fail when checking t_check_trans()as long as method==CANCEL.
FreeSWITCH was a *little* easier to diagnose because it would send a 481. I suppose I could have made (and should make) my OpenSIPS configurations do this when using t_check_trans for CANCEL:
if (is_method("CANCEL")) {
if (!t_check_trans()) {
# No matching transaction, error and exit
sl_send_reply("481","Call leg/transaction does not exist");
exit;
}
# Hand it to tm
t_relay();
exit;
Anyways this has certainly been fixed in Asterisk 1.4.24. I'm looking forward to not dealing with any of this for some time...
Thursday, February 5, 2009
The update you've been waiting for...
UPDATE: Any updates for this and other SIP/RTP issues can be found here.
In my last post over one month ago, I ranted on and on (big surprise, right) about some issues with Sonus equipment we were experiencing. After learning more I should elaborate on "Sonus equipment".
Like many other manufacturers Sonus has multiple products. We'll be talking about their NBS SBC. Many providers use the NBS SBC in conjunction with GSX gateways and PSX route servers. I have no comments about GSX gateways or PSX route servers; this equipment is largely transparent to us "end users". My gripes are with the NBS SBC.
Providers that use Sonus NBS:
- Level(3) (w/ GSX)
- XO (w/ PSX & GSX)
- Global Crossing
- Broadvox
- Many others
If you are using these carriers for SIP services, be aware.
Last time I was talking about timestamps. This time it's far more insidious...
Apparently (as relayed to me from Level(3) engineers) Sonus has a DSP buffer limitation for RTP packet handling. If there is ever more than a 100ms (my experience has shown it to be much less) gap in RTP Sonus will in technical terms, "freak out".
We have now identified four RTP interop issues with Sonus equipment:
1) Sonus requires all RTP packets (events or voice) to have unique timestamps. The RFCs specifically state that not only is it valid to use the same timestamp for various RTP packets, it is ideal in some cases (like events, for example).
2) The RFC 2833 events generated by Sonus equipment are goofy, to put it lightly. The event duration increments do not match the packetization of the voice stream as stated in RFC 2833 and elaborated on in RFC 4733. Specifically, Sonus equipment increments RFC 2833 duration 80 samples
at a time as if the voice stream is 10ms (regardless of what it actually is). I don't know of any other implementations that do this. Even when the audio stream is *clearly* 20 ms (in the SDP, too) Sonus will continue to increment 80 samples at a time.
3) The most recent (and biggest problem) has been caused by the Sonus (seemingly arbitrary) requirement that there never be greater than 100ms gaps in RTP. This is inherently broken behavior for robustness in IP networks.
4) Sonus has yet another issue with RTP timing and sequencing... If a call is brought up with an endpoint that clocks it's own RTP stream (IVR server, for example) everything will be fine. Until the IVR server (or whatever) bridges that channel to another device that also clocks its own RTP. Sonus (probably related to #3 above) will lose sync and drop audio for up to several seconds while it catches up to the new RTP stream. This requires those of us that work with Sonus equipment to rewrite all timestamps and sequence numbers on our equipment; which has the adverse effect of less than optimal jitter buffering (which should ideally be done at each far endpoint).
Asterisk is largely ok with all of these issues, believe it or not. The one that still causes problems is #3. If you are using Asterisk and Sonus gateways, make DAMN SURE that you are using Packet2Packet bridging and that your devices (whatever they may be) implement RFC 2833 the Sonus way. If not...
NO DTMF FOR YOU!
If you are not using Packet2Packet bridging and your events need to traverse the Asterisk core (for features, fixup, or anything else) there will be a variable length RTP gap that often exceeds the Sonus DSP buffer requirement. With gaps in RTP...
NO DTMF FOR YOU!
FreeSWITCH is also ok as long as you avoid #4. FreeSWITCH provides the configuration option to rewrite timestamps and break jitter buffering. If you are using Sonus gateways you should enable it, otherwise...
NO DTMF FOR YOU!
All of this makes me wish I was around back in the old days when there was one telco and all DTMF was inband!
In my last post over one month ago, I ranted on and on (big surprise, right) about some issues with Sonus equipment we were experiencing. After learning more I should elaborate on "Sonus equipment".
Like many other manufacturers Sonus has multiple products. We'll be talking about their NBS SBC. Many providers use the NBS SBC in conjunction with GSX gateways and PSX route servers. I have no comments about GSX gateways or PSX route servers; this equipment is largely transparent to us "end users". My gripes are with the NBS SBC.
Providers that use Sonus NBS:
- Level(3) (w/ GSX)
- XO (w/ PSX & GSX)
- Global Crossing
- Broadvox
- Many others
If you are using these carriers for SIP services, be aware.
Last time I was talking about timestamps. This time it's far more insidious...
Apparently (as relayed to me from Level(3) engineers) Sonus has a DSP buffer limitation for RTP packet handling. If there is ever more than a 100ms (my experience has shown it to be much less) gap in RTP Sonus will in technical terms, "freak out".
We have now identified four RTP interop issues with Sonus equipment:
1) Sonus requires all RTP packets (events or voice) to have unique timestamps. The RFCs specifically state that not only is it valid to use the same timestamp for various RTP packets, it is ideal in some cases (like events, for example).
2) The RFC 2833 events generated by Sonus equipment are goofy, to put it lightly. The event duration increments do not match the packetization of the voice stream as stated in RFC 2833 and elaborated on in RFC 4733. Specifically, Sonus equipment increments RFC 2833 duration 80 samples
at a time as if the voice stream is 10ms (regardless of what it actually is). I don't know of any other implementations that do this. Even when the audio stream is *clearly* 20 ms (in the SDP, too) Sonus will continue to increment 80 samples at a time.
3) The most recent (and biggest problem) has been caused by the Sonus (seemingly arbitrary) requirement that there never be greater than 100ms gaps in RTP. This is inherently broken behavior for robustness in IP networks.
4) Sonus has yet another issue with RTP timing and sequencing... If a call is brought up with an endpoint that clocks it's own RTP stream (IVR server, for example) everything will be fine. Until the IVR server (or whatever) bridges that channel to another device that also clocks its own RTP. Sonus (probably related to #3 above) will lose sync and drop audio for up to several seconds while it catches up to the new RTP stream. This requires those of us that work with Sonus equipment to rewrite all timestamps and sequence numbers on our equipment; which has the adverse effect of less than optimal jitter buffering (which should ideally be done at each far endpoint).
Asterisk is largely ok with all of these issues, believe it or not. The one that still causes problems is #3. If you are using Asterisk and Sonus gateways, make DAMN SURE that you are using Packet2Packet bridging and that your devices (whatever they may be) implement RFC 2833 the Sonus way. If not...
NO DTMF FOR YOU!
If you are not using Packet2Packet bridging and your events need to traverse the Asterisk core (for features, fixup, or anything else) there will be a variable length RTP gap that often exceeds the Sonus DSP buffer requirement. With gaps in RTP...
NO DTMF FOR YOU!
FreeSWITCH is also ok as long as you avoid #4. FreeSWITCH provides the configuration option to rewrite timestamps and break jitter buffering. If you are using Sonus gateways you should enable it, otherwise...
NO DTMF FOR YOU!
All of this makes me wish I was around back in the old days when there was one telco and all DTMF was inband!
Wednesday, January 7, 2009
Heads up!
UPDATE: Any developments on this and other SIP/RTP issues can be found here.
Some serious issues for all of those of you in SIP land:
There is a pretty serious RTP problem with Sonus equipment that has been making the rounds...
Simply put, Sonus equipment will not accept two RTP packets with the same timestamp, even if the sequence number has been properly incremented. According to various RFCs (namely 1889 and 2833) this is perfectly valid and in some cases (like video) desired.
A few slight problems... Many implementations (including Asterisk AND FreeSWITCH) will (did -more on this later) send out RFC 2833 DTMF events with the same timestamp as the last voice RTP packet. This is perfectly valid according to the RFCs mentioned above.
It appears (after my own testing) that Sonus will actually drop BOTH the voice RTP packet and the event packet. After some testing against Sonus gear it was pretty clear that no audio was being passed as long as the DTMF event occured. This makes sense because per RFC2833 a variable length DTMF event must use the same timestamp, increment the sequence counter and increase the duration when it is resent - DO NOT change the timestamp. Oh Sonus.
Both Asterisk and FreeSWITCH have incremented workarounds to address this. They are similar but there is one key difference. Asterisk now (as of SVN 12/15/2008 or so) will always use a unique timestamp for every RTP packet. I guess that solves that problem. FreeSWITCH is slightly smarter about it (as of SVN about the same time, interestingly enough) but I"m worried...
FreeSWITCH will parse the SDP to find the originator line (o=). If it is equal to "Sonus_UAC" FreeSWITCH activates a specific workaround to always send RTP packets with different timestamps. This seems more elegant but I am worried they will have to expand this hack for other equipment in the future (requiring a code change and recompile).
One could argue that Sonus has gotten this far with their current implementation and expected behavior. While it is valid (per the RFCs) to use the same timestamp, it is more /compatible/ to always use different timestamps. That appears to be what most equipment does.
This issue is what (apparantly) caused so many issues for Teliax a while back while they switched from Asterisk to FreeSWITCH. At least that's what I heard. What doesn't make any sense is that Asterisk had the same behavior as FreeSWITCH - they both sent voice and event RTP packets with identical timestamps. So that part doesn't make any sense.
Also, one would like to think that when you provide voice services (which are pretty important to your customers) you would *test* something like DTMF when you were completely switching platforms. I discovered these issues while testing Star2Star with Level(3), for example. I'm glad I was paying attention. Our customers would have been upset with broken DTMF while we updated all of our Asterisk machines (several hundred).
I'm suprised no one noticed this until mid-December or so. It will be interesting to see what other things pop out of this mess...
Some serious issues for all of those of you in SIP land:
There is a pretty serious RTP problem with Sonus equipment that has been making the rounds...
Simply put, Sonus equipment will not accept two RTP packets with the same timestamp, even if the sequence number has been properly incremented. According to various RFCs (namely 1889 and 2833) this is perfectly valid and in some cases (like video) desired.
A few slight problems... Many implementations (including Asterisk AND FreeSWITCH) will (did -more on this later) send out RFC 2833 DTMF events with the same timestamp as the last voice RTP packet. This is perfectly valid according to the RFCs mentioned above.
It appears (after my own testing) that Sonus will actually drop BOTH the voice RTP packet and the event packet. After some testing against Sonus gear it was pretty clear that no audio was being passed as long as the DTMF event occured. This makes sense because per RFC2833 a variable length DTMF event must use the same timestamp, increment the sequence counter and increase the duration when it is resent - DO NOT change the timestamp. Oh Sonus.
Both Asterisk and FreeSWITCH have incremented workarounds to address this. They are similar but there is one key difference. Asterisk now (as of SVN 12/15/2008 or so) will always use a unique timestamp for every RTP packet. I guess that solves that problem. FreeSWITCH is slightly smarter about it (as of SVN about the same time, interestingly enough) but I"m worried...
FreeSWITCH will parse the SDP to find the originator line (o=). If it is equal to "Sonus_UAC" FreeSWITCH activates a specific workaround to always send RTP packets with different timestamps. This seems more elegant but I am worried they will have to expand this hack for other equipment in the future (requiring a code change and recompile).
One could argue that Sonus has gotten this far with their current implementation and expected behavior. While it is valid (per the RFCs) to use the same timestamp, it is more /compatible/ to always use different timestamps. That appears to be what most equipment does.
This issue is what (apparantly) caused so many issues for Teliax a while back while they switched from Asterisk to FreeSWITCH. At least that's what I heard. What doesn't make any sense is that Asterisk had the same behavior as FreeSWITCH - they both sent voice and event RTP packets with identical timestamps. So that part doesn't make any sense.
Also, one would like to think that when you provide voice services (which are pretty important to your customers) you would *test* something like DTMF when you were completely switching platforms. I discovered these issues while testing Star2Star with Level(3), for example. I'm glad I was paying attention. Our customers would have been upset with broken DTMF while we updated all of our Asterisk machines (several hundred).
I'm suprised no one noticed this until mid-December or so. It will be interesting to see what other things pop out of this mess...
Monday, December 22, 2008
Introducing Recqual
I've been waiting to talk about this one for a while.
Several months ago Star2Star was having problems with one of our upstream SIP carriers. We were starting to notice a large increase in the number of one way audio calls our customers were reporting.
When most people think of one way calls their first reaction is to blame SIP. Must be NAT! Must be a firewall! SIP sucks! Etc, etc.
I knew that wasn't the case. I just had to prove it.
I was convinced the problem wasn't SIP/UDP/IP related at all. We had multiple pcaps where we were sending RTP to the appropriate gateway. It just wasn't getting to the PSTN. Where was it going? When was this happening? Which gateways (out of hundreds) were the most problematic? We needed to know and we needed to know quickly.
I came up with and "wrote" recqual over a couple of days. After a few runs we were noticing patterns with problematic RTP endpoint IP addresses. Long story short, once these were identified we worked with the carrier to replace various bits of equipment (DSPs, line cards, etc). The one way audio problem has largely disappeared and we continue to run recqual. If this starts happening again we should know /BEFORE/ our customers do.
Of course I'm using Asterisk to place the calls. The best part of using Asterisk is it's multi-protocol flexibility. You should be able to test just about any combination of voice technologies - G.279a, G711, GSM, SIP, IAX, PRI, FXO, FXO, gtalk/jabber/jingle, skype, etc. The possibilities boggle the mind.
I've just been too busy to get it together and release this to the community - until now.
Tarball with instructions here.
Questions? Comments? Suggestions? Drop me a line.
Several months ago Star2Star was having problems with one of our upstream SIP carriers. We were starting to notice a large increase in the number of one way audio calls our customers were reporting.
When most people think of one way calls their first reaction is to blame SIP. Must be NAT! Must be a firewall! SIP sucks! Etc, etc.
I knew that wasn't the case. I just had to prove it.
I was convinced the problem wasn't SIP/UDP/IP related at all. We had multiple pcaps where we were sending RTP to the appropriate gateway. It just wasn't getting to the PSTN. Where was it going? When was this happening? Which gateways (out of hundreds) were the most problematic? We needed to know and we needed to know quickly.
I came up with and "wrote" recqual over a couple of days. After a few runs we were noticing patterns with problematic RTP endpoint IP addresses. Long story short, once these were identified we worked with the carrier to replace various bits of equipment (DSPs, line cards, etc). The one way audio problem has largely disappeared and we continue to run recqual. If this starts happening again we should know /BEFORE/ our customers do.
Of course I'm using Asterisk to place the calls. The best part of using Asterisk is it's multi-protocol flexibility. You should be able to test just about any combination of voice technologies - G.279a, G711, GSM, SIP, IAX, PRI, FXO, FXO, gtalk/jabber/jingle, skype, etc. The possibilities boggle the mind.
I've just been too busy to get it together and release this to the community - until now.
Tarball with instructions here.
Questions? Comments? Suggestions? Drop me a line.
Monday, November 17, 2008
SBCs are Killing SIP
Wow... Over a month since my last post! My how time flies.
No time to reminisce or catch up. I've got a rant that needs to get out - NOW.
SBCs (Session Border Controllers) are killing SIP. Breaking SIP. Smothering SIP. Especially when used by "carriers". Carriers and their SBCs I tells ya.
SBCs, technically, are pretty cool devices. While I certainly understand their purpose they tend to be overused, misconfigured, and misunderstood. Many entities deploy SBCs without any idea of the other components (I'm looking at you, proxies) that make up a well designed SIP network.
Why do I hate SBCs so much?
1) SIP is cool because it is end to end and designed with intelligent endpoints in mind (endpoints that can think for themselves).
2) SIP is very flexible, especially with regards to handling media.
3) Ubiquity.
SBCs (especially when misconfigured) break many of these features:
1) SBCs (by design) hide endpoints from one another. Both endpoints support G.722? The SBC doesn't and it's going to rewrite the SDP with it's capabilities. Too bad.
2) SBCs (by design) handle media. While this can be good often times it isn't and there are other, less drastic ways to ensure quality of media.
3) When the only tool you have is a hammer, every problem starts to look like a nail.
My biggest concerns with SBCs relate to the last point. I swear, there are many providers, enterprises, etc that have deployed SIP in some capacity using ONLY SBCs and simple UACs and UASs. They've never heard of a proxy. Or a registrar. Heck, I'd even go for a signalling-only B2BUA and call it a compromise. Chances are they've never heard of that either.
I have dealt with several devices that break down, utterly fall apart when used with a proxy. I've covered it on this blog before. I'm just too mad to look up the link now. Again, $MANUFACTURER designs and markets a SIP device. They only test it against SBCs and they've (apparently) never heard of a proxy. Guess what happens...
Some poor soul like myself tries to deploy said device in what I consider to be a well designed SIP network. Unfortunately for me, this call path might not involve an SBC. Guess what happens? The device doesn't understand traversing proxies (Record-Route, Via, etc) and does something silly like parse the Contact header when trying to send a response. Call failure and all kinds of brokenness ensue.
So... I talk to $MANUFACTURER and get the standard "We've deployed this device thousands of times and never seen this problem before". Let's assume that's true. I don't know what's more depressing: the fact that they skipped over multiple sections of a basic SIP RFC like 3261 or the fact that no one noticed it for this long because (apparantly) no one uses proxies anymore. Ugh. Gross.
It's not just device manufacturers. Carriers do this too. Often times the actual issue lies with their SBC. Many carriers (especially those using ACME SBCs, it seems) parse To: instead of the Request-URI. Probably because their customers are using SBCs too and Request-URI and To: match. Not so with a proxy. I don't blame the carrier's use of an SBC. This makes sense. That's what they were designed for. However, please test your device and configuration against something other than another SBC.
What happens if your Request-URI and To: don't match? They send a 404! Yet another RFC3261 violation. Section 8.2.2.1 allows for a UAS to route based off To (although it doesn't sound preffered). However, for the love of God, if you are going to deny a request because of the content of a To header, please send a 403 as specified in the RFC. Your 404s are confusing and ignorant. Was it really not found, or are you just routing based off To instead of the Request-URI? Once again I blame SBCs and a world where it's becoming common for SBCs to talk to each other (and nothing else).
This is yet another situation where assumptions are made based on the behavior of SBCs. It's bad. Please stop.
No time to reminisce or catch up. I've got a rant that needs to get out - NOW.
SBCs (Session Border Controllers) are killing SIP. Breaking SIP. Smothering SIP. Especially when used by "carriers". Carriers and their SBCs I tells ya.
SBCs, technically, are pretty cool devices. While I certainly understand their purpose they tend to be overused, misconfigured, and misunderstood. Many entities deploy SBCs without any idea of the other components (I'm looking at you, proxies) that make up a well designed SIP network.
Why do I hate SBCs so much?
1) SIP is cool because it is end to end and designed with intelligent endpoints in mind (endpoints that can think for themselves).
2) SIP is very flexible, especially with regards to handling media.
3) Ubiquity.
SBCs (especially when misconfigured) break many of these features:
1) SBCs (by design) hide endpoints from one another. Both endpoints support G.722? The SBC doesn't and it's going to rewrite the SDP with it's capabilities. Too bad.
2) SBCs (by design) handle media. While this can be good often times it isn't and there are other, less drastic ways to ensure quality of media.
3) When the only tool you have is a hammer, every problem starts to look like a nail.
My biggest concerns with SBCs relate to the last point. I swear, there are many providers, enterprises, etc that have deployed SIP in some capacity using ONLY SBCs and simple UACs and UASs. They've never heard of a proxy. Or a registrar. Heck, I'd even go for a signalling-only B2BUA and call it a compromise. Chances are they've never heard of that either.
I have dealt with several devices that break down, utterly fall apart when used with a proxy. I've covered it on this blog before. I'm just too mad to look up the link now. Again, $MANUFACTURER designs and markets a SIP device. They only test it against SBCs and they've (apparently) never heard of a proxy. Guess what happens...
Some poor soul like myself tries to deploy said device in what I consider to be a well designed SIP network. Unfortunately for me, this call path might not involve an SBC. Guess what happens? The device doesn't understand traversing proxies (Record-Route, Via, etc) and does something silly like parse the Contact header when trying to send a response. Call failure and all kinds of brokenness ensue.
So... I talk to $MANUFACTURER and get the standard "We've deployed this device thousands of times and never seen this problem before". Let's assume that's true. I don't know what's more depressing: the fact that they skipped over multiple sections of a basic SIP RFC like 3261 or the fact that no one noticed it for this long because (apparantly) no one uses proxies anymore. Ugh. Gross.
It's not just device manufacturers. Carriers do this too. Often times the actual issue lies with their SBC. Many carriers (especially those using ACME SBCs, it seems) parse To: instead of the Request-URI. Probably because their customers are using SBCs too and Request-URI and To: match. Not so with a proxy. I don't blame the carrier's use of an SBC. This makes sense. That's what they were designed for. However, please test your device and configuration against something other than another SBC.
What happens if your Request-URI and To: don't match? They send a 404! Yet another RFC3261 violation. Section 8.2.2.1 allows for a UAS to route based off To (although it doesn't sound preffered). However, for the love of God, if you are going to deny a request because of the content of a To header, please send a 403 as specified in the RFC. Your 404s are confusing and ignorant. Was it really not found, or are you just routing based off To instead of the Request-URI? Once again I blame SBCs and a world where it's becoming common for SBCs to talk to each other (and nothing else).
This is yet another situation where assumptions are made based on the behavior of SBCs. It's bad. Please stop.
Thursday, October 9, 2008
Submit Your SIP
Ever since I've started blogging and talking about SIP people have come out of the woodwork with SIP interop problems.
After giving a talk about SIP at Astricon 2008 I received several e-mails from audience members with specific SIP issues. I LOVE getting these e-mails.
Why? I love working on SIP issues. With all of the devices using SIP there is no shortage of interop problems. Just today a guy on the Asterisk mailing list had a problem with his Cisco AS5300 and Asterisk 1.2 Usually that wouldn't be a problem at all - many people (including myself) use this combination of hardware with great success.
Why was he having problems? His AS5300 was configured for GTD and Asterisk 1.2 (apparently) doesn't handle multipart SIP bodies very well. I was able to find a patch to Asterisk 1.4 to improve multipart body parsing. That was a fun one.
I got to thinking... There should be a place where people can exchange specific SIP interop tips and notes. Otherwise how are we supposed to get anything to work!?!?
I came up with such a place and it's called SubmitYourSip.com. I' ve started to fill it in a little but hopefully (with time) it will become somewhat of a SIP wiki (with a focus on interop, of course).
I'm just getting started on it but I'll be working on my MediaWiki syntax and going back through my e-mail to dig out some of these examples.
After giving a talk about SIP at Astricon 2008 I received several e-mails from audience members with specific SIP issues. I LOVE getting these e-mails.
Why? I love working on SIP issues. With all of the devices using SIP there is no shortage of interop problems. Just today a guy on the Asterisk mailing list had a problem with his Cisco AS5300 and Asterisk 1.2 Usually that wouldn't be a problem at all - many people (including myself) use this combination of hardware with great success.
Why was he having problems? His AS5300 was configured for GTD and Asterisk 1.2 (apparently) doesn't handle multipart SIP bodies very well. I was able to find a patch to Asterisk 1.4 to improve multipart body parsing. That was a fun one.
I got to thinking... There should be a place where people can exchange specific SIP interop tips and notes. Otherwise how are we supposed to get anything to work!?!?
I came up with such a place and it's called SubmitYourSip.com. I' ve started to fill it in a little but hopefully (with time) it will become somewhat of a SIP wiki (with a focus on interop, of course).
I'm just getting started on it but I'll be working on my MediaWiki syntax and going back through my e-mail to dig out some of these examples.
Monday, August 25, 2008
Nokia isn't ditching SIP
Someone sent me a link to a blog post about Nokia "turning it's back on VoIP". I know I can get pretty emotional from time to time, but at least I'm accurate when I do. At least I feel like I am, and that's all that matters, right?
Anyways, let's look at this post. The author points out that the new N78 and N96 no longer include the Symbian SIP client. This must mean Nokia is finally giving in to pressure from cell carriers in some huge scheme to enslave the mobile phone subscriber and direct all talk time over their network.
Follow up comments clarify the situation a bit. Nokia simply removed the interface to the SIP stack on the N series. It's still in the firmware and available to any third party developers, so the most "threatening" apps (Truphone, Gizmo, etc) will continue to work. People that just want to configure the phone to connect to their corporate/"personal" PBX will be out of luck. I love saying "personal" PBX. I have one. So do many of my friends. How geeky is that?
Why did the N series have a SIP interface in the first place? Anyone who has ever read my iPhone review knows that I believe teenagers and hipsters control a large chunk of the cellphone market which breaks down into three parts:
- Free (throw away) phones for moms, grandparents, and kids. Sign up for a two year contract and they're yours to keep! What a deal.
- Flashy phones for teenagers and hipsters (it's a better MP3 player/camera than phone)
- Serious business phones (Blackberry, Windows Mobile, most Symbian, etc)
The N series is (and always has been) a hipster phone. It's more likely to compete with the iPhone than the Blackberry or Nokia E-Series.
Speaking of the E-Series... The new Nokia E-71 still includes the SIP client. Nokia turning it's back on VoIP? I don't think so. Nokia learning a little more about their customers? Much more likely.
Anyways, let's look at this post. The author points out that the new N78 and N96 no longer include the Symbian SIP client. This must mean Nokia is finally giving in to pressure from cell carriers in some huge scheme to enslave the mobile phone subscriber and direct all talk time over their network.
Follow up comments clarify the situation a bit. Nokia simply removed the interface to the SIP stack on the N series. It's still in the firmware and available to any third party developers, so the most "threatening" apps (Truphone, Gizmo, etc) will continue to work. People that just want to configure the phone to connect to their corporate/"personal" PBX will be out of luck. I love saying "personal" PBX. I have one. So do many of my friends. How geeky is that?
Why did the N series have a SIP interface in the first place? Anyone who has ever read my iPhone review knows that I believe teenagers and hipsters control a large chunk of the cellphone market which breaks down into three parts:
- Free (throw away) phones for moms, grandparents, and kids. Sign up for a two year contract and they're yours to keep! What a deal.
- Flashy phones for teenagers and hipsters (it's a better MP3 player/camera than phone)
- Serious business phones (Blackberry, Windows Mobile, most Symbian, etc)
The N series is (and always has been) a hipster phone. It's more likely to compete with the iPhone than the Blackberry or Nokia E-Series.
Speaking of the E-Series... The new Nokia E-71 still includes the SIP client. Nokia turning it's back on VoIP? I don't think so. Nokia learning a little more about their customers? Much more likely.
Monday, July 7, 2008
An update on SIP DoS mitigation
After spending a couple of hours on the VoIP User's Conference last week I thought I'd keep working on my SIP DoS/DDoS script a bit and get it to the point where I'd like to run it on some of my systems (if only to collect statistics).
The new version includes several new features, the most exciting (and certainly controversial) are changes to the string pattern matching for SIP requests. I now block ALL tel: URIs by default. I don't like tel: URIs. I think they're anti-SIP and you shouldn't use them. Now my script won't let you (unless you disable it, of course).
Anyways, I had to do this because (as I've already mentioned), I changed the way pattern matching runs on SIP requests. Two big changes:
1) Support a (configurable) offset for searches into the packet.
2) Update the SIP method matches to match "$METHOD sip:"
First of all, we now (by default) only search the first 65 bytes of the packet. That should be more than enough to search the first line for the SIP method and URI. Speaking of URI, I now match the URI along with the method to prevent false matches. Before we were only matching on the method and it was causing some false positives because of things like the Allow: header (where all of the supported methods are listed).
We'll see how this goes.
One thing I wanted to bring up in the VoIP User's call last week but failed to do so is the possible use of OpenSER to protect Asterisk (and other systems) from attack. In addition to supporting cool things like SIP message length filtering (msg:len) you can also use the pike module for some basic (but slightly more intelligent) SIP rate limiting. Of course then you need to support an OpenSER config, which a lot of people don't want to do...
What else is new in the script? Basic support for udp and/or tcp, configurable bursting, fixes to the FORWARD support and more. Check it out, it's free!
The new version includes several new features, the most exciting (and certainly controversial) are changes to the string pattern matching for SIP requests. I now block ALL tel: URIs by default. I don't like tel: URIs. I think they're anti-SIP and you shouldn't use them. Now my script won't let you (unless you disable it, of course).
Anyways, I had to do this because (as I've already mentioned), I changed the way pattern matching runs on SIP requests. Two big changes:
1) Support a (configurable) offset for searches into the packet.
2) Update the SIP method matches to match "$METHOD sip:"
First of all, we now (by default) only search the first 65 bytes of the packet. That should be more than enough to search the first line for the SIP method and URI. Speaking of URI, I now match the URI along with the method to prevent false matches. Before we were only matching on the method and it was causing some false positives because of things like the Allow: header (where all of the supported methods are listed).
We'll see how this goes.
One thing I wanted to bring up in the VoIP User's call last week but failed to do so is the possible use of OpenSER to protect Asterisk (and other systems) from attack. In addition to supporting cool things like SIP message length filtering (msg:len) you can also use the pike module for some basic (but slightly more intelligent) SIP rate limiting. Of course then you need to support an OpenSER config, which a lot of people don't want to do...
What else is new in the script? Basic support for udp and/or tcp, configurable bursting, fixes to the FORWARD support and more. Check it out, it's free!
Wednesday, July 2, 2008
SIP DoS/DDoS Mitigation
An interesting thread came up on Asterisk-Users the other day...
Someone had problems with some miscreant attempting (and apparently succeeding) to bruteforce a SIP account on his Asterisk system.
That's a problem. Asterisk currently has no means to protect against any type of SIP flooding/bruteforce/DoS (other than running out of resources/crashing). That's ok because some people (like myself) would argue that these types of attacks are best handled via other means...
Like the kernel, or preferably the kernel of a completely seperate box.
It got me to thinking - maybe I could whip up a script using some of the cool stuff in iptables/netfilter to mitigate these SIP DoS attacks in the kernel. I don't have a ton of time to elaborate on the script but for now here it is:
http://admin.star2star.com/sipdos
It's not great. It's not very SIP-specific. Who knows how accurate/effective/resource intensive it is. It hasn't been tested (much). For all I know, it will just make things worse. However, I think it is a step in the right direction and hopefully one of many tools that can be used to protect Asterisk and other SIP/VoIP systems.
By the way, I call it "SIP DoS Mitigation" because with any large enough DDoS attack, you are toast. ..
Speaking of other tools... We're all going to celebrate America's birthday this Friday by getting together on a conference call/IRC to talk over some of these issues with the VoIP User's Conference. Hear you there?
Someone had problems with some miscreant attempting (and apparently succeeding) to bruteforce a SIP account on his Asterisk system.
That's a problem. Asterisk currently has no means to protect against any type of SIP flooding/bruteforce/DoS (other than running out of resources/crashing). That's ok because some people (like myself) would argue that these types of attacks are best handled via other means...
Like the kernel, or preferably the kernel of a completely seperate box.
It got me to thinking - maybe I could whip up a script using some of the cool stuff in iptables/netfilter to mitigate these SIP DoS attacks in the kernel. I don't have a ton of time to elaborate on the script but for now here it is:
http://admin.star2star.com/sipdos
It's not great. It's not very SIP-specific. Who knows how accurate/effective/resource intensive it is. It hasn't been tested (much). For all I know, it will just make things worse. However, I think it is a step in the right direction and hopefully one of many tools that can be used to protect Asterisk and other SIP/VoIP systems.
By the way, I call it "SIP DoS Mitigation" because with any large enough DDoS attack, you are toast. ..
Speaking of other tools... We're all going to celebrate America's birthday this Friday by getting together on a conference call/IRC to talk over some of these issues with the VoIP User's Conference. Hear you there?
Friday, March 21, 2008
Why I (and presumably other people) hate telecom
Q.931.
That's it. Q-freaking-9-3-1. I suppose it's not bad considering it was developed so long ago, but really for me this protocol (and all it stands for) is why so many people (myself included) get frustrated with telecom.
I spent two hours today trying to get a call up (over PRI) from an Asterisk system to a Cisco gateway. I covered all of the basics:
Asterisk - pri_net - check
Cisco - pri_cpe - check
Asterisk - master clocking - check
T1 params - B8ZS, ESF - check
Switchtype - national/NI2 - check
D chan - 24 - check
T1 crossover cable - check
Voila! D channel up in no time (seriously, five minutes). Try to send a call - SURE, everything looks good (including caller id). The call gets setup and once there is about 1 sec of audio it gets torn down. Hmmm... Granted I am running SIP on the other side of the Cisco gateway (AS5350XM) so I start there. Let's face, if you're reading this blog you know I deal with SIP quite a bit. I'd much rather look at it than the alternative - Q.931...
Sure enough everything looks good on the SIP side. Perfect, actually. Ok fine, I guess we're going to have to go Q.931. I enable Q.931 debugging:
Asterisk:
asterisk -r
pri debug span 3
Cisco:
debug isdn q931
term mon (to log to my SSH session)
Try the call again. Same thing - one second of audio, disconnect. Here is what I get:
Cisco debug:
Mar 21 23:13:45.307: ISDN Se3/0:23 Q931: Applying typeplan for sw-type 0xD is 0x2 0x1, Calling num 5043223199
Mar 21 23:13:45.311: ISDN Se3/0:23 Q931: Applying typeplan for sw-type 0xD is 0x2 0x1, Called num 19412340001
Mar 21 23:13:45.311: ISDN Se3/0:23 Q931: TX -> SETUP pd = 8 callref = 0x00A3
Bearer Capability i = 0x8090A2
Standard = CCITT
Transfer Capability = Speech
Transfer Mode = Circuit
Transfer Rate = 64 kbit/s
Channel ID i = 0xA18381
Preferred, Channel 1
Display i = 'Pcom2'
Calling Party Number i = 0x2180, '5043223199'
Plan:ISDN, Type:National
Called Party Number i = 0xA1, '19412340001'
Plan:ISDN, Type:National
Mar 21 23:13:45.323: ISDN Se3/0:23 Q931: RX <- CALL_PROC pd = 8 callref = 0x80A3
Channel ID i = 0xA98381
Exclusive, Channel 1
Mar 21 23:13:46.371: ISDN Se3/0:23 Q931: RX <- CONNECT pd = 8 callref = 0x80A3
Channel ID i = 0xA98381
Exclusive, Channel 1
Progress Ind i = 0x8182 - Destination address is non-ISDN
Mar 21 23:13:46.371: ISDN Se3/0:23 Q931: TX -> CONNECT_ACK pd = 8 callref = 0x00A3
Mar 21 23:13:46.379: ISDN Se3/0:23 Q931: RX <- STATUS pd = 8 callref = 0x80A3
Cause i = 0x80E2 - Message not compatible with call state or not implemented
Call State i = 0x0A
Mar 21 23:13:46.379: ISDN Se3/0:23 Q931: TX -> RELEASE pd = 8 callref = 0x00A3
Cause i = 0x80E408 - Invalid information element contents
Mar 21 23:13:46.403: ISDN Se3/0:23 Q931: RX <- RELEASE_COMP pd = 8 callref = 0x80A3
Cause i = 0x8190 - Normal call clearing
Asterisk debug:
< Protocol Discriminator: Q.931 (8) len=50
< Call Ref: len= 2 (reference 164/0xA4) (Originator)
< Message type: SETUP (5)
< [04 03 80 90 a2]
< Bearer Capability (len= 5) [ Ext: 1 Q.931 Std: 0 Info transfer capability: Speech (0)
< Ext: 1 Trans mode/rate: 64kbps, circuit-mode (16)
< Ext: 1 User information layer 1: u-Law (34)
< [18 03 a1 83 81]
< Channel ID (len= 5) [ Ext: 1 IntID: Implicit, PRI Spare: 0, Preferred Dchan: 0
< ChanSel: Reserved
< Ext: 1 Coding: 0 Number Specified Channel Type: 3
< Ext: 1 Channel: 1 ]
< [28 05 50 63 6f 6d 32]
< Display (len= 5) [ Pcom2 ]
< [6c 0c 21 80 35 30 34 33 32 32 33 31 39 39]
< Calling Number (len=14) [ Ext: 0 TON: National Number (2) NPI: ISDN/Telephony Numbering Plan (E.164/E.163) (1)
< Presentation: Presentation permitted, user number not screened (0) '5043223199' ]
< [70 0c a1 31 39 34 31 32 33 34 30 30 30 31]
< Called Number (len=14) [ Ext: 1 TON: National Number (2) NPI: ISDN/Telephony Numbering Plan (E.164/E.163) (1) '19412340001' ]
-- Making new call for cr 164
-- Processing Q.931 Call Setup
-- Processing IE 4 (cs0, Bearer Capability)
-- Processing IE 24 (cs0, Channel Identification)
-- Processing IE 40 (cs0, Display)
-- Processing IE 108 (cs0, Calling Party Number)
-- Processing IE 112 (cs0, Called Party Number)
> Protocol Discriminator: Q.931 (8) len=10
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: CALL PROCEEDING (2)
> [18 03 a9 83 81]
> Channel ID (len= 5) [ Ext: 1 IntID: Implicit, PRI Spare: 0, Exclusive Dchan: 0
> ChanSel: Reserved
> Ext: 1 Coding: 0 Number Specified Channel Type: 3
> Ext: 1 Channel: 1 ]
-- Accepting call from '5043223199' to '19412340001' on channel 0/1, span 3
-- Executing Wait("Zap/49-1", "1") in new stack
-- Executing Answer("Zap/49-1", "") in new stack
> Protocol Discriminator: Q.931 (8) len=14
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: CONNECT (7)
> [18 03 a9 83 81]
> Channel ID (len= 5) [ Ext: 1 IntID: Implicit, PRI Spare: 0, Exclusive Dchan: 0
> ChanSel: Reserved
> Ext: 1 Coding: 0 Number Specified Channel Type: 3
> Ext: 1 Channel: 1 ]
> [1e 02 81 82]
> Progress Indicator (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: Private network serving the local user (1)
> Ext: 1 Progress Description: Called equipment is non-ISDN. (2) ]
-- Executing MusicOnHold("Zap/49-1", "") in new stack
-- Started music on hold, class 'default', on channel 'Zap/49-1'
< Protocol Discriminator: Q.931 (8) len=5
< Call Ref: len= 2 (reference 164/0xA4) (Originator)
< Message type: CONNECT ACKNOWLEDGE (15)
> Protocol Discriminator: Q.931 (8) len=12
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: STATUS (125)
> [08 02 80 e2]
> Cause (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: User (0)
> Ext: 1 Cause: Wrong message (98), class = Protocol Error (6) ]
> [14 01 0a]
> Call State (len= 3) [ Ext: 0 Coding: CCITT (ITU) standard (0) Call state: Active (10)
< Protocol Discriminator: Q.931 (8) len=10
< Call Ref: len= 2 (reference 164/0xA4) (Originator)
< Message type: RELEASE (77)
< [08 03 80 e4 08]
< Cause (len= 5) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: User (0)
< Ext: 1 Cause: Invalid information element contents (100), class = Protocol Error (6) ]
< Cause data 1: 08 (8)
-- Processing IE 8 (cs0, Cause)
-- Channel 0/1, span 3 got hangup
-- Stopped music on hold on Zap/49-1
== Spawn extension (pri-in, 19412340001, 3) exited non-zero on 'Zap/49-1'
NEW_HANGUP DEBUG: Calling q931_hangup, ourstate Null, peerstate Release Request
> Protocol Discriminator: Q.931 (8) len=9
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: RELEASE COMPLETE (90)
> [08 02 81 90]
> Cause (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: Private network serving the local user (1)
> Ext: 1 Cause: Normal Clearing (16), class = Normal Event (1) ]
NEW_HANGUP DEBUG: Calling q931_hangup, ourstate Null, peerstate Null
NEW_HANGUP DEBUG: Destroying the call, ourstate Null, peerstate Null
-- Hungup 'Zap/49-1'
s2s-srq-co*CLI>
Everytime I have to look at Q931 I just cringe. I MEAN CRINGE. I hate it. Look at it - all of these goofy messages, number plans, and the worst - IEs (Information Elements). Compare this to SIP (there are plenty of examples on this blog). I guess I just take it for granted. I can fire up ngrep on a network interface, turn on interpretation of carriage returns/linefeeds and go to town with something that just makes more sense. Sure to some people it probably still looks like gibberish but anyone that complains certainly hasn't seen much Q.931!
Let's look at my current problem... It appears that once the call is setup, Asterisk sends a CONNECT message, to which the Cisco quickly acknowledges with a CONNECT ACKNOWLEDGMENT (big surprise there). Here's where things get a little strange... The next message is from Asterisk (STATUS) complaining with "Message not compatible with call state or not implemented". That is pretty helpful, I'll give them that. However, what's not compatible about a standard, simple CONNECT ACK?!?!
This is wear I start to get angry. I'd like to see what's going on a little bit more. I know I have some flexibility with my IEs, for example. If I turn up the debug high enough in Asterisk I can see which IEs Asterisk identifies. I just can't see the data. From a debug standpoint, Cisco appears to give me even less. Cisco does, however, give me some pretty decent control of newer and less than standard IEs to send (and to a lesser extent, receive). That's gotta have something to do with it, I'm sure. It's even telling me "invalid information element contents". Too bad I can't actually see the IE content... Being severely limited with the tools at hand, I began to cycle through all of my IE, number plan, etc options on both sides. I even got lazy at one point and tried to set my switchtype to dms100! No dice.
Want to know the craziest part about all of this? I've done this several times before. I've brought up many a PRI to carriers, with all kinds of switchtypes, b/d channel configs, into all kinds of equipment (including Asterisk and Cisco). I've gone from Asterisk PRI -> Cisco PRI, Cisco PRI -> Cisco PRI, Asterisk PRI -> Asterisk PRI and every other combination you could imagine (although I think I covered all of them). I've never really had problems, although whenever there is a problem it means going through lines and lines of less-than-helpful Q.931 messaging to identify the problem, which gets me back to my point.
That's just the problem with telecom. It's old, slow, and inflexible. If this were a problem with SIP I could grab the packet stream with tcpdump and load it into Wireshark if I got really desperate (I'm rarely that desperate). I could watch it in real time with ngrep, complete with regex matching on payload and BPF syntax. I could try multiple SIP libraries and multiple clients. I could even tap into the WEALTH of information about SIP, including the various RFCs. Sure I know there are a lot of them but hey, at least you know where to look. I could even try different hardware very easily because hey, you don't need a $500 T1 card to play with SIP. Heck you don't even need a network. VmWare or even good ol' fashioned lo0 work just fine.
How can I do this with ISDN? Buy a T-Berd? No thanks. So here it is, Friday night, and I'm obsessing over the PRI that kicked my ass today. Anyone have any ideas?
That's it. Q-freaking-9-3-1. I suppose it's not bad considering it was developed so long ago, but really for me this protocol (and all it stands for) is why so many people (myself included) get frustrated with telecom.
I spent two hours today trying to get a call up (over PRI) from an Asterisk system to a Cisco gateway. I covered all of the basics:
Asterisk - pri_net - check
Cisco - pri_cpe - check
Asterisk - master clocking - check
T1 params - B8ZS, ESF - check
Switchtype - national/NI2 - check
D chan - 24 - check
T1 crossover cable - check
Voila! D channel up in no time (seriously, five minutes). Try to send a call - SURE, everything looks good (including caller id). The call gets setup and once there is about 1 sec of audio it gets torn down. Hmmm... Granted I am running SIP on the other side of the Cisco gateway (AS5350XM) so I start there. Let's face, if you're reading this blog you know I deal with SIP quite a bit. I'd much rather look at it than the alternative - Q.931...
Sure enough everything looks good on the SIP side. Perfect, actually. Ok fine, I guess we're going to have to go Q.931. I enable Q.931 debugging:
Asterisk:
asterisk -r
pri debug span 3
Cisco:
debug isdn q931
term mon (to log to my SSH session)
Try the call again. Same thing - one second of audio, disconnect. Here is what I get:
Cisco debug:
Mar 21 23:13:45.307: ISDN Se3/0:23 Q931: Applying typeplan for sw-type 0xD is 0x2 0x1, Calling num 5043223199
Mar 21 23:13:45.311: ISDN Se3/0:23 Q931: Applying typeplan for sw-type 0xD is 0x2 0x1, Called num 19412340001
Mar 21 23:13:45.311: ISDN Se3/0:23 Q931: TX -> SETUP pd = 8 callref = 0x00A3
Bearer Capability i = 0x8090A2
Standard = CCITT
Transfer Capability = Speech
Transfer Mode = Circuit
Transfer Rate = 64 kbit/s
Channel ID i = 0xA18381
Preferred, Channel 1
Display i = 'Pcom2'
Calling Party Number i = 0x2180, '5043223199'
Plan:ISDN, Type:National
Called Party Number i = 0xA1, '19412340001'
Plan:ISDN, Type:National
Mar 21 23:13:45.323: ISDN Se3/0:23 Q931: RX <- CALL_PROC pd = 8 callref = 0x80A3
Channel ID i = 0xA98381
Exclusive, Channel 1
Mar 21 23:13:46.371: ISDN Se3/0:23 Q931: RX <- CONNECT pd = 8 callref = 0x80A3
Channel ID i = 0xA98381
Exclusive, Channel 1
Progress Ind i = 0x8182 - Destination address is non-ISDN
Mar 21 23:13:46.371: ISDN Se3/0:23 Q931: TX -> CONNECT_ACK pd = 8 callref = 0x00A3
Mar 21 23:13:46.379: ISDN Se3/0:23 Q931: RX <- STATUS pd = 8 callref = 0x80A3
Cause i = 0x80E2 - Message not compatible with call state or not implemented
Call State i = 0x0A
Mar 21 23:13:46.379: ISDN Se3/0:23 Q931: TX -> RELEASE pd = 8 callref = 0x00A3
Cause i = 0x80E408 - Invalid information element contents
Mar 21 23:13:46.403: ISDN Se3/0:23 Q931: RX <- RELEASE_COMP pd = 8 callref = 0x80A3
Cause i = 0x8190 - Normal call clearing
Asterisk debug:
< Protocol Discriminator: Q.931 (8) len=50
< Call Ref: len= 2 (reference 164/0xA4) (Originator)
< Message type: SETUP (5)
< [04 03 80 90 a2]
< Bearer Capability (len= 5) [ Ext: 1 Q.931 Std: 0 Info transfer capability: Speech (0)
< Ext: 1 Trans mode/rate: 64kbps, circuit-mode (16)
< Ext: 1 User information layer 1: u-Law (34)
< [18 03 a1 83 81]
< Channel ID (len= 5) [ Ext: 1 IntID: Implicit, PRI Spare: 0, Preferred Dchan: 0
< ChanSel: Reserved
< Ext: 1 Coding: 0 Number Specified Channel Type: 3
< Ext: 1 Channel: 1 ]
< [28 05 50 63 6f 6d 32]
< Display (len= 5) [ Pcom2 ]
< [6c 0c 21 80 35 30 34 33 32 32 33 31 39 39]
< Calling Number (len=14) [ Ext: 0 TON: National Number (2) NPI: ISDN/Telephony Numbering Plan (E.164/E.163) (1)
< Presentation: Presentation permitted, user number not screened (0) '5043223199' ]
< [70 0c a1 31 39 34 31 32 33 34 30 30 30 31]
< Called Number (len=14) [ Ext: 1 TON: National Number (2) NPI: ISDN/Telephony Numbering Plan (E.164/E.163) (1) '19412340001' ]
-- Making new call for cr 164
-- Processing Q.931 Call Setup
-- Processing IE 4 (cs0, Bearer Capability)
-- Processing IE 24 (cs0, Channel Identification)
-- Processing IE 40 (cs0, Display)
-- Processing IE 108 (cs0, Calling Party Number)
-- Processing IE 112 (cs0, Called Party Number)
> Protocol Discriminator: Q.931 (8) len=10
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: CALL PROCEEDING (2)
> [18 03 a9 83 81]
> Channel ID (len= 5) [ Ext: 1 IntID: Implicit, PRI Spare: 0, Exclusive Dchan: 0
> ChanSel: Reserved
> Ext: 1 Coding: 0 Number Specified Channel Type: 3
> Ext: 1 Channel: 1 ]
-- Accepting call from '5043223199' to '19412340001' on channel 0/1, span 3
-- Executing Wait("Zap/49-1", "1") in new stack
-- Executing Answer("Zap/49-1", "") in new stack
> Protocol Discriminator: Q.931 (8) len=14
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: CONNECT (7)
> [18 03 a9 83 81]
> Channel ID (len= 5) [ Ext: 1 IntID: Implicit, PRI Spare: 0, Exclusive Dchan: 0
> ChanSel: Reserved
> Ext: 1 Coding: 0 Number Specified Channel Type: 3
> Ext: 1 Channel: 1 ]
> [1e 02 81 82]
> Progress Indicator (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: Private network serving the local user (1)
> Ext: 1 Progress Description: Called equipment is non-ISDN. (2) ]
-- Executing MusicOnHold("Zap/49-1", "") in new stack
-- Started music on hold, class 'default', on channel 'Zap/49-1'
< Protocol Discriminator: Q.931 (8) len=5
< Call Ref: len= 2 (reference 164/0xA4) (Originator)
< Message type: CONNECT ACKNOWLEDGE (15)
> Protocol Discriminator: Q.931 (8) len=12
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: STATUS (125)
> [08 02 80 e2]
> Cause (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: User (0)
> Ext: 1 Cause: Wrong message (98), class = Protocol Error (6) ]
> [14 01 0a]
> Call State (len= 3) [ Ext: 0 Coding: CCITT (ITU) standard (0) Call state: Active (10)
< Protocol Discriminator: Q.931 (8) len=10
< Call Ref: len= 2 (reference 164/0xA4) (Originator)
< Message type: RELEASE (77)
< [08 03 80 e4 08]
< Cause (len= 5) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: User (0)
< Ext: 1 Cause: Invalid information element contents (100), class = Protocol Error (6) ]
< Cause data 1: 08 (8)
-- Processing IE 8 (cs0, Cause)
-- Channel 0/1, span 3 got hangup
-- Stopped music on hold on Zap/49-1
== Spawn extension (pri-in, 19412340001, 3) exited non-zero on 'Zap/49-1'
NEW_HANGUP DEBUG: Calling q931_hangup, ourstate Null, peerstate Release Request
> Protocol Discriminator: Q.931 (8) len=9
> Call Ref: len= 2 (reference 164/0xA4) (Terminator)
> Message type: RELEASE COMPLETE (90)
> [08 02 81 90]
> Cause (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: Private network serving the local user (1)
> Ext: 1 Cause: Normal Clearing (16), class = Normal Event (1) ]
NEW_HANGUP DEBUG: Calling q931_hangup, ourstate Null, peerstate Null
NEW_HANGUP DEBUG: Destroying the call, ourstate Null, peerstate Null
-- Hungup 'Zap/49-1'
s2s-srq-co*CLI>
Everytime I have to look at Q931 I just cringe. I MEAN CRINGE. I hate it. Look at it - all of these goofy messages, number plans, and the worst - IEs (Information Elements). Compare this to SIP (there are plenty of examples on this blog). I guess I just take it for granted. I can fire up ngrep on a network interface, turn on interpretation of carriage returns/linefeeds and go to town with something that just makes more sense. Sure to some people it probably still looks like gibberish but anyone that complains certainly hasn't seen much Q.931!
Let's look at my current problem... It appears that once the call is setup, Asterisk sends a CONNECT message, to which the Cisco quickly acknowledges with a CONNECT ACKNOWLEDGMENT (big surprise there). Here's where things get a little strange... The next message is from Asterisk (STATUS) complaining with "Message not compatible with call state or not implemented". That is pretty helpful, I'll give them that. However, what's not compatible about a standard, simple CONNECT ACK?!?!
This is wear I start to get angry. I'd like to see what's going on a little bit more. I know I have some flexibility with my IEs, for example. If I turn up the debug high enough in Asterisk I can see which IEs Asterisk identifies. I just can't see the data. From a debug standpoint, Cisco appears to give me even less. Cisco does, however, give me some pretty decent control of newer and less than standard IEs to send (and to a lesser extent, receive). That's gotta have something to do with it, I'm sure. It's even telling me "invalid information element contents". Too bad I can't actually see the IE content... Being severely limited with the tools at hand, I began to cycle through all of my IE, number plan, etc options on both sides. I even got lazy at one point and tried to set my switchtype to dms100! No dice.
Want to know the craziest part about all of this? I've done this several times before. I've brought up many a PRI to carriers, with all kinds of switchtypes, b/d channel configs, into all kinds of equipment (including Asterisk and Cisco). I've gone from Asterisk PRI -> Cisco PRI, Cisco PRI -> Cisco PRI, Asterisk PRI -> Asterisk PRI and every other combination you could imagine (although I think I covered all of them). I've never really had problems, although whenever there is a problem it means going through lines and lines of less-than-helpful Q.931 messaging to identify the problem, which gets me back to my point.
That's just the problem with telecom. It's old, slow, and inflexible. If this were a problem with SIP I could grab the packet stream with tcpdump and load it into Wireshark if I got really desperate (I'm rarely that desperate). I could watch it in real time with ngrep, complete with regex matching on payload and BPF syntax. I could try multiple SIP libraries and multiple clients. I could even tap into the WEALTH of information about SIP, including the various RFCs. Sure I know there are a lot of them but hey, at least you know where to look. I could even try different hardware very easily because hey, you don't need a $500 T1 card to play with SIP. Heck you don't even need a network. VmWare or even good ol' fashioned lo0 work just fine.
How can I do this with ISDN? Buy a T-Berd? No thanks. So here it is, Friday night, and I'm obsessing over the PRI that kicked my ass today. Anyone have any ideas?
Friday, February 22, 2008
Obscure RFC 3398 Support with OpenSER
I hate SS7. Let me say that again. I HATE SS7. I mean it's cool and everything, but there are parts of it that are unnecessarily confusing...
For instance, this week I got very confused over use of the term "cic" in telco-speak. I had thought cic was "circuit identification code". So, when one of our partners asked me to support embedding the cic in the incoming SIP INVITE, I freaked out. Why is that my responsibility? Why should my proxy ("class 5 switch") have to maintain state of individual DS0s on YOUR media gateways. This sucks! I bitterly started reading RFC 3398 (particularly section 7.2.1.1) and gave up halfway through it, becoming frustrated with the overall language of the section. After all, keeping track of DS0s just isn't my thing, and that's what CICs are about, right?
Turns out I was wrong. Keeping tracks of CICs is my thing. The problem is, these CICs are "Carrier Identification Codes", not "Circuit Identification Codes". I know everyone in networking and telecom loves to have an acronym for everything, but for GODs SAKE PEOPLE don't use the same acronym in the same context. Reading on in RFC 3398, they specifically warn for this confusion:
..
..
Good point. Why didn't you say something sooner? So, now it turns out all I have to do is append a cic= (CARRIER identification code) parameter to the SIP RURI (request uniform resource identifier) on the outbound INVITE to the SIP <-> SS7 gateway. No problem. Thanks to OpenSER, module textops, and subst_uri, this is all I need to do:
avp_printf("$avp(s:cic)", "+15062"); # Set the cic value in an AVP
if (subst_uri('/^sip:([0-9]+)@(.*)$/sip:\1;cic=$avp(s:cic)@\2/i')) {
xlog("Added CIC $avp(s:cic) to RURI"); # tell us about it
};
Here is the INVITE going out to the gateway (cic in red):
U a.b.c.d:5060 -> w.x.y.z:5060
INVITE sip:14145551212;cic=+15062@w.x.y.z:5060 SIP/2.0.
Record-Route:.
Via: SIP/2.0/UDP a.b.c.d;branch=z9hG4bKf1e2.8a4d41b6.0.
Via: SIP/2.0/UDP e.f.g.h:5060;branch=z9hG4bK4e28b419;rport=5060.
From: "Polycom 320";tag=as477acc5f.
To:.
Contact:.
Call-ID: 343b661709e271c30bdfafe923a82adb@e.f.g.h.
CSeq: 103 INVITE.
User-Agent: Star2Star StarBox astlinux-s2s-1438-net4801.
Max-Forwards: 69.
Remote-Party-ID: "Polycom 320";privacy=off;screen=no.
Date: Fri, 22 Feb 2008 19:48:44 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
Content-Type: application/sdp.
Content-Length: 265.
.
v=0.
o=root 6183 6184 IN IP4 e.f.g.h.
s=session.
c=IN IP4 e.f.g.h.
t=0 0.
m=audio 19770 RTP/AVP 18 0 101.
a=rtpmap:18 G729/8000.
a=fmtp:18 annexb=no.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
Yeah, yeah, you like that?!?
I'm still not sure of the actual CIC format... That isn't EXACTLY what is shown in section 8.2.1.1 of RFC 3398 but it does look like what Cisco expects from their docs I've read (from the BTS 10200). After all, this particular SIP <-> SS7 gateway is made by them (PGW 2200).
Obviously you can get fancy with that and do all sorts of conditionals and DB calls to get the CIC, only append it on certain calls, different CICs for different sources, destinations, etc but the point is you can tweak up the RURI and insert it. I still haven't tested this against Cisco but I'm pretty sure it will work...
After I sorted out my "CIC confusion" it took me less than 30 minutes to implement this. Thanks OpenSER. You make SS7 a lot easier to deal with (as long as SIP is involved)!
For instance, this week I got very confused over use of the term "cic" in telco-speak. I had thought cic was "circuit identification code". So, when one of our partners asked me to support embedding the cic in the incoming SIP INVITE, I freaked out. Why is that my responsibility? Why should my proxy ("class 5 switch") have to maintain state of individual DS0s on YOUR media gateways. This sucks! I bitterly started reading RFC 3398 (particularly section 7.2.1.1) and gave up halfway through it, becoming frustrated with the overall language of the section. After all, keeping track of DS0s just isn't my thing, and that's what CICs are about, right?
Turns out I was wrong. Keeping tracks of CICs is my thing. The problem is, these CICs are "Carrier Identification Codes", not "Circuit Identification Codes". I know everyone in networking and telecom loves to have an acronym for everything, but for GODs SAKE PEOPLE don't use the same acronym in the same context. Reading on in RFC 3398, they specifically warn for this confusion:
..
If the 'cic=' parameter is present in the Request-URI, the
gateway SHOULD consult local policy to make sure that it is
appropriate to transmit this Carrier Identification Code (CIC, not to
be confused with the MTP3 'circuit identification code') in the IAM;
..
Good point. Why didn't you say something sooner? So, now it turns out all I have to do is append a cic= (CARRIER identification code) parameter to the SIP RURI (request uniform resource identifier) on the outbound INVITE to the SIP <-> SS7 gateway. No problem. Thanks to OpenSER, module textops, and subst_uri, this is all I need to do:
avp_printf("$avp(s:cic)", "+15062"); # Set the cic value in an AVP
if (subst_uri('/^sip:([0-9]+)@(.*)$/sip:\1;cic=$avp(s:cic)@\2/i')) {
xlog("Added CIC $avp(s:cic) to RURI"); # tell us about it
};
Here is the INVITE going out to the gateway (cic in red):
U a.b.c.d:5060 -> w.x.y.z:5060
INVITE sip:14145551212;cic=+15062@w.x.y.z:5060 SIP/2.0.
Record-Route:
Via: SIP/2.0/UDP a.b.c.d;branch=z9hG4bKf1e2.8a4d41b6.0.
Via: SIP/2.0/UDP e.f.g.h:5060;branch=z9hG4bK4e28b419;rport=5060.
From: "Polycom 320"
To:
Contact:
Call-ID: 343b661709e271c30bdfafe923a82adb@e.f.g.h.
CSeq: 103 INVITE.
User-Agent: Star2Star StarBox astlinux-s2s-1438-net4801.
Max-Forwards: 69.
Remote-Party-ID: "Polycom 320"
Date: Fri, 22 Feb 2008 19:48:44 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
Content-Type: application/sdp.
Content-Length: 265.
.
v=0.
o=root 6183 6184 IN IP4 e.f.g.h.
s=session.
c=IN IP4 e.f.g.h.
t=0 0.
m=audio 19770 RTP/AVP 18 0 101.
a=rtpmap:18 G729/8000.
a=fmtp:18 annexb=no.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
Yeah, yeah, you like that?!?
I'm still not sure of the actual CIC format... That isn't EXACTLY what is shown in section 8.2.1.1 of RFC 3398 but it does look like what Cisco expects from their docs I've read (from the BTS 10200). After all, this particular SIP <-> SS7 gateway is made by them (PGW 2200).
Obviously you can get fancy with that and do all sorts of conditionals and DB calls to get the CIC, only append it on certain calls, different CICs for different sources, destinations, etc but the point is you can tweak up the RURI and insert it. I still haven't tested this against Cisco but I'm pretty sure it will work...
After I sorted out my "CIC confusion" it took me less than 30 minutes to implement this. Thanks OpenSER. You make SS7 a lot easier to deal with (as long as SIP is involved)!
Wednesday, February 20, 2008
Missing SIP traffic?
SIP can be cool because it resembles HTTP. OpenSER and SER are cool because they are so powerful. For instance, in OpenSER you don't need to authenticate calls. Or you can specify which request URIs should be challenged with an auth (or which method types, like REGISTER). This is usually done like so:
if (is_method("INVITE")) {
if (uri=~"^sip:1000@") {
rewritehostport("192.168.1.1:5060"); #set destination
route(1); #t_relay, etc is in route(1)
}
if (!allow_trusted()) {
if (!proxy_authorize("star2star.com","subscriber")) {
proxy_challenge("star2star.com","0");
return;
} else if (!check_from()) {
sl_send_reply("403", "Use From=ID");
return;
};
xlog("Creds are good\n");
consume_credentials();
};
if (uri=~"^sip:1[0-9]{10}@") {
route(5); #goto LCR
return;
};
};
So, in this example any SIP endpoint that can reach this proxy can hit RURI:1000 and be forwarded to the voicemail server with no authentication. As we step through this example, the only other URIs that match are after the allow_trusted or proxy_authorize checks. Basically, your source IP address has to be in the trusted table or you have to successfully respond to a 407 Proxy Authentication Required from the proxy.
I've seen this work perfectly between an Asterisk client and OpenSER hundreds of times. Most of the time it works. MOST OF THE TIME. I've noticed a scenario where it does not work, and I am struggling to figure out why...
Here's the architecture:
User's Phone -> Asterisk --(internet)--> OpenSER -> Misc. other systems
Like I said, normally this works and it looks like this (get ready for some SIP):
#
U a.b.c.d:5060 -> e.f.g.h:5060
INVITE sip:9415551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport.
From: "User Phone";tag=as3474f6c4.
To:.
Contact:.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 INVITE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone";privacy=off;screen=no.
Date: Tue, 05 Feb 2008 18:24:11 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
X-s2s-region: 1.
Content-Type: application/sdp.
Content-Length: 216.
.
v=0.
o=root 1429 1429 IN IP4 a.b.c.d.
s=session.
c=IN IP4 a.b.c.d.
t=0 0.
m=audio 19424 RTP/AVP 0 101.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 Trying.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport=5060.
From: "User Phone";tag=as3474f6c4.
To:.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:9415551212@e.f.g.h via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 407 Proxy Authentication Required.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport=5060.
From: "User Phone";tag=as3474f6c4.
To:;tag=0dd4490c85a9eb8d48ff967a8700cef0.fcb4.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 INVITE.
Proxy-Authenticate: Digest realm="star2star.com", nonce="valid_nonce".
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:9415551212@e.f.g.h via_cnt==1".
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
ACK sip:9415551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport.
From: "User Phone";tag=as3474f6c4.
To:;tag=0dd4490c85a9eb8d48ff967a8700cef0.fcb4.
Contact:.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 ACK.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone";privacy=off;screen=no.
Content-Length: 0.
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
INVITE sip:9415551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport.
From: "User Phone";tag=as3474f6c4.
To:.
Contact:.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 INVITE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone";privacy=off;screen=no.
Proxy-Authorization: Digest username="cpesource", realm="star2star.com", algorithm=MD5, uri="sip:9415551212@e.f.g.h", nonce="valid_nonce", response="valid_response", opaque="".
Date: Tue, 05 Feb 2008 18:24:12 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
X-s2s-region: 1.
Content-Type: application/sdp.
Content-Length: 216.
.
v=0.
o=root 1429 1430 IN IP4 a.b.c.d.
s=session.
c=IN IP4 a.b.c.d.
t=0 0.
m=audio 19424 RTP/AVP 0 101.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 Trying.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone";tag=as3474f6c4.
To:.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4361 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:9415551212@e.f.g.h via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 trying -- your call is important to us.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone";tag=as3474f6c4.
To:.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4361 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:18135551212@w.x.y.z:5060;transport=udp via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 183 Session Progress.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone";tag=as3474f6c4.
To:;tag=F6697AC-1DCD.
Date: Tue, 05 Feb 2008 18:24:12 GMT.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
Server: Cisco-SIPGateway/IOS-12.x.
CSeq: 103 INVITE.
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, COMET, REFER, SUBSCRIBE, NOTIFY, INFO, UPDATE, REGISTER.
Allow-Events: telephone-event.
Contact:.
Record-Route:.
Content-Disposition: session;handling=required.
Content-Type: application/sdp.
Content-Length: 238.
.
v=0.
o=CiscoSystemsSIP-GW-UserAgent 2171 1394 IN IP4 w.x.y.z.
s=SIP Call.
c=IN IP4 w.x.y.z.
t=0 0.
m=audio 18722 RTP/AVP 0 101.
c=IN IP4 w.x.y.z.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 200 OK.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone";tag=as3474f6c4.
To:;tag=F6697AC-1DCD.
Date: Tue, 05 Feb 2008 18:24:12 GMT.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
Server: Cisco-SIPGateway/IOS-12.x.
CSeq: 103 INVITE.
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, COMET, REFER, SUBSCRIBE, NOTIFY, INFO, UPDATE, REGISTER.
Allow-Events: telephone-event.
Contact:.
Record-Route:.
Content-Type: application/sdp.
Content-Length: 238.
.
v=0.
o=CiscoSystemsSIP-GW-UserAgent 2171 1394 IN IP4 w.x.y.z.
s=SIP Call.
c=IN IP4 w.x.y.z.
t=0 0.
m=audio 18722 RTP/AVP 0 101.
c=IN IP4 w.x.y.z.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
#
U a.b.c.d:5060 -> e.f.g.h:5060
ACK sip:18135551212@w.x.y.z:5060 SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK38cd41fe;rport.
Route:.
From: "User Phone";tag=as3474f6c4.
To:;tag=F6697AC-1DCD.
Contact:.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 ACK.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone";privacy=off;screen=no.
Content-Length: 0.
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
BYE sip:18135551212@w.x.y.z:5060 SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK48cf5abc;rport.
Route:.
From: "User Phone";tag=as3474f6c4.
To:;tag=F6697AC-1DCD.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 104 BYE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone";privacy=off;screen=no.
Proxy-Authorization: Digest username="cpesource", realm="star2star.com", algorithm=MD5, uri="sip:18135551212@w.x.y.z:5060", nonce="valid_nonce", response="valid_response", opaque="".
Content-Length: 0.
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 200 OK.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK48cf5abc;rport=5060.
From: "User Phone";tag=as3474f6c4.
To:;tag=F6697AC-1DCD.
Date: Tue, 05 Feb 2008 18:24:21 GMT.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
Server: Cisco-SIPGateway/IOS-12.x.
Content-Length: 0.
CSeq: 104 BYE.
.
INVITE comes in, 407 goes out and gets ACKd by remote Asterisk instance, INVITE comes back, this time with a Proxy-Authorization: header attached. The call gets forwarded, setup, and everything is fine.
HOWEVER - some times the 407 doesn't make it back to Asterisk (for whatever reason) and the call dies:
#
U a.b.c.d:5060 -> e.f.g.h:5060
INVITE sip:8135551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport.
From: "User Phone";tag=as664fbdbc.
To:.
Contact:.
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 INVITE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone";privacy=off;screen=no.
Date: Tue, 05 Feb 2008 18:22:47 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
X-s2s-region: 1.
Content-Type: application/sdp.
Content-Length: 216.
.
v=0.
o=root 1306 1306 IN IP4 a.b.c.d.
s=session.
c=IN IP4 a.b.c.d.
t=0 0.
m=audio 19154 RTP/AVP 0 101.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 Trying.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport=5060.
From: "User Phone";tag=as664fbdbc.
To:.
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:8135551212@e.f.g.h out_uri=sip:8135551212@e.f.g.h via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 407 Proxy Authentication Required.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport=5060.
From: "User Phone";tag=as664fbdbc.
To:;tag=0dd4490c85a9eb8d48ff967a8700cef0.d2fe.
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 INVITE.
Proxy-Authenticate: Digest realm="star2star.com", nonce="valid_nonce".
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:8135551212@e.f.g.h out_uri=sip:8135551212@e.f.g.h via_cnt==1".
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
CANCEL sip:8135551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport.
From: "User Phone";tag=as664fbdbc.
To:.
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 CANCEL.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone";privacy=off;screen=no.
Content-Length: 0.
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 483 Too Many Hops.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport=5060.
From: "User Phone";tag=as664fbdbc.
To:;tag=0dd4490c85a9eb8d48ff967a8700cef0.f74a.
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 CANCEL.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4365 req_src_ip=e.f.g.h req_src_port=5060 in_uri=sip:8135551212@e.f.g.h out_uri=sip:8135551212@e.f.g.h via_cnt==71".
.
So, in this scenario the INVITE comes in and OpenSER responds with a 407. The remote endpoint (Asterisk) never receives the 407, gives up on the request, sending a CANCEL after about 60 seconds. At this point I'm not really sure what happens but something loops in OpenSER until Max-Forwards: is exceeded.
We can verify from packet captures on the remote Asterisk system that the 407 is not being received and therefore, Asterisk isn't resending the INVITE with auth.
What's doing on here? We're not seeing any other messages being lost, we're not seeing packet loss, what's going on?
if (is_method("INVITE")) {
if (uri=~"^sip:1000@") {
rewritehostport("192.168.1.1:5060"); #set destination
route(1); #t_relay, etc is in route(1)
}
if (!allow_trusted()) {
if (!proxy_authorize("star2star.com","subscriber")) {
proxy_challenge("star2star.com","0");
return;
} else if (!check_from()) {
sl_send_reply("403", "Use From=ID");
return;
};
xlog("Creds are good\n");
consume_credentials();
};
if (uri=~"^sip:1[0-9]{10}@") {
route(5); #goto LCR
return;
};
};
So, in this example any SIP endpoint that can reach this proxy can hit RURI:1000 and be forwarded to the voicemail server with no authentication. As we step through this example, the only other URIs that match are after the allow_trusted or proxy_authorize checks. Basically, your source IP address has to be in the trusted table or you have to successfully respond to a 407 Proxy Authentication Required from the proxy.
I've seen this work perfectly between an Asterisk client and OpenSER hundreds of times. Most of the time it works. MOST OF THE TIME. I've noticed a scenario where it does not work, and I am struggling to figure out why...
Here's the architecture:
User's Phone -> Asterisk --(internet)--> OpenSER -> Misc. other systems
Like I said, normally this works and it looks like this (get ready for some SIP):
#
U a.b.c.d:5060 -> e.f.g.h:5060
INVITE sip:9415551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport.
From: "User Phone"
To:
Contact:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 INVITE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone"
Date: Tue, 05 Feb 2008 18:24:11 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
X-s2s-region: 1.
Content-Type: application/sdp.
Content-Length: 216.
.
v=0.
o=root 1429 1429 IN IP4 a.b.c.d.
s=session.
c=IN IP4 a.b.c.d.
t=0 0.
m=audio 19424 RTP/AVP 0 101.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 Trying.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport=5060.
From: "User Phone"
To:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:9415551212@e.f.g.h via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 407 Proxy Authentication Required.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport=5060.
From: "User Phone"
To:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 INVITE.
Proxy-Authenticate: Digest realm="star2star.com", nonce="valid_nonce".
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:9415551212@e.f.g.h via_cnt==1".
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
ACK sip:9415551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK2e8b0432;rport.
From: "User Phone"
To:
Contact:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 102 ACK.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone"
Content-Length: 0.
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
INVITE sip:9415551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport.
From: "User Phone"
To:
Contact:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 INVITE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone"
Proxy-Authorization: Digest username="cpesource", realm="star2star.com", algorithm=MD5, uri="sip:9415551212@e.f.g.h", nonce="valid_nonce", response="valid_response", opaque="".
Date: Tue, 05 Feb 2008 18:24:12 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
X-s2s-region: 1.
Content-Type: application/sdp.
Content-Length: 216.
.
v=0.
o=root 1429 1430 IN IP4 a.b.c.d.
s=session.
c=IN IP4 a.b.c.d.
t=0 0.
m=audio 19424 RTP/AVP 0 101.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 Trying.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone"
To:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4361 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:9415551212@e.f.g.h via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 trying -- your call is important to us.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone"
To:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4361 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:9415551212@e.f.g.h out_uri=sip:18135551212@w.x.y.z:5060;transport=udp via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 183 Session Progress.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone"
To:
Date: Tue, 05 Feb 2008 18:24:12 GMT.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
Server: Cisco-SIPGateway/IOS-12.x.
CSeq: 103 INVITE.
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, COMET, REFER, SUBSCRIBE, NOTIFY, INFO, UPDATE, REGISTER.
Allow-Events: telephone-event.
Contact:
Record-Route:
Content-Disposition: session;handling=required.
Content-Type: application/sdp.
Content-Length: 238.
.
v=0.
o=CiscoSystemsSIP-GW-UserAgent 2171 1394 IN IP4 w.x.y.z.
s=SIP Call.
c=IN IP4 w.x.y.z.
t=0 0.
m=audio 18722 RTP/AVP 0 101.
c=IN IP4 w.x.y.z.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 200 OK.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK57af1f44;rport=5060.
From: "User Phone"
To:
Date: Tue, 05 Feb 2008 18:24:12 GMT.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
Server: Cisco-SIPGateway/IOS-12.x.
CSeq: 103 INVITE.
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, COMET, REFER, SUBSCRIBE, NOTIFY, INFO, UPDATE, REGISTER.
Allow-Events: telephone-event.
Contact:
Record-Route:
Content-Type: application/sdp.
Content-Length: 238.
.
v=0.
o=CiscoSystemsSIP-GW-UserAgent 2171 1394 IN IP4 w.x.y.z.
s=SIP Call.
c=IN IP4 w.x.y.z.
t=0 0.
m=audio 18722 RTP/AVP 0 101.
c=IN IP4 w.x.y.z.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
#
U a.b.c.d:5060 -> e.f.g.h:5060
ACK sip:18135551212@w.x.y.z:5060 SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK38cd41fe;rport.
Route:
From: "User Phone"
To:
Contact:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 103 ACK.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone"
Content-Length: 0.
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
BYE sip:18135551212@w.x.y.z:5060 SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK48cf5abc;rport.
Route:
From: "User Phone"
To:
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
CSeq: 104 BYE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone"
Proxy-Authorization: Digest username="cpesource", realm="star2star.com", algorithm=MD5, uri="sip:18135551212@w.x.y.z:5060", nonce="valid_nonce", response="valid_response", opaque="".
Content-Length: 0.
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 200 OK.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK48cf5abc;rport=5060.
From: "User Phone"
To:
Date: Tue, 05 Feb 2008 18:24:21 GMT.
Call-ID: 2ad4212700f61c3e1294439f1c72c479@a.b.c.d.
Server: Cisco-SIPGateway/IOS-12.x.
Content-Length: 0.
CSeq: 104 BYE.
.
INVITE comes in, 407 goes out and gets ACKd by remote Asterisk instance, INVITE comes back, this time with a Proxy-Authorization: header attached. The call gets forwarded, setup, and everything is fine.
HOWEVER - some times the 407 doesn't make it back to Asterisk (for whatever reason) and the call dies:
#
U a.b.c.d:5060 -> e.f.g.h:5060
INVITE sip:8135551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport.
From: "User Phone"
To:
Contact:
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 INVITE.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone"
Date: Tue, 05 Feb 2008 18:22:47 GMT.
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY.
X-s2s-region: 1.
Content-Type: application/sdp.
Content-Length: 216.
.
v=0.
o=root 1306 1306 IN IP4 a.b.c.d.
s=session.
c=IN IP4 a.b.c.d.
t=0 0.
m=audio 19154 RTP/AVP 0 101.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16.
a=silenceSupp:off - - - -.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 100 Trying.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport=5060.
From: "User Phone"
To:
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 INVITE.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:8135551212@e.f.g.h out_uri=sip:8135551212@e.f.g.h via_cnt==1".
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 407 Proxy Authentication Required.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport=5060.
From: "User Phone"
To:
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 INVITE.
Proxy-Authenticate: Digest realm="star2star.com", nonce="valid_nonce".
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4363 req_src_ip=a.b.c.d req_src_port=5060 in_uri=sip:8135551212@e.f.g.h out_uri=sip:8135551212@e.f.g.h via_cnt==1".
.
#
U a.b.c.d:5060 -> e.f.g.h:5060
CANCEL sip:8135551212@e.f.g.h SIP/2.0.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport.
From: "User Phone"
To:
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 CANCEL.
User-Agent: Asterisk PBX.
Max-Forwards: 70.
Remote-Party-ID: "User Phone"
Content-Length: 0.
.
#
U e.f.g.h:5060 -> a.b.c.d:5060
SIP/2.0 483 Too Many Hops.
Via: SIP/2.0/UDP a.b.c.d:5060;branch=z9hG4bK3beb8c07;rport=5060.
From: "User Phone"
To:
Call-ID: 165e2edb63fd81b30e629055245b8b28@a.b.c.d.
CSeq: 102 CANCEL.
Server: OpenSer (1.1.0-notls (i386/linux)).
Content-Length: 0.
Warning: 392 e.f.g.h:5060 "Noisy feedback tells: pid=4365 req_src_ip=e.f.g.h req_src_port=5060 in_uri=sip:8135551212@e.f.g.h out_uri=sip:8135551212@e.f.g.h via_cnt==71".
.
So, in this scenario the INVITE comes in and OpenSER responds with a 407. The remote endpoint (Asterisk) never receives the 407, gives up on the request, sending a CANCEL after about 60 seconds. At this point I'm not really sure what happens but something loops in OpenSER until Max-Forwards: is exceeded.
We can verify from packet captures on the remote Asterisk system that the 407 is not being received and therefore, Asterisk isn't resending the INVITE with auth.
What's doing on here? We're not seeing any other messages being lost, we're not seeing packet loss, what's going on?
Monday, July 16, 2007
What's my name?
This is going to be a different kind of post. This post might actually be useful for people trying to solve this problem. Just the facts, ma'am.
One of the things that has repeatedly come up in my line of work is CallerID name delivery in PRI (Primary Rate Interface) ISDN (Integrated Services Digital Network) configurations. I learned more about CallerID name today than I ever wanted to know. Just kidding - I love getting into stuff like this!
PRI is great because call setup is fast and CallerID information is available instantly. Or is it? I always knew that Caller ID name is not carried over the PSTN (usually - in some countries it is). The number does (obviously), but the name is usually looked up in CNAM by the terminating switch, not the originating switch. What I didn't know is that sometimes this isn't done when the initial Q.931 Setup message comes down the PRI to signal a new call.
Sometimes this CNAM lookup takes a little while (fractions of a second) and the name is sent later in a separate Q.931 Facility message. This is true. Cisco says so (PDF). A Cisco ISDN-SIP gateway can be configured to do this one of two ways:
1) Wait until you receive the Q.931 Facility message with name and shove it into the SIP INVITE using either PAI (P-Asserted-Identity) or RPID (Remote-Party-ID). Send the INVITE to the SIP proxy (or wherever).
2) Send the INVITE ASAP, and then send a SIP INFO packet when the name shows up in the Q.931 Facility message.
The default is #2, which is screwy. Very cool, but still screwy. It is much harder to design a SIP platform that can accept the initial INVITE, begin to process the call, and then append the PAI or RPID information received in the later INFO.
Thanks to Cisco I now understand more about Q.931 and ISDN. Now I need to get this "thing" to work.
My test setup:
PRI -> Asterisk -> PRI -> AS5350XM -> SIP -> OpenSER -> SIP -> Device
I need to get Caller ID with name delivery through this whole mess, from the first PRI to the last SIP device.
The LEC provided the PRI coming into the Asterisk machine. I provided everything else. I saw several roadblocks:
1) Get the CID Name from the LEC (via PRI)
2) Pass it through Asterisk
3) Get it to the 5350 (via PRI)
4) Get it to OpenSER (via SIP)
Knowing what I now know about Caller ID with name in ISDN I knew just what to do for Asterisk. In zapata.conf, my incoming context is lec-in. Here it is (from extensions.conf):
[lec-in]
exten => NXXNXXXXXX,1,Wait(1)
exten => NXXNXXXXXX,n,DoSomethingElse
Yep, that's right. All you need is to Wait a little to get that second Facility IE. Asterisk doesn't support getting the Facility IE later and it certainly doesn't support sending a subsequent SIP INFO. That's a good thing because as I said the "other" way (SIP INFO) just seems goofy to me.
Now I needed to get the CallerID name to the 5350. It didn't seem to work. I start looking at "pri debug span 3" output to see the Q.931 goodness coming from Asterisk. I fired up "debug isdn q931" on the 5350. No dice. It looked like this bug in libpri was killing me:
http://bugs.digium.com/view.php?id=9651
This was committed to libpri SVN about a month ago. I update libpri from SVN, recompile Asterisk, and install the new chan_zap.so. I give it another shot. It looks like the 5350 is now getting the name over Q. 931. Using ngrep I look at the SIP INVITE coming into OpenSER from the 5350. I have an RPID header, but it looks strange. The name field in the Remote-Party-ID header is "pending". What the heck is that about? "pending" was not what I was seeing in Asterisk!
I opened up ngrep a bit to let my see any SIP INFO messages that might be coming later. Sure enough shortly after the SIP INVITE comes a SIP INFO message with my Caller ID name. Going back to my two configuration choices on the 5350 I knew I preferred option #1 (send everything in one SIP INVITE), even if it meant there was a little delay before the caller got audio. How could I configure the 5350 to wait a little and put it all in one SIP INVITE before the Cisco fired it off to OpenSER?
I dug around on cisco.com for a bit. Nothing - at least nothing obvious. You have to love Cisco configuration and Cisco docs. I decided to look around the internet and see if anyone else had this problem.
I looked on Google and found this:
http://puck.nether.net/pipermail/cisco-voip/2005-June/005485.html
I wondered if Mr. Adam Rothschild ever found the solution to his (my) problem. I open up another tab and write him an e-mail. Three minutes later (literally) he sends me this configuration snippet:
---Begin IOS Configuration---
interface Serial3/0:23
no ip address
load-interval 30
isdn switch-type primary-ni
isdn incoming-voice modem
isdn supp-service name calling
isdn negotiate-bchan
no isdn outgoing display-ie
no cdp enable
exit
gateway
timer receive-rtp 1200
sip-ua
disable-early-media 180
retry invite 3
retry response 3
retry bye 3
retry cancel 3
timers buffer-invite 500
---End IOS Configuration---
Let's get away from this technical mumbo-jumbo and talk about people for a minute...
Mr. Adam Rothschild got an e-mail from a random stranger across the internet referencing an obscure technical problem that he had over two years ago. In less than three minutes he dug up the solution and wrote me back. I have a SmartNet support contract on this 5350 but I doubt the techs at Cisco could have helped me any better or faster than a nice guy (Adam) helping a stranger (me).
Wipe away your tears, you sentimental fool. We're getting back to configuration. This blog is hardcore. Couldn't you tell?
I applied Adam's config to my AS5350XM running IOS 12.4(15)T. Here is the SIP INVITE from the 5350 to OpenSER:
U 192.168.0.1:61306 -> 192.168.0.10:5060
INVITE sip:9418675309@192.168.0.10:5060 SIP/2.0.
Via: SIP/2.0/UDP 192.168.0.1:5060;x-ds0num="ISDN 3/1:D 3/1:DS1
1:DS0";branch=z9hG4bK901AB9.
Remote-Party-ID: "STAR2STAR COMM"
;party=calling;screen=no;privacy=off.
From: "STAR2STAR COMM";tag=971D8C-1203.
To:.
Date: Mon, 16 Jul 2007 21:57:24 GMT.
Call-ID: 5E99C7DC-331E11DC-8126E6C7-399CBB13@192.168.0.1.
Supported: 100rel,timer,resource-priority,replaces.
Min-SE: 1800.
Cisco-Guid: 1586976572-857608668-2150694933-1673067056.
User-Agent: Cisco-SIPGateway/IOS-12.x.
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, UPDATE, REFER,
SUBSCRIBE, NOTIFY, INFO, REGISTER.
CSeq: 101 INVITE.
Max-Forwards: 70.
Timestamp: 1184623044.
Contact:.
Expires: 300.
Allow-Events: telephone-event.
Content-Type: application/sdp.
Content-Disposition: session;handling=required.
Content-Length: 288.
.
v=0.
o=CiscoSystemsSIP-GW-UserAgent 7275 8957 IN IP4 192.168.0.1.
s=SIP Call.
c=IN IP4 192.168.0.1.
t=0 0.
m=audio 20746 RTP/AVP 18 0 101.
c=IN IP4 192.168.0.1.
a=rtpmap:18 G729/8000.
a=fmtp:18 annexb=no.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16
Yeah yeah! Look at that Caller ID name in that Remote-Party-ID header! I feel like that's the best looking SIP INVITE I have ever seen. How does one SIP INVITE look better than any other? If you don't know the answer to that question, you haven't been following along.
I wrote Adam back to let him know how it turned out. He wrote me back again, happy to hear that it worked for me. Wow, just wow.
So many things shine through in this post. In one evening I found (and patched) a bug in libpri. I learned more about Q.931 and Caller ID. I found a guy to help me put it all together. The open source development model worked. The promise of easy access to information via the internet skooled me in ISDN. Social networking proved to be very effective, even while using pre-web 2.0 technology (e-mail). Google worked (a lot).
Now I get to put it all together in this blog post to give back a little. Hopefully the next guy (or girl) trying to get some mixed up mess of SIP and ISDN devices to work together with Caller ID Name delivery will get out of the office just a little bit earlier.
One of the things that has repeatedly come up in my line of work is CallerID name delivery in PRI (Primary Rate Interface) ISDN (Integrated Services Digital Network) configurations. I learned more about CallerID name today than I ever wanted to know. Just kidding - I love getting into stuff like this!
PRI is great because call setup is fast and CallerID information is available instantly. Or is it? I always knew that Caller ID name is not carried over the PSTN (usually - in some countries it is). The number does (obviously), but the name is usually looked up in CNAM by the terminating switch, not the originating switch. What I didn't know is that sometimes this isn't done when the initial Q.931 Setup message comes down the PRI to signal a new call.
Sometimes this CNAM lookup takes a little while (fractions of a second) and the name is sent later in a separate Q.931 Facility message. This is true. Cisco says so (PDF). A Cisco ISDN-SIP gateway can be configured to do this one of two ways:
1) Wait until you receive the Q.931 Facility message with name and shove it into the SIP INVITE using either PAI (P-Asserted-Identity) or RPID (Remote-Party-ID). Send the INVITE to the SIP proxy (or wherever).
2) Send the INVITE ASAP, and then send a SIP INFO packet when the name shows up in the Q.931 Facility message.
The default is #2, which is screwy. Very cool, but still screwy. It is much harder to design a SIP platform that can accept the initial INVITE, begin to process the call, and then append the PAI or RPID information received in the later INFO.
Thanks to Cisco I now understand more about Q.931 and ISDN. Now I need to get this "thing" to work.
My test setup:
PRI -> Asterisk -> PRI -> AS5350XM -> SIP -> OpenSER -> SIP -> Device
I need to get Caller ID with name delivery through this whole mess, from the first PRI to the last SIP device.
The LEC provided the PRI coming into the Asterisk machine. I provided everything else. I saw several roadblocks:
1) Get the CID Name from the LEC (via PRI)
2) Pass it through Asterisk
3) Get it to the 5350 (via PRI)
4) Get it to OpenSER (via SIP)
Knowing what I now know about Caller ID with name in ISDN I knew just what to do for Asterisk. In zapata.conf, my incoming context is lec-in. Here it is (from extensions.conf):
[lec-in]
exten => NXXNXXXXXX,1,Wait(1)
exten => NXXNXXXXXX,n,DoSomethingElse
Yep, that's right. All you need is to Wait a little to get that second Facility IE. Asterisk doesn't support getting the Facility IE later and it certainly doesn't support sending a subsequent SIP INFO. That's a good thing because as I said the "other" way (SIP INFO) just seems goofy to me.
Now I needed to get the CallerID name to the 5350. It didn't seem to work. I start looking at "pri debug span 3" output to see the Q.931 goodness coming from Asterisk. I fired up "debug isdn q931" on the 5350. No dice. It looked like this bug in libpri was killing me:
http://bugs.digium.com/view.php?id=9651
This was committed to libpri SVN about a month ago. I update libpri from SVN, recompile Asterisk, and install the new chan_zap.so. I give it another shot. It looks like the 5350 is now getting the name over Q. 931. Using ngrep I look at the SIP INVITE coming into OpenSER from the 5350. I have an RPID header, but it looks strange. The name field in the Remote-Party-ID header is "pending". What the heck is that about? "pending" was not what I was seeing in Asterisk!
I opened up ngrep a bit to let my see any SIP INFO messages that might be coming later. Sure enough shortly after the SIP INVITE comes a SIP INFO message with my Caller ID name. Going back to my two configuration choices on the 5350 I knew I preferred option #1 (send everything in one SIP INVITE), even if it meant there was a little delay before the caller got audio. How could I configure the 5350 to wait a little and put it all in one SIP INVITE before the Cisco fired it off to OpenSER?
I dug around on cisco.com for a bit. Nothing - at least nothing obvious. You have to love Cisco configuration and Cisco docs. I decided to look around the internet and see if anyone else had this problem.
I looked on Google and found this:
http://puck.nether.net/piperma
I wondered if Mr. Adam Rothschild ever found the solution to his (my) problem. I open up another tab and write him an e-mail. Three minutes later (literally) he sends me this configuration snippet:
---Begin IOS Configuration---
interface Serial3/0:23
no ip address
load-interval 30
isdn switch-type primary-ni
isdn incoming-voice modem
isdn supp-service name calling
isdn negotiate-bchan
no isdn outgoing display-ie
no cdp enable
exit
gateway
timer receive-rtp 1200
sip-ua
disable-early-media 180
retry invite 3
retry response 3
retry bye 3
retry cancel 3
timers buffer-invite 500
---End IOS Configuration---
Let's get away from this technical mumbo-jumbo and talk about people for a minute...
Mr. Adam Rothschild got an e-mail from a random stranger across the internet referencing an obscure technical problem that he had over two years ago. In less than three minutes he dug up the solution and wrote me back. I have a SmartNet support contract on this 5350 but I doubt the techs at Cisco could have helped me any better or faster than a nice guy (Adam) helping a stranger (me).
Wipe away your tears, you sentimental fool. We're getting back to configuration. This blog is hardcore. Couldn't you tell?
I applied Adam's config to my AS5350XM running IOS 12.4(15)T. Here is the SIP INVITE from the 5350 to OpenSER:
U 192.168.0.1:61306 -> 192.168.0.10:5060
INVITE sip:9418675309@192.168.0.10:5060 SIP/2.0.
Via: SIP/2.0/UDP 192.168.0.1:5060;x-ds0num="ISDN 3/1:D 3/1:DS1
1:DS0";branch=z9hG4bK901AB9.
Remote-Party-ID: "STAR2STAR COMM"
From: "STAR2STAR COMM"
To:
Date: Mon, 16 Jul 2007 21:57:24 GMT.
Call-ID: 5E99C7DC-331E11DC-8126E6C7-399CBB13@192.168.0.1.
Supported: 100rel,timer,resource-priority,replaces.
Min-SE: 1800.
Cisco-Guid: 1586976572-857608668-2150694933-1673067056.
User-Agent: Cisco-SIPGateway/IOS-12.x.
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, UPDATE, REFER,
SUBSCRIBE, NOTIFY, INFO, REGISTER.
CSeq: 101 INVITE.
Max-Forwards: 70.
Timestamp: 1184623044.
Contact:
Expires: 300.
Allow-Events: telephone-event.
Content-Type: application/sdp.
Content-Disposition: session;handling=required.
Content-Length: 288.
.
v=0.
o=CiscoSystemsSIP-GW-UserAgent 7275 8957 IN IP4 192.168.0.1.
s=SIP Call.
c=IN IP4 192.168.0.1.
t=0 0.
m=audio 20746 RTP/AVP 18 0 101.
c=IN IP4 192.168.0.1.
a=rtpmap:18 G729/8000.
a=fmtp:18 annexb=no.
a=rtpmap:0 PCMU/8000.
a=rtpmap:101 telephone-event/8000.
a=fmtp:101 0-16
Yeah yeah! Look at that Caller ID name in that Remote-Party-ID header! I feel like that's the best looking SIP INVITE I have ever seen. How does one SIP INVITE look better than any other? If you don't know the answer to that question, you haven't been following along.
I wrote Adam back to let him know how it turned out. He wrote me back again, happy to hear that it worked for me. Wow, just wow.
So many things shine through in this post. In one evening I found (and patched) a bug in libpri. I learned more about Q.931 and Caller ID. I found a guy to help me put it all together. The open source development model worked. The promise of easy access to information via the internet skooled me in ISDN. Social networking proved to be very effective, even while using pre-web 2.0 technology (e-mail). Google worked (a lot).
Now I get to put it all together in this blog post to give back a little. Hopefully the next guy (or girl) trying to get some mixed up mess of SIP and ISDN devices to work together with Caller ID Name delivery will get out of the office just a little bit earlier.
Subscribe to:
Posts (Atom)