Friday, July 13, 2007

Getting Multihomed - Parts 1,2

WARNING: This is going to be a long post. You probably won't make it to the end. I guess that's what happens when I go this long without having a blog.

Since moving into our colo in Tampa way back when, Star2Star has been getting blended bandwidth from our colo facility (E-Solutions). First they had three providers (Verizon/MCI/UUnet, Global Crossing, Level3). Then two (Global Crossing, Level3).

Starting in February, Global Crossing started having some big problems. Mostly packet loss in a router in Miami. Not only did it happen frequently, it was bad (%50 - %60 loss). You can imagine what that does for VoIP...

We are obsessed with quality, so about four months ago we decided to get multihomed. Seems easy enough, right? Get the right equipment, order some circuits, do the BGP thing.

Let's start with the good equipment. We have been using the awesome Cisco Catalyst 3750 to form our redundant switch stack (two 3750G-24-TS-1U configured with STP to the colo's 4500). My buddy Anton Kapela at Five9s Data suggested them. How I love these switches:

- Good stacking (Cisco StackWise)
- Good performance (65.7mpps - that's over 65 million packets per second across the backplane)
- Good performance, with features. That's right, you can do QoS, ACLs, etc at wire speed, per port (within the limits of the TCAM, obviously).
- 24 +4 port density in 1U (24 GigE copper + four GBIC slots)
- More router-type functionality (with EMI software image - gives BGP, etc)

So with a little configuration I should be able to use one of these (right now I just grabbed a spare) to form our BGP capable router to aggregate all of these circuits. Remember those great services I talked about before? Remember the tcam? Turns out that it can only handle about 8,000 unicast routes before it starts to drop into software forwarding/otherwise start to act up. Not that big of a surprise, with the current full BGP table on the internet pushing 225K+ routes the 128MB of RAM in the 3750 wouldn't have done much good anyways. With our configuration (providers directly connected, aggregated routes only) 8000 unicast routes should be just fine. Sure we lose some end to end visibility, but it's still better than what we've got now.

It might not be the perfect equipment (it's no VXR, that's for sure) but it should get us started. Now I have to order some circuits...

We take a look at the customers we have now, the providers they have, the big providers in the area, all of that good stuff. We determine we would like to get (in no order):

- Cogent
- Verizon
- Time Warner

Cogent! Yes, I know, Cogent. Cogent sticks out on that list. Let's start with them.

Dealing with the sales guy was great. Very responsive. The price (I'm sure you know) was tough to beat. Even better than price, there was another perk...

Remember that huge global internet routing table I was talking about? There are many advertised networks in it, all with varying sizes. Some are a full Class A (/8), some are less than a Class C (/24). Or are they? It turns out that most providers/network admins/BGP snobs filter any announcement smaller than a full Class C (/24). Make sense. That table is out of control! Router memory is expensive! There is old equipment! What are those less-than-a-full-class-C small fries doing messing with BGP anyways?

What is someone with a currently small network supposed to do if they want to multihome? We need BGP to control our own routing and peer with other networks and providers. We don't have enough machines to justify the current ARIN/ISP policy of %25/%50 utilization for IP addresses to get a full class C.

It turns out that ARIN has been thinking of us. That's why there is ARIN policy 2001-2. This policy, in short, says that if you can prove you are multihoming, your ISP can give you a full Class C no questions asked. Out of the three providers mentioned above Cogent was the only provider that had even heard of this and they were more than happy to do the allocation. Thanks Cogent! (Why does everyone hate them so much?)

Don't get me wrong. I am really interested in IPv6. I know global IPv4 address space is shrinking. Hacks like NAT are running out too and it is only a matter of time before we run out of IP address space and the internet comes to a halt. Whatever. At the rate Star2Star is growing we'll need all of those IPs soon enough. When the internet really needs IPv6 all of the really smart internet god types will figure it out. I'm not worried.

So we have one provider. We have a class C. Now we need to get an ASN. Before we do that, we need to make the various contacts that ARIN requires. I started with applications for the OrgID and NocID (I think - can't remember exactly now). Much to my surprise, the turnaround time on both of these was less than one hour even though I didn't start until about 5:30 PM on a Friday. I guess they don't believe in P.O.E.T.S Day over at ARIN!

We get approved for the ASN in a day or so. Fax in the contract. Give them their $500. Now we need to wait until the good 'ol US Postal Service delivers TWO copies of our signed contract. I guess ARIN really wants to hold us to that one. Don't worry ARIN I promise we'll keep up our end of the deal. You've been great so far. Two days later (USPS Florida -> Virginia) our ASN (15092) is approved and will be in WHOIS the next day. Fantastic!

We order two more circuits. One from Time Warner, one from Verizon/MCI. Time Warner installs the circuit fairly easily. They give us a /30 and everything is good. Now we just need BGP.

So far I have filled out the Time Warner BGP request form three times. No response, not even an automated one. I have e-mailed tech support. No response, not even an automated one. I have called tech support. They say they can't do anything until I fill out the form. They say they have no requests from me. What gives?!?!? I'm giving up on them for now. At least until next week Monday. I don't give up easily.

Next we deal with Verizon. This has been interesting. Most people know of at least three Verizon-type companies:

- Verizon Wireless (cell phones)
- Verizon local (the ILEC)
- Verizon Business (used to be MCI, I guess)

So far, I have heard the following business names while trying to order/turn up this circuit:

- Verizon
- Verizon Legacy
- Verizon Core
- Verizon Business
- MCI
- UUNET

That's right: UUNET. Are you kidding me? Having six different names for your company is confusing enough. Using UUNET certainly doesn't help. Last time I heard UUNET it was the nineties and I was in middle school. I had to look it up on Wikipedia just to make sure she wasn't totally confused. Turns out it goes something like this:

UUNET -> MCI -> Verizon Business

So far their name hasn't been the only thing they are confused about. I don't even want to get into it right now. I'll make sure to update everyone as the Verizon/MCI/UUNET saga continues.

What is most surprising about all of this? Cogent. With it's horrible reputation and low cost half the people I talk to still cringe at the mention of the "C word". I can tell you this: they have been (by far) the easiest to deal with. Amazing. We'll see how the service is but as of now I am a happy Cogent customer. Anyone that would like to argue about them can try to deal with some of these other characters. Let me know how it goes!

2 comments:

Anonymous said...

I'm working on getting BGP set up with two of the same players you mention. In my case, Time Warner responded to my BGP request for within one business day.

I too went in circles trying to find the answer for Verizon. I ended up dropping email to help4u @ verizonbusiness.com and got a reply within a couple of hours offering to set up BGP.

Good luck!

Anonymous said...

Fantastic article. I wish I had found it when I first started this saga myself. I am in almost the exact same situation. It has taken me many hours and days of research and working w/ the ISPs to discover the information you lay out here.

Kudos!

-john marquart