Oh well, at least it's

different
Everything here is my opinion. I do not speak for your employer.
March 2011
April 2011

2011-03-28 »

I hope IPv6 never catches on

(Temporal note: this article was written a few days ago and then time-released.)

This year, like every year, will be the year we finally run out of IPv4 addresses. And like every year before it, you won't be affected, and you won't switch to IPv6.

I was first inspired to write about IPv6 after I read an article by Daniel J. Bernstein (the qmail/djbdns/redo guy) called The IPv6 Mess. Now, that article appears to be from 2002 or 2003, if you can trust its HTTP Last-Modified date, so I don't know if he still agrees with it or not. (If you like trolls, check out the recent reddit commentary about djb's article.) But 8 years later, his article still strikes me as exactly right.

Now, djb's commentary, if I may poorly paraphrase, is really about why it's impossible (or perhaps more precisely, uneconomical, in the sense that there's a chicken-and-egg problem preventing adoption) for IPv6 to catch on without someone inventing something fundamentally new. His point boils down to this: if I run an IPv6-only server, people with IPv4 can't connect to it, and at least one valuable customer is surely on IPv4. So if I adopt IPv6 for my server, I do it in addition to IPv4, not in exclusion. Conversely, if I have an IPv6-only client, I can't talk to IPv4-only servers. So for my IPv6 client to be useful, either all servers have to support IPv6 (not likely), or I must get an IPv4 address, perhaps one behind a NAT.

In short, any IPv6 transition plan involves everyone having an IPv4 address, right up until everyone has an IPv6 address, at which point we can start dropping IPv4, which means IPv6 will start being useful. This is a classic chicken-and-egg problem, and it's unsolvable by brute force; it needs some kind of non-obvious insight. djb apparently hadn't seen any such insight by 2002, and I haven't seen much new since then.

(I'd like to meet djb someday. He would probably yell at me. It would be awesome. </groupie>)

Still, djb's article is a bit limiting, because it's all about why IPv6 physically can't become popular any time soon. That kind of argument isn't very convincing on today's modern internet, where people solve impossible problems all day long using the unstoppable power of "not SQL", Ruby on Rails, and Ajax to-do list applications (ones used by breakfast cereal companies!).

No, allow me to expand on djb's argument using modern Internet discussion techniques:

Top 10 reasons I hope IPv6 never catches on

Just kidding. No, we're going to do this apenwarr-style:

What I hate about IPv6

Really, there's only one thing that makes IPv6 undesirable, but it's a doozy: the addresses are just too annoyingly long. 128 bits: that's 16 bytes, four times as long as an IPv4 address. Or put another way, IPv4 contains almost enough addresses to give one to each human on earth; IPv6 has enough addresses to give 39614081257132168796771975168 (that's 2**95) to every human on earth, plus a few extra if you really must.

Of course, you wouldn't really do that; you would waste addresses to make subnetting and routing easier. But here's the horrible ironic part of it: all that stuff about making routing easier... that's from 20 years ago!

Way back in the IETF dark ages when they were inventing IPv6 (you know it was the dark ages, because they invented the awful will-never-be-popular IPsec at the same time), people were worried about the complicated hardware required to decode IPv4 headers and route packets. They wanted to build the fastest routers possible, as cheaply as possible, and IPv4 routing tables are annoyingly complex. It's pretty safe to assume that someday, as the Internet gets more and more crowded, nearly every single /24 subnet in IPv4 will be routed to a different place. That means - hold your breath - an astonishing 2**24 routes in every backbone router's routing table! And those routes might have 4 or 8 or even 16 bytes of information each! Egads! That's... that's... 256 megs of RAM in every single backbone router!

Oh. Well, back in 1995, putting 256 megs of RAM in a router sounded like a big deal. Nowadays, you can get a $99 Sheevaplug with twice that. And let me tell you, the routers used on Internet backbones cost a lot more than $99.

It gets worse. IPv6 is much more than just a change to the address length; they totally rearranged the IPv4 header format (which means you have to rewrite all your NAT and firewall software, mostly from scratch). Why? Again, to try to reduce the cost of making a router. Back then, people were seriously concerned about making IPv6 packets "switchable" in the same way ethernet packets are: that is, using pure hardware to read the first few bytes of the packet, look it up in a minimal routing table, and forward it on. IPv4's variable-length headers and slightly strange option fields made this harder. Some would say impossible. Or rather, they would, if it were still 1995.

Since then, FPGAs and ASICs and DSPs and microcontrollers have gotten a whole lot cheaper and faster. If Moore's Law calls for a doubling of transistor performance every 18 months, then between 1995 and 2011 (16 years), that's 10.7 doublings, or 1663 times more performance for the price. So if your $10,000 router could route 1 gigabit/sec of IPv4 in 1995 - which was probably pretty good for 1995 - then nowadays it should be able to route 1663 gigabits/sec. It probably can't, for various reasons, but you know what? I sincerely doubt that's IPv4's fault.

If it were still 1995 and you had to route, say, 10 gigabits/sec for the same price as your old 1 gigabit/sec router using the same hardware technology, then yeah, making a more hardware-friendly packet format might be your only option. But the router people somehow forgot about Moore's Law, or else they thought (indications are that they did) that IPv6 would catch on much faster than it has.

Well, it's too late now. The hardware-optimized packet format of IPv6 is worth basically zero to us on modern technology. And neither is the simplified routing table. But if we switch to IPv6, we still have to pay the software cost of those things, which is extremely high. (For example, Linux IPv6 iptables rules are totally different from IPv4 iptables rules. So every Linux user would need to totally change their firewall configuration.)

So okay, the longer addresses don't fix anything technologically, but we're still running out of addresses, right? I mean, you can't argue with the fact that 2**32 is less than the number of people on earth. And everybody needs an IP address, right?

Well, no, they don't:

The rise of NAT

NAT is Network Address Translation, sometimes called IP Masquerading. Basically, it means that as a packet goes through your router/firewall, the router transparently changes your IP address from a private one - one reused by many private subnets all over the world and not usable on the "open internet" - to a public one. Because of the way TCP and UDP work, you can safely NAT many, many private addresses onto a single public address.

So no. Not everybody in the world needs a public IP address. In fact, most people don't need one, because most people make only outgoing connections, and you don't need your own public IP address to make an outgoing connection.

(Update 2011/04/02: A lot of people have criticized this article by talking about how nasty it is to require NAT everywhere. If we had too much NAT, the whole world would fall apart, etc. That argument is awfully hard to understand: any user with a home wireless router has a computer behind a NAT. The world hasn't come to an end. What makes you think we can't handle a little more of the same?)

By the way, the existence of NAT (and DHCP) has largely eliminated another big motivation behind IPv6: automatic network renumbering. Network renumbering used to be a big annoying pain in the neck; you'd have to go through every computer on your network, change its IP address, router, DNS server, etc, rewrite your DNS settings, and so on, every time you changed ISPs. When was the last time you heard about that being a problem? A long, long time ago, because once you switch to private IP subnets, you virtually never have to renumber again. And if you use DHCP, even the rare mandatory renumbering (like when you merge with another company and you're both using 192.168.1.0/24) is rolled out automatically from a central server.

Okay, fine, so you don't need more addresses for client-only machines. But every server needs its own public address, right? And with the rise of peer-to-peer networking, everyone will be a server, right?

Well, again, no, not really. Consider this for a moment:

Every HTTP Server on Earth Could Be Sharing a Single IP Address and You Wouldn't Know The Difference

That's because HTTP/1.1 (which is what everyone uses now... speaking of avoiding chicken/egg problems) supports "virtual hosts." You can connect to an IP address on port 80, and you provide a Host: header at the beginning of the connection, telling it which server name you're looking for. The IP you connect to can then decide to route that request anywhere it wants.

In short, HTTP is IP-agnostic. You could run HTTP over IPv4 or IPv6 or IPX or SMS, if you wanted, and you wouldn't need to care which IP address your server had. In a severely constrained world, Linode or Slicehost or Comcast or whoever could simply proxy all the incoming HTTP requests to their network, and forward the requests to the right server.

(See the very end of this article for discussion of how this applies to HTTPS.)

Would it be a pain? Inefficient? A bit expensive? Sure it would. So was setting up NAT on client networks, when it first arrived. But we got used to it. Nowadays we consider it no big deal. The same could happen to servers.

What I'd expect to happen is that as the IPv4 address space gets more crowded, the cost of a static IP address will go up. Thus, fewer and fewer public IP addresses will be dedicated to client machines, since clients won't want to pay extra for something they don't need. That will free up more and more addresses for servers, who will have to pay extra.

It'll be a long time before we reach 4 billion (2**32) server IPs, particularly given the long-term trend toward more and more (infinitely proxyable) HTTP. In fact, you might say that HTTP/1.1 has successfully positioned itself as the winning alternative to IPv6.

So no, we are obviously not going to run out of IPv4 addresses. Obviously. The world will change, as it did when NAT changed from a clever idea to a worldwide necessity (and earlier, when we had to move from static IPs to dynamic IPs) - but it certainly won't grind to a halt.

Next:

It is possible do do peer-to-peer when both peers are behind a NAT.

Another argument against widespread NATting is that you can't run peer-to-peer protocols if both ends are behind a NAT. After all, how would they figure out how to connect to each other? (Let's assume peer-to-peer is a good idea, for purposes of this article. Don't just think about movie piracy; think about generally improved distributed database protocols, peer-to-peer filesystem backups, and such.)

I won't go into this too much, other than to say that there are already various NAT traversal protocols out there, and as NAT gets more and more annoyingly mandatory, those protocols and implementations are going to get much better.

(Update 2011/04/02: Clarification: I'm not claiming here that NAT traversal is currently standardized or very reliable. I'm just saying that it already sort of works - ask any Bittorrent user - and moreover, that it can be improved without having to upgrade/reconfigure every IP stack and every router in the world first. As more and more people end up getting forcibly stuck behind an ISP-controlled IPv4 NAT, you can bet that some huge innovation around NAT traversal will start to materialize. And it will work with multi-layer NAT, too, I guarantee it.)

Note too that NAT traversal protocols don't have a chicken-and-egg problem like IPv6 does, for the same reason that dynamic IP addresses don't, and NAT itself doesn't. The reason is: if one side of the equation uses it, but the other doesn't, you might never know. That, right there, is the one-line description of how you avoid chicken-and-egg adoption problems. And how IPv6 didn't.

IPv6 addresses are as bad as GUIDs

So here's what I really hate about IPv6: 16-byte (32 hex digit) addresses are impossible to memorize. Worse, auto-renumbering of networks, facilitated by IPv6, mean that anything I memorize today might be totally different tomorrow.

IPv6 addresses are like GUIDs (which also got really popular in the 1990s dark ages, notably, although luckily most of us have learned our lessons since then). The problem with GUIDs are now well-known: that is, although they're globally unique, they're also totally unrecognizable.

If GUIDs were a good idea, we would use them instead of URLs. Are URLs perfect? Does anyone love Network Solutions? No, of course not. But it's 1000x better than looking at http://b05d25c8-ad5c-4580-9402-106335d558fe and trying to guess if that's really my bank's web site or not.

The counterargument, of course, is that DNS is supposed to solve this problem. Give each host a GUID IPv6 address, and then just map a name to that address, and you can have the best of both worlds.

Sounds good, but isn't actually. First of all, go look around in the Windows registry sometime, specifically the HKEY_CLASSES_ROOT section. See how super clean and user-friendly it isn't? Barf. But furthermore, DNS [as generally configured on the Internet] is still a steaming pile of hopeless garbage. When I bring my laptop to my friend's house and join his WLAN, why can't he ping it by name? Because DNS [implementations] suck. Why doesn't it show up by name in his router control panel so he knows which box is using his bandwidth? Because DNS [implementations] suck. Why can the Windows server browse list see it by name (sometimes, after a random delay, if you're lucky), even though DNS can't? Because they got sick of DNS and wrote something that works. (mDNS, while based on DNS, is really a very new thing.) Why do we still send co-workers hyperlinks with IP addresses in them instead of hostnames? Because the fascist sysadmin won't add a DNS entry for the server Bob set up on his desktop PC.

DNS [implementations] are, at best, okay. [They] will get better over time, as necessity dictates. All the problems I listed above are mostly solved already, in one form or another, in different DNS, DHCP, and routing products [but no single product gets everything right, and not everybody uses the best DNS server products, so the net result is confusion and mess]. It's certainly not the DNS protocol that's to blame, it's just how people use it.

(Update 2011/04/02: in the two paragraphs above, clarified that I mean existing DNS implementations, not the DNS protocol. Yes, mDNS is lovely; too bad most normal people still don't have a working install of it. I hope someday we all have something like that; it will benefit IPv4 and IPv6 users equally.)

But still, if you had to switch to IPv6, you'd discover that those DNS problems that were a nuisance yesterday are suddenly a giant fork stabbing you in the face today. I'd rather they fixed DNS before making me switch to something where I can't possibly remember my IP addresses anymore, thanks.

(Update 2011/04/02: To be extra clear here: I am saying that DNS is currently not good enough and has many obvious routes to improvement, none of which pose chicken-and-egg adoption problems like IPv6 does. DNS can be improved. People are actively working on it; I love those people! And yes, the fact that DNS implementations suck is annoying with both IPv4 and IPv6. My point in this section is just that with IPv6, you can't work around the annoyance of sucky DNS as easily as with IPv4, because IPv4 addresses can be memorized, while IPv6 ones can't. Furthermore, if we fix DNS, it will help IPv4 and IPv6. So my advice to IPv6 proponents: fix DNS first, then maybe we can talk about IPv6.)

Server-side NAT could actually make the world a better place

So that's my IPv6 rant. I want to leave you with some good news, however: I think the increasing density of IPv4 addresses will actually make the Internet a better place, not a worse one.

Client-side NAT had an unexpected huge benefit: security. NAT is like "newspeak" in Orwell's 1984: we remove nouns and verbs to make certain things inexpressible. For example, it is not possible for a hacker in Outer Gronkstown to even express to his computer the concept of connecting to the Windows File Sharing port on your laptop, because from where he's standing, there is no name that unambiguously identifies that port. There is no packet, IPv4 or IPv6 or otherwise, that he can send that will arrive at that port.

A NAT can be unbelievably simple-minded, and just because of that one limitation, it will vastly, insanely, unreasonably increase your security. As a society of sysadmins, we now understand this. You could give us all the IPv6 addresses in the world, and we'd still put our corporate networks behind a NAT. No contest.

Server-side NAT is another thing that could actually make life better, not worse. First of all, it gives servers the same security benefits as clients - if I accidentally leave a daemon running on my server, it's not automatically a security hole. (I actually get pretty scared about the vhosts I run, just because of those accidental holes.)

But there's something else, which I would be totally thrilled to see fixed. You see, IPv4 addresses aren't really 32-bits. They're actually 48 bits: a 32-bit IP address plus a 16-bit port number. People treat them as separate things, but what NAT teaches us is that they're really two parts of the same whole: the flow identifier, and you can break them up any way you want.

The address of my personal apenwarr.ca server isn't 74.207.252.179; it's 74.207.252.179:80. As a user of my site, you didn't have to type the IP (which was provided by DNS) or the port number (which is a hardcoded default in your web browser), but if I started another server, say on port 8042, then you would have to enter the port. Worse, the port number would be a weird, meaningless, magic number, akin to memorizing an IP address (though mercifully, only half as long).

So here's my proposal to save the Internet from IPv6: let's extend DNS to give out not only addresses, but port numbers. So if I go to www2.apenwarr.ca, it could send me straight to 74.207.252.179:8042. Or if I ask for ssh.apenwarr.ca, I get 74.207.252.179:22.

Someday, when IPv4 addresses get too congested, I might have to share that IP address with five other people, but that'll be fine, because each of us can run our own web server on something other than port 80, and DNS would transparently give out the right port number.

This also solves the problem with HTTPS. Alert readers will have noticed, in my comments above, that HTTPS can't support virtual hosts the same way HTTP does, because of a terrible flaw in its certificate handling. Someday, someone might make a new version of the HTTPS standard without this terrible flaw, but in the meantime, transparently supporting multiple HTTPS servers via port numbers on the same machine eliminates the problem; each port can have its own certificate.

(Update 2011/03/28: zillions of people wrote to remind me about SNI, an HTTPS extension that allows it to work with vhosts. Thanks! Now, some of those people seemed to think this refutes my article somehow, which is not true. In fact, the existence of an HTTPS vhosting standard makes IPv6 even less necessary. Then again, the standard doesn't work with IE6.)

This proposal has very minor chicken-and-egg problems. Yes, you'll have to update every operating system and every web browser before you can safely use it for everything. But for private use - for example, my personal ssh or VPN or testing web server - at least it'll save me remembering stupid port numbers. Like the original DNS, it can be adopted incrementally, and everyone who adopts it sees a benefit. Moreover, it's layered on top of existing standards, and routable over the existing Internet, so enabling it has basically zero admin cost.

Of course, I can't really take credit for this idea. The solution, DNS SRV records, has already been invented and is being used in a few places.

Embrace IPv4. Love it. Appreciate the astonishing long-lasting simplicity and resilience of a protocol that dates back to the 1970s. Don't let people pressure you into stupid, awful, pain-inducing, benefit-free IPv6. Just relax and have fun.

You're going to be enjoying IPv4 for a long, long time.

...

Update 2011/04/02: Sam Stickland adds, "Regarding the problem of 2^24 routes in the global table, the problem isn't the amount of memory - it's whether the routing computation can take place before the next change in the topology. If it can't the network topology might never stabilise. There is lot of evidence to suggest that Moore's Law won't save us from this growth. RFC4984 goes into a lot of detail on this subject.

"Unfortunately IPv6 has exactly the same problem, and its failure to deal with this is, IMNSHO, its greatest failing. Consider, for example, 16 million multi homed organisations! Given how critical the internet is becoming in many parts of our lives this doesn't seem particularly far fetched.

Some people believe the problem is that an IP address represents both a routing location and a host identity, and splitting these will solve these scaling issues. RFC6115 gives an overview of the all the various proposed solutions - what a shame this wasn't realised when IPv6 was being designed."

...

Update 2011/04/02: There are some interesting comments at Ycombinator News. The most redeeming thing about most of them is that at least now I know people don't just blindly agree with everything I say (which was getting to be a scary trend lately). The bad news is that most of the posts are knee-jerk reactions from people who have religiously bought into IPv6 without actually thinking about it. (Check for yourself: how many of my claims are logic or facts vs. just opinion? How many of the commenters got 40+ upvotes with mostly ad-hominem arguments? (Bonus points for swearing to distract from unproven assertions.) How many of them have very low or negative scores just because they made a point against IPv6? See also: What you can't say.)

This article had the most negative reactions - and thus, of course, the most viewers - since my earlier sequence about XML Parsing and Postel's Law. Anyway, rather than replying to what would certainly be a useless flamewar, I've used the news.yc input to add some clarifications above, for the benefit of future readers. (All of them are marked with "Update 2011/04/02" or [] brackets.)

There is one really valid counterpoint that a few people made that I failed to bring up in my article above. That is: you never know how making things easier - in this case, giving every computer a non-NATted address - will encourage innovation and make new things possible that weren't possible before. It's possible that the whole world would be filled with working peer-to-peer software by now if everyone had a static IP and there was no such thing as NAT. I don't actually believe that specific utopian version, but there really is a valid argument that real innovation might have been lost because of NAT. Conversely, though, the need for NAT is also the primary reason every home router has a built-in firewall nowadays (you know someone would sell a super-cheap firewall-less router if they could, and you know a lot of people would buy it). I remember the days before NAT was widespread. The biggest innovation in "peer-to-peer" at that time was an endless series of internet worms and viruses. Maybe in this case, making it too easy to make incoming connections didn't make things better, it made things worse.

...

Update 2011/04/02: One more thing: some people have commented to the effect that "his opinions about IPv6 will change after he joins Google." First of all, the opinons on this blog have always been and will continue to be mine, not my employer's, and I don't randomly change my personal opinions based on who I work for. And secondly, even if Google does use IPv6 internally - I don't know if they do or not - it won't make any difference. They'll still be talking to apenwarr.ca over IPv4, and everything I wrote here will still be just as true.

I'm CEO at Tailscale, where we make network problems disappear.

Why would you follow me on twitter? Use RSS.

apenwarr on gmail.com