IPv6 and Virtual Data Centres
I was tasked with bringing a pair of Data Centres up-to-date with IPv6, and offering the same customer functionality as IPv4. This required setting up my own ‘Virtual Data Centre’, using/editing our code repository, and creating software versions of:
- Network switches
- IPv4/IPv6 routers (more Linux boxes)
- Host servers
- Guest (virtual) servers
In terms of extra third-party software, I accomplished this with:
- Oracle VirtualBox
- A bit of Ruby scripting
Note: I’m using networking terms like CIDR in this article — also — no screenshots or diagrams at present. If you’re not into networking, then this might not be good reading for you!
Let’s start with a simple question.
Why is IPv6 still not implemented everywhere?
Quite simply, because IPv6 requires a different strategy for routing along the way to an endpoint. IPv4 works a similar way to IPv6, but the increased network size (2^128 vs 2^32) means you likely have to rethink how things work.
With IPv4 you have ARP (Address Resolution Protocol), and proxy ARP. As you connect servers in a subnet together, discovery works automatically. You simply need to:
- Ensure the ARP request you’re getting is for a server in your subnet.
- Forward the ARP ‘Where is x.x.x.x?’ IP request to the node or endpoint.
- Forward the ARP ‘x.x.x.x is at xx:xx:xx:xx:xx:xx’ MAC location response back.
From that point onwards, you know where the server is, and don’t need to issue ARP requests/responses unless it stops responding to other packet types, like TCP or ICMP.
With Data Centre firewalls which can look after 1,000 servers each, keeping track of all those routes still isn’t too difficult.
Where is Proxy ARP in IPv6?
With IPv6 you don’t have ARP, proxy ARP, or ‘ARP requests’. It’s not part of the RFC specification. Instead, you have NDP (Neighbour Discovery Protocol), proxy NDP and ‘Neighbour Solicitation Requests’.
Why not use Proxy NDP?
In large-scale Data Centres with complex IPv6 networks, nodes in a route can have tens of thousands of addresses. Proxying NDP should be avoided because:
- With IPv6, endpoints can often be allocated an entire subnet. My company owned a ‘/48′ IPv6 subnet. Customers were allocated with ‘/64′ subnets each, and everything for that subnet is routed there, regardless of what IPv6 addresses they’ve opted to use. Since we’re talking about potentially having a routing table size of nearly 1.9*(10^19) records for each customer, proxy NDP is simply inadequate and would be open to abuse.
- Proxy ARP forwards everything. For good reason, without installing a third-party package, most modern versions of Linux do NOT proxy all NDP requests, only ones for addresses you specify. That means lots of nasty configuration.
What’s the alternative?
Basically, with IPv6, static routes are the only viable alternative for anything besides a small office environment. Route traffic for entire subnets, and, as you travel down nodes, you split the routes out ‘via’ different ethernet interfaces.
Since setting up these static routes requires changes at nearly every level of infrastructure, it isn’t exactly easy to do!
Which parts did I do?
- All the ‘ip6tables’ work for the infrastructure, meaning customers could set up IPv6 firewalling just like they can with IPv4.
- All of the IPv6 QoS traffic shaping and hash tabling.
- Solved a problem where link-local addresses caused conflicts within a VLAN, preventing the company from having more than one IPv6 firewall in a network zone.
What is the ‘hash tabling’?
A performance improvement. Dealing with QoS throttling rules on thousands of servers with gigabits of traffic is difficult. I got all the ‘tc’ traffic shaping, designed for IPv4, to work properly with IPv6. This meant customer servers could be correctly throttled according to the bandwidth they purchased, rather than to the rather arbitrary 100Mbps.
Each server was allocated at least one /32 IPv4 address, but also could get a single /64 IPv6 prefix (not a /128 address)!
The hash ‘buckets’ were based on the last byte of the address (making 256 in total), which is the biggest variant in most IPv4 subnets. I moved this to the last byte of the second 32-bit chunk in the IPv6 prefix, which again, was the biggest variant — traffic was routed by the /64 prefix anyway, not address, so we should never look at the last 64 bits when routing. The offsets do change depending on which way traffic is flowing, but either way, the correct thing to do is to throttle based on the server side’s prefix.
This was proven to work in the ‘Virtual Data Centre’ environment I created at work, bringing IPv4 up-to-date with IPv6. It was also shown to work in a real enviroment.