IPv6 and Virtual Data Centres

I was tasked with bringing a pair of Data Centres up-to-date with IPv6, and offering the same customer functionality in IPv6 as they had with IPv4 — configurable firewalls, rate limiting, and proper routing. This required setting up my own ‘Virtual Data Centre’, using/editing our code repository, and creating software versions of:

  • Network switches
  • IPv4/IPv6 routers (more Linux boxes)
  • Host servers
  • Guest (virtual) servers

In terms of extra third-party software, I accomplished this with:

  • Oracle VirtualBox
  • Vagrant
  • A bit of Ruby scripting

InformationI’m using networking terms like CIDR in this article — also — no screenshots or diagrams at present.

Let’s start with a simple question.
 

Why is IPv6 still not implemented everywhere?

Quite simply, because large-scale IPv6 requires different strategies for:

  • Routing packets to an endpoint
  • Traffic flow classification
  • Optimisation of rules
  • Rate limiting (and non-rate limiting)
  • Firewalling
  • Any internal usage of ICMP protocol ‘types’, which change in ICMPv6

Few engineers want to risk breaking IPv4 for the current, limited, benefits of having a fancy dual-stack IPv4/IPv6 network. IPv4 does work in a similar way to IPv6, but the increased network size (2^32 vs 2^128) means routing has to be done differently.

With IPv4 you have an automatic way of discovering endpoints — ARP (Address Resolution Protocol), and proxy ARP for when an endpoint is behind another route. As you connect servers in a subnet together, things just ‘work’. For many engineers, this automatic discovery is the preferred approach. ARP does all this for you:

  1. It ensures the request you’re getting is for a server in your subnet.
  2. It sends the ‘Where is x.x.x.x? Tell y.y.y.y’ IP broadcast to all systems on the subnet.
  3. A system (usually the owner of the address) replies with the ‘x.x.x.x is at xx:xx:xx:xx:xx:xx’ response indicating which MAC address owns the IP, or which address should relay it.

From that point onwards, you know where the server is, and further ARP requests/responses are not required unless the endpoint stops responding to other packet types, like TCP or ICMP.

Even with Data Centre firewalls, which can look after 1,000 servers each, keeping track of how to route data for each of those isn’t too difficult.
 

Where is Proxy ARP in IPv6?

With IPv6 you don’t have ARP, proxy ARP, or ‘ARP requests’. It’s not part of the RFC specification. Instead, you have NDP (Neighbour Discovery Protocol), proxy NDP and ‘Neighbour Solicitation Requests’.
 

Why not use Proxy NDP?

In large-scale Data Centres with complex IPv6 networks, nodes in a route can have tens of thousands of addresses. Proxying NDP should be avoided because:

  • With IPv6, endpoints often are allocated an entire subnet to ‘pass on’ the ability to create subnets. My company owned a huge ‘/32′ public IPv6 subnet. Customers were allocated with ‘/64′ subnets on each server. Since we’re talking about a single Data Centre firewall potentially having a routing table size of nearly 1.9*(10^19) records for each customer, proxy NDP is simply inadequate and would be open to abuse.
  • Proxy ARP forwards everything. For good reason, without installing a third-party package, most modern versions of Linux do NOT proxy all NDP requests, only ones for addresses you specify. That means lots of nasty configuration.

 

What’s the alternative?

Basically, with IPv6, static routes are the only viable alternative for anything besides a small office environment. Route traffic for entire subnets, and, as you travel down nodes, you split the routes out ‘via’ different Ethernet interfaces.

Since setting up these static routes requires changes at nearly every level of infrastructure, it isn’t exactly easy to do!
 

Which parts did I do?

I looked after the following:

  • All the ‘ip6tables’ work for the infrastructure, meaning customers could set up IPv6 firewalling just like they can with IPv4.
  • All of the IPv6 QoS traffic shaping and hash tabling.
  • Solved a problem where link-local addresses caused conflicts within a VLAN, preventing the company from having more than one IPv6 firewall in a network zone.

 

What is the ‘Hash Tabling’?

A huge performance improvement. Dealing with QoS throttling rules on thousands of servers with gigabits of traffic is difficult. I got all the ‘tc’ traffic shaping, designed for IPv4, to work properly with IPv6. This meant customer servers could be correctly throttled according to the bandwidth they purchased, rather than to the rather arbitrary 100Mbps.

Each server was allocated at least one /32 IPv4 address, but also could request a single /64 IPv6 prefix, and then multiple /128 IPv6 addresses on that prefix.

In IPv4, the hash ‘buckets’ were based on the last byte of the address (making 256 in total), which is the biggest variant in most subnets. I moved this to the last byte of the IPv6 prefix, which again, was the biggest variant — we should never need to look at the actual IPv6 address when routing if the entire prefix goes to the same server.
 

IPv4 usage breakdown

Let’s assume:

  • The Data Centre owns a /20 IPv4 prefix (a 4,096 address range).
  • The entire IPv4 range starts at 192.168.0.0 and ends at 192.168.15.255.
  • Each server has a single IPv4 address, for example, 192.168.0.1.

An IPv4 address might look like this (192.168.0.1 in hexadecimal):

C0AB0001

Broken down, this would be:

C0AB0: The company’s /20 IPv4 range prefix
001: The server’s unique, 12-bit IPv4 suffix
01: The hash bucket for the server’s IPv4 address

 

IPv6 usage breakdown

Let’s assume:

  • The Data Centre owns a /32 IPv6 prefix (a 79,228,162,514,264,337,593,543,950,336 address range).
  • The entire IPv6 range starts at FE80:: and ends at FE80::FFFF:FFFF:FFFF:FFFF:FFFF:FFFF.
  • Each server has a single public /64 IPv6 prefix, such as FE80:0:89AB:CDEF::.
  • A server can have a number of IP addresses in this range, such as FE80:0:89AB:CDEF:123::1.

An IPv6 address might look like:

FE80:0000:89AB:CDEF:0123:0000:0000:0001

Broken down, this would be:

FE80:0000: The company’s /32 IPv6 prefix
89AB:CDEF: Part of the server’s /64 IPv6 prefix
EF: The hash bucket for the server’s IPv6 prefix
0123:0000:0000:0001: One of the server’s IPv6 suffixes

These rules were directly applied to the encapsulated IPv6 packets in the ingress/egress frames.

 

The Result

This was proven to work in the ‘Virtual Data Centre’ environment I created at work, bringing IPv6 up-to-date with IPv4. It was also shown to work in a real enviroment.