RFC 2553 bind semantics harms the way to AF independence

Horacio J. Peña

RFC 2553 enforces the IPv4 mapped on IPv6 model for bind(2). This has had some very useful short term results, but harms very badly the way to AF independence, a goal that in my opinion we should try to reach.

Main premise

This paper is based on the premise that it's better writing AF independent programs than IPv6 centric ones.

The basis for this is that we don't believe IPv6 is the cure-everything protocol and that any time in the future (probably in the far future) there would be a new transition from IPv6 to some other protocol. And we should do our best so that when that happens those who have to make that transition work can do it as easily as possible.

We're experiencing the IPv4 to IPv6 transition, and it's painful. There's too much work to do, and the porting of the applications is responsible for much of that pain.

Yes, porting an application is not so hard. But when you have so many, there is a problem. How many times have we heard that the IPv6 adoption is so slow because there is no support on the clients? How easier would have been getting that support if no change to the applications had have to be done? I believe that the answer is ``lots easier''. Porting the applications to use the AF independent way will make future transitions very much easier. I believe that's desired.

What does RFC 2553 says

Note: I'm talking about ``RFC 2553'' but meaning ``RFC 2553 and successors'', so i'll quote rfc2553bis-03 draft, not the RFC.

Because of the importance of providing IPv4 compatibility in the API, these extensions are explicitly designed to operate on machines that provide complete support for both IPv4 and IPv6. A subset of this API could probably be designed for operation on systems that support only IPv6. However, this is not addressed in this memo.

(from ``2. Design Considerations'')

I.e., RFC 2553 applies to dual stack hosts.

Applications may use PF_INET6 sockets to open TCP connections to IPv4 nodes, or send UDP packets to IPv4 nodes, by simply encoding the destination's IPv4 address as an IPv4-mapped IPv6 address, and passing that address, within a sockaddr_in6 structure, in the connect() or sendto() call. When applications use PF_INET6 sockets to accept TCP connections from IPv4 nodes, or receive UDP packets from IPv4 nodes, the system returns the peer's address to the application in the accept(), recvfrom(), or getpeername() call using a sockaddr_in6 structure encoded this way.

(from ``3.7 Compatibility with IPv4 Nodes'')

5.3 IPV6_V6ONLY option for AF_INET6 Sockets

This socket option restricts AF_INET6 sockets to IPv6 communications only. As stated in section <3.7 Compatibility with IPv4 Nodes>, AF_INET6 sockets may be used for both IPv4 and IPv6 communications. Some applications may want to restrict their use of an AF_INET6 socket to IPv6 communications only. For these applications the IPV6_V6ONLY socket option is defined. When this option is turned on, the socket can be used to send and receive IPv6 packets only. This is an IPPROTO_IPV6 level option. This option takes an int value. This is a boolean option. By default this option is turned off.

This implies that when binding an INET6 socket to a port (without specifying an address to bind to) it will hear the IPv4 requests too unless the IPV6_V6ONLY option is set.

That is done using the IPv4-mapped IPv6 addresses.

How to program a server

Let me digress a bit now. I'll show how a server is programmed in IPv4 only programs, in IPv6 centric ones, and in the AF independent way, so the rest of this paper can be understood.

IPv4 server

int listenfd, connfd;
struct sockaddr_in cliaddr, servaddr;
socklen_t clilen;

listenfd = socket(AF_INET, SOCK_STREAM, 0);

if(listenfd < 0)
   die();

memset(&servaddr, 0, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_port = htons(1000);

if(bind(listenfd, &servaddr, sizeof(servaddr)) != 0)
   die();

if(listen(listenfd, 10) != 0)
   die();

connfd = accept(listenfd, (struct sockaddr *) &cliaddr, &clilen);

if(connfd < 0)
   die();

/* do something with connfd */

This does work only with IPv4 connections, if anyone tries to connect to that box at the tcp port 1000 by IPv6 it will not connect.

IPv6 centric server

int listenfd, connfd;
struct sockaddr_in6 cliaddr, servaddr;
socklen_t clilen;

listenfd = socket(AF_INET6, SOCK_STREAM, 0);

if(listenfd < 0)
   die();

memset(&servaddr, 0, sizeof(servaddr));
servaddr.sin6_family = AF_INET6;
servaddr.sin6_addr = in6addr_any;
servaddr.sin6_port = htons(1000);

if(bind(listenfd, &servaddr, sizeof(servaddr)) != 0)
   die();

if(listen(listenfd, 10) != 0)
   die();

connfd = accept(listenfd, (struct sockaddr *) &cliaddr, &clilen);

if(connfd < 0)
   die();

/* do something with connfd */

Almost no changes from the IPv4 only server. Accepts both IPv4 and IPv6 connections. But it is not going to work in OS with IPv6 support compiled out (it is not going to work even for IPv4)

AF independent server

int listenfds[MAX_AF], connfd;
struct addrinfo hints, *res, *ressave;
struct sockaddr_storage ss;
socklen_t sslen;
int n, i, m;
fd_set fdset;

memset(&hints, 0, sizeof(hints));
hints.ai_flags = AI_PASSIVE;
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;

if(getaddrinfo(NULL, "1000", &hints, &res) != 0)
   die();

ressave = res;

for(n = 0; (n < MAX_AF) && res ; res = res->ai_next) {
        listenfds[n] = socket(res->ai_family, res->ai_socktype,
                    res->ai_protocol);
        if(listenfds[n] < 0)
          continue; /* libc supports protocols that kernel don't */

        if(bind(listenfds[n], res->ai_addr, res->ai_addrlen) != 0)
      die();

        if(listen(listenfds[n], 10) != 0)
      die();
        n++;
        }

freeaddrinfo(ressave);

m = 0;
FD_ZERO(&fdset);
for(i = 0; i < n; i++) {
        FD_SET(listenfds[i], &fdset);
        m = MAX(listenfds[i]+1,m);
        }

if(select(m, &fdset, NULL, NULL, NULL) < 0)
   die();

for(i = 0; i < n; i++) {
        if(FD_ISSET(listenfds[i], &fdset)) {
        	sslen = sizeof(ss);
                connfd = accept(listenfds[i], (struct sockaddr*) &ss, &sslen);
                break;
                }
        }

if(connfd < 0)
   die();

/* do something with connfd */

Lots harder...

A comment

But that's not all.

Nor the IPv6 centric nor the AF independent way work as cleanly as I presented them. The IPv6 centric way dies on OS where the IPv6 support is not compiled, so when programming the IPv6 centric way you should check if the socket call fails and then fall back to work as a pure IPv4 server (ie, duplication of code)

And about the AF independent way... I'll talk about the problems it has on the following sections of this paper.

But, even if both ways worked so great as the previous sections would make you believe, while the AF independent way is harder to do, it's a once in the life change, while the IPv6 centric way would have to be modified if any time in the future you want to handle anything that cannot be mapped to IPv6.

End of the digression. Let's go back to RFC 2553.

RFC 2553 implementations

We classify the RFC 2553 implementations by how they implement the bind semantics.

The non compliant ones

These systems consider IPv4 and IPv6 as different protocols, so they don't let the IPv4 mapping to IPv6 work.

The AF independent way works great. The IPv6 centric way works only for IPv6 connections and the INET6 sockets never catch an IPv4 connection.

OpenBSD, NetBSD (by default) and MSR stack for Windows are some of the non compliant implementations.

The buggy ones

Warning: I talk about ``buggyness'' just about the issue on topic, the systems I qualify as ``buggy'' are the best I've worked with. And I like the ``buggy'' stacks better than the ``correct'' ones where trying to work without depending on AF is really hard.

Moderns IPv4 stacks consider INADDR_ANY as meaning ``every address'', and not ``default for not bound addresses'', so they don't allow binding to an specific address when the wildcard address is bound in the same port. That is to avoid letting applications ``steal'' connections from other ones. The same way it shouldn't be allowed to ``steal'' the IPv4 connections from the IPv6 wildcard. Allowing that should be considered a bug.

That behaviour lets the IPv6 centric way work ok, and the AF independent way work ok too. But it has a bug.

FreeBSD, NetBSD (optionally) and BSDI has that buggy behaviour. Probably most propietary implementations do too.

Compliant, non-buggy, but unworkable

Warning: I amn't saying that Linux has not bugs. I'm just talking about bind semantics in this paper.

That's Linux. Linux complies with the RFC letting the IPv6 sockets catch IPv4 connections, has not the bug mentioned before, but it is impossible to work with in an AF independent fashion.

When doing the socket/bind/listen loop, the IPv4 bind call will fail because there is an IPv6 socket bound to the IPv6 wildcard address, so you should ignore bind errors, or croak only if none of the bind calls worked, but that will be the same that ignoring bind errors if a new protocol that has no mapping to the others exists. Ignoring these errors is a Bad Thing.

Conclusion: there's no good implementations of RFC 2553

If my classification is not exhaustive and there is a non-buggy, fully compliant implementation of the RFC 2553 bind semantics that doesn't cause problems to programs written in an AF independent way, I'd be very pleased to know them and learn how they have avoided all that problems. But I believe that the cause is that the RFC is not very good on that point and should have more work done on that.

Until then I can just suggest several possible ways of solving this problem.

Possible solutions

Any of the following possible solutions is good enough for me, being my objective to be able to program portable, AF-independent programs without having to add special cases for IPv6 (nor any other protocol, I'm an application programmer, I shouldn't care about what is running the network), something I cannot do now because the Linux way of implementing bind.

I believe IPv4 mapped addresses will be deprecated sooner or later because IPv4 itself will be deprecated. And I believe that maybe now is the time to start that deprecation, not by disallowing them right now, but by allowing the existence of systems where the IPv4 and IPv6 stacks are isolated (like OBSD and Windows) That's what I call for, but any of the other possible solutions will be enough for me.

Deprecate IPv4 mapped addresses

Maybe the time for IPv4 mapped addresses is over, maybe that was a good mechanism to get things to start rolling but it's time to grow up and left them.

Itojun has mentioned in their ipv6-transition-abuse draft many other problems that the IPv4 mapped addresses have.

But, there is too much work done in the IPv6 centric way, so maybe it isn't prudent to throw them all at once.

Deprecate IPV6_V6ONLY, add IPV6_ACCEPTV4MAPPED option

Then the IPv6 sockets would have to be explicitly allowed to accept IPv4 connections. So the programs that use the IPv6 centric way have to be modified a bit, but the buggy implementations and the unworkable one could be corrected without losing features. Just making IPV6_V6ONLY default to on would have the same results.

More magic to getaddrinfo

Take the Linux approach as the good one (it's the only compliant and non buggy -again, talking just about the issue on topic, i won't judge the general buggyness of any stack here), add a bit more (yet more!) of magic to getaddrinfo so it only returns the INET wildcard sockaddr when the kernel has no support for IPv6, and then the buggy stacks could be corrected with no loss of features and the unworkable one would get workable.

Add a provision for double stack implementations

RFC 2553 targets the dual stack systems, where there is one stack that implements both IPv4 and IPv6 protocols.

If the RFC had a little comment telling that there is allowed to have systems with two isolated stacks, and that the IPv4 to IPv6 mapping may be absent on these systems, the non compliant implementations would become compliant and we would have some implementations compliant, non buggy and easy to work with AF-independently.

About this document ...

RFC 2553 bind semantics harms the way to AF independence

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.48)

The command line arguments were:
latex2html -split 0 afindependence.tex

The translation was initiated by on 2001-06-21

2001-06-21