CRYPTO-BASED HOST IDENTIFIERS

Abstract

Recently there have been many discussions on separating the locator and
identifier functions of IP addresses in order to facilitate scalable
multihoming, mobility and address stability. This document describes a way
to incorporate a 64 bit identifier in IPv6 addresses without getting in
the way of regular IPv6 operation, and the related mechanisms to verify
the authenticity of the identifier used by a correspondent. The intent is
to foster further discussion within the multi6 design team and elswere,
not to provide a complete specification. 


[Note that this document hasn't been submitted as a draft yet.]

1. Introduction

In many types of interactions across the network it is important to know 
the identity of the correspondent. This is especially true in multihoming 
and mobility, where a correspondent may change its address during a 
session. In [MIPv6] and [NOID] it has been shown to be possible to solve 
mobility and multihoming without introducing a long-lived host or stack 
name identifier. However, this doesn't mean that having such an 
identifier would be without benefits. This document explores the 
possibility of adding a means to identify a host independent of the full 
IPv6 address used by the host and independent of a specific multihoming 
or mobility solution.

This document reflects the author's take on current discussions within the
design team but isn't a design team product. The design team members are,
in reverse alphabetical order: Pekka Savola, Erik Nordmark, Tony Li, Mike
O'Dell, Brian Carpenter and Iljitsch van Beijnum. 

2. Overview

There are two types of crypto-based host identifiers 64 bit and 80 bit. 
The 64 bit type consists of 4 control bits, 48 site key bits and a 12 bit 
host number:

            0     8     16    24    32    40    48    56   63
            +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
            |  site  |C |     site (continued)     |  host  |
            +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

The 64 bit host identifiers are appropriate in cases where the subnet 
bits (bits 48 - 63 in the IPv6 address) are subject to change, for 
instance in a host multihoming or mobility situation. When the subnet 
bits are fixed, which is likely to be the case with site multihoming or 
when no address changes are expected, 80 bit host identifiers that 
include the subnet bits are more appropriate, as these allow 
significantly more hosts to be grouped together in a site. The 80 bit 
host identifier consists of 4 control bits, 44 site key bits, a 16 bit 
host number and a 16 bit subnet number:

0     8     16    24    32    40    48    56    64    72   79
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|   subnet  |  site  |C |   site (continued)    |    host   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

The control bits are:

- reserved
- 80 bit identifier (1) or 64 bit identifier (0)
- u/l and g bits as outlined below

3. EUI-64 Compatibility

Generally, a host will create IPv6 addresses for interfaces based on the 
interface's EUI-64 as outlined in [RFC 2462]. In order to avoid overlap 
from addresses generated by [RFC 3041] and from regular EUI-64 interface 
identifiers, for crypto-based host identifiers the universal/global bit 
is set to "universal" and the group bit is set to indicate a group 
(multicast) address. Note that the resulting EUI-64 value is only valid 
for the purposes of generating IPv6 addresses in accordance with [RFC 
2462]. Under no circumstances may such a value be assigned to an 
interface for use as a link address.

4. The Site Identifier

The site identifier is created by generating a key pair using a public 
key crypto algorithm (to be decided). Then the SHA-1 algorithm is used to 
calculate a hash over the public key. If 80 bit host identifiers are to 
be used, the site identifier consists of the first 44 bits of the SHA-1 
hash. If 64 bit host identifiers are to be used, the site identifier 
consists of the last 48 bits of the SHA-1 hash. (The terms "site 
identifier" and "site key bits" are used interchangeably.)

All hosts that hold a host identifier must have a set of public key 
cryptographic keys. The host's public key is signed using the site secret 
key.

Site identifiers, along with the full self-signed public key and other 
pertinent information, are registered publicly to avoid and resolve site 
identifier collisions. When a newly generated site identifier collides 
with an existing one, the new key pair is discarded and a new one is 
generated. This is the only required use of a public registry. All other 
use of such a registry is optional.

Since it is computationally expensive to generate working keys that match 
a specific site identifier, possession of the secret key provides a 
"proof of ownership" of a site identifier that is good enough to fend off 
denial of service attacks and to provide authentication with a strength 
level somewhere between a simple encrypted password and full-out IPsec. 
An important feature is that the site identifier registry doesn't require 
rooted authority: any mechanism that makes a full list of site 
identifiers and public keys along with serial numbers available to anyone 
who wants to do a lookup within a reasonable timeframe after new 
identifiers have been generated is sufficient. A small number of 
repositories that accept new site identifiers and accompanying material 
after checking the signature would work well. Each repository could work 
independently but they could exchange new site identifiers for the sake 
of completeness. Repositories can then make their contents available 
through mirroring and direct querying mechanisms. A good way to allow 
direct queries to the site identifier database would be by publishing a 
copy of an up to date repository in the DNS.

A fully populated 44 or 48 bit range of values is too large to store in 
the DNS without additional hierarchical structure. However, these ranges 
will never be fully populated, both because such a large number of site 
identifiers isn't necessary and because at some point, the chance of 
successive collisions becomes too large to be able to generate a new site 
identifier efficiently. A target for optimum performance would be a 
population somewhere between one in a million (approximately 17 million 
and 260 million site identifiers respectively) and one in a thousand (17 
/ 260 billion site identifiers). Current practice shows that the DNS can 
handle flat spaces with up to several tens of millions entries, so a 
modest growth rate (well below Moore's Law) maxing out at around one to 
ten billion sites in 2050 shouldn't be a problem.

5. The Challenge/Response Mechanism

When a host wants to authenticate a correspondent using a crypto-based 
host identifiers, it issues a challenge to the correspondent. The layout 
of the challenge and the way it is transmitted to the correspondent is to 
be decided later. The challenge consists of a source correlator and a 
cookie or nonce that may either be sent in the clear, or be encrypted 
using the host's secret key. In either case the host knows that it is 
dealing with a "live" correspondent rather than some host spewing out 
packets without looking at the return traffic. If the challenge was 
encrypted, the correspondent was also forced to expend a significant 
amount of CPU cycles. An encrypted challenge always contains the data 
found in an unsolicited response, in case the correspondent is not yet in 
the possession of the public key.

Responses consist of the correlator and cookie values from the challenge 
(or data determined by the host creating the response if the response is 
unsolicited) and the host public key, the site public key, the host 
number, a site serial number and the signature over the host key and site 
serial number, and last but not least, a signature over the entire 
response packet.

Both challenges and responses allow for additional data. The additional 
data in a challenge is copied into the additional data in the response 
and is included in the signature calculation. This can be used (for 
instance) to copy back a TCP SYN to the host initiating a new TCP 
session. This allows the receiving host to respond with a challenge 
without setting up any state yet. Only when a response for the challenge 
comes back and after validating this response, the receiving host 
proceeds to process the TCP SYN.

Unsolicited responses may be silently dropped at any time.

When checking a response, a host may optionally take advantage of 
information published in the DNS or through other means. This allows the 
host to detect whether it's dealing with the "real" holder of a site 
locator rather than an impostor that stumbled on a key pair that maps to 
an existing site identifier. It also allows for retiring a compromised 
host key: if the published site serial number is higher than that 
presented by the correspondent, the host key is invalid.

6. Turning the Site Identifier into an Address Range

In certain types of multihoming solutions, such as [ODELL96], the locator 
and identifier functions of the IP address are separated. In these cases, 
the upper layer protocols such as TCP and UDP only see the identifier. In 
[ODELL96] the identifier consists of the lower 64 bits of the IPv6 
address, which is compatible with what is proposed here. However, 
intra-site connectivity using just the lower 64 bits of the IPv6 address 
is problematic. To avoid this problem, and in order to provide a range of 
stable addresses a site may use regardless of its connectivity to the 
Internet, the site identifier may be transformed into a site prefix.

The procedure for transforming a 80 bit host identifier into a site 
prefix is to take the site identifier bits and concatenate those to a 4 
bit prefix assigned by IANA. The resulting 48 bit value is the provider 
independent site prefix. This prefix is combined with a 80 bit host 
identifier to form a complete IPv6 address.

Example of how a 80 bit host identifier is turned into a 48 bit site prefix:

0     8     16    24    32    40    48    56    64    72   79
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|   subnet  |  site  |C |   site (continued)    |    host   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
             \        \ |                       |
              \        \|                       |
            +--+--+--+--+--+--+--+--+--+--+--+--+
            |IA|  s i t e   i d e n t i f i e r |
            +--+--+--+--+--+--+--+--+--+--+--+--+
            0     8     16    24    32    40   47

(IA = 4 bit prefix assigned by IANA)

A 64 bit host identifier is turned into a site prefix by concatenating 
the site bits with a 12 bit prefix assigned by IANA. This results in a 60 
bit provider independent prefix. To avoid being limited to a single 
subnet, the top 4 bits of the host number are copied to bits 60 - 63 in 
the IPv6 address. The full 64 bit host identifier is present in the lower 
64 bits to arrive at a full IPv6 address. This allows for 16 subnets with 
256 possible hosts each.

Example of how a 64 bit host identifier is turned into a 60 bit site 
prefix / 64 bit subnet prefix:

      0     8     16    24    32    40    48    56   63
      +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
      |  site  |C |     site (continued)     |  host  |
      +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       \        \ |                             |
        \        \|                             |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|  IANA  |   s i t e   i d e n t i f i e r   |H |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
0     8     16    24    32    40    48    56   63

(IANA = 12 bit prefix assigned by IANA)
(H = top 4 bits of the host number)

Use of an EUI-64 that isn't a host identifier as outlined in this 
document in combination with one of the above provider independent 
prefixes is undefined and not recommended.

7. Operation

In the following examples host A in site X wants to communicate with host 
B in site Y.

7.1 Address Authentication

- A sends a TCP SYN to B
- B sends a challenge to A, which includes an encrypted cookie, an 
  unsolicited response and the TCP SYN packet as additional data
- B doesn't store any state yet to avoid resource exhaustion attacks
- A receives the challenge, finds the unsolicited response and performs 
  the necessary checks on it:
   * is the signature over the host key (using the site key) valid?
   * does the site public key produce a hash that is equivalent to the 
     site identifier?
   * does the host number equal that in the host identifier?
   * is the signature over the unsolicited response (using the host key) 
     valid?
- After having thus authenticated B's use of its host identifier A uses 
  B's host public key to decrypt the cookie
- A returns a response for the challenge with the cookie in the clear, 
  along with the additional data
- B receives the response and continues to check it. In addition to the 
  checks done by A, B also looks up B's site identifier in the DNS to find 
  the full site public key and the current serial number.
   * B checks whether the B's site public key in the response is equal to 
     the one in the DNS
   * B checks whether the serial number in the signature over B's host key 
     isn't smaller than the serial number in the DNS
- subsequent upper layer sessions between the same hosts reuse the 
  previously created state 

7.2 Identifierless Multihoming or Mobility

- A and B communicate
- B receives a packet from an unknown address, however the host 
  identifier is A's
- B sends a challenge for A's host identifier to the new address
- B receives a response
   * if the response checks out, B creates mapping state that maps 
     incoming packets from the previously unknown address to the address 
     that A used earlier before the packet is handed off to higher layers
   * if the response doesn't check out, B continues to communicate with A 
     over the known address and an alarm is raised

Note that in this case there is no challenge/response exchange when the 
initial communication is started, but only after there is a change of IP 
address.

8. IANA Considerations

IANA is requested to allocate a /4 and a /12 for crypto-based site 
identifier derived provider independent address ranges.

9. Security Considerations

Since the length of the hash over the public key is only 44 or 48 bits, 
even though finding a key for a known hash is extremely difficult, there 
is a significant chance of accidental collisions. As such, this 
authentication scheme on its own isn't secure enough for use with very 
sensitive applications.

10. Author's Address

Iljitsch van Beijnum
Karel Roosstraat 95
2571 BG  The Hague
Netherlands

Phone: +31-70-3103790

Email: iljitsch@muada.com

11. References

[RFC 2462] S. Thomson and T. Narten, "IPv6 Stateless Address 
           Autoconfiguration", December 1998

[RFC 3041] T. Narten and R. Draves, "Privacy Extensions for Stateless 
           Address Autoconfiguration in IPv6", Januari 2001

[MIPv6]    "Mobility Support in IPv6", draft-ietf-mobileip-ipv6-24.txt, work 
           in progress

[NOID]     E. Nordmark and T. Li, "Multihoming without IP Identifiers", 
           draft-nordmark-multi6-noid-00.txt, work in progress, October 2003

[M6SEC]    Nordmark, E., and T. Li, "Threats relating to IPv6 multihoming 
           solutions", draft-nordmark-multi6-threats-00.txt, work in 
           progress, October 2003.

[ODELL96]  O'Dell M., "8+8 - An Alternate Addressing Architecture
           for IPv6", draft-odell-8+8-00.txt, work in progress, October 1996