Renaming the DNS root: opportunities, pitfalls and a testbed

The introduction of a new naming scheme would entail significant challenges

Tuesday 28 November 2023
Article by: Moritz Müller, Marco Davids

The original blog is in Dutch, this is the English translation.

Authors: Moritz Müller, Marco Davids (SIDN Labs), Willem Toorop, Yorgos Thessalonikefs and Benno Overeinder (NLnet Labs).

In partnership with NLnet Labs, we carried out a study for ICANN to establish what implications 5 alternative naming schemes would have for the root servers of the Domain Name System (DNS). Adoption of a new naming scheme is under consideration because it could, for example, make the root less dependent on the .net top-level domain (TLD). In October, ICANN published our findings in a report that concluded that the introduction of a new naming scheme would entail significant challenges. The results of our research are outlined in this article and detailed in the full study report.

The current DNS root server naming scheme

At the top of the DNS are the root servers. Management of the servers and publication of the root zone (the ultimate reference point for DNS query processing) is the responsibility of 12 organisations around the world. Of those organisations, 11 manage 1 server each, and 1 manages 2 servers, making 13 root servers in total, identified by the letters A to M. Each server is reachable using the domain name letter.root-servers.net. So, for example, k.root-servers.net is the domain name of the root server operated by the RIPE NCC.

Figure 1 shows the structure of the current naming scheme. The root (the empty circle, far right), .net and the root servers are zones in their own right. The root zone delegates to .net, which in turn delegates to root-servers.net.

Figure 1: Structure of the current scheme (referred to as scheme 5.1 in the RSSAC028 document) and scheme 5.2.

A resolver queries the root servers if, for example, a top-level domain such as .org or .nl is unknown to it (it has no information about the TLD in its cache), or if it has just been started up. A recursive resolver is supplied with a list of root servers and their IPv4 and IPv6 addresses: the 'root hints' file. After starting up, and in most cases daily thereafter, the recursive resolver sends a query, known as a 'priming query', to 1 of the root servers in its root hints file to check that the IP addresses in the file are up to date.

As things stand, that implies that a priming query to k.root-servers.net, for example, also goes to the .net servers. Although the .net TLD is DNSSEC-enabled, root-servers.net is not. Consequently, priming queries are potentially vulnerable to node re-delegation attacks, where queries from a resolver are diverted to fake root servers. An outage affecting .net would also have implications for priming queries.

Alternative naming

In its publication RSSAC028, ICANN's Root Server System Advisory Committee (RSSAC) presented various options for renaming the root servers, so that information about the root servers can be secured using DNSSEC, and so that root servers become independent of .net in some cases.

The RSSAC put forward 5 alternative naming schemes. In this article and in our report, we refer to the 5 schemes by the numbers of the sections of RSSAC028 in which they are described: 5.2 to 5.6, inclusive. Scheme 5.1 is the current naming scheme. Schemes 5.3 and 5.5 have variants, referred to as 5.3.1 and 5.5.1. Each of the schemes is considered briefly below.

Scheme 5.2 is the current naming scheme, but with the root-servers.net zone also signed with DNSSEC (see Figure 1). All the zones in schemes 5.3, 5.4, 5.5 and 5.6 are DNSSEC-enabled as well.

Figure 2: In-zone NS Names (schemes 5.3 and 5.3.1).

Schemes 5.3 and 5.3.1 involve the name 'root-servers' being included in the root zone itself - the names are 'in zone'. Scheme 5.3.1. goes a little further by abbreviating the names of the root servers.

Figure 3: Shared Delegated TLD (scheme 5.4)

Scheme 5.4 goes back to separation of the root-servers zone from the root zone, as at present . However, in contrast to the present situation, there is a new TLD, shared by all root server operators.

Figure 4: Names Delegated to Each Operator (schemes 5.5 and 5.5.1).

Schemes 5.5 and 5.5.1 go further, with each root server having its own TLD.

Figure 5: Single Shared Label for All Operators (schemes 5.6 and 5.6.1).

Schemes 5.6 and 5.6.1. In these schemes, the domain names of the individual root servers would disappear. Instead, all the root servers would be identified by a single, shared domain name with 13 IPv4 addresses and 13 Ipv6 addresses.

Purpose of our study

The purpose of the study was to investigate what would happen if each of the root server naming schemes were adopted. We particularly wanted to know whether priming queries sent by existing resolvers would still be successful: in other words, that resolvers would accept, cache and use the new names.

We also wanted to find out how error-prone resolvers would be: how likely it would be, for example, that a resolver would be unable to validate the DNSSEC signatures of the new records and, as a consequence, would fail priming.

Root name server configurations

We wanted to test the naming schemes in an environment that was as realistic as possible. That implied being able to simulate the behaviour of the existing root name servers and perform tests using various resolver software versions.

And that in turn meant knowing the configurations of the root servers. However, configuration data isn't normally available to the public. We therefore sent a questionnaire to each of the root server operators asking for information such as what name server software they used.

The root server configuration data we compiled is summarised in Table 1. Some of the information shared with us is confidential, such as in cases where the operators run proprietary software. In such cases, the operators did help us to deduce how their software would behave, enabling us to test the naming schemes in a realistic environment. The tests were carried out using an adapted version of NLnet Labs' ldns testns test tool, which allowed us to simulate the behaviour of the proprietary software.

Letters	Operating System	Software	Version
A and J		Confidential
B		Confidential
C	CentOS 7	BIND	9.16
D		NSD	4.1.20
E	FreeBSD	BIND	9.16.x
F	FreeBSD 12.x and 13.x --------------------	BIND -------------------- Confidential	9.16.x --------------------
G		BIND	9.16.29-S1
H	Linux	NSD	4.5.0
I		Confidential
K	RedHat Linux derivative	BIND -------------------- Knot DNS -------------------- NSD	9.16.x -------------------- 3.1.x -------------------- 4.x.x
L	Ubuntu 18.04	Knot DNS -------------------- NSD	3.1.8 -------------------- 4.6.0
M		Confidential

Table 1: Root server configurations.

Our root server testbed

The configurations of the root servers, as detailed in Table 1, formed the basis of our testbed. With help from Vagrant and using an earlier ICANN testbed as our starting point, we built our own open-source testbed to perform automated simulations, in which all the root servers were tested with each of the 5 naming schemes. Our modifications have since been integrated into ICANN's testbed.

Our testbed design makes it easy to add further resolver software versions. In our study, we ran tests using both the latest version and older versions of a number of popular open-source resolver software packages: BIND, Knot Resolver, PowerDNS Recursor and Unbound.

For each scheme, we checked whether the resolvers (i) were successfully able to resolve a test domain name and (ii) did adopt the new name scheme and use it when responding to subsequent queries.

We also measured a number of parameters, such as the number of queries sent during priming, and whether queries were sent over UDP or TCP.

Our testbed is available to download from GitHub and described in detail in the study report. The entire testbed is open source and therefore free to use.

Result #1: priming queries are often successful, but not always

Our test results are summarised in Figure 6. The 13 simulated root servers were configured in line with the information provided in response to our questionnaire.

Figure 6: Priming query outcomes for the various scheme-resolver combinations.

As Figure 6 shows, priming was successful in most cases, but a few resolvers exhibited unexpected behaviour. Moreover, even when priming was successful, the resolvers were not necessarily able to interact with all the root servers: there were cases where a resolver had to try multiple servers until it found one whose software and configuration enabled the resolver to get a response to its priming query.

Our report points out, for example, that root servers running BIND do not include IP addresses in the additional field of the DNS packet when scheme 5.3, 5.3.1, 5.6 or 5.6.1 is active. A common feature of those schemes is that, in such circumstances, the root zone is immediately treated as authoritative for the root server domain name. As a result, when the schemes in question are active, most resolvers send numerous queries to the root servers that run BIND software in order to get the IP addresses. Only PowerDNS Recursor did not behave in that way, and was therefore unable to prime successfully.

Result #2: TCP traffic to the root servers increases

Figure 6 also shows that more resolvers send their priming queries over TCP, as one would expect. Many versions of the BIND resolver always send priming queries with a buffer size of 512 bytes anyway, almost always leading to fragmentation and therefore to a follow-up query being sent over TCP. However, scheme 5.3.1 forms an exception in this context: with that scheme, the responses are less than 512 bytes and do not therefore require TCP transport.

Some root servers always respond with the TC flag set, if not all the additional information can be accommodated in the packet. As a result, resolvers very often use TCP, which may not be desirable for the root server operators.

Result #3: DNSSEC errors can affect priming

As indicated above, with the new schemes all responses are secured using DNSSEC. We therefore wondered what would happen if a resolver was unable to validate a signature, and whether there were implications for priming. To test that scenario, we deliberately set the time incorrectly on the machines running the resolvers, so that it would seem to the resolvers that all the signatures had expired.

In most cases, that proved to have no impact on priming. Only certain versions of Knot Resolver and PowerDNS Recursor were no longer able to prime successfully, indicating that the resolvers were attempting to validate the priming queries.

Various more general observations regarding resolver behaviour are summarised in Table 2.

Property	Resolvers	Observered behavior
Moment of priming	knot-resolver-* pdns-recursor-* unbound-* bind-*	After startup Before it starts working on a query Just after it starts working on a query
Address queries	pdns-recursor-* unbound-* bind-9.10.8 … 9.19.13 bind-9.9.11 & knot-resolver-*	Never query for root server addresses Query root server addresses when there were none in the priming response Always query for root server addresses
DNSSEC validation	pdns-recursor-4.7.5 … 4.8.4 knot-resolver-5.5.3 … 5.6.0	Does not resolve with a clock skewed outside of DNSSEC validity period
Truncated responses	unbound-* bind-9.13.7 … 9.14.10	The content of truncated responses is ignored. The content is expected to be acquired in the follow-up query over TCP

Table 2: Summary of resolver properties.

Conclusions

Our root server testbed experiments show that adoption of any of the 5 new naming schemes would entail challenges. With schemes 5.3 and 5.3.1, for example, the volume of queries would increase considerably, relative to the current situation. Indeed, PowerDNS Recursor is unable to learn the new names from certain root servers. Schemes 5.6 and 5.6.1 give rise to similar problems.

Scheme 5.4 is currently unsuitable for rollout, because it would render Knot resolvers unable to prime successfully. Even with the current scheme and its adapted version (schemes 5.1 and 5.2), problems are liable to occur if the root zone is modified, e.g. if the IP addresses of root servers change. Only schemes 5.5 and 5.5.1 are not associated with any "deal breakers".

A great deal of additional test data and discussion can be found in our study report.

What next?

The ICANN community will now consider our report and possible follow-up action. In the meantime, anyone interested in the root servers is invited to try out our testbed. Finally, we wish to thank all the root server operators and ICANN for their cooperation and support.

Article by: