Fragmentation, truncation, and timeouts: are large DNS messages falling to bits?
Our analysis based on 164 billion DNS queries
Chose your color
Frequently visited
Frequently asked questions
The Whois is an easy-to-use tool for checking the availability of a .nl domain name. If the domain name is already taken, you can see who has registered it.
On the page looking up a domain name you will find more information about what a domain name is, how the Whois works and how the privacy of personal data is protected. Alternatively, you can go straight to look for a domain name via the Whois.
To get your domain name transferred, you need the token (unique ID number) for your domain name. Your existing registrar has the token and is obliged to give it to you within five days, if you ask for it. The procedure for changing your registrar is described on the page transferring your domain name.
To update the contact details associated with your domain name, you need to contact your registrar. Read more about updating contact details.
When a domain name is cancelled, we aren't told the reason, so we can't tell you. You'll need to ask your registrar. The advantage of quarantine is that, if a name's cancelled by mistake, you can always get it back.
One common reason is that the contract between you and your registrar says you've got to renew the registration every year. If you haven't set up automatic renewal and you don't renew manually, the registration will expire.
Wanneer je een klacht hebt over of een geschil met je registrar dan zijn er verschillende mogelijkheden om tot een oplossing te komen. Hierover lees je meer op pagina klacht over registrar. SIDN heeft geen formele klachtenprocedure voor het behandelen van een klacht over jouw registrar.
Would you like to be able to register domain names for customers or for your own organisation by dealing directly with SIDN? If so, you can become a .nl registrar. Read more about the conditions and how to apply for registrar status on the page becoming a registrar.
Our analysis based on 164 billion DNS queries
The Domain Name System (DNS) provides one of the core services of the Internet. DNS employs both UDP and TCP as transport protocol, and most responses are sent over UDP given it is fast (1 RTT). UDP, however, is not always suitable to deliver large DNS responses: packets can be dropped or fragmented, and as such, there is a risk that clients will not receive the answers, which can lead to unreachability. To determine how serious is the problem of large messages in DNS, we analyze 164 billion DNS queries/responses collected at the authoritative servers of the Netherlands’ .nl ccTLD – covering three full months of data (July 2019, July 2020, and October 2020). We present in this blog the main results of a paper we published at the Passive and Active Measurements Conference (PAM2021).
Nobody really likes to wait for a page to be loaded on the Internet. And the DNS can be one of the reasons for slow page load times. Domain names need to be resolved before pages can be loaded. Faster responses are obtained with DNS over UDP (DNS/UDP), which require one round-trip time (RTT). However, given UDP by design provides no delivery guarantees, DNS also can be used with TCP (DNS/TCP), which takes 2 RTT to retrieve the same responses (TCP requires an extra RTT due to its handshake).
DNS/UDP is faster than DNS/TCP, but it has a problem: it has a tough time handling large messages: the original DNS specification limited UDP messages in 512 bytes. Well, that was not enough for many cases, so in 1999 EDNS0 was proposed, allowing the extension of UDP message sizes up to 64k bytes. With EDNS0, DNS clients (resolvers) can advertise their UDP buffer to the authoritative servers, which would use that value as an upper limit when sending responses. If, however, a response was larger than the EDNS0 buffer advertised by the client, then the authoritative server would truncate it and mark it (TC bit), so the resolver would use that signal to request the query again, but then using DNS/TCP.
The issue was that all of this was done at the application layer, which is agnostic to the networking layer. In other words, these buffer negotiations did not consider the maximum transmission unit (MTU) of the path between client and authoritative server – and the most common MTU on the core of the Internet is 1500 bytes. If DNS responses were larger than the path MTU, then these packets would be simply fragmented or discarded along the way. And IPv4 fragmentation is so poorly designed that it is nowadays considered fragile and should be avoided. The worst case is when responses are silently discarded, and clients never receive a DNS response, which blocks them effectively from reaching their desired URL.
While several other works investigated this issue, we take a different vantage point from previous works: two anycast authoritative servers of the Netherlands's ccTLD (.nl). We analyze 164 billion queries, collected with our DNS big data analysis tool ENTRADA, as shown in the table below:
July 219 | July 2020 | October 2020 | |||
---|---|---|---|---|---|
IPv4 | IPv6 | IPv4 | IPv6 | IPv4 | |
Queries/responses | 29.79B | 7.80B | 45.38B | 15.87B | 48.58B |
UDP | 28.68B | 7.54B | 43.75B | 15.01B | 46.94B |
UDP TC off | 27.80B | 7.24B | 42.06B | 13.88B | 45.49B |
UDP TC on | 0.87B | 0.31B | 1.69B | 1.14B | 1.44B |
Ratio (%) | 2.93% | 3.91% | 3.72% | 7.15% | 2.96% |
TCP | 1.11B | 0.25B | 1.36B | 0.85B | 0.36B |
Ratio (%) | 3.72% | 3.32% | 3.59% | 5.37% | 3.17% |
Resolvers | |||||
UDP TC off | 3.09M | 0.35M | 2.99M | 0.67M | 3.12M |
UDP TC on | 0.61M | 0.08M | 0.85M | 0.12M | 0.87M |
TCP | 0.61M | 0.08M | 0.83M | 0.12M | 0.87M |
ASes | |||||
UDP TC off | 44.8k | 8.3k | 45.6k | 8.5k | 46.4k |
UDP TC on | 23.3k | 8.3k | 27.6k | 5.4k | 28.2k |
TCP | 23.5k | 4.3k | 27.3k | 5.2k | 27.9k |
Table 1: Datasets from the .nl zone.
We collect data from two .nl anycast authoritative servers (NS1 and NS3, run by two different anycast providers), and we show them combined in Table 1. We take yearly snapshots (2019 and 2020, July) and October 2020 – the first month after the DNS Flag Day 2020.
At our vantage points, we see that a small fraction of responses is truncated: 2.93 percent to 7.15 percent, depending on the year and IP version. This is the start point of our analysis.
The first analysis we do is to calculate the distribution of the response sizes our servers sent. We see in Figure 1 that 99.99% of the responses from the .nl servers are smaller than 1232 bytes (vertical dashed line), which is the size proposed by the DNS Flag Day 2020. One could say "well, that's only valid for the .nl zone". But Google Public DNS, the largest public resolver service on the Internet, reports that 99.7% of their traffic is also smaller than 1232 bytes.
Figure 1: response sizes per server/IP version for July 2020.
Contrary to what we expected, the largest responses are for A and AAAA records of the .nl authoritative servers – and not DNSSEC records. And the size of the responses changed per server: NS1 is configured to return minimal responses, while NS3 is not. Thus, minimum responses effectively prevent extra records to be added in the additional section, reducing the message response size.
IP fragmentation can happen on the server side, and along the way (only for IPv4, IPv6 forbids in-network fragmentation). We analyze, per server and per IP-version, the number of fragmented responses sent by our servers. Figure 2 shows the results. Very few responses are fragmented: less than 10k a day. We see 1—2 billion daily queries in total in comparison (Table 1). We show in the paper an active measurement with Ripe Atlas to measure in-network fragmentation. We found that 4.4% of queries are fragmented at the network level in the wild over IPv4.
Figure 2: UDP fragmented queries for .nl authoritative servers.
We see in Table 1 that 2.93 to 7.15% of the UDP responses are truncated. Now we investigate why. Figure 3 shows the CDF of both response sizes and EDNS0 buffer sizes for NS1. We see that most DNS/UDP queries are truncated to values under 512 bytes, independent of the IP version.
Figure 3: NS1: CDF of DNS/UDP TC responses for .nl: July 2020.
We also see that most buffer sizes are equal to 512 bytes (left dashed vertical line), which is rather small. Oddly, we see from the purple line for IPv4 that NS1 receives 13% of its queries without EDNS0 extension. We found that this was from two ASs, who have an odd behavior, and only query NS1 (sticky resolvers).
So, when a resolver receives a truncated response, it should ask the same query again using DNS/TCP. We found that this happens in 80% of the cases, as shown in Figure 4.
Figure 4: DNS/UDP TC responses followed by TCP queries.
The DNS flag day 2020 proposed that resolver ops configure their EDNS0 buffer sizes to 1232 bytes. That, in turn, would reduce the large buffer sizes we see in Figure 5, and avoid both fragmentation and truncation. We use the October 2020 dataset and compare it against the July 2020 to measure the uptake of the Flag Day: we get all resolvers seen on both datasets, and see how many have migrated to EDNS 1232 bytes.
From 1.85 million resolvers (unique IP addresses), we see only 11338 that adopted 1232 bytes compared to July 2020, suggesting that the flag day didn’t cause operators to change their settings immediately.
But we also investigated the daily distribution for over a 1.5-year period, as shown in Figure 5. By the end of May 2021, we see 9% of the resolvers announcing 1232 bytes – twice as many as one year earlier. However, the majority still announces either 4096 bytes or other values.
Figure 5: Daily EDNS buffer distribution by resolvers (y axis in log-2 scale).
This study complements previous ones on fragmentation and truncation on DNS. While rather rare, large responses exist in DNS, and they can be prevented by the increased adoption of smaller buffer sizes. Server-side fragmentation is very rare, for both IPv4 and IPv6, but in-network fragmentation is still present (4.4% for IPv4, similar to previous studies). The DNS Flag Day 2020 had some impact, but DNS operators adopted its recommendations only slowly.
This blog is based on a peer-reviewed paper.
Article by:
Share this article