LEMMINGS warns .nl registrants if mail traffic may still be going to domain names they have cancelled

New prototype system helps prevent data breaches

Email icon on a light board with large pixels

The original blog is in Dutch. This is the English translation.

A cancelled domain name that continues to attract mail traffic can lead to a data breach. In recent years, there have been a number of high-profile incidents, where sensitive information has got into the wrong hands following the cancellation of a domain name. The incidents showed us that .nl needed a mechanism for helping organisations and individuals avoid cancellation-related security breaches. At SIDN Labs, we have therefore built a prototype system called LEMMINGS, which we're now piloting and evaluating in collaboration with the .nl registrar Argeweb. This blog explains why cancelled domain names are potentially problematic and how LEMMINGS can help. The architecture and the technical working of the system will be covered in a future blog.

Are you the former registrant of a cancelled .nl domain name, and have you had an e-mail from us saying that people may still be addressing e-mail to your old domain? If so, please see our FAQs for answers to your questions.

Police could have prevented hack Major data breach at youth services organisation could have been prevented

E-mail is still going strong

Mail is one of the oldest internet applications, having been around since the early 1970s. It nevertheless remains hugely popular, despite the emergence of chat programmes, video calling and other rival technologies. Analysis we performed using our crawler revealed that about 73 per cent of all .nl domain names have a mail server record in the DNS, meaning that they're configured for mail. What's more, data presented on our statistics website shows that, over the last twelve months, our DNS servers handled between 75 and 100 million MS record requests a day. That serves as a crude pointer to the number of e-mails exchanged via .nl domain names (including spam and other junk messages). E-mail has traditionally been an unsecured system, although recently the technical community has been working hard to develop and roll out e-mail security standards, including SPF, DKIM and DMARC. (To check the status of your e-mail address, visit internet.nl.) However, such protocols are designed to make sending and receiving mail more secure; they can't prevent data breaches linked to cancelled domain names.

Why are cancelled domain names a security risk?

A data breach can occur if a registrant cancels a domain name, and the name is then re-registered by someone else. In that situation, mail intended for the old registrant may go to the new registrant, whose motives for registering the domain name may be dishonest. That isn't just a theoretical risk. It's actually happened to the Dutch police and to certain care providers, for example. Consider the following illustration. Alice is the registrant of the domain name example.nl. She doesn't need it anymore, so she cancels the registration. Example.nl is then quarantined for forty days (just in case the cancellation was a mistake, or Alice has second thoughts). During that time, only Alice can get the name reactivated (retrieved from quarantine). Forty days pass without Alice reactivating example.nl, so the name is made available for re-registration by anyone who's interested. Corinne takes advantage of that opportunity. She re-registers example.nl, and sets up a mail server to receive any mail addressed to the domain. Bob doesn't know (or forgets) that Alice's e-mail address has changed. So he mails alice@example.nl, and Corinne's server accepts his message. If Bob hasn't encrypted his message to Alice, Corinne can read all its contents, which might include personal or commercially sensitive information. Meanwhile, Alice probably has no idea that anything is amiss, because she has no way of knowing that Bob is sending mail to alice@example.nl. If Bob mails Alice while example.nl is in quarantine, he'll get a 'bounce' message saying that the address he's used doesn't exist. But once Corinne has re-registered the name and set up the mail server, Bob's mail will be delivered, and he'll think that all is well.

How can SIDN help?

As the organisation with overall control of all 6.2 million .nl domain names, we are able to see when mail is probably still being sent to e-mail addresses linked to cancelled domain names. If, for example, Bob mails alice@example.nl, his mail server uses the Domain Name System (DNS) to look up the mail server for example.nl, even if Alice has cancelled it. DNS lookups for .nl domain names ultimately result in queries going to our DNS servers. So we get a partial picture of the mail traffic flows to .nl domains (as do DNS resolver operators, such as Google Public DNS and ISPs). At SIDN Labs, we've therefore prototyped a system that automatically analyses DNS traffic for cancelled domain names. If the DNS traffic and other information together suggest that legitimate mail is still being addressed to a cancelled domain (e.g. by Bob), the system notifies the former registrant (Alice). The system we've developed is called LEMMINGS, an acronym creatively derived from 'deLetEd doMain MaIl `warNinG System'.

How innovative is LEMMINGS?

As far as we're aware, LEMMINGS is the first system in the world that notifies former registrants about mail possibly being sent to cancelled domain names. It complements the staged plan formulated by the Expertise Centre for Cybersecurity in the Care Sector (ZCERT) with a view to minimising the risk of data breaches linked to domain name cancellations. While ZCERT's plan provides general advice, LEMMINGS alerts ex-registrants on a smart, proactive basis.

What datasets does LEMMINGS use?

LEMMINGS uses various datasets that we've created at SIDN Labs, including ENTRADA (60+ terabytes of DNS traffic) and DMAP (a hundred million items of historical security and usage data on .nl domain names). LEMMINGS also draws on external data sources, such as Spamhaus, a widely used anti-spam blacklist. A daily analysis is made of the previous day's DNS traffic for all cancelled domain names. For each domain name, LEMMINGS calculates a number of statistics, such as the daily number of mail-related DNS queries (MX queries, for the experts).

What are the criteria for sending a notification about a cancelled domain name?

LEMMINGS operates on the principle that we contact the registrant of a cancelled domain name only if we have good reason to believe that legitimate mail is still being sent to the domain. We define 'legitimate mail' as mail deliberately sent to an individual recipient by a person or entity, as opposed to automated campaign mail. Under that definition, spam, marketing mail, mail from social media platforms (Facebook, LinkedIn, Twitter etc) and other forms of bulk mail don't count as legitimate mail.

Filters

We use various filters that we've developed ourselves to automatically exclude DNS traffic linked to spamming and marketing campaigns from our analysis. Any DNS queries left after filtering are likely to be legitimate. Some of our filters consist of static lists of suspect IP addresses or autonomous systems. However, most of the filters used by LEMMINGS are regenerated on a daily basis to ensure that they are up to date. That's done using information from external sources such as Spamhaus. We also filter DNS traffic on the basis of sender, excluding queries from senders who frequently enquire about the name servers for domains that aren't configured for mail. For details of our filters, see our LEMMINGS information page.

High-risk category

LEMMINGS is also designed to estimate the likelihood of a cancelled domain name previously having been used by an organisation for which privacy is especially important, such as a health care provider or a law firm. We developed a relatively simple rule-based algorithm to automatically identify such domain names and categorise them as 'high risk'. For example, domain names that contain keywords from a static list that we maintain -- including the Dutch for 'family doctor' or 'law firm' -- are categorised. So a domain such as example-solicitors.nl would get labelled 'high risk', for instance. We also consider whether (the last time it was scanned by our crawler before cancellation) a domain name was used for a website with content suggesting an organisation whose activities may well involve sensitive mail, such as a care service provider or government body.

Students from the University of Amsterdam have also conducted research for us into the risks surrounding mail and non-existent domain names. The results of this research can be found here: https://rp.os3.nl/2019-2020/p19/report.pdf.

When does LEMMINGS send notifications?

On day thirty of the forty-day quarantine period, LEMMINGS assesses whether a notification is in order. That's done using statistical data covering the previous ten days (day twenty to day thirty of the quarantine period). The former registrant then has ten days to get the domain name reinstated, or take other steps to address the risk. LEMMINGS doesn't currently consider the statistics from the first twenty days of the quarantine period, because the DNS traffic pattern is likely to be more erratic then, as mail servers continue attempting to deliver messages from automated systems, for example.

Privacy and ethical considerations

We attach great importance to the privacy of the registrants and users of .nl domain names. Privacy protection is integral to all our projects. We have therefore drawn up a privacy policy for LEMMINGS, as we do for every research project that involves personal data processing. The LEMMINGS privacy policy defines, for example, how we process the data we need to decide whether a cancelled domain name should be classed as high-risk. At the end of the monitoring period, we anonymise the data collected for this purpose by deleting the privacy-sensitive items (e.g. registrants' names and e-mail addresses). The remaining statistical data is kept for up to three years to help us improve our services and for use in scientific and other research. It should be emphasised that our DNS traffic analysis doesn't reveal anything about the mail itself. So, for example, we don't know the sender's name, the recipient's name, the subject, or anything about the content of the mail.

Pilot with Argeweb

On 1 June, we started a pilot in collaboration with the .nl registrar Argeweb to see how our LEMMINGS prototype works in practice. For a period of a month, we're testing the system on all the domain names whose registrations are managed by Argeweb. From the pilot, we hope to learn: (1) how well the system works, and (2) what registrants think of it. Point 2 is important, because we want to gauge the extent to which former registrants understand the information they get and aren't unnecessarily alarmed. We have therefore devoted a lot of time to composing the notifications, knowing that many recipients may find the subject matter complex. The results of the pilot will be used to refine the LEMMINGS prototype and the associated notifications, and as the basis for a publication describing the lessons learned.

The future of LEMMINGS

Unless the pilot reveals major flaws, we expect to roll out LEMMINGS for all cancelled domain names in the .nl zone from July. After that, we'll look at adding more features to LEMMINGS, such as support for a notification opt-out and making measurement and analysis data available to ex-registrants. Once LEMMINGS has been in operation for at least six months, we'll analyse our data to determine what effect the system has had. That will include looking at parameters such as the number of quarantined domain names reactivated after a LEMMINGS notification is issued, and comparing post-LEMMINGS patterns with the pre-LEMMINGS situation.