Logo-based detection of malicious .nl websites in practice

Findings of two large-scale pilots run with the Dutch government and a webshop trust mark operator

Screenshot of a government website where LogoMotive marks found logos by framing them in green

The original blog is in Dutch. This is the English translation.

At SIDN Labs, we've developed a system for the logo-based detection of malicious domain names, called LogoMotive. We've since undertaken a large-scale evaluation of LogoMotive with the help of two authoritative bodies: the Dutch national government and the operator of a webshop trust mark scheme called Thuiswinkel Waarborg. The findings confirmed our belief that logo detection can be a valuable addition to the tools we use for continuously reinforcing the security of the .nl zone. In this blog, we summarise the two pilots and their three main findings. We have also written a research paper describing the work in more detail.

What was LogoMotive again?

New system for logo-based detection of malicious domain names successfully piloted

As explained in a previous blog, LogoMotive is a system that facilitates the detection of malicious .nl domain names by identifying logos on websites. The system automatically visits .nl websites and takes screenshots of their home pages. Next, the screenshots are scanned for logos and the results uploaded to a web application. Analysts can then go through the detections and decide whether the logos are being used legitimately or not. For the details of the implementation, please read our earlier blog and the research paper.

Pilots with the government and a trust mark scheme operator

Having created a LogoMotive prototype and gained the initial impression that the approach was feasible, we wanted to find out how well LogoMotive would work in practice. We therefore organised two pilots in which LogoMotive was deployed across the whole .nl zone. One was run with the help of the national government's Public Information and Communications Service (DPC), the other with the operator of a webshop trust mark called Thuiswinkel Waarborg. In the pilots, we investigated possible abuses of the national government logo (Figure 1a) and the Thuiswinkel trust mark (Figure 1b), both of which are well known to the Dutch public. Figure 1 illustrates how the logos are detected by LogoMotive.

Screenshot of a government website in which LogoMotive marks found logos by framing them in green

Figure 1a LogoMotive identifies the national government logo on a website.

Screenshot of a website in which LogoMotive marks found logos from Thuiswinkel.org by framing them in green

Figure 1b LogoMotive identifies the Thuiswinkel trust mark on a website.

In the two pilots, we used LogoMotive to perform repeated scans of all 6.2 million .nl domain names. The government logo pilot also involved monitoring all new registrations over a two-month period (August to October 2021). We detected roughly 11,700 domains making use of the government logo, and about 10,600 .nl sites displaying the Thuiswinkel trust mark. Analysts at the DPC and Thuiswinkel Waarborg manually reviewed the detections, which together numbered more than twenty thousand.

Finding 1: it is possible to detect malicious websites by scanning for logos

We developed LogoMotive because we thought that logo detection could be a useful tool in our efforts to weed out malicious websites. The pilots back up that belief. Our first LogoMotive scan of the .nl zone found three phishing websites displaying the national government's logo. That may not sound a lot, but the number needs to be put in context. We were continuously monitoring new registrations for two months. In that short time, LogoMotive detected fifty-three domains using the government logo. Three of those domains – more than 5 per cent – proved to host phishing websites. Across the .nl zone as a whole, LogoMotive also detected 208 domain names linked to webshops displaying the Thuiswinkel trust mark although they weren't affiliated to the scheme or therefore entitled to do so.

That might suggest that the Thuiswinkel trust mark is more widely abused than the national government logo. However, the overall number of malicious websites in the zone is relatively small, and rates of government logo abuse associated with newly registered domain names were significantly higher. We are therefore inclined to believe that the lower prevalence of government logo abuse detected in our pilot is probably attributable to high rates of detection by other means and subsequent takedowns, as we saw more abuse in newly registered domain names. It's also important to recognise the difference between phishing and other forms of brand abuse: we detected a further eighty sites where the government logo was being used for undesirable purposes other than phishing. Those included fraudulent endorsement and satire.

Based on the two pilots, our conclusion is that logo detection is an effective tool for the detection of new malicious websites before people fall victim. Half of the phishing sites we discovered had not appeared on any popular block list at the time of detection. We also see logo detection as useful for organisations that currently do relatively little abuse monitoring.

Finding 2: logo detection highlights the latent risk posed by redirects

As well as the phishing sites, LogoMotive detected eighty-two suspect domains that displayed the government logo but used HTTP redirects to take visitors to authentic government websites. The registrants had no connection to the national government, and the concern was that they could easily have changed their redirects to point to scam sites or sent mail purporting to come from the government.

Those fears were reinforced by the domain names in question being 'typosquats': names very much like the names of real government domains. A good example was a domain name incorporating the Dutch words for 'vaccination' and 'appointment', which redirected to coronatest.nl, the official website for arranging coronavirus tests and vaccinations. We observed a reasonable volume of DNS traffic linked to the domain name, suggesting a steady flow of visitors to the site.

Two of the redirects we came across led to specialist government websites, including one with a login portal for civil servants. That could have been set up in preparation for a 'spear-phishing attack' targeting specific civil servants. The domain names we discovered could also have been used for sending fraudulent mail. One was associated with MX records relating to a dubious mail server, for example. Despite not (yet) having been associated with abuse or featuring on block lists, redirecting domains represent a latent risk. What's more, domains used for spear-phishing don't always get listed, because relatively few people visit them. The DPC was keen to be made aware of such domain names, because of their potential for use in the context of malicious activities with serious consequences. Logo detection therefore has great potential as a tool for the proactive detection of domain names that pose a latent risk.

Finding 3: logo detection helps portfolio administrators maintain a comprehensive overview

During the pilot, we found 318 legitimate government domain names that weren't known to the DPC's domain name portfolio administrators. As well as maintaining an internal register of government domain names, the DPC publishes a list of publicly accessible government websites. The register and the public list are used, for example, to monitor whether all government domains conform to security standards, such as DMARC and DNSSEC. The importance of inclusion in the register and list is apparent from the findings of our survey of DNSSEC and DMARC support by government domains. Of the domain names in the DPC register, 98 per cent support DNSSEC and 92 per cent DMARC. The figures for the domain names newly detected by LogoMotive were 74 and 41 per cent, respectively.

Similarly, LogoMotive revealed fifty-four domain names belonging to Thuiswinkel trust mark scheme members, which the scheme's operators were unaware of. The scheme's operators consequently had no way of ensuring that the associated websites were compliant with the applicable guidelines and requirements. Logo detection therefore has significant potential as a domain name portfolio management tool. Having a comprehensive overview of the organisation's portfolio enables administrators to monitor compliance with requirements such as support for modern internet standards. It also reduces the risk of a domain name being cancelled without the administrators' knowledge, potentially leading to data breaches of the kind suffered by the Dutch police and others. (That problem is addressed by another SIDN Labs project, called LEMMINGS.)

The DPC has since added the government domain names discovered by LogoMotive to its register.

Conclusion and plans

Our pilots with the national government and the Thuiswinkel trust mark scheme demonstrated that logo detection can indeed help to make .nl more secure. We discovered malicious websites, suspect domain names that could have been used for spear-phishing, and legitimate domain names operating outside the organisations' security and quality monitoring regimes.

We accordingly plan to integrate LogoMotive into SIDN BrandGuard, so that all the service's users can use logo detection to enhance their brand protection activities. We're also making the LogoMotive code available to researchers who want to undertake follow-up studies, and to other registries that want to enable logo detection in their DNS zones. Our research paper on LogoMotive will be presented to the Passive and Active Measurements conference on 28 March. It is available to read on this website, along with an executive summary.

Acknowledgements

We wish to thank the analysts at the DPC and Thuiswinkel.org for the time and effort they invested in helping us run the LogoMotive pilots. Without their many hours of annotation work, we could not have evaluated the system or made the encouraging findings described in this blog.