We’re proud to announce that we are releasing our ENTRADA platform as an open source project. ENTRADA enables researchers and network engineers to analyse large amounts of network traffic, for instance to spot anomalies and threats. We originally developed ENTRADA to analyse DNS traffic and to further increase the security and stability of SIDN’s services for .nl (hence the acronym: “ENhanced Top-level Domain Resilience through Advanced Data Analysis”).
What is ENTRADA?
In a nutshell, ENTRADA is a “big data” platform designed to ingest and quickly analyse large amounts of network data. More technically, it is in fact a high-performance data-streaming warehouse (DSW).ENTRADA's ability to deliver such performance is due to two main features:
It employs an optimised columnar file format (Apache Parquet)
It employs a high-performance SQL query engine (Apache Impala)
Please refer to our research paper for a performance evaluation and more details.
Who can use it?
Domain registries that are interested in developing DNS big data applications
Internet measurement researchers who are in need of a high-performance analytics platform
Main features
Performance: analyse the Parquet equivalent of 50 TB of pcap data in under 3.5 minutes, with a small 6-node cluster (4 data processing nodes).
Interface: benefit from easy SQL statements to analyse your data
Scalable: just add more nodes for faster processing
Built-in conversion of DNS/IP/TCP/UDP/ICMP network data to Parquet data
Open-source
.nl ENTRADA deployment
We have been using ENTRADA at SIDN Labs for the past year. It currently runs on a relatively small 6-node cluster (with 4 data-nodes), which contains about 2 years of DNS traffic from the .nl name servers. This equates to over 100 billion queries and responses. Every day, the database grows by a further 400 million queries.We use ENTRADA as an enabling platform to develop applications that aim at increasing the stability and security of the .nl internet zone and the global internet infrastructure. In ENTRADA, we store .nl zone DNS traffic and analyse it to detect anomalies and threats. It has many potential applications: for example, we are working on including algorithms for phishing detection and the identification of botnet traffic (https://irtf.org/raim-2015-slides/ic/moura.pdf)
Open source
We decided to make ENTRADA open source because we believe it might also benefit the larger network analysis community. By releasing ENTRADA as an open source project, we hope to jumpstart a community of ENTRADA developers and users. New applications based on ENTRADA can aid in further enhancing the stability and security of the internet as a whole.If you have any questions about deploying or running ENTRADA at your organisation, send an e-mail to entrada {at} sidnlabs.nl.
For more information see the ENTRADA project-site at: http://entrada.sidnlabs.nl/