Projects:2018s1-165 Dual IP Stack Exfiltration - Methods and Defences
Dr Matthew J. Sorell
Dr Olaf M. Maennel
The Dual-Stack IP Exfiltration Attack is an exploit that allows covert data exfiltration to occur over IP. A client performing the attack takes the file to be exfiltrated and manipulates the IP headers of the packets such that the packets of the file are sent over the network alternating between IPv4 and IPv6 Protocols. The design of IPv6 and incompatibility with IPv4 means that it appears as if data sent between the client and server machines, with dual stack capabilities, appear as communications between two sets of entirely different hosts. Thus, from the network’s perspective, there is no correlation between the two sets of IPv6 and IPv4 traffic, and thus the attack can transpire covertly.
An industry dependence on IPv4 coupled with no enforcement of a transition has led to a late adoption of IPv6 standards, leaving many networks exposed to IPv6 based attacks. However, even the networks secure against IPv6 attacks, and IPv4 attacks respectively, are often still vulnerable to a ‘Dual-stack’ attack. A dual-stack attack utilises both IPv4 and IPv6 together to, in this case, exfiltrate data undetected by most network protective and network forensic analysis tools (NFATs). This paper focuses primarily on the forensic detection of dual-stack data exfiltration, which leads on into the investigation of real-time traffic and packet analysis techniques for the detection and defence against such exploits.
The objective of this paper is to investigate three key questions concerning the detection of the attack:
- How effective is the Lempel-Ziv-Welch algorithm at detecting dual-stack exfiltrations in captured network traffic?
- What is a valid method of detecting a personal device identifying mark (PDID) within the captured network traffic to resolve the IPv4 and IPv6 addresses of the host for the purpose of traffic reconciliation and dual-stack exfiltration detection?
- How effective is timing analysis in the detection of dual-stack exfiltrations in captured network traffic?
The first aim of the project will be to follow up on the work of previous authors from the honours project study. The first step of course is to recreate the original attack. In order to achieve this, a system of three virtual machines which act as a client, server, and an intermediate router and DNS service provider, will be setup to simulate a typical internet connection. A packet sniffer will be placed in the middle of this setup to monitor network activity. This is demonstrated in Figure 1.
The second part of the follow up exercise is to implement a LZW dictionary. The Lempel-Ziv-Welch algorithm compresses data, in particular text data, by creating unique look-up tables (dictionaries) based on the data. The method here is to create dictionaries for each received packet and compare their respective dictionaries against all incoming packets of the opposite protocol. The hypothesis is that two sets of packets that are exfiltrating the same file will have a high correlation between their respective LZW dictionaries. Here, the study seeks to confirm and expand upon the findings of the previous paper.
PDID and Timing Correlation
The third part of the investigation is the analysis of the captured network traffic from the earlier simulated attacks to look for identifying marks contained within either the packet headers, the network routing tables, Domain Name Server (DNS) requests, Dynamic Host Configuration Protocol (DHCP) handshakes, Neighbourhood Discovery messages, or any other configuration/messages captured during the network's operation.
The investigation will then use these identifying marks to build up a device profile, and eventually tie that to a unique Persistent Device Identification (PDID) which can identify the host regardless of which IP version is used for transmission. By linking the IPv4 and IPv6 traffic back to the same source host, a more complete data stream can be analysed for exfiltration attempts.
Another avenue for investigation is the correlation of traffic times, and round trips to attempt to identify dual-stack hosts. By correlating the request and response times between various network streams and remote servers, we hope to demonstrate that device identification can be achieved to the level of certainty in which the IPv4 and IPv6 addresses of a single machine can be identified and traced back to the physical host from forensic analysis of the network traffic.
Likewise with the PDID identification, the goal is to identify both the IPv4 and IPv6 network streams and unify them into a contiguous network stream for forensic analysis to detect the hallmarks of dual-stack IP data exfiltration.