New Age Network Detection: Collection and Analysis

As we return to our series on New Age Network Detection, let’s revisit our first post. We argued that we’re living through technology disruption on a scale, and at a velocity, we haven’t seen before. Unfortunately security has failed to keep pace with attackers. The industry’s response has been to move the goalposts, focusing on new shiny tech widgets every couple years. We summed it up in that first post:

We have to raise the bar. What we’ve been doing isn’t good enough and hasn’t been for years. We don’t need to throw out our security data. We need to make better use of it. We’ve got to provide visibility into all of the networks (even cloud-based and encrypted ones), minimize false positives, and work through the attackers’ attempts to obfuscate their activity. We need to proactively find the attackers, not wait for them to mess up and trigger an alert.

So that’s the goal – make better use of security data and proactively look for attackers. We even tipped our hat to the ATT&CK framework, which has given us a detailed map of common attacks. But now you have to do something, right? So let’s dig into what that work looks like, and we start first with the raw materials that drive security analytics – data.

Collection

In the olden days – you know, 2012 – life was simpler. If we wanted to capture network telemetry we’d aggregate NetFlow data from routers and switches, supplementing with full packet capture where necessary. All activity was on networks we controlled, so it wasn’t a problem to access that data. But alas, over the past decade several significant changes have shifted how that data can be collected:

Faster Networks: As much as it seems enterprise data centers and networks are relics of yesteryear, many organizations still run big fast networks on-prem. So collection capabilities need to keep up. It’s not enough to capture traffic at 1gbit/sec when your data center network is running at 100gbit/sec. So you’ll need to make sure those hardware sensors have enough capacity and throughput to capture data, and in many modern architectures they’ll need to analyze it in realtime as well.
Sensor Placement: You don’t only need to worry about north/south traffic – adversaries aren’t necessarily out there. At some point they’ll compromises a local device, at which point you’ll have an insider to deal with, which means you also need to pay attention to east/west (lateral) movement. You’ll need sensors, not just at key choke points for external application traffic, but also on network segments which serve internal constituencies.
Public Cloud: Clearly traffic to and from internal applications is no longer entirely on networks you control. These applications now run in the public cloud, so collection needs to encompass cloud networks. You’ll need to rely on IaaS sensors, which may look like virtual devices running in your cloud networks, or you may be able to take advantage of leading cloud providers’ traffic mirroring facilities.
Web/SaaS Traffic & Remote Users: Adoption of SaaS applications has exploded, along with the poppulation of remote employees, and people are now busily arguing over what an office will look like moving coming out of the pandemic. That means you might never see the traffic from a remote user to your SaaS application unless you backhaul all that traffic to a collection point you control, which is not the most efficient way to network. Collection in this context involves capturing telemetry from web security and SASE (Secure Access Service Edge) providers, who bring network security (including network detection) out to remote users. You’ll also want to rely on partnerships between your network detection vendor and application-specific telemetry sources, such as CASB and PaaS services.

We should make some finer points on whether you need full packet capture or only metadata for sufficient granularity and context for detection. We don’t think there it’s an either/or proposition. Metadata provides enough depth and detail in most cases, but not all. For instance if you are looking to understand the payload of an egress session you need to full packet stream. So make sure you have the option to capture full packets, knowing you will do that sparingly. Embracing more intelligence and automation in network detection enables working off captured metadata routinely, triggering full packet collection on detection of potentially malicious activity or exfiltration. Be sure to factor in storage costs when determining the most effective collection approach. Metadata is pretty reasonable to store for long periods, but full packets are not. So you’ll want to keep a couple days or weeks of full captures around when investigating an attack, but might always save years of metadata. Another area that warrants a bit more discussion is cloud network architecture. Using a transit network to centralize inter-account and external (both ingress and egress) traffic facilitates network telemetry collection. All traffic moving between environments in your cloud (and back to the data center) runs through the transit network. But for sensitive applications you’ll want to perform targeted collection within the cloud network to pinpoint any potential compromise or application misuse. Again, though, a secure architecture which leverages isolation makes it harder for attackers to access sensitive data in the public cloud.

Dealing with Encryption

Another complication for broad and effective network telemetry collection is that a significant fraction of network traffic is encrypted. So you can’t access the payloads unless you crack the packets, which was much easier with early versions of SSL and TLS. You used to become a Man-in-the-Middle to users: terminating their encrypted sessions, inspecting their payloads, and then re-encrypting and sending the traffic on its way. Decryption and inspection were resource intensive but effective, especially using service chaining to leverage additional security controls (IPS, email security, DLP, etc.) depending on the result of packet inspection. But that goose has been cooked since the latest version of TLS (1.3) enlisted perfect forward secrecy to break retrospective inspection. This approach issues new keys for each encrypted session, preventing inspection devices from being able to terminate encrypted connections. Since we can no longer rely on seeing into encrypted packets, we need to get smarter – literally. At this point machine learning (or artificial intelligence, if you prefer that term) comes into play. By analyzing headers for insight into source and destination, frequency, and packet sizes, detection vendors can identify legitimate encrypted network traffic. Then network detection functionality can recognize those patterns. Another option for TLS traffic is to analyze the handshake between source and destination. The signature of the handshake, known as JA3 or JA3S (for server traffic), doesn’t change. When you detect a new handshake you can perform additional analysis of the destination to make sure it’s legitimate. Of course this approach requires some tuning – new connections are not necessarily malicious – but once the system is tuned this approach works well. You’ll also need to focus on the access control aspect of network security to supplement your detection and analysis. For example if you allow inbound SSH to instances or servers for management, you should only allow those connections from a small set of trusted IPs (your internal networks and approved remote networks). You can (and should) supplement the security of the inbound connections by requiring MFA as well. What does this have to do with network detection? Nothing really, but we cannot guarantee you’ll be able to see inside packets, so you’ll need some workarounds to ensure data is protected.

Analysis

Once you have the data collected and aggregated, it’s time for some math to figure out what’s going on. That’s easy for us to say, but there are so many methods to analyze data – how can you know what will work best? Let’s work through the different types of techniques used by network detection tools.

Rules and Reputation: First is signature-based controls, the old standard. You know, the type of analyses your old IDS used, back in the day. It’s now a bit more sophisticated thanks to tools like the ATT&CK framework, but you still look for patterns that have been identified as malicious – such as command and control or exfiltration. Patterns may be based on destination, frequency of communication, packet size, or a bunch of other stuff. The key is to know what you are looking for.
Machine Learning: The significant evolution from IDS is the ability to detect malicious activity you haven’t seen before, which requires advanced analytics to define a traffic baseline, and then the ability to analyze data from hundreds of other organizations to develop a model of what particular attacks look like on the network. Once you’ve established what your traffic should look like, your detection engine can look for variation.

Another consideration is where analysis takes place: either on-premise or in the cloud. From an enterprise standpoint, so long as detection is accurate and timely, you shouldn’t much care. Although mining a crap-ton of network telemetry is likely to require a big (expensive) set of devices. But the alternative is to send your security rules and sensitive data to the cloud, and you probably pay to move data there, so you’ll need to weigh your options carefully. It all comes down to coverage of your needs for collection, cost-effectiveness, and accuracy. You’ll also be able to enrich collected data before analysis, adding important context to help identify attacks. For instance you could tighten alert thresholds on more sensitive segments. Detection vendors can add additional detections to their analyses for common traffic patterns (see encrypted traffic, above). Then they leverage insights gleaned from their own research teams and analysis of their other customers’ data. It’s timely and generating high interest, so we should mention that these analysis techniques can apply to a broader dataset than just network traffic. That’s what XDR (eXtended Detection and Response) is poised to do. There will be a lot of noise about XDR, and it is an exciting opportunity to build attacks models based on robust multi-faceted data. But let’s not get ahead of ourselves. Detection needs to be use case driven. So we’ll wrap up this series with a focus on how New Age Network Detection can help with use cases like threat hunting, insider threat detection, command and control detection, and everyone’s favorite: ransomware.