Wednesday, February 8, 2017

Linux - Hunting for Beaconing Using System Tap

I have challenged myself to get myself back into writing consistently after taking a break from it. Hopefully I can keep this blog going longer than one post... so.... here goes.

Ever find yourself dealing with a Linux server that's beaconing to a known bad IP address or domain but shows no other symptoms of being compromised? A lot of attacker malware behaves this way even in the Unix world. When the attacker is not active on the system, the backdoor checks in with the Command and Control server at a specified interval so the attacker still knows it's available. Once you confirm a system is talking to a known bad IP address the goal is generally to get it off the network before the attacker becomes operational again. The downside of pulling the network plug so quickly is that you might find it harder to pinpoint the process that is making the malicious network connections.  Sometimes you just want a little bit more detail before making the decision to bring an important system offline.

The difficult part about trying to detect beaconing is that it happens incredibly quickly. A script junkie may consider throwing a tool like netstat in a while loop and running diff between each execution. Although this isn't a terrible idea, it's very possible that the C2 beacon could occur between netstat executions given that it occurs so quickly.  Enter System Tap. This is a language that gives you access to low level hooks on the operating system and then compiles your script into a kernel module and runs it. Kernel modules that might take you days to write normally can be written in minutes which is perfect for incident response.

In the System Tap documentation there are a lot of examples on how to trace TCP connections. I've also come into contact with Linux malware that beacons to its C2 using UDP packets. So I'm going to take the focus off TCP an put it on UDP.  Here's as basic as it gets if you just want to track UDP connections in real time using System Tap.

#! /usr/bin/env stap
probe udp.sendmsg{
printf("%15s %15s %15s %5d %5d %15s UDP\n", ctime(gettimeofday_s()), saddr, daddr, sport, dport, execname())
}
view raw udp_trace.stp hosted with ❤ by GitHub
For those that have never used System Tap, know that it operates with two primary items. Events and Handlers. In this case our event is udp.sendmesg which occurs when... wait for it.... UDP messages are sent. Inside the brackets we specify that we want to retrieve some data back when this happens. Most data is returned back automatically but we call ctime() to make our timestamp pretty and execname() to automatically look up the process name associated with the UDP call. We can test our little script by running a ping command.



Perfect. By using hardly any lines of code we are able to see the UDP connections our system is making as well as the process responsible!  Let's take this thought a step further.  What if instead of viewing all UDP connections we wanted to write a tool that detects processes beaconing at a steady rate.  To do this, we just have to keep a list of timestamps in which each IP is contacted per process. If the delta is consistent between each timestamp then we will consider the process suspicious and report it to the user.



For those who are familiar with Python you know that calculating something like this wouldn't be too difficult. You could easily record data in a format like the following.

{procname:[timestamp1,timestamp2,timestamp3,timestamp4] }

You could then calculate the difference between each timestamp and determine if the beaconing is steady or not. Unfortunately, System Tap won't be quite as user friendly when it comes to scripting this process out.   I've drafted up a small tool titled "beaconator.stp" which will print out UDP calls and then every 30 seconds check the beaconing to see if there are any processes showing consistent calls. Since many don't have a Linux UDP based backdoor handy, we can use ping and sleep inside a bash script to imitate such a behavior. I use a four second beacon here for testing, but a real beacon is likely to be far more spread out

#!/bin/bash
for i in {1..10}; do ping -c1 google.com; sleep 4; done
view raw beacon.sh hosted with ❤ by GitHub


Checkout the code below or at github. I've commented it as best I could.  I'd be hesitant to run this on any important servers as it's just a PoC. I haven't been highly thorough in my cleanup or tidiness. but take the code and modify it yourself and make it work in your best interest! Also note that this will be far noisier if you're running it on a server of some type.

#! /usr/bin/env stap
/*
/\ \ /\ \__
\ \ \____ __ __ ___ ___ ___ __ \ \ ,_\ ___ _ __
\ \ '__`\ /'__`\ /'__`\ /'___\ / __`\ /' _ `\ /'__`\ \ \ \/ / __`\/\`'__\
\ \ \L\ \/\ __//\ \L\.\_/\ \__//\ \L\ \/\ \/\ \/\ \L\.\_\ \ \_/\ \L\ \ \ \/
\ \_,__/\ \____\ \__/.\_\ \____\ \____/\ \_\ \_\ \__/.\_\\ \__\ \____/\ \_\
\/___/ \/____/\/__/\/_/\/____/\/___/ \/_/\/_/\/__/\/_/ \/__/\/___/ \/_/
*/
/*System Tap arrays must all be global*/
global procnames //holds a list of processes we've seen making udp calls
global curr_timestamps //holds the last UDP timestamp seen per process
global process_count //holds the number of times we've seen each process make a UDP call*/
global delta_list /*Lookup a time delta by supplying a process_name and which time we saw it
IE: {ping:3} = the time delta the 3rd time ping caused a UDP call*/
global ip_lookup_table /* Lookup an IP_address by supplying a process_name and delta
IE{process_name, delta} = ip_address */
global beacon_count //Lookup the number of times we've seen a process and a specific time delta
global confirmed_beacons //Lookup processes by proc_name and delta that we've confirmed are consistently beaconing
/* begin is executed after the kernel module is loaded */
probe begin
{
println("Tracking UDP Calls...")
}
/*Set up a handler for UDP sends*/
probe udp.sendmsg{
ts = gettimeofday_s()
procname = execname()
procnames[procname] <<< 1
printf(" %15s %15s %15s %5d %5d %15s UDP\n", ctime(ts), saddr, daddr, sport, dport, execname())
/*If the process that made the UDP call is already one we're tracking*/
if (procname in curr_timestamps)
{
delta = ts - curr_timestamps[procname]
if (delta != 0){
process_count[procname]++
ip_lookup_table[procname,delta] = daddr
delta_list[procname, process_count[procname]] = delta
}
}
curr_timestamps[procname] = ts
}
/* beacon_check function will check for beaconing every 30 seconds */
function beacon_check(){
//for every process we've seen
foreach(procname in procnames){
i=0
cached_time_delta = 0
//while i is less than the total number of times we saw the process make a udp call
while (i < process_count[procname])
{
current_time_delta = delta_list[procname,i]
//Ignore cases where our time delta was 0. Likely duplicate UDP event
if (current_time_delta != 0)
{
//If we are dealing with the first UDP event, we have nothing to compare it to. Store and move to next pass
if (cached_time_delta == 0)
{
cached_time_delta = current_time_delta
}
else
{
//If we've seen the same amount of space between two beacons
if (cached_time_delta == current_time_delta)
beacon_count[procname,current_time_delta]++
//If we've seen this beacon 3 times at a consistent rate
if (beacon_count[procname,current_time_delta] > 2)
{
if ([procname,current_time_delta] in confirmed_beacons){
;
}
else
{
confirmed_beacons[procname,current_time_delta] = 1
printf("%s is beaconing every %d seconds to %s\n", procname, current_time_delta, ip_lookup_table[procname,current_time_delta])
}
}
}
//save this time delta so we can compare it to the next pass
cached_time_delta = current_time_delta
}
i++
}
}
}
/* Creating a timer that runs our beacon_check every 30 seconds */
probe timer.s(30)
{
//print it in red
ansi_set_color(31)
beacon_check()
ansi_reset_color()
}
/* end runs right before the kernel module is unloaded (ctrl+c) */
probe end
{
println("Done!")
beacon_check()
}
view raw beaconator.stp hosted with ❤ by GitHub



A few things I love about this approach.

  1. It's done 100% from the host level.
  2. We can tie any beacon to the process that's performing it. Sometimes this can be difficult even with a memory dump.
  3. It's faster than dumping memory and performing analysis.
  4. System Tap also allows for cross-instrumentation which under the right circumstances will let you compile your code into a kernel module and use that kernel module on a separate system instead of having to install the dependancies every time.