This blog post will explore how to set up a simple logging pipeline to detect maliciously downloaded files. This setup will utilize technologies such as Osquery, Rsyslog, Kafka, Docker, Python3, and VirusTotal for a logging pipeline. If this pipeline detects a malicious file, a Slack alert will be triggered.
Abstract
First, Osquery will monitor file system events for newly created files. Rsyslog client on a macOS endpoint will ship logs to a Rsyslog server. The Rsyslog server will forward the logs to Kafka, and then Kafka will place the logs into a topic to be consumed by our Dockerized Python application. The Python application will extract the file hash from Osquery file events. These hashes will be submitted to VirusTotal for analysis. If VirusTotal reports that the file is malicious, a Slack alert will be triggered.
Goals
- Detect malicious downloads with Osquery and VirusTotal
- Osquery configs
- Learning to use Kafka with Python
- Learn how to leverage VirusTotal to detect malicious files
- Deploying Kafka and Rsyslog server on Docker
Assumptions
This blog post is written to be a proof of concept and not a comprehensive post. This post will NOT cover how Osquery, Kafka, or how Docker works, and this post assumes you know how these technologies work. Second, this blog post contains setups and configs that may NOT be production ready. The “future improvements” section discusses various improvements for this implementation.
Assumption
Background
What is osquery?
Osquery exposes an operating system as a high-performance relational database. This allows you to write SQL-based queries to explore operating system data. With Osquery, SQL tables represent abstract concepts such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events or file hashes.
What is Rsyslog?
Rsyslog is a rocket-fast system for log processing. It offers high-performance, great security features and a modular design. While it started as a regular syslogd, rsyslog has evolved into a kind of swiss army knife of logging, being able to accept inputs from a wide variety of sources, transform them, and output to the results to diverse destinations.
Rsyslog can deliver over one million messages per second to local destinations when limited processing is applied (based on v7, December 2013). Even with remote destinations and more elaborate processing the performance is usually considered “stunning”.
What is Kafka?
Apache Kafka is a community distributed event streaming platform capable of handling trillions of events a day. Initially conceived as a messaging queue, Kafka is based on an abstraction of a distributed commit log. Since being created and open sourced by LinkedIn in 2011, Kafka has quickly evolved from messaging queue to a full-fledged event streaming platform.
What is VirusTotal?
VirusTotal inspects items with over 70 antivirus scanners and URL/domain blacklisting services, in addition to a myriad of tools to extract signals from the studied content. Any user can select a file from their computer using their browser and send it to VirusTotal. VirusTotal offers a number of file submission methods, including the primary public web interface, desktop uploaders, browser extensions, and a programmatic API. The web interface has the highest scanning priority among the publicly available submission methods. Submissions may be scripted in any programming language using the HTTP-based public API.
Network diagram
Obtain VirusTotal API key
- Browse to https://www.virustotal.com/#/home/upload and create an account
- Login into your new account
- Select your profile icon in the top right then select “Settings”
- Select “API key” on the left
- Copy this API key for later use
Setup Kafka + Rsyslog with Docker
Kafka
Create DNS A record for Kafka
Kafka needs to register itself to a static IP address OR a fully qualified domain name (FQDN – test.google.com). Therefore, you will need to create a DNS A record on your local DNS server that points at the Docker host.
This blog post will NOT cover how to set up a DNS A record because each person’s DNS setup is different. However, I have posted a photo below of a DNS record for my Kafka container on my FreeIPA server. Within my home lab environment, I have two domains, hackinglab.local
is used for development. I created an FQDN for kafka-test.hackinglab.local
pointed at the IP address of 192.168.1.130 which is my Docker host.
Configure Kafka
To keep this blog post short and targeted we will setup Kafka using Docker. This post assumes you know what Kafka is and how to operate it. If you are unfamiliar with Kafka, please take a look at these blog posts: Apache Kafka Tutorial — Kafka For Beginners, Thorough Introduction to Apache Kafka, and How to easily run Kafka with Docker for Development.
sed -i '' "s/KAFKA_ADVERTISED_HOST_NAME: kafka-test.hackinglab.local/KAFKA_ADVERTISED_HOST_NAME: <FQDN for Kafka>/g" docker-compose.yml
Rsyslog server
Create DNS A record for Rsyslog server
I have posted a photo below of a DNS record for my Rsyslog server container on my FreeIPA server. This FQDN will allow Rsyslog clients to send their logs to the Rsyslog server. Lastly, the FQDN rsyslog.hackinglab.local
is pointed at the IP address of 192.168.1.130 which is my Docker host.
Configure Rsyslog
sed -i 's#broker=["kafka-test.hackinglab.local:9092"]#broker=["<FQDN of Kafka>:9092"]#g' conf/rsyslog-server/31-kafka-output.conf
Spin up Docker stack
docker-compose up -d
docker stats
Test Kafka
brew install ipython
pip3 install kafka-python
ipython
from kafka import KafkaConsumer
consumer = KafkaConsumer('osquery', bootstrap_servers=['<Kafka FQDN>:9092'], value_deserializer=lambda x: loads(x.decode('utf-8')))
consumer.topic()
- Ignore port 9098
Install/Setup osquery on MacOS
Install/Setup osquery on MacOS
- Open a browser
- Browse to https://osquery.io/downloads/official/3.3.2
- Download the latest osquery installer
- Install osquery
sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/osquery/osquery.conf -o /var/osquery/osquery.conf
sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/osquery/osquery.flags -o /var/osquery/osquery.flags
sudo cp /var/osquery/com.facebook.osqueryd.plist /Library/LaunchDaemons/com.facebook.osqueryd.plist
sudo launchctl load /Library/LaunchDaemons/com.facebook.osqueryd.plist
Install/Setup Rsylog client on MacOS
brew install rsyslog
sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/rsyslog-client/rsyslog.conf -o /usr/local/etc/rsyslog.conf
sudo mkdir /etc/rsyslog.d
sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/rsyslog-client/30-osquery.conf -o /etc/rsyslog.d/30-osquery.conf
sed -i 's#Target="rsyslog.hackinglab.local"#Target="<FQDN of Rsyslog server>"#g' /etc/rsyslog.d/30-osquery.conf
sudo brew services start rsyslog
Test osquery config
- Open browser
- Browse to https://www.mozilla.org/en-US/firefox/new/ or any downloading website
- Download a file
tail -f /var/log/osqueryd.results.log
- May take up to 5 minutes
Pull events from Kafka with kafka-python
brew install ipython
pip3 install kafka-python
ipython
-
from kafka import KafkaConsumer from json import loads consumer = KafkaConsumer('osquery', bootstrap_servers=['<FQDN of Kafka>:9092'], value_deserializer=lambda x: loads(x.decode('utf-8'))) for message in consumer: message = message.value print (message)
- Result
Spin up Python app with Docker
Config.yaml
cp app/config/config.yml vim app/config/config.yml.example
vim app/config/config.yml
and set:- Kafka
hostname
– Set to the FQDN of Kafkaport
– Set to the port of Kafkatopic
– Can leave as default
- Virustotal
api_key
– Set API keythreshold
– When to generate a Slack alert on a file- The threshold is from 0-1.0 which is positive hits on file/total scanners
- Slack
webhook_url
– URL to send messages too
- save and exit
- Kafka
Spin up app
docker-compose -f docker-compose-app.yml up -d
Testing setup
Benign file
- Open browser
- Browse to https://www.mozilla.org/en-US/firefox/new/ or any downloading website
- Download a NON-malicious file
- Kafka-python result
- App result
Malicious file
- Open browser
- Link to DOWNLOAD MALICIOUS FILE
- Kafka-python result
- App result
- Slack result
Future improvements
Osquery detection
This proof of concept(PoC) only monitors the User’s download folder. In an enterprise environment, I would recommend monitoring the user’s e-mail directory for e-mail attachments and potentially the user’s entire home folder. However, this type of monitoring will generate A LOT of noise. I would also recommend generating a list of file types you would like to monitor for such as: .dmg, .docx, .pkg, .xlsx, and etc.
Osquery is detection, not prevention
"schedule": { "file_events": { "query": "SELECT * FROM file_events WHERE action=='CREATED' AND ( target_path NOT like '/Users/%/Downloads/Unconfirmed%' AND target_path NOT like '/Users/%/Downloads/.com.google%');", "removed": false, "interval": 300 }
This code segment was taken from osquery.conf
. The file_events
query is set to run every 300 seconds (once every 5 minutes). This means it will review the kernel’s file events every 300 seconds looking for events that match our query. Because of this, you will always be 5 minutes behind detecting a malicious download. Secondly, Osquery is a detection tool and NOT a prevention tool. Therefore, if this pipeline detects a malicious file, it will NOT delete the file, nor will it stop the user from interacting with it.
Osquery file carving
A great feature to add to this Python app would be a trigger to initiate an Osquery file carving event. File carving is when you instruct Osquery to zip up a file on an endpoint and send the zip to a server. This would reduce the incident response team having to manually do this and hopefully we can obtain the malicious download before the user/malware deletes it.