Detecting malicious downloads with Osquery, Rsyslog, Kafka, Python3, and VirusTotal

This blog post will explore how to set up a simple logging pipeline to detect maliciously downloaded files. This setup will utilize technologies such as Osquery, Rsyslog, Kafka, Docker, Python3, and VirusTotal for a logging pipeline. If this pipeline detects a malicious file, a Slack alert will be triggered.

Abstract

First, Osquery will monitor file system events for newly created files. Rsyslog client on a macOS endpoint will ship logs to a Rsyslog server. The Rsyslog server will forward the logs to Kafka, and then Kafka will place the logs into a topic to be consumed by our Dockerized Python application. The Python application will extract the file hash from Osquery file events. These hashes will be submitted to VirusTotal for analysis. If VirusTotal reports that the file is malicious, a Slack alert will be triggered.

Goals

  • Detect malicious downloads with Osquery and VirusTotal
  • Osquery configs
  • Learning to use Kafka with Python
  • Learn how to leverage VirusTotal to detect malicious files
  • Deploying Kafka and Rsyslog server on Docker

Assumptions

This blog post is written to be a proof of concept and not a comprehensive post. This post will NOT cover how Osquery, Kafka, or how Docker works, and this post assumes you know how these technologies work. Second, this blog post contains setups and configs that may NOT be production ready. The “future improvements” section discusses various improvements for this implementation.

Assumption

Background

What is osquery?

Osquery exposes an operating system as a high-performance relational database. This allows you to write SQL-based queries to explore operating system data. With Osquery, SQL tables represent abstract concepts such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events or file hashes.

What is Rsyslog?

Rsyslog is a rocket-fast system for log processing. It offers high-performance, great security features and a modular design. While it started as a regular syslogd, rsyslog has evolved into a kind of swiss army knife of logging, being able to accept inputs from a wide variety of sources, transform them, and output to the results to diverse destinations.

Rsyslog can deliver over one million messages per second to local destinations when limited processing is applied (based on v7, December 2013). Even with remote destinations and more elaborate processing the performance is usually considered “stunning”.

What is Kafka?

Apache Kafka is a community distributed event streaming platform capable of handling trillions of events a day. Initially conceived as a messaging queue, Kafka is based on an abstraction of a distributed commit log. Since being created and open sourced by LinkedIn in 2011, Kafka has quickly evolved from messaging queue to a full-fledged event streaming platform.

What is VirusTotal?

VirusTotal inspects items with over 70 antivirus scanners and URL/domain blacklisting services, in addition to a myriad of tools to extract signals from the studied content. Any user can select a file from their computer using their browser and send it to VirusTotal. VirusTotal offers a number of file submission methods, including the primary public web interface, desktop uploaders, browser extensions, and a programmatic API. The web interface has the highest scanning priority among the publicly available submission methods. Submissions may be scripted in any programming language using the HTTP-based public API.

Network diagram

Obtain VirusTotal API key

  1. Browse to https://www.virustotal.com/#/home/upload and create an account
  2. Login into your new account
  3. Select your profile icon in the top right then select “Settings”
  4. Select “API key” on the left
    1. Copy this API key for later use

Setup Kafka + Rsyslog with Docker

Kafka

Create DNS A record for Kafka

Kafka needs to register itself to a static IP address OR a fully qualified domain name (FQDN – test.google.com). Therefore, you will need to create a DNS A record on your local DNS server that points at the Docker host.

This blog post will NOT cover how to set up a DNS A record because each person’s DNS setup is different. However, I have posted a photo below of a DNS record for my Kafka container on my FreeIPA server. Within my home lab environment, I have two domains, hackinglab.local is used for development. I created an FQDN for kafka-test.hackinglab.local pointed at the IP address of 192.168.1.130 which is my Docker host.

Configure Kafka

To keep this blog post short and targeted we will setup Kafka using Docker. This post assumes you know what Kafka is and how to operate it. If you are unfamiliar with Kafka, please take a look at these blog posts: Apache Kafka Tutorial — Kafka For Beginners, Thorough Introduction to Apache Kafka, and How to easily run Kafka with Docker for Development.

  1. sed -i '' "s/KAFKA_ADVERTISED_HOST_NAME: kafka-test.hackinglab.local/KAFKA_ADVERTISED_HOST_NAME: <FQDN for Kafka>/g" docker-compose.yml

Rsyslog server

Create DNS A record for Rsyslog server

I have posted a photo below of a DNS record for my Rsyslog server container on my FreeIPA server.  This FQDN will allow Rsyslog clients to send their logs to the Rsyslog server. Lastly, the FQDN rsyslog.hackinglab.local is pointed at the IP address of 192.168.1.130 which is my Docker host.

Configure Rsyslog

  1. sed -i 's#broker=["kafka-test.hackinglab.local:9092"]#broker=["<FQDN of Kafka>:9092"]#g' conf/rsyslog-server/31-kafka-output.conf

Spin up Docker stack

  1. docker-compose up -d
  2. docker stats

Test Kafka

  1. brew install ipython
  2. pip3 install kafka-python
  3. ipython
    1. from kafka import KafkaConsumer
    2. consumer = KafkaConsumer('osquery', bootstrap_servers=['<Kafka FQDN>:9092'], value_deserializer=lambda x: loads(x.decode('utf-8')))
    3. consumer.topic()
      1. Ignore port 9098

Install/Setup osquery on MacOS

Install/Setup osquery on MacOS

  1. Open a browser
  2. Browse to https://osquery.io/downloads/official/3.3.2
  3. Download the latest osquery installer
  4. Install osquery
  5. sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/osquery/osquery.conf -o /var/osquery/osquery.conf
  6. sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/osquery/osquery.flags -o /var/osquery/osquery.flags
  7. sudo cp /var/osquery/com.facebook.osqueryd.plist /Library/LaunchDaemons/com.facebook.osqueryd.plist
  8. sudo launchctl load /Library/LaunchDaemons/com.facebook.osqueryd.plist

Install/Setup Rsylog client on MacOS

  1. brew install rsyslog
  2. sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/rsyslog-client/rsyslog.conf -o /usr/local/etc/rsyslog.conf
  3. sudo mkdir /etc/rsyslog.d
  4. sudo curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery_kafka_rsyslog/conf/rsyslog-client/30-osquery.conf -o /etc/rsyslog.d/30-osquery.conf
    1. sed -i 's#Target="rsyslog.hackinglab.local"#Target="<FQDN of Rsyslog server>"#g' /etc/rsyslog.d/30-osquery.conf
  5. sudo brew services start rsyslog

Test osquery config

  1. Open browser
  2. Browse to https://www.mozilla.org/en-US/firefox/new/ or any downloading website
  3. Download a file
  4. tail -f /var/log/osqueryd.results.log
    1. May take up to 5 minutes

Pull events from Kafka with kafka-python

  1. brew install ipython
  2. pip3 install kafka-python
  3. ipython
  4. from kafka import KafkaConsumer
    from json import loads
    
    consumer = KafkaConsumer('osquery', bootstrap_servers=['<FQDN of Kafka>:9092'], value_deserializer=lambda x: loads(x.decode('utf-8')))
    
    for message in consumer:
        message = message.value
        print (message)

  5. Result

Spin up Python app with Docker

Config.yaml

  1. cp app/config/config.yml vim app/config/config.yml.example
  2. vim app/config/config.yml and set:
    1. Kafka
      1. hostname – Set to the FQDN of Kafka
      2. port – Set to the port of Kafka
      3. topic – Can leave as default
    2. Virustotal
      1. api_key – Set API key
      2. threshold – When to generate a Slack alert on a file
        1. The threshold is from 0-1.0 which is positive hits on file/total scanners
    3. Slack
      1. webhook_url – URL to send messages too
    4. save and exit

Spin up app

  1. docker-compose -f docker-compose-app.yml up -d

Testing setup

Benign file

  1. Open browser
  2. Browse to https://www.mozilla.org/en-US/firefox/new/ or any downloading website
  3. Download a NON-malicious file
  4. Kafka-python result
  5. App result

Malicious file

  1. Open browser
  2. Link to DOWNLOAD MALICIOUS FILE
  3. Kafka-python result
  4. App result
  5. Slack result

Future improvements

Osquery detection

This proof of concept(PoC) only monitors the User’s download folder. In an enterprise environment, I would recommend monitoring the user’s e-mail directory for e-mail attachments and potentially the user’s entire home folder. However, this type of monitoring will generate A LOT of noise. I would also recommend generating a list of file types you would like to monitor for such as: .dmg, .docx, .pkg, .xlsx, and etc.

Osquery is detection, not prevention

"schedule": {
    "file_events": {
      "query": "SELECT * FROM file_events WHERE action=='CREATED' AND ( target_path NOT like '/Users/%/Downloads/Unconfirmed%' AND target_path NOT like '/Users/%/Downloads/.com.google%');",
      "removed": false,
      "interval": 300
    }

This code segment was taken from osquery.conf. The file_events query is set to run every 300 seconds (once every 5 minutes). This means it will review the kernel’s file events every 300 seconds looking for events that match our query. Because of this, you will always be 5 minutes behind detecting a malicious download. Secondly, Osquery is a detection tool and NOT a prevention tool. Therefore, if this pipeline detects a malicious file, it will NOT delete the file, nor will it stop the user from interacting with it.

Osquery file carving

A great feature to add to this Python app would be a trigger to initiate an Osquery file carving event. File carving is when you instruct Osquery to zip up a file on an endpoint and send the zip to a server. This would reduce the incident response team having to manually do this and hopefully we can obtain the malicious download before the user/malware deletes it.

References

Leave a Reply

Your email address will not be published. Required fields are marked *