This proof-of-concept (PoC) will demonstrate how to use Osquery to monitor the browser activity of users. Not only will this PoC collect browser activity, but it will also use VirusTotal to rank each URL to detect malicious activity. In addition to VirusTotal, this PoC will utilize Rsyslog, Osquery, Kafka, Splunk, Virustotal, Python3, and Docker as a logging pipeline. Once this pipeline has been implemented, your security team will have the ability to protect your user’s from today’s most serious threats on the web.
Introduction
In this blog post, we will use Osquery to monitor the browser activity of users. Many organizations will monitor the browser activity of their users using a web proxy. However, web proxies are costly [2], they require a cert on every device for SSL termination [3], certificate pinning prevents SSL inspection [4], and a web proxy will only work when devices are on the network [5]. This proof of concept (PoC) uses open-source software, it doesn’t require a certificate on each endpoint, just Osquery, is not affected by certificate pinning, and this PoC will work on or off the network. In addition to URL collection, this PoC will use VirusTotal to enrich the logs with a ranking of the URL to detect malicious activity.
Goals
- Log user browsing activity with Osquery
- Detect malicious URLs that users are browsing too with VirusTotal
- Implement Kafka and Python together
- Deploying and creating a logging pipeline with Kafka, Rsyslog, Python, and Spunk on Docker
Assumptions
This blog post is written to be a proof of concept and not a comprehensive post. This post will NOT cover how Osquery, Kafka, Rsyslog, VirusTotal, Spunk, Homebrew for macOS, or how Docker works, therefore this post assumes you know how these technologies work. Second, this blog post contains setups and configurations that may NOT be production-ready. The “future improvements” section discusses various improvements for this implementation.
Assumptions
Background
What is Osquery?
Osquery exposes an operating system as a high-performance relational database. This allows you to write SQL-based queries to explore operating system data. With Osquery, SQL tables represent abstract concepts such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events or file hashes.
What are Osquery ATC tables?
ATC (automatic table construction) is a method which can expose the contents of local SQLite database file as an osquery virtual table. ATC was added to osquery by Mitchell Grenier (obelisk) in response to a number of virtual table pull requests which all functioned by parsing SQLite databases. Rather than approving each table as a separate pull request, Mitchell took the opportunity to add a native SQLite parsing method to osquery, which would allow adding any number of new virtual tables on a customizable basis.
What is Rsyslog?
Rsyslog is a rocket-fast system for log processing. It offers high-performance, great security features and a modular design. While it started as a regular syslogd, rsyslog has evolved into a kind of swiss army knife of logging, being able to accept inputs from a wide variety of sources, transform them, and output to the results to diverse destinations.
Rsyslog can deliver over one million messages per second to local destinations when limited processing is applied (based on v7, December 2013). Even with remote destinations and more elaborate processing the performance is usually considered “stunning”.
What is Kafka?
Apache Kafka is a community distributed event streaming platform capable of handling trillions of events a day. Initially conceived as a messaging queue, Kafka is based on an abstraction of a distributed commit log. Since being created and open sourced by LinkedIn in 2011, Kafka has quickly evolved from messaging queue to a full-fledged event streaming platform.
What is VirusTotal?
VirusTotal inspects items with over 70 antivirus scanners and URL/domain blacklisting services, in addition to a myriad of tools to extract signals from the studied content. Any user can select a file from their computer using their browser and send it to VirusTotal. VirusTotal offers a number of file submission methods, including the primary public web interface, desktop uploaders, browser extensions, and a programmatic API. The web interface has the highest scanning priority among the publicly available submission methods. Submissions may be scripted in any programming language using the HTTP-based public API.
What is Splunk?
Splunk is an advanced, scalable, and effective technology that indexes and searches log files stored in a system. It analyzes the machine-generated data to provide operational intelligence. The main advantage of using Splunk is that it does not need any database to store its data, as it extensively makes use of its indexes to store the data. Splunk is a software mainly used for searching, monitoring, and examining machine-generated Big Data through a web-style interface. Splunk performs capturing, indexing, and correlating the real-time data in a searchable container from which it can produce graphs, reports, alerts, dashboards, and visualizations. It aims to build machine-generated data available over an organization and is able to recognize data patterns, produce metrics, diagnose problems, and grant intelligence for business operation purposes. Splunk is a technology used for application management, security, and compliance, as well as business and web analytics.
Using Osquery ATC tables with OSqueryi
Install Osquery on MacOS
- Open a browser
- Browse to https://osquery.io/downloads/official/4.0.2
- Download the latest Osquery installer
- Install Osquery
Osqueryi and ATC table config
curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery-url-monitor/conf/osquery/osquery_chrome_atc_table.conf -o /tmp/osquery_chrome_atc_table.conf
osqueryi --verbose --config_path /tmp/osquery_chrome_atc_table.conf
SELECT * FROM chrome_history LIMIT 10;
Network/UML diagram
Install/Setup Osquery + Rsyslog on MacOS
Setup Osquery on MacOS
curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery-url-monitor/conf/osquery/osquery.conf -o /var/osquery/osquery.conf
curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery-url-monitor/conf/osquery/osquery.flags -o /var/osquery/osquery.flags
sudo cp /var/osquery/com.facebook.osqueryd.plist /Library/LaunchDaemons/com.facebook.osqueryd.plist
sudo launchctl load /Library/LaunchDaemons/com.facebook.osqueryd.plist
Install/Setup Rsylog client on MacOS
brew install rsyslog
sudo mkdir /etc/rsyslog.d
curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery-url-monitor/conf/rsyslog-client/rsyslog.conf -o /usr/local/etc/rsyslog.conf
curl https://raw.githubusercontent.com/CptOfEvilMinions/BlogProjects/master/osquery-url-monitor/conf/rsyslog-client/30-output-osquery-to-rsyslog.conf -o /etc/rsyslog.d/30-osquery.conf
sed -i 's#Target="rsyslog.hackinglab.local"#Target="<FQDN/IP addr of Rsyslog server on Docker server>"#g' /etc/rsyslog.d/30-osquery.conf
sudo brew services start rsyslog
Setup/Deploy Kafka + Rsyslog + Python client + Splunk on Docker
Obtain VirusTotal API key
- Browse to https://www.virustotal.com/#/home/upload and create an account
- Login into your new account
- Select your profile icon in the top right then select “Settings”
- Select “API key” on the left
- Copy this API key for the next section
Configure Python client
git clone https://github.com/CptOfEvilMinions/BlogProjects
cd osquery-url-monitor
cd app
mv config/config.ini.example config/config.ini
vim config/config.ini
and set:vti_api_key
- Paste VirusTotal key from above here
- Save and exit
Deploy Docker stack
docker-compose up -d
docker stats
Setup Splunk to ingest logs
- Once Splunk has initialized
- Open a browser to http://<Docker IP address>:8000
- Enter login credentials
- Username: admin
- Password: changeme
- Select “Settings” in the top right then “Data inputs”
- Select “UDP” type under “Local inputs”
- Select “New Local UDP” in the top right
- Select source
- Select “UDP” for protocol
- Enter “1514” into port
- Input settings
- Select “_json” for the source type
- Select “Search and reporting” for “App context”
- Select “IP” for host method
- Under the index section select “Create a new index”
- Enter “osquery” for Index name
- Select “Save”
- Select your newly created index “Osquery” for index
- Review
- Review the settings
- Select source
Testing setup
- Back to macOS client
- Open the Google Chrome browser
- Browse to https://www.mozilla.org/en-US/firefox/new
Final thoughts/Future improvements
Osquery is detection, not prevention
This proof of concept (PoC) only monitors what URLs users visit with Google Chrome. In addition to monitoring URLs, this PoC is not realtime monitoring because the query used to obtain the user’s browser activity will always be X amount of time after the user accessing the URL. Our configuration scans the user’s browser activity every 10 seconds but in an enterprise environment, this setting is not recommended, and we recommend every 15 minutes. Lastly, if a user browses to https://malware.com, this PoC will not block the user from accessing this website. We recommend using the APIs of your network firewalls to block any current connections to that IP address and any future connections to that IP address. Lastly, you can obtain the domain from the URL and sink-hole the domain so all future connections are unsuccessful.
URL whitelisting
This PoC submits every URL that a user browses too. Not only is this approach wasteful on the VirusTotal API count, but it also leaks internal URLs that users may browse. It is our recommendation to create whitelists for internal URLs/domains. In addition to a whitelist, we also recommend using Alexa’s top million to further reduce wasteful API calls for URLs that include google.com
No visibility with incognito mode or curl
Chrome does not keep track of URLs when using incognito mode. This means that a user can use incognito mode to evade this PoC which is a loss in visibility. In addition to no visibility in incognito mode, the same applies to applications like curl. If a user or an attacker uses curl to download a malicious payload, we will not have a record of the URL. However, Osquery can be used to monitor user commands for this type of threat.
Osquery does not inspect HTTP payloads
The biggest shortcoming of this PoC is that it doesn’t have the ability to inspect HTTP payloads since this can only be done with a web proxy. Furthermore, as stated above, the operational and hardware costs of a web proxy can be expensive. However, this PoC can be used to demonstrate to your leadership why they should purchase a web proxy to increase visibility. For example, let’s say this PoC detects malware that is known to steal intellectual property. This PoC will NOT give you the proper visibility to determine if and what the malware stole. However, an incident like this would demonstrate the importance of a web proxy in the overall defense strategy to protect users.