Need help with inhale?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

174 Stars 32 Forks 8 Commits 3 Opened issues


A malware analysis and classification tool.

Services available


Need anything else?

Contributors list

# 260,160
2 commits

Inhale - Malware Inhaler

Inhale is a malware analysis and classification tool that is capable of automating and scaling many static analysis operations.

This is the beta release version, for testing purposes, feedback, and community development.


Inhale started as a series of small scripts that I used when collecting and analyzing a large amount of malware from diverse sources. There are plenty of frameworks and tools for doing similar work, but none of them really matched my work flow of quickly finding, classifying, and storing information about a large number of files. Some also require expensive API keys and other services that cost money.

I ended up turning these scripts into something that people can quickly set up and use, whether you run from a research server, a laptop, or a low cost computer like a Raspberry Pi.


This tool is built to run on Linux using Python3, ElasticSearch, radare2, yara and binwalk. jq is also needed to pretty print output from the database. Here are some of the basic instructions to install.

There's a bunch of things in the config.yml file that aren't actually set up yet, just leave them be unless otherwise stated in this documentation.


Install requirements

python3 -m pip install -r requirements.txt

Installing ElasticSearch (Debian)

A database is not required to use Inhale, but if you would like to set one up, just follow these instructions and set the config.yml option "enable_database" to True.


wget -qO - | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt-get update && sudo apt-get install elasticsearch
sudo service elasticsearch start

You can also install manually by following this documentation

Additionally you can set up a full ELK stack for visualization and data analysis purposes. It is not necessary for using this tool.

Installing radare2

It's important to install radare2 from the repo, and not your package manager. Package manager versions don't come with all the bells and whistles required for inhale.

git clone
cd radare2

Installing Yara


sudo apt-get install automake libtool make gcc
tar xvzf v3.10.0.tar.gz
cd yara-3.10.0/
sudo make install

If you get any errors about shared objects, try this to fix it.

sudo sh -c 'echo "/usr/local/lib" >> /etc/'
sudo ldconfig

Installing binwalk

It's most likely best to simply install binwalk from the repo.

git clone
cd binwalk
sudo python3 install

More information on installing additional features for binwalk is located here.

Installing telfhash/tshl

This is a library that hashes sections of ELF files for malware family analysis. telfhash relies on tlsh, so instructions for installing both are as follows:

Set up tlsh

git clone
cd tlsh
cd py_ext/
python3 ./ build
sudo python3 ./ install

Set up telfhash

git clone
cd telfhash
sudo python3 install

Setting up web server

If you want to use a web server to host inhale output to share, set the variables in config.yml to the appropriate paths, and make sure that the directories exist!

  gen_html: False
  webdir: "/var/www/html/" # The actual web directory
  fqdn: "" # Your website
  in_path: "/var/www/html/inhaled/" # The path to inhale output
  ex_path: "/var/www/html/exhaled/" # The path for db query cache output

To use html output, run inhale like this:

python3 [all your args here] --html


Specify the file you are scraping by type:

-f INFILE      Analyze a single file
-d DIRECTORY   Analyze a directory of files
-u URLFILE     Analyze a remote file (url)
-r RDIRECTORY  Analyze a remote directory (url)
-l URLLIST     Analyze a list of URLs in a text file

Other options:

-t TAGS        Add additional tags to the output.
-b             Turn off binwalk signatures
-y YARARULES   Specify custom Yara Rules
-o OUTDIR      Store scraped files in specific output dir (default: ./files//)
-i             Just print info, don't add files to database
--html         Save output as html to the webdir.


Running will perform all of the analysis on a given file/directory/url and print it to your terminal.

View info on /bin/ls, but don't add to the database

python3 -f /bin/ls -i 

Add directory 'malwarez' to database

python3 -d malwarez/

Download this file and add to the database

python3 -u

Download everything in this remote directory, tag it all as "phishing":

python3 -r -t phishing

PROTIP: Use this Twitter hashtag search to find interesting open directories that possibly contain malware. Use at your own risk.


You can pass your own yara rules with -y, this is a huge work in progress and almost everything in "YaraRules" is from Shoutout @KevTheHermit

Querying the Database

Use to query (Soon to be a nice script) *something* | jq .

Data Model

The following is the current data model used for the elasticsearch database. Not every one of these will be used for every given file. Any r2_* tags are typically reserved for binaries of some sort.

| Name | Description | | ----------- |-----------------------------| | filename | The full path of the binary | | fileext | The file extension | | filesize | The file size | | filetype | Filetype based on magic value. Not as reliable as binwalk signatures. | | md5 | The files MD5 hash | | sha1 | The files SHA1 hash | | sha256 | The files SHA256 hash | | added | The date the file was added | | r2arch | Architecture of the binary file | | r2baddr | The binary's base address | | r2binsz | The size of the program code | | r2bits | Architecture bits - 8/16/32/64 etc. | | r2canary | Whether or not stack canaries are enabled | | r2class | Binary Class | | r2compiled | The date that the binary was compiled | | r2dbgfile | The debug file of the binary | | r2intrp | The interpreter that the binary calls if dynamically linked | | r2lang | The language of the source code | | r2lsyms | Whether or not there are debug symbols | | r2machine | The machine type, usually means the CPU the binary is for | | r2os | The OS that the machine is supposed to run on | | r2pic | Whether or not there is Position Independent Code | | r2relocs | Whether or not there are relocations | | r2rpath | The run-time search path - if applicable | | r2stripped | Whether or not the binary is stripped | | r2subsys | The binary's subsystem | | r2format | The binary format | | r2iorw | Whether ioctl calls are present | | r2_type | The binary type, whether or not it's an executable, shared object etc. | | yara | Contains a list of yara matches | | binwalk | Contains a list of binwalk signatures and their locations in the binary | | tags | Any user defined tags passed with the -t flag. | | url | The origin url if a file was remotely downloaded | | urls | Any URLs that have been pulled from the binary |

Solutions to Issues

There are some known issues with this project (mainly to do with versions from package managers), and here I will track anything that has a solution for it.

ElasticSearch index field limit

If you get an error like this:

elasticsearch.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'Limit of total fields [1000] in index [inhaled] has been exceeded')

You may have an older version of elasticSearch. You can upgrade, or you can increase the fields limit with this one liner.

curl -XPUT 'localhost:9200/inhaled/_settings' -H 'Content-Type: application/json' -d'{ "index" : { "mapping" : { "total_fields" : { "limit" : "100000" }}}}'

Future Features

  • Re-doing the bot plugin for Discord / Matrix
  • Additional binary analysis features - pulling import/export tables, hashing of specific structures in the header, logging all strings etc. Some implemented in telfhash!
  • Checking if the file is the database before adding. This feature was removed previously due to specific issues with older versions of ES.
  • Configuration options for requests such as: user agent, timeout, proxy etc.
  • Dockerization of this entire project.


PRs are welcome! If you want to give specific feedback, you can also DM me @netspooky on Twitter.


I'd like to thank everyone who helped to test this tool with me. I'd also like to thank Plazmaz for doing an initial sweep of the code to make it a bit neater.

Greetz to: hermit, plazmaz, nux, x0, dustyfresh, aneilan, sshell, readme, dnz, notdan, rqu, specters, nullcookies, ThugCrowd, and everyone involved with ThreatLand and the TC Safari Zone.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.