Need help with bulk_extractor?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

514 Stars 120 Forks Other 1.6K Commits 42 Opened issues


This is the development tree. For downloads please see:

Services available


Need anything else?

Contributors list

Welcome to bulk_extractor.

Note: bulk_extractor version 2.0 is now under development. For information, please see Release 2.0 roadmap in the release-2.0-dev branch.

To build bulk_extractor in Linux or Mac OS:

  1. Make sure required packages have been installed. You can do this by going into the etc/ directory and looking for a script that installs the necessary packages for your platform.

  2. Then run these commands:

make install

For detailed instructions on installing packages and building bulkextractor, read the wiki page here:

The Windows version of bulk_extractor must be built on Fedora.

To download the Windows installer and/or other releases of bulkextractor, visit the downloads page here:

For more information on bulkextractor, visit:

Tested Configurations

This release of bulk_extractor has been tested to compile on the following platforms:

  • Amazon Linux as of 2019-11-09
  • Fedora 32
  • Ubuntu 16.04LTS
  • Ubuntu 18.04LTS

To configure your operating system, please run the appropriate scripts in the etc/ directory.


If you are writing a scientific paper and using bulk_extractor, please cite it with:

Garfinkel, Simson, Digital media triage with bulk data analysis and bulkextractor. Computers and Security 32: 56-72 (2013) * Science Direct * Bibliometrics * Author's website ``` @article{10.5555/2748150.2748581, author = {Garfinkel, Simson L.}, title = {Digital Media Triage with Bulk Data Analysis and Bulkextractor}, year = {2013}, issuedate = {February 2013}, publisher = {Elsevier Advanced Technology Publications}, address = {GBR}, volume = {32}, number = {C}, issn = {0167-4048}, journal = {Comput. Secur.}, month = feb, pages = {56–72}, numpages = {17}, keywords = {Digital forensics, Bulk data analysis, bulkextractor, Stream-based forensics, Windows hibernation files, Parallelized forensic analysis, Optimistic decompression, Forensic path, Margin, EnCase} } ```


I continue to port bulkextractor, tcpflow, be13api and dfxml to modern C++. After surveying the standards I’ve decided to go with C++17 and not C++14, as support for 17 is now widespread. (I probably don’t need 20). I am sticking with autotools, although there seems a strong reason to move to CMake. I am keeping be13_api and dfxml as a modules that are included, python-style, rather than making them stand-alone libraries that are linked against. I’m not 100% sure that’s the correct decision, though.

The project is taking longer than anticipated because I am also doing a general code refactoring. The main thing that is taking time is figuring out how to detangle all of the C++ objects having to do with parser options and configuration.

Given that tcpflow and bulkextractor both use be13api, my attention has shifted to using tcpflow to get be13_api operational, as it is a simpler program. I’m about three quarters of the way through now. I anticipate having something finished before the end of 2020.

--- Simson Garfinkel, October 18, 2020

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.