Need help with tabula?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

5.4K Stars 558 Forks MIT License 1.6K Commits 539 Opened issues


Tabula is a tool for liberating data tables trapped inside PDF files

Services available


Need anything else?

Contributors list


an active project?

Tabula is, and always has been, a volunteer-run project. We've ocassionally had funding for specific features, but it's never been a commercial undertaking. At the moment, none of the original authors have the time to actively work on the project. The end-user application, hosted on this repo, is unlikely to see updates from us in the near future.

sees updates and occasional bug-fix releases from time to time.


Repo Note: The

branch is an in development version of Tabula. This may be substantially different from the latest releases of Tabula.



Build Status

Tabula helps you liberate data tables trapped inside PDF files.

© 2012-2020 Manuel Aristarán. Available under MIT License. See

Why Tabula?

If you’ve ever tried to do anything with data provided to you in PDFs, you know how painful this is — you can’t easily copy-and-paste rows of data out of PDF files. Tabula allows you to extract that data in CSV format, through a simple web interface.

Caveat: Tabula only works on text-based PDFs, not scanned documents. If you can click-and-drag to select text in your table in a PDF viewer (even if the output is disorganized trash), then your PDF is text-based and Tabula should work.

Security Concerns?: Tabula is designed with security in mind. Your PDF and the extracted data never touch the net -- when you use Tabula on your local machine, as long as your browser's URL bar says "localhost" or "", all processing takes place on your local machine. Other than to retrieve a few badges and other static assets, there are two calls that are made from your browser to external machines; one fetches the list of latest Tabula versions from GitHub to alert you if Tabula has been updated, the other makes a call to a stats counter that helps us determine how often various versions of Tabula are being used. If this is a problem, the version check can be disabled by adding

to the command line at startup, and the stats counter call can be disabled by adding
. Please note: If you are providing Tabula as a service using a reverse SSL proxy, users may notice a security warning due to our stats counter endpoint being hosted at a non-secure URL, so you may wish to disable the notifications in this scenario.

Using Tabula

First, make sure you have a recent copy of Java installed. You can download Java here. Tabula requires a Java Runtime Environment compatible with Java 7 (i.e. Java 7, 8 or higher). If you have a problem, check Known Issues first, then report an issue.

  • ### Windows Download
    from the download site. Unzip the whole thing and open the
    file inside. A browser should automatically open to . If not, open your web browser of choice and visit that link.

To close Tabula, just go back to the console window and press "Control-C" (as if to copy).

  • ### Mac OS X Download
    from the download site. Unzip and open the Tabula app inside. A browser should automatically open to . If not, open your web browser of choice and visit that link.

To close Tabula, find the Tabula icon in your dock, right-click (or control-click) on it, and press "Quit".

Note: If you’re running Mac OS X 10.8 or later, you might get an error like "Tabula is damaged and can't be opened." We're working on fixing this, but click here for a workaround.

  • ### Other platforms (e.g. Linux) Download
    from the download site and unzip it to the directory of your choice. Open a terminal window, and
    to inside the
    directory you just unzipped. Then run:

java -Dfile.encoding=utf-8 -Xms256M -Xmx1024M -jar tabula.jar

Then manually navigate your browser to (New in Tabula 1.1. To go back to the old behavior that automatically launches your web browser, use the


Tabula binds to port 8080 by default. You can change it with the

option; for example, to use port 9999:

java -Dfile.encoding=utf-8 -Xms256M -Xmx1024M -Dwarbler.port=9999 -jar tabula.jar

If the program fails to run, double-check that you have Java installed and then try again.

Known issues

There are some bugs that we're aware of that we haven't managed to fix yet. If there's not a solution here or you need more help, please go ahead and report an issue.

  1. Right-click on and select Open from the context menu.
  2. The system will tell you that the application is "from an unidentified developer" and ask you whether you want to open it. Click Open to allow the application to run. The system remembers this choice and won't prompt you again.

(If you continue to have issues, double-check the OS X GateKeeper documentation for more information.)

  1. Open a Command Prompt
  2. type
    and then the path to the directory that contains
    , e.g.
    cd C:\Users\Username\Downloads
  3. Change that terminal's codepage to Unicode by typing:
    chcp 65001
  4. Run Tabula by typing

java -Dfile.encoding=utf-8 -Xms256M -Xmx1024M -Dwarbler.port=9999 -jar tabula.jar

Incorporating Tabula into your own project

Tabula is open-source, so we'd love for you to incorporate pieces of Tabula into your own projects. The "guts" of Tabula -- that is, the logic and heuristics that reconstruct tables from PDFs -- is contained in the tabula-java repo. There's a JAR file that you can easily incorporate into JVM languages like Java, Scala or Clojure and it includes a command-line tool for you to automate your extraction tasks. Visit that repo for more information on how to use

on the CLI and on how Tabula exports


Tabula has bindings for JRuby and R. If you end up writing bindings for another language, let us know and we'll add a link here.

Running Tabula from source (for developers)

  1. Download JRuby. You can install it from its website, or using tools like

    . Note that as of Tabula 1.1.0 (7875582becb2799b65586d5680782cafd399bb33), Tabula uses the JRuby 9000 series (i.e. JRuby
  2. Download Tabula and install the Ruby dependencies. (Note: if using

    , ensure that JRuby is being used.
    git clone git://
    cd tabula

    gem install bundler -v 1.17.3 bundle install jruby -S jbundle install

Then, start the development server:

jruby -G -r jbundler -S rackup

(If you get encoding errors, set the

environment variable to

The site instance should now be viewable at .

You can a couple some options when executing the server in this manner:

TABULA_DATA_DIR="/tmp/tabula" \
jruby -G -r jbundler -S rackup
    controls where uploaded data for Tabula is stored. By default, data is stored in the OS-dependent application data directory for the current user. (similar to:
    on Windows,
    ~/Library/Application Support/Tabula
    on Mac,
    on Linux/UNIX)
    prints out extra status data when PDF files are being processed. (
    by default.)

Alternatively, running the server as a JAR file

Testing in this manner will be closer to testing the "packaged application" version of the app.

jruby -G -S rake war
java -Dfile.encoding=utf-8 -Xms256M -Xmx1024M -jar build/tabula.jar

If you intend to develop against an unreleased version of

, you need to install its JAR to your local Maven repository. From the directory that contains

mvn install:install-file -Dfile=target/tabula--SNAPSHOT.jar -DgroupId=technology.tabula -DartifactId=tabula -Dversion=-SNAPSHOT -Dpackaging=jar -DpomFile=pom.xml

Then, adjust the


Building a packaged application version

After performing the above steps ("Running Tabula from source"), you can compile Tabula into a standalone application:

Mac OS X

If you wish to share Tabula with other machines, you will need a codesigning certificate. Our distribution of Tabula uses a self-signed certificate, as noted above. See this section of build.xml for details. If you will only be running Tabula on the machine you are building it on, you may remove this entire block (lines 44-53).

To compile the app:

WEBSERVER_VERSION=9.4.31.v20200723 MAVEN_REPO= rake macosx

This will result in a portable "" archive (inside the

directory) for Mac OS X users.

Note that the Mac version bundles Java with the Tabula app. This results in a 98MB zip file, versus the 30MB zip file for other platforms, but allows users to run Tabula without having to worry about Java version incompatibilities.


You can build .exe files for the Windows target on any platform.

Download a 3.1.X (beta) copy of Launch4J.

Unzip it into the Tabula repo so that "launch4j" (with subdirectories "bin", etc.) is in the repository root.

(If you're building on a 64bit Linux, you may need to install 32bit libs like, in Ubuntu

sudo apt-get install lib32z1 lib32ncurses5


WEBSERVER_VERSION=9.4.31.v20200723 MAVEN_REPO= rake windows

This will result in a portable "" archive (inside the

directory) for Mac OS X users.

If you have issues, you can try building manually. (These commands are for OS X/Linux and may need to be adjusted for Windows users.)

# (from the root directory of the repo)
WEBSERVER_VERSION=9.4.31.v20200723 MAVEN_REPO= rake war
cd launch4j
ant -f ../build.xml windows

A "tabula.exe" file will be generated in "build/windows". To run, the exe file needs "tabula.jar" (contained in "build") in the same directory. You can create a .zip archive by doing:

# (from the root directory of the repo)
cd build/windows
mkdir tabula
cp tabula.exe ./tabula/
cp ../tabula.jar ./tabula/
zip -r9 tabula
rm -fr tabula


Interested in helping out? We'd love to have your help!

You can help by:

  • Reporting a bug.
  • Adding or editing documentation.
  • Contributing code via a Pull Request from ideas or bugs listed in the Enhancements section of the issues. see
  • Spreading the word about Tabula to people who might be able to benefit from using it.


You can also support our continued work on Tabula with a one-time or monthly donation on OpenCollective. Organizations who use Tabula can also sponsor the project for acknolwedgement on our official site and this README.

Tabula is made possible in part through the generosity of our users and through grants from the Knight Foundation and the Shuttleworth Foundation. Special thanks to all the users and organizations that support Tabula!

The John S. and James L. Knight Foundation The Shuttleworth Foundation

More acknowledgments can be found in

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.