Need help with pdf-toolbox?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

Yuras
155 Stars 21 Forks 249 Commits 10 Opened issues

Description

A collection of tools for processing PDF files in Haskell

Services available

!
?

Need anything else?

Contributors list

# 243,992
Haskell
C
Shell
Redis
238 commits
# 295,852
Haskell
Shell
sed
f-sharp
2 commits
# 81,541
mpd
elastic...
pandoc
markup
1 commit
# 363,431
Haskell
Vue.js
React
Webpack
1 commit
# 84,162
Haskell
ghcjs
functio...
SQLite
1 commit

pdf-toolbox

Haskell CI

A collection of tools for processing PDF files

Features

  • Written in Haskell
  • Parsing on demand. You don't need to parse or load into memory the entire PDF file just to extract one image
  • Different levels of abstraction. You can inspect high level (catalog, page tree, pages) or low level (xref, trailer, object) structure of PDF file. You can even switch between levels of details on the fly.
  • Extremely fast and memory efficient when you need to inspect only part of the document
  • Resonably fast and memory efficient in general case
  • Text extraction with exact glyph positions It can be used e.g. to implement text selection and copying in pdf viewer
  • Full support of xref streams and object streams
  • Supports editing of PDF files (incremental updates)
  • Basic support for PDF file generating
  • Encrypted PDF documents are partially supported

Still in TODO list

  • Linearized PDF files
  • Higher level API for incremental updates and PDF generating

Examples

(Also see

examples
and
viewer
directories)

Inspect high level structure:

import Pdf.Document

main = withPdfFile "input.pdf" $ \pdf -> encrypted

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.