Need help with siegfried?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

153 Stars 21 Forks Apache License 2.0 1.1K Commits 21 Opened issues


signature-based file format identification

Services available


Need anything else?

Contributors list


Siegfried is a signature-based file format identification tool, implementing:

  • the National Archives UK's PRONOM file format signatures
  •'s MIME-info file format signatures
  • the Library of Congress's FDD file format signatures (beta).
  • Wikidata (beta).



Build Status Build status GoDoc Go Report Card


Command line

sf file.ext
sf DIR


sf -csv file.ext | DIR                     // Output CSV rather than YAML
sf -json file.ext | DIR                    // Output JSON rather than YAML
sf -droid file.ext | DIR                   // Output DROID CSV rather than YAML
sf -nr DIR                                 // Don't scan subdirectories
sf -z | DIR                       // Decompress and scan zip, tar, gzip, warc, arc
sf -zs gzip,tar file.tar.gz | DIR          // Selectively decompress and scan 
sf -hash md5 file.ext | DIR                // Calculate md5, sha1, sha256, sha512, or crc hash
sf -sig custom.sig file.ext                // Use a custom signature file
sf -                                       // Scan stream piped to stdin
sf -name file.ext -                        // Provide filename when scanning stream 
sf -f myfiles.txt                          // Scan list of files and directories
sf -v | -version                           // Display version information
sf -home c:\junk -sig custom.sig file.ext  // Use a custom home directory
sf -serve hostname:port                    // Server mode
sf -throttle 10ms DIR                      // Pause for duration (e.g. 1s) between file scans
sf -multi 256 DIR                          // Scan multiple (e.g. 256) files in parallel 
sf -log [comma-sep opts] file.ext | DIR    // Log errors etc. to stderr (default) or stdout
sf -log e,w file.ext | DIR                 // Log errors and warnings to stderr
sf -log u,o file.ext | DIR                 // Log unknowns to stdout
sf -log d,s file.ext | DIR                 // Log debugging and slow messages to stderr
sf -log p,t DIR > results.yaml             // Log progress and time while redirecting results
sf -log fmt/1,c DIR > results.yaml         // Log instances of fmt/1 and chart results
sf -replay -log u -csv results.yaml        // Replay results file, convert to csv, log unknowns
sf -setconf -multi 32 -hash sha1           // Save flag defaults in a config file
sf -setconf -serve :5138 -conf srv.conf    // Save/load named config file with '-conf filename' 



Signature files

By default, siegfried uses the latest PRONOM signatures without buffer limits (i.e. it may do full file scans). To use MIME-info or LOC signatures, or to add buffer limits or other customisations, use the roy tool to build your own signature file.


With go installed:

go get

sf -update

Or, without go installed:


Download a pre-built binary from the releases page. Unzip to a location in your system path. Then run:

sf -update

Mac Homebrew (or Linuxbrew):

brew install mistydemeo/digipres/siegfried

Or, for the most recent updates, you can install from this fork:

brew install richardlehane/digipres/siegfried

Ubuntu/Debian (64 bit):

wget -qO - | sudo apt-key add -
echo "deb wheezy main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update && sudo apt-get install siegfried


pkg install siegfried

Arch Linux:

git clone
cd siegfried
makepkg -si


v1.9.1 (2020-10-11)


  • update PRONOM to v97
  • zs flag now activates -z flag


  • details text in PRONOM identifier
  • roy
    panic when building signatures with empty sequences. Reported by Greg Lepore

v1.9.0 (2020-09-22)


  • a new Wikidata identifier, harvesting information from the Wikidata Query Service. Implemented by Ross Spencer.
  • select which archive types (zip, tar, gzip, warc, or arc) are unpacked using the -zs flag (sf -zs tar,zip). Implemented by Ross Spencer.


  • update LOC signatures to 2020-09-21
  • update tika-mimetypes signatures to v1.24
  • update signatures to v2.0


  • incorrect basis for some signatures with multiple patterns. Reported and fixed by Ross Spencer.

v1.8.0 (2020-01-22)


  • utc flag returns file modified dates in UTC e.g.
    sf -utc FILE | DIR
    . Requested by Dragan Espenschied
  • new cost and repetition flags to control segmentation when building signatures


  • update PRONOM to v96
  • update LOC signatures to 2019-12-18
  • update tika-mimetypes signatures to v1.23
  • update signatures to v1.15


  • XML namespaces detected by prefix on root tag, as well as default namespace (for mime-info spec)
  • panic when scanning certain MS-CFB files. Reported separately by Mike Shallcross and Euan Cochrane
  • file with many FF xx sequences grinds to a halt. Reported by Andy Foster

See the CHANGELOG for the full history.


Copyright 2020 Richard Lehane, Ross Spencer

Licensed under the Apache License, Version 2.0


Join the Google Group for updates, signature releases, and help.


Like siegfried and want to get involved in its development? That'd be wonderful! There are some notes on the wiki to get you started, and please get in touch.


Thanks TNA for and

Thanks Ross for and, both are very handy!

Thanks Misty for the brew and ubuntu packaging

Thanks Steffen for the FreeBSD and Arch Linux packaging

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.