Need help with DSAlign?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

mozilla
158 Stars 29 Forks Mozilla Public License 2.0 135 Commits 8 Opened issues

Description

DeepSpeech based forced alignment tool

Services available

!
?

Need anything else?

Contributors list

DSAlign

DeepSpeech based forced alignment tool

Installation

It is recommended to use this tool from within a virtual environment. After cloning and changing to the root of the project, there is a script for creating one with all requirements in the git-ignored dir

venv
:

shell script
$ bin/createenv.sh
$ ls venv
bin  include  lib  lib64  pyvenv.cfg  share

bin/align.sh
will automatically use it.

Internally DSAlign uses the DeepSpeech STT engine. For it to be able to function, it requires a couple of files that are specific to the language of the speech data you want to align. If you want to align English, there is already a helper script that will download and prepare all required data:

shell script
$ bin/getmodel.sh 
[...]
$ ls models/en/
alphabet.txt  lm.binary  output_graph.pb  output_graph.pbmm  output_graph.tflite  trie

Overview and documentation

A typical application of the aligner is done in three phases:

  1. Preparing the data. Albeit most of this has to be done individually, there are some tools for data preparation, statistics and maintenance. All involved file formats are described here.
  2. Aligning the data using the alignment tool and it algorithm.
  3. Exporting aligned data using the data-set exporter.

Quickstart example

Example data

There is a script for downloading and preparing some public domain speech and transcript data. It requires

ffmpeg
for some sample conversion.

shell script
$ bin/gettestdata.sh
$ ls data
test1  test2

Alignment using example data

Now the aligner can be called either "manually" (specifying all involved files directly):

shell script
$ bin/align.sh --audio data/test1/audio.wav --script data/test1/transcript.txt --aligned data/test1/aligned.json --tlog data/test1/transcript.log

Or "automatically" by specifying a so-called catalog file that bundles all involved paths:

shell script
$ bin/align.sh --catalog data/test1.catalog

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.