Fast PyTorch based DSP for audio and 1D signals
Julius contains different Digital Signal Processing algorithms implemented with PyTorch, so that they are differentiable and available on CUDA. Note that all the modules implemented here can be used with TorchScript.
For now, I have implemented:
Along that, you might found useful utilities in:
julius0.2.2 released: fixing normalization of filters in lowpass and resample to avoid very low frequencies to be leaked. Switch from zero padding to replicate padding (uses first/last value instead of 0) to avoid discontinuities with strong artifacts.
juliusimplementation of resampling is now officially part of Torchaudio.
juliusrequires python 3.6. To install:
bash pip3 install -U julius
See the Julius documentation for the usage of Julius. Hereafter you will find a few examples to get you quickly started:
import julius import torch
signal = torch.randn(6, 4, 1024)
Resample from a sample rate of 100 to 70. The old and new sample rate must be integers,
and resampling will be fast if they form an irreductible fraction with small numerator
and denominator (here 10 and 7). Any shape is supported, last dim is time.
resampled_signal = julius.resample_frac(signal, 100, 70)
Low pass filter with a
0.1 * sample_ratecutoff frequency.
low_freqs = julius.lowpass_filter(signal, 0.1)
Fast convolutions with FFT, useful for large kernels
conv = julius.FFTConv1d(4, 10, 512) convolved = conv(signal)
Decomposition over frequency bands in the Waveform domain
bands = julius.split_bands(signal, n_bands=10, sample_rate=100)
Decomposition with n_bands frequency bands evenly spaced in mel space.
Input shape can be
[*, T], output will be
[n_bands, *, T].
random_eq = (torch.rand(10, 1, 1, 1) * bands).sum(0)
This is an implementation of the sinc resample algorithm by Julius O. Smith. It is the same algorithm than the one used in resampy but to run efficiently on GPU it is limited to fractional changes of the sample rate. It will be fast if the old and new sample rate are small after dividing them by their GCD. For instance going from a sample rate of 2000 to 3000 (2, 3 after removing the GCD) will be extremely fast, while going from 20001 to 30001 will not. Julius resampling is faster than resampy even on CPU, and when running on GPU it makes resampling a completely negligible part of your pipeline (except of course for weird cases like going from a sample rate of 20001 to 30001).
Computing convolutions with very large kernels (>= 128) and a stride of 1 can be much faster using FFT. This implements the same API as
torch.nn.functional.conv1dbut with a FFT backend. Dilation and groups are not supported. FFTConv will be faster on CPU even for relatively small tensors (a few dozen channels, kernel size of 128). On CUDA, due to the higher parallelism, regular convolution can be faster in many cases, but for kernel sizes above 128, for a large number of channels or batch size, FFTConv1d will eventually be faster (basically when you no longer have idle cores that can hide the true complexity of the operation).
Classical Finite Impulse Reponse windowed sinc lowpass filter. It will use FFT convolutions automatically if the filter size is large enough.
Decomposition of a signal over frequency bands in the waveform domain. This can be useful for instance to perform parametric EQ (see Usage above).
You can find speed tests (and comparisons to reference implementations) on the benchmark. The CPU benchmarks are run on a Mac Book Pro 2020, with a 2 GHz quadcore intel CPU. The GPUs benchmark are run on Google Colab Pro (e.g. V100 or P100 NVidia GPU). We also compare the validity of our implementations, as compared to reference ones like
Clone this repository, then
bash pip3 install .[dev]' python3 tests.py
To run the benchmarks:
pip3 install .[dev]' python3 -m bench.gen
juliusis released under the MIT license.
This package is named in the honor of Julius O. Smith, whose books and website were a gold mine of information for me to learn about DSP. Go checkout his website if you want to learn more about DSP.