Need help with reconstructing_faces_from_voices?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

cmu-mlsp
129 Stars 18 Forks GNU General Public License v3.0 29 Commits 4 Opened issues

Description

An example of the paper "reconstructing faces from voices"

Services available

!
?

Need anything else?

Contributors list

Reconstructing faces from voices

Implementation of Reconstructing faces from voices paper

Yandong Wen, Rita Singh, and Bhiksha Raj

Machine Learning for Signal Processing Group

Carnegie Mellon University

Requirements

This implementation is based on Python 3.7 and Pytorch 1.1.

We recommend you use conda to install the dependencies. All the requirements are found in

requirements.txt
. Run the following command to create a new conda environment using all the dependencies.
$ ./install.sh

After you run the above script, you need to activate the environment where all the packages had been installed. The environment is called

voice2face
and can be run by:
$ source activate voice2face

NOTE: If you get an error complaining about "webrtcvad" not being found, then you need to make sure the pip in your PATH is the one found inside your environment. This could happen if you have multiple installations of pip (inside/outside environment).

Processed data

The following are the processed training data we used for this paper. Please feel free to download them.

Voice data (log mel-spectrograms): google drive

Face data (aligned face images): google drive

Once downloaded, update variables

voice_dir
and
face_dir
with the corresponding paths.

Configurations

See

config.py
on how to change configurations.

Train

We provide pretrained models including a voice embedding network and a trained generator in

pretrained_models/
. Or you can train your own generator by running the training script
$ python gan_train.py
The trained model is
models/generator.pth

Test

We provide some examples of generated faces (in

data/example_data/
) using the model in
pretrained_model/
. If you want to generate faces for your own voice recordings using the trained model, specify the test_data (as the folder containing voice recordings) and model_path (as the path of the generator) variables in
config.py
and run:
$ python gan_test.py

Results will be in test_data folder. For each voice recording named

.wav
, we generate a face image named
.png
.

Note: Now we only support the voice recording with one channel at 16K sample rate. The file names of the voices and faces starting with A-E are validation or testing set, while those starting with F-Z are training set.

Citation

@article{wen2019reconstructing,
  title={Reconstructing faces from voices},
  author={Yandong Wen, Rita Singh, Bhiksha Raj},
  journal={arXiv preprint arXiv:1905.10604},
  year={2019}
}

Contribution

We welcome contributions from everyone and always working to make it better. Please give us a pull request or raise an issue and we will be happy to help.

License

This repository is licensed under GNU GPL-3.0. Please refer to LICENSE.md.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.