Need help with Mask-Predict?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

facebookresearch
173 Stars 27 Forks Other 1 Commits 8 Opened issues

Description

A masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a partially masked target translation.

Services available

!
?

Need anything else?

Contributors list

Mask-Predict

Download model

Description

Dataset Model
MASK-PREDICT [WMT14 English-German] download (.tar.bz2)
MASK-PREDICT [WMT14 German-English] download (.tar.bz2)
MASK-PREDICT [WMT16 English-Romanian] download (.tar.bz2)
MASK-PREDICT [WMT16 Romanian-English] download (.tar.bz2)
MASK-PREDICT [WMT17 English-Chinese] download (.tar.bz2)
MASK-PREDICT [WMT17 Chinese-English] download (.tar.bz2)

Preprocess

text=PATHYOURDATA

outputdir=PATHYOUR_OUTPUT

src=source_language

tgt=target_language

modelpath=PATHTOMASKPREDICTMODEL_DIR

python preprocess.py --source-lang ${src} --target-lang ${tgt} --trainpref $text/train --validpref $text/valid --testpref $text/test --destdir ${outputdir}/data-bin --workers 60 --srcdict ${modelpath}/maskPredict${src}${tgt}/dict.${src}.txt --tgtdict ${modelpath}/maskPredict${src}_${tgt}/dict.${tgt}.txt

Train

modeldir=PLACETOSAVEYOUR_MODEL

python train.py ${outputdir}/data-bin --arch berttransformerseq2seq --share-all-embeddings --criterion labelsmoothedlengthcrossentropy --label-smoothing 0.1 --lr 5e-4 --warmup-init-lr 1e-7 --min-lr 1e-9 --lr-scheduler inversesqrt --warmup-updates 10000 --optimizer adam --adam-betas '(0.9, 0.999)' --adam-eps 1e-6 --task translationself --max-tokens 8192 --weight-decay 0.01 --dropout 0.3 --encoder-layers 6 --encoder-embed-dim 512 --decoder-layers 6 --decoder-embed-dim 512 --fp16 --max-source-positions 10000 --max-target-positions 10000 --max-update 300000 --seed 0 --save-dir ${modeldir}

Evaluation

python generatecmlm.py ${outputdir}/data-bin --path ${modeldir}/checkpointbestaverage.pt --task translationself --remove-bpe --max-sentences 20 --decoding-iterations 10 --decoding-strategy mask_predict

License

MASK-PREDICT is CC-BY-NC 4.0. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{ghazvininejad2019MaskPredict,
  title = {Mask-Predict: Parallel Decoding of Conditional Masked Language Models},
  author = {Marjan Ghazvininejad, Omer Levy, Yinhan Liu, Luke Zettlemoyer},
  booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing},
  year = {2019},
}

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.