Need help with vcf2maf?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

232 Stars 155 Forks Other 344 Commits 56 Opened issues


Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms

Services available


Need anything else?

Contributors list


To convert a VCF into a MAF, each variant must be mapped to only one of all possible gene transcripts/isoforms that it might affect. But even within a single isoform, a

close enough to a
, can be labeled as either in MAF format, but not as both. This selection of a single effect per variant, is often subjective. And that's what this project attempts to standardize. The
scripts leave most of that responsibility to Ensembl's VEP, but allows you to override their "canonical" isoforms, or use a custom ExAC VCF for annotation. Though the most useful feature is the extensive support in parsing a wide range of crappy MAF-like or VCF-like formats we've seen out in the wild.

Build Status

Quick start

Find the latest stable release, download it, and view the detailed usage manuals for

export VCF2MAF_URL=`curl -sL | grep -m1 tarball_url | cut -d\" -f4`
curl -L -o mskcc-vcf2maf.tar.gz $VCF2MAF_URL; tar -zxf mskcc-vcf2maf.tar.gz; cd mskcc-vcf2maf-*
perl --man
perl --man

If you don't have VEP installed, then follow this gist. Of the many annotators out there, VEP is preferred for its large team of active coders, and its CLIA-compliant HGVS formats. After installing VEP, test out

like this:
perl --input-vcf tests/test.vcf --output-maf tests/test.vep.maf

To fill columns 16 and 17 of the output MAF with tumor/normal sample IDs, and to parse out genotypes and allele counts from matched genotype columns in the VCF, use options

. Skip option
if you didn't have a matched normal:
perl --input-vcf tests/test.vcf --output-maf tests/test.vep.maf --tumor-id WD1309 --normal-id NB1308

VCFs from variant callers like VarScan use hardcoded sample IDs TUMOR/NORMAL to name genotype columns. To have

correctly locate the columns to parse genotypes, while still printing proper sample IDs in the output MAF:
perl --input-vcf tests/test_varscan.vcf --output-maf tests/test_varscan.vep.maf --tumor-id WD1309 --normal-id NB1308 --vcf-tumor-id TUMOR --vcf-normal-id NORMAL

If VEP is installed under

and the VEP cache is under
, there are options available to tell
where to find them:
perl --input-vcf tests/test.vcf --output-maf tests/test.vep.maf --vep-path /opt/vep --vep-data /srv/vep

If you want to skip running VEP and need a minimalist MAF-like file listing data from the input VCF only, then use the

option. If your input VCF contains VEP annotation, then
will try to extract it. But be warned that the accuracy of your resulting MAF depends on how VEP was operated upstream. In standard operation,
runs VEP with very specific parameters to make sure everyone produces comparable MAFs. So, it is strongly recommended to avoid
unless you know what you're doing.


If you have a MAF or a MAF-like file that you want to reannotate, then use

, which simply runs
followed by
perl --input-maf tests/test.maf --output-maf tests/test.vep.maf

After tests on variant lists from many sources,

are quite good at dealing with formatting errors or "MAF-like" files. It even supports VCF-style alleles, as long as
Start_Position == POS
. But it's OK if the input format is imperfect. Any variants with a reference allele mismatch are kept aside in a separate file for debugging. The bare minimum columns that
expects as input are:
Chromosome  Start_Position  Reference_Allele    Tumor_Seq_Allele2   Tumor_Sample_Barcode
1   3599659 C   T   TCGA-A1-A0SF-01
1   6676836 A   AGC TCGA-A1-A0SF-01
1   7886690 G   A   TCGA-A1-A0SI-01


for a sampler. Addition of
will be used to determine zygosity. Otherwise, it will try to determine zygosity from variant allele fractions, assuming that arguments
are set correctly to the names of columns containing those read counts. Specifying the
with its respective columns containing read-counts, is also strongly recommended. Columns containing normal allele read counts can be specified using argument


Apache-2.0 | Apache License, Version 2.0 |


Cyriac Kandoth. mskcc/vcf2maf: vcf2maf v1.6.19. (2020). doi:10.5281/zenodo.593251

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.