VCFtools

A set of tools written in Perl and C++ for working with VCF files.

The Perl modules examples

This page provides usage examples for the Perl modules. Extended documentation for all of the options can be found in the full documentation.

Annotating

# Add custom annotations
cat in.vcf | vcf-annotate -a annotations.gz \
   -d key=INFO,ID=ANN,Number=1,Type=Integer,Description='My custom annotation' \
   -c CHROM,FROM,TO,INFO/ANN > out.vcf

# Apply SnpCluster filter
cat in.vcf | vcf-annotate --filter SnpCluster=3,10 > out.vcf

Comparing

vcf-compare A.vcf.gz B.vcf.gz C.vcf.gz
vcf check A.vcf.gz B.vcf.gz

Concatenating

vcf-concat A.vcf.gz B.vcf.gz C.vcf.gz | bgzip -c > out.vcf.gz

Converting

# Convert between VCF versions
zcat file.vcf.gz | vcf-convert -r reference.fa | bgzip -c > out.vcf.gz

# Convert from VCF format to tab-delimited text file
zcat file.vcf.gz | vcf-to-tab > out.tab

Filtering

# Filter by QUAL and minimum depth
vcf-annotate --filter Qual=10/MinDP=20

Intersections, complements

# Include positions which appear in at least two files
vcf-isec -o -n +2 A.vcf.gz B.vcf.gz C.vcf.gz | bgzip -c > out.vcf.gz

# Exclude from A positions which appear in B and/or C
vcf-isec -c A.vcf.gz B.vcf.gz C.vcf.gz | bgzip -c > out.vcf.gz

# Fast hstlib implementation vcf isec -n =2 A.vcf.gz B.vcf.gz

Merging

vcf-merge A.vcf.gz B.vcf.gz | bgzip -c > C.vcf.gz
vcf merge A.vcf.gz B.vcf.gz

Querying

vcf-query file.vcf.gz 1:10327-10330 -c NA0001

Reordering columns

vcf-shuffle-cols -t template.vcf.gz file.vcf.gz > out.vcf

Stats

vcf-stats file.vcf.gz
vcf check file.vcf.gz > file.vchk && plot-vcfcheck file.vchk -p plot/

Stripping columns

vcf-subset -c NA0001,NA0002 file.vcf.gz | bgzip -c > out.vcf.gz

Useful shell one-liners

This sections lists some usefull one line commands. Note that there are also dedicated convenience scripts vcf-sort and vcf-concat which do the same but also perform some basic sanity checks. All examples in BASH.

# Replace VCF header. The file must be compressed by bgzip.
tabix -r header.txt in.vcf.gz > out.vcf.gz

# Sort VCF file keeping the header. The head command is for performance.
(zcat file.vcf.gz | head -100 | grep ^#;
zcat file.vcf.gz | grep -v ^# | sort -k1,1d -k2,2n;) \
| bgzip -c > out.vcf.gz

# Merge (that is, concatenate) two VCF files into one, keeping the header
# from first one only.
(zcat A.vcf.gz | head -100 | grep ^#; \
zcat A.vcf.gz | grep -v ^#; \
zcat B.vcf.gz | grep -v ^#; ) \
| bgzip -c > out.vcf.gz

VCF validation

Both vcftools and Vcf.pm can be used for validation. The first validates VCFv4.0, the latter is able to validate the older versions as well.

perl -MVcf -e validate example.vcf
perl -I/path/to/the/module/ -MVcf -e validate example.vcf
vcf-validator example.vcf