A set of tools written in Perl and C++ for working with VCF files.
This page provides usage examples for the Perl modules. Extended documentation for all of the options can be found in the full documentation.
# Add custom annotations
cat in.vcf | vcf-annotate -a annotations.gz \
-d key=INFO,ID=ANN,Number=1,Type=Integer,Description='My custom annotation' \
-c CHROM,FROM,TO,INFO/ANN > out.vcf
# Apply SnpCluster filter
cat in.vcf | vcf-annotate --filter SnpCluster=3,10 > out.vcf
vcf-compare A.vcf.gz B.vcf.gz C.vcf.gz
vcf check A.vcf.gz B.vcf.gz
vcf-concat A.vcf.gz B.vcf.gz C.vcf.gz | bgzip -c > out.vcf.gz
# Convert between VCF versions
zcat file.vcf.gz | vcf-convert -r reference.fa | bgzip -c > out.vcf.gz
# Convert from VCF format to tab-delimited text file
zcat file.vcf.gz | vcf-to-tab > out.tab
# Filter by QUAL and minimum depth
vcf-annotate --filter Qual=10/MinDP=20
# Include positions which appear in at least two files
vcf-isec -o -n +2 A.vcf.gz B.vcf.gz C.vcf.gz | bgzip -c > out.vcf.gz
# Exclude from A positions which appear in B and/or C
vcf-isec -c A.vcf.gz B.vcf.gz C.vcf.gz | bgzip -c > out.vcf.gz
# Fast hstlib implementation
vcf isec -n =2 A.vcf.gz B.vcf.gz
vcf-merge A.vcf.gz B.vcf.gz | bgzip -c > C.vcf.gz
vcf merge A.vcf.gz B.vcf.gz
vcf-query file.vcf.gz 1:10327-10330 -c NA0001
vcf-shuffle-cols -t template.vcf.gz file.vcf.gz > out.vcf
vcf-stats file.vcf.gz
vcf check file.vcf.gz > file.vchk && plot-vcfcheck file.vchk -p plot/
vcf-subset -c NA0001,NA0002 file.vcf.gz | bgzip -c > out.vcf.gz
This sections lists some usefull one line commands. Note that there are also dedicated convenience scripts vcf-sort and vcf-concat which do the same but also perform some basic sanity checks. All examples in BASH.
# Replace VCF header. The file must be compressed by bgzip.
tabix -r header.txt in.vcf.gz > out.vcf.gz
# Sort VCF file keeping the header. The head command is for performance.
(zcat file.vcf.gz | head -100 | grep ^#;
zcat file.vcf.gz | grep -v ^# | sort -k1,1d -k2,2n;) \
| bgzip -c > out.vcf.gz
# Merge (that is, concatenate) two VCF files into one, keeping the header
# from first one only.
(zcat A.vcf.gz | head -100 | grep ^#; \
zcat A.vcf.gz | grep -v ^#; \
zcat B.vcf.gz | grep -v ^#; ) \
| bgzip -c > out.vcf.gz
Both vcftools and Vcf.pm can be used for validation. The first validates VCFv4.0, the latter is able to validate the older versions as well.
perl -MVcf -e validate example.vcf
perl -I/path/to/the/module/ -MVcf -e validate example.vcf
vcf-validator example.vcf