变异检测VarScan软件使用说明

VarScan使用示例,如下:
samtools mpileup -q35 -d8000 -f GBS_A909.fa M008_alignmen_F4.sort.bam >M008_alignmen_F4.sort.mpileup
java -jar VarScan.v2.4.0.jar mpileup2indel M008_alignmen_F4.sort.mpileup --min-var-freq 0 --p-value 0.99 --min-avg-qual 1 --min-coverage 1 --output-vcf 1 > M008_alignmen_F4.sort.mpileup.indel.txt
 
得到的文件M008_alignmen_F4.sort.mpileup.indel.txt的格式如下所示:

Tab-delimited SNP calls with the following columns:

       Chrom           chromosome name

       Position   position (1-based)

       Ref                reference allele at this position

       Var                variant allele observed

       PoolCall  Cross-sample call using all data (Cons:Cov:Reads1:Reads2:Freq:P-value)

                            Cons - consensus genotype in IUPAC format

                            Cov - total depth of coverage

                            Reads1 - number of reads supporting reference

                            Reads2 - number of reads supporting variant

                            Freq - the variant allele frequency by read count

                            P-value - FET p-value of observed reads vs expected non-variant

       StrandFilt Information to look for strand bias using all reads, format R1+:R1-:R2+:R2-:pval

                            R1+ = reference supporting reads on forward strand

                            R1- = reference supporting reads on reverse strand

                            R2+ = variant supporting reads on forward strand

                            R2- = variant supporting reads on reverse strand

                            pval = FET p-value for strand distribution, R1 versus R2

       SamplesRef    Number of samples called reference (wildtype)

       SamplesHet    Number of samples called heterozygous-variant

       SamplesHom  Number of samples called homozygous-variant

       SamplesNC     Number of samples not covered / not called

       SampleCalls    The calls for each sample in the mpileup, space-delimited

                         Each sample has six values separated by colons:

                     Cons - consensus genotype in IUPAC format

                     Cov - total depth of coverage

                     Reads1 - number of reads supporting reference

                     Reads2 - number of reads supporting variant

                     Freq - the variant allele frequency by read count

                     P-value - FET p-value of observed reads vs expected non-variant      

 
以下是对varscan2的使用说明:

Commands

Three VarScan subcommands will invoke the germline variant calling model. These work with single-sample and multi-sample mpileup input:
	
mpileup2snp - calls single nucleotide polymorphisms (SNPs)
mpileup2indel - calls insertions and deletions (indels)
mpileup2cns - calls a consensus genotype (reference, SNP, or indel)

The first two (mpileup2snp and mpileup2indel) report *only* positions at which a variant of the given type (SNP and indel) was called. The third command (mpileup2cns) reports all positions that met the miniumum coverage, or (with the -v parameter), all positions at which a SNP or an indel was called. Use the --output-vcf 1 argument to get VCF 4.1 output. (能够得到vcf格式的文件)

The following commands still work, but only with single-sample pileup and they do NOT include full VCF output support.
	
pileup2snp - calls single nucleotide polymorphisms (SNPs)
pileup2indel - calls insertions and deletions (indels)
pileup2cns - calls a consensus genotype (reference, SNP, or indel)

The first two (pileup2snp and pileup2indel) report *only* positions at which a variant of the given type (SNP and indel) was called. The third command (pileup2cns) reports all positions that met the miniumum coverage, or (with the -v parameter), all positions at which a SNP or an indel was called. 
原文地址:https://www.cnblogs.com/xiaofeiIDO/p/6857130.html