统计tophat map上的read数量

samtools flagstat /SRA111111/SRR111222/accepted_hits.bam

78406056 + 0 in total (QC-passed reads + QC-failed reads) (1)
0 + 0 duplicates
78406056 + 0 mapped (100.00%:-nan%)  (2)
78406056 + 0 paired in sequencing (3)
39915264 + 0 read1 (4)
38490792 + 0 read2 (5)
68310778 + 0 properly paired (87.12%:-nan%) (6)
73600312 + 0 with itself and mate mapped (7)
4805744 + 0 singletons (6.13%:-nan%) (8)
1208374 + 0 with mate mapped to a different chr (9)
115100 + 0 with mate mapped to a different chr (mapQ>=5) (10)

(2)=(7)+(8)

(3)=(4)+(5)

Usage: samtools flagstat <in.bam>
   
$ samtools flagstat example.bam
11945742 + 0 in total (QC-passed reads + QC-failed reads) #总共的reads数
0 + 0 duplicates
7536364 + 0 mapped (63.09%:-nan%) #总体上reads的匹配率
11945742 + 0 paired in sequencing #有多少reads是属于paired reads
5972871 + 0 read1 #reads1中的reads数
5972871 + 0 read2 #reads2中的reads数
6412042 + 0 properly paired (53.68%:-nan%) #完美匹配的reads数:比对到同一条参考序列,并且两条reads之间的距离符合设置的阈值
6899708 + 0 with itself and mate mapped #paired reads中两条都比对到参考序列上的reads数
636656 + 0 singletons (5.33%:-nan%) #单独一条匹配到参考序列上的reads数,和上一个相加,则是总的匹配上的reads数。
469868 + 0 with mate mapped to a different chr #paired reads中两条分别比对到两条不同的参考序列的reads数
243047 + 0 with mate mapped to a different chr (mapQ>=5) #同上一个,只是其中比对质量>=5的reads的数量

 samtools view  ./accepted_hits.bam  | cut -f1 | sort | uniq | wc -l

REF:

https://www.biostars.org/p/84396/

https://www.biostars.org/p/12475/

http://seqanswers.com/forums/showthread.php?t=16500

http://sourceforge.net/p/samtools/mailman/message/31201762/

http://xushengwang.blogspot.com/2010/09/interpreting-samtools-flagstat-output.html

http://genomespot.blogspot.com/2014/09/data-analysis-step-3-align-paired-end.html

http://seqanswers.com/forums/showthread.php?t=19844

原文地址:https://www.cnblogs.com/emanlee/p/4827096.html