seqtk截取reads

seqtk截取reads(fastq文件)

可以按比例截取(如抽出比例为0.014的reads)

seqtk sample name_1.fq.gz 0.014 > name_new_L1_1.fq

也可以按reads条数截取(建议小的fastq文件这样操作,如果reads过大占用内存也会过大,当需要操作的是数据量较大的fastq时,建议采取按比例截取的方式)

seqtk sample name_1.fq.gz 10000 > name_new_1.fq

可以通过-s参数(seed数)控制read1和read2是成对抽取(Subsample 10000 read pairs from two large paired FASTQ files (remember to use the same random seed to keep pairing),默认-s是11

seqtk sample -s100 read1.fq 10000 > sub1.fq
seqtk sample -s100 read2.fq 10000 > sub2.fq

参考
https://awesomeopensource.com/project/lh3/seqtk

原文地址:https://www.cnblogs.com/artesian0526/p/15735521.html