bwa index|amb|ann|bwt|pac|sa

cat /xxx/DU-030-17.gapcloser.fa |head -1000 > t1.fa
bwa index -a bwtsw -p t1 t1.fa 1>t1.bwa_index.log 2>&1

#$ ll
#total 292K
#-rw-r--r-- 1 XXX 638 Jul 23 10:55 t1.amb
#-rw-r--r-- 1 XXX 183 Jul 23 10:55 t1.ann
#-rw-r--r-- 1 XXX 98K Jul 23 10:55 t1.bwt
#-rw-r--r-- 1 XXX 99K Jul 23 10:54 t1.fa
#-rw-r--r-- 1 XXX 25K Jul 23 10:55 t1.pac
#-rw-r--r-- 1 XXX 49K Jul 23 10:55 t1.sa
#-rw-r--r-- 1 XXX  70 Jul 23 10:57 t1.bwa_index.log
#-rw-r--r-- 1 XXX   0 Jul 23 10:56 w.sh

[bwa_idx_build] fail to open file 't2.fa' : No such file or directory

 其中:

参数-a用于指定建立索引的算法:

  • bwtsw 适用于>10M
  • is 适用于参考序列<2G (默认-a is)

可以不指定-a参数,bwa index会根据基因组大小来自动选择合适的索引方法

.amb is text file, to record appearance of N (or other non-ATGC) in the ref fasta.
.ann is text file, to record ref sequences, name, length, etc.
.bwt is binary, the Burrows-Wheeler transformed sequence.
.pac is binary, packaged sequence (four base pairs encode one byte).
.sa is binary, suffix array index.

原文地址:https://www.cnblogs.com/yuanjingnan/p/11230549.html