plink 软件中 recode 01、recode 12、outputmissinggenotype的作用

1、准备测试数据,8个样本,8个位点

[root@linuxprobe test]# cat outcome.ped
DOR     1       0       0       0       -9      A G     G G     G G     G C     G G     C C     C C     0 0
DOR     2       0       0       0       -9      G G     G G     A G     C C     G G     G C     C C     0 0
DOR     3       0       0       0       -9      G G     G G     A G     G C     C G     C C     G G     0 0
DOR     4       0       0       0       -9      G G     G G     G G     G G     C G     G G     G G     G G
DOR     5       0       0       0       -9      G G     G G     A G     G C     G G     C C     G G     G G
DOR     6       0       0       0       -9      G G     G G     A A     C C     G G     C C     G G     G G
DOR     7       0       0       0       -9      A A     G G     G G     C C     G G     C C     G G     A G
DOR     9       0       0       0       -9      A A     G G     G G     C C     G G     C C     G G     A G
[root@linuxprobe test]# cat outcome.map
1       snp1    0       55910
1       snp2    0       85204
1       snp3    0       122948
1       snp4    0       203750
1       snp5    0       312707
1       snp6    0       356863
1       snp7    0       400518
1       snp8    0       487423

2、 --recode 01

plink --file outcome --recode 12 --out test;rm *.log *.nosex  ## --recode 12 的作用是将次等位基因转换为1,主等位基因转换为2,缺失基因型扔为0
[root@linuxprobe test]# cat test.map
1       snp1    0       55910
1       snp2    0       85204
1       snp3    0       122948
1       snp4    0       203750
1       snp5    0       312707
1       snp6    0       356863
1       snp7    0       400518
1       snp8    0       487423
[root@linuxprobe test]# cat test.ped
DOR 1 0 0 0 -9 1 2 2 2 2 2 1 2 2 2 2 2 1 1 0 0
DOR 2 0 0 0 -9 2 2 2 2 1 2 2 2 2 2 1 2 1 1 0 0
DOR 3 0 0 0 -9 2 2 2 2 1 2 1 2 1 2 2 2 2 2 0 0
DOR 4 0 0 0 -9 2 2 2 2 2 2 1 1 1 2 1 1 2 2 2 2
DOR 5 0 0 0 -9 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2
DOR 6 0 0 0 -9 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2
DOR 7 0 0 0 -9 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2
DOR 9 0 0 0 -9 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2

3、--recode 01

[root@linuxprobe test]# plink --file outcome --recode 01 --out test;rm *.log *.nosex  ## 直接使用,报错了
PLINK v1.90b6.19 64-bit (16 Sep 2020)          www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test.log.
Options in effect:
  --file outcome
  --out test
  --recode 01

23700 MB RAM detected; reserving 11850 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (8 variants, 8 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
8 variants loaded from .bim file.
8 people (0 males, 0 females, 8 ambiguous) loaded from .fam.
Ambiguous sex IDs written to test.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 8 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.953125.
8 variants and 8 people pass filters and QC.
Note: No phenotypes present.
Error: The --recode '01' modifier normally has to be used with a nonzero
--output-missing-genotype setting.

4、--recode 01  +  --output-missing-genotype 

plink --file outcome --recode 01 --output-missing-genotype 9 --out test;rm *.log *.nosex  ## 加参数 --output-missing-genotype
[root@linuxprobe test]# cat test.ped  ## 此等位基因变为0,主等位基因变为1,缺失基因型变为9
DOR 1 0 0 0 -9 0 1 1 1 1 1 0 1 1 1 1 1 0 0 9 9
DOR 2 0 0 0 -9 1 1 1 1 0 1 1 1 1 1 0 1 0 0 9 9
DOR 3 0 0 0 -9 1 1 1 1 0 1 0 1 0 1 1 1 1 1 9 9
DOR 4 0 0 0 -9 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 1
DOR 5 0 0 0 -9 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1
DOR 6 0 0 0 -9 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1
DOR 7 0 0 0 -9 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1
DOR 9 0 0 0 -9 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1
[root@linuxprobe test]# cat test.map
1       snp1    0       55910
1       snp2    0       85204
1       snp3    0       122948
1       snp4    0       203750
1       snp5    0       312707
1       snp6    0       356863
1       snp7    0       400518
1       snp8    0       487423

 

结论:--recode 12 :将次等位基因变为1,主等位基因变为2

         --recode 01 :需结合--output-missing-genotype使用,将次等位基因变为0,主等位基因变为1,--output-missing-genotype作用是设定缺失基因型的代表字符。

原文地址:https://www.cnblogs.com/liujiaxin2018/p/13795230.html