Comparison of long read sequencing technologies in resolving bacteria and fly genomes 长读测序技术在解决细菌和苍蝇基因组中的比较

Comparison of long read sequencing technologies in resolving bacteria and fly genomes

长读测序技术在解决细菌和苍蝇基因组中的比较

Eric S. Tvedte, Mark Gasser, Benjamin C. Sparklin, Jane Michalski, Xuechu Zhao, Robin Bromley, Luke J. Tallon, Lisa Sadzewicz, David A. Rasko, Julie C. Dunning Hotopp
 

ABSTRACT

Background The newest generation of DNA sequencing technology is highlighted by the ability to sequence reads hundreds of kilobases in length, and the increased availability of long read data has democratized the genome sequencing and assembly process. PacBio and Oxford Nanopore Technologies (ONT) have pioneered competitive long read platforms, with more recent work focused on improving sequencing throughput and per-base accuracy. Released in 2019, the PacBio Sequel II platform advertises substantial enhancements over previous PacBio systems.

Results We used whole-genome sequencing data produced by two PacBio platforms (Sequel II and RS II) and two ONT protocols (Rapid Sequencing and Ligation Sequencing) to compare assemblies of the bacteria Escherichia coli and the fruit fly Drosophila ananassae. Sequel II assemblies had higher contiguity and consensus accuracy relative to other methods, even after accounting for differences in sequencing throughput. ONT RAPID libraries had the fewest chimeric reads in addition to superior quantification of E. coli plasmids versus ligation-based libraries. The quality of assemblies can be enhanced by adopting hybrid approaches using Illumina libraries for bacterial genome assemblies or combined ONT and Sequel II libraries for eukaryotic genome assemblies. Genome-wide DNA methylation could be detected using both technologies, however ONT libraries enabled the identification of a broader range of known E. coli methyltransferase recognition motifs in addition to undocumented D. ananassae motifs.

Conclusions The ideal choice of long read technology may depend on several factors including the question or hypothesis under examination. No single technology outperformed others in all metrics examined.

Competing Interest Statement

The authors have declared no competing interest.

背景
最新一代DNA测序技术的突出特点是能够读取数百个碱基长度的序列,而长读取数据的可用性的增加已经使基因组测序和组装过程民主化。
PacBio和Oxford Nanopore Technologies (ONT)率先推出了具有竞争力的长时间阅读平台,最近的工作重点是提高测序通量和每个碱基精确度。
PacBio Sequel II平台于2019年发布,宣称与之前的PacBio系统相比有了很大的改进。

结果
我们使用两个PacBio平台(Sequel II和RS II)和两种ONT协议(快速测序和连接测序)产生的全基因组测序数据来比较大肠杆菌和果蝇果蝇的组装。
与其他方法相比,即使考虑了测序吞吐量的差异,Sequel II装配具有更高的连贯性和一致性准确性。
与基于连接的文库相比,ONT快速文库除了能更好地定量大肠杆菌质粒外,嵌合读出量最少。
利用Illumina文库进行细菌基因组装配或结合ONT和Sequel II文库进行真核生物基因组装配,可以提高装配质量。
两种技术都可以检测到全基因组DNA甲基化,然而,ONT文库能够识别更广泛的已知大肠杆菌甲基转移酶识别基序,以及未被记录的D. ananassae基序。

结论

长时间阅读技术的理想选择可能取决于几个因素,包括所检查的问题或假设。
没有一种技术在所有的指标上都优于其他技术。

原文地址:https://www.cnblogs.com/wangprince2017/p/13755973.html