Efficient Hybrid De Novo Error Correction and Assembly for Long Reads

Efficient Hybrid De Novo Error Correction and Assembly for Long Reads

Abstract:

The new generation of long reads generated by Oxford nanopore sequencing technology has revolutionized the next generation sequencing environment with the appearance of its new sequencer MinIon. This sequencer produces long reads with a low production costs and with high throughput. However, long reads generated by the MinIon sequencer have a high error rate which deteriorates the quality of results obtained by analyzing these long reads. A solution to correct long reads is to use the high coverage and the high quality of short reads generated by the second generation sequencing technology. Here, we present MiRCA (MinIon Reads Correction Algorithm) a hybrid approach based on the sequences alignments that detects and corrects errors for MinIon long reads using preassembled Illumina short reads. With this new error correction approach, we were able to make an effective and quick de novo assembly. Experiments on Saccharomyces cerevisia and the Escherichia coli genomes show that MiRCA is much better than the available tools. MiRCA is tested on Linux platforms and freely available athttps://github.com/Mkchouk/MiRCA.

高效的混合从头错误校正和组装长读取
文摘:

由牛津纳米孔测序技术产生的新一代长序列随着其新的测序仪MinIon的出现彻底改变了下一代测序环境。
这种测序器生产长读与低生产成本和高吞吐量。
但是,MinIon测序器产生的长读取有很高的错误率,这降低了分析这些长读取得到的结果的质量。
利用第二代测序技术产生的高覆盖率和高质量的短序列,是校正长序列的一种解决方案。
在这里,我们提出了MiRCA (MinIon读取校正算法),这是一种基于序列比对的混合方法,使用预先组装的Illumina短读取来检测和纠正MinIon长读取的错误。
有了这种新的纠错方法,我们能够做出一个有效和快速的从头组装。
对酿酒酵母和大肠杆菌基因组的实验表明,MiRCA比现有的工具要好得多。
MiRCA在Linux平台上进行了测试,并可免费获得athttps://github.com/Mkchouk/MiRCA。

原文地址:https://www.cnblogs.com/wangprince2017/p/13756589.html