Easy assembly of Oxford Nanopore or PacBio reads and contig-polishing with high-quality Illumina reads
The recently released tutorial for handling long, error-prone reads in QIAGEN CLC Genomics Workbench 20 focuses on how to perform hybrid assembly with long, error-prone reads and high-quality, short reads. The tutorial teaches you how to assess the quality of the assembly using a reference and the Whole Genome Alignment plugin (read our blog post to learn more).
The hybrid assembly for a supplied dataset is easily carried out following a simple workflow (Figure 1).
The first step is to run the De Novo Assemble Long Reads (beta) tool resulting in an almost perfect assembly using only Oxford Nanopore reads, as assessed by evaluating the alignment percentage (AP) of 99.61% and average nucleotide identity (ANI) of 99.92% after alignment to the expected reference. After contig polishing with the supplied Illumina reads using the Polish with Reads (beta) tool, these quality measures are improved to AP 99.81% and ANI 99.93%. A single misassembly is identified along the way (Figure 2), highlighting when it becomes necessary to correct the reads before assembly.
After read correction and assembly, the misassembly disappears (Figure 3).
The reduction of errors in the reads can be visualized in a read mapping to the reference genome using the Map Long Reads to Reference (beta) tool (Figure 4).
Found this blog useful? Stay tuned for regular blog posts, full of valuable information!