How to accurately quantify miRNAs and increase fraction of annotated reads

Author:

Leif Schauser

How to accurately quantify miRNAs and increase fraction of annotated reads

Introduction to the QIAseq miRNA library kit support in the Biomedical Genomics Analysis plugin for CLC Genomics Workbench

The QIAseq miRNA library kit

In an independent comparative study (1), four miRNA NGS library preparation kits from different vendors were assessed. The results showed that the QIAGEN QIAseq miRNA kit was the superior choice on all parameters benchmarked. Here, we discuss bioinformatics support for NGS data generated with this kit through a dedicated miRNA analysis pipeline.

For library preparation quality control, a spike-in kit is available. Although highly recommended, this QC measure is optional. QC based on this kit is performed in the bioinformatics analysis.

Algorithmic steps

The QIAseq miRNA NGS library preparation kit makes use of Unique Molecular Identifiers (UMIs) enabling precise quantification of nucleic acids in challenging samples such as FFPE, plasma and serum, where the analyte often is rare and highly degraded. UMIs are tagged to the captured analyte so that amplification biases and other artifacts of library preparation can be reduced to a minimum.

In this implementation, both Illumina and IonTorrent reads can be used, in conjunction with a metadata table describing the samples. The QIAseq miRNA Quantification ready-to-use workflow allows the quantification of miRNA expression for each sample. An overview of identified miRNAs across all samples can be generated by the Create Combined miRNA Report tool. The QIAseq miRNA Differential Expression ready-to-use workflow then estimates which miRNAs are differentially expressed, using the well-known tools from the RNA-seq workflows. In all steps, options and parameters in the workflow can be freely changed and adjusted.

Contrary to most miRNA bioinformatics analysis pipelines, which first map to the reference genome, the CLC Genomics Workbench QIAseq miRNA workflow first creates UMI consensus reads. Then, it annotates these UMI reads with Small RNA Reference Data. This data set contains the miRbase database, and spike-in sequence information. In the next release, the software will include options for including custom databases (e.g. piRNABank for additional functional miRNA annotation in addition to rRNAs and tRNAs for sources of non-miRNA reads).

Figure 1 Schematic overview of the bioinformatics workflow for the analysis of NGS data generated using the QIAseq miRNA library preparation kit.

Output

In the Grouped on mature expression table, there is a row for each mature miRNA in the database, which gives insights into which specific miRNA genes are regulated.

miRNA seeds are the active agents regulating biological pathways, no matter which member of a family of similar miRNA genes the seed originates from. The Grouped on seed expression table has a row for each identified miRNA seed. This table supports further analysis in Ingenuity Pathway Analysis.

A Heat Map, Venn Diagram, GO Gene Set Test, and Expression Browser are generated to compare miRNA expressions in groups specified in the metadata.

Conclusion

We have reanalyzed the data from Coenen-Stass1 and found that the pipeline described here yields accurate quantification of the miRNAs and increases the fraction of miRNA-annotated reads, as well as increasing the number of detected miRNAs, compared to the findings reported by Coenen-Stass. Also, a reduction in the coefficient of variation was observed. Tailoring and tuning the bioinformatics tasks for the analysis of NGS data from a specific QIAseq kit thus increases the quality of analysis.

Try the miRNA NGS analysis using CLC Genomics Workbench tutorial.

Check out the related QIAGEN liquid biopsy resources featuring exosome research resources and cfmiRNA solutions.

Plugin availability

The Biomedical Genomics Analysis plugin is freely available and can be downloaded and installed directly on the CLC Genomics Workbench via the Plugin Manager. Plugin files can also be downloaded from our plugins webpage for installation on the CLC Genomics Workbench or CLC Genomics Server.

References

  1. Coenen-Stass, A. M. L. et al. 2018. Evaluation of methodologies for microRNA biomarker detection by next generation sequencing. RNA Biol 15, 1133–1145.