Products

Applications

QCI Interpret for Oncology

The most advanced end-to-end solution for oncology NGS analysis, interpretation and reporting

Ingenuity Pathway Analysis (IPA)

Analyze, compare and contextualize your NGS data with the leading pathway analysis application

Knowledge Bases

Biomedical Knowledge Bases

Gene Variant Databases

COSMIC (Catalogue of Somatic Mutations in Cancer)

HSMD (Human Somatic Mutation Database)

HGMD (Human Gene Mutation Database)

PGXI (Pharmacogenomic Insights)

'Omics Databases

QIAGEN DiseaseLand

QIAGEN OncoLand

QIAGEN Single Cell Land

ATCC Cell Line Land

QIAGEN OmicSoft Land Explorer

Biomedical Knowledge Base-AI

Find novel connections missed by traditional methods, hiding in over 640M biomedical relationships

Services

Discovery Bioinformatics Services

Clinical Analysis and Interpretation Services

QCI Precision Insights

Biomedical Knowledge Bases

Access critical drug discovery data, save time and explore novel biomedical relationships

QCI Precision Insights

A professional clinical interpretation service that translates molecular data specific to each patient into insights and therapeutic options

Biomedical Knowledge Base-HD

Directly access over 10M high-quality biological findings

QIAGEN receives European IVDR certification for QIAGEN Clincal Insight Interpret

Powerful cloud-enabled ‘omics GUI, complete NGS analysis workflows and unparalleled curated content for immediate exploration

Which secondary analysis solution is right for you?

Use our decision tree to find out which secondary analysis solution is right for your lab

Solutions

Discovery and Research

Biomarker Identification

Target Discovery

Mechanism of Action

Single-Cell Genomics

Microbial/Metagenomics

Gene Regulation

Variant Analysis

SARS-CoV-2 Solutions

OmicSoft NGS Data Analysis

Explore and compare data across 700K+ disease studies with a cloud-based NGS analysis suite

Data Sciences

Data and Pipeline Management

OmicSoft OncoLand

Explore high-quality, preprocessed genomics data with our oncology database

Webinar: How decentralized and small labs can adopt high-throughput NGS analysis

Discover a new secondary analysis solution for oncology & inherited disease applications for high-throughput use with any clincal NGS data

Clinical NGS Testing

Oncology

Solid Tumors

Hematological Malignancies

Hereditary Cancers

Inherited Disorders

Rare & Undiagnosed Diseases

Sample to Insight solutions

View our NGS workflows for labs of all sizes and experience - and find the right fit for you

Partner Program

Biomarker and Target Discovery

Augment your biomarker discovery research with 20M findings & 700K preprocessed ` omics samples

Rare and Undiagnosed Diseases

Finding a diagnosis for rare diseases is often a race against time. QDI is helping provide answers where none were available before

Clinical Testing Solutions

Deliver partient-specific reports for any NGS panel in minutes with on-demand, expert-curated content & professional interpretation services

Resources

App Notes & White Papers

Public Citations

Citation Guidelines

SARS-CoV-2 Resources

Knowledge Bases Blog: Using trusted cancer data can accelerate drug discovery and development

See our expert curation processes and how data curated this way helps biopharma research

Latest IPA Blog: Free pathway analysis - How much do you really save?

Discover why you need pathway analysis tools that provide rich, directional relationships

Latest Blog: Immune Repertoire Analysis Showdown: Speed, Ease, Accuracy

Find out which B-cell receptor reconstruction tool takes the crown

Webinar: Investigating genomic variants with QDI Software

Learn to analyze various types of NGS data with CLC Genomics Workbench, QCII Translational and IPA

Blog: How COSMIC & HSMD support different phases of cancer drug discovery and development

Learn how expert-curated cancer data can help biopharma researchers identify & validate drug targets faster & optimize clinical trial design

Support

Maintenance and Support

VIdeo: Introduction to QCI Interpret for Oncology

Check out this introductory video to QCI Interpret for Oncology, a CDS software that will allow you to confidently interpret NGS variants

Lateset improvements: IPA

See the newest improvements and updates in the IPA 2024R1 Release

Webinar: Leveraging the QIAGEN Knowledge Graph for insights into drug repurposing

This webinar shows how to predict novel drug-disease relationships and construct networks that capture relevant supporting evidence

Webinar: HGMD Pro in action: Search, curate and classify genetic variants - Session 2

Learn how HGMD Professional can help you get better variant data faster

Downloads

Product Downloads

Module and Plugin Downloads

Example Data

Somatic knowledge bases for clinical NGS testing

Expert-curated content for accelerated analysis of cancer mutations in clincal NGS testing

Somatic knowledge bases for biopharmaceutical research

Expert-curated content for the discovery and development of precision cancer therapies

Human Gene Mutation Database (HGMD) Professional

Improve diagnostics with the largest expert-curated source of hereditary disease-causing mutations

Latest improvements: OmicsSoft Lands

See the latest improvements and updates in the Lands 2024R1 Release

Video: Introduction to QCI for Hereditary Disorders

Check out this introductory video to QCI Interpret for Hereditary Disorders, industry's only automated FASTQ to final report solution

Press Release: QIAGEN enchances bioinformatics workflows with new secondary analysis solultion

QIAGEN launches QCI Secondary Analysis, a cloud-based solution enabling high-throughput secondary analysis with clincal NGS data

About

About Us

Press Releases

Contact

Careers | QIAGEN Digital Insights

Home > News BLOG > Lasting expressions

  Author: 
  QIAGEN Digital Insights

Author: qiagen

October 10, 2016

Lasting expressions

Evolving best practices for RNA-seq data analysis

For few areas of genomics, do best practices evolve as quickly and continuously as for RNA-seq applications.

As a consequence of the rapid development within RNA-seq, researchers struggle to ensure that their analysis pipelines meet the latest standards. This typically means testing and integrating the best performers among a growing number of analysis solutions. And in the daily routine users often run a mix of different tools for the respective analysis step they perform best, from read mapping through isoform quantification to the detection of differential abundance.

Make lasting expressions with your research – one integrated solution for your RNA-seq analysis

RNA-seq analysis is a declared focus area for us. Users of CLC Genomics Workbench and Biomedical Genomics Workbench rely on us to constantly evaluate emerging bioinformatics approaches and integrate leading approaches into our solutions in a way that follows modern design control and quality assurance criteria.

We’re here sharing some of the recent improvements and underlying methods implemented into our RNA-seq solution.

Our Advanced RNA-Seq solution

Make best use of gene annotations

Reads are simultaneously mapped to the genome and the transcriptome before being combined into a unified picture of expression. This brings two advantages:

The use of the transcriptome ensures that short exons are detected, and allows isoforms to be distinguished based on paired read distances. Mapping to the genome prevents reads originating from intronic regions from being incorrectly mapped to transcripts.

The mapping algorithm we use for RNA-seq is designed with a focus on accuracy in downstream variant calling. It is closely related to the Map Reads to Reference tool, so variants called in DNA and RNA can be compared without bias.

Benefit from a stranded protocol

Stranded library preparation reduces the uncertainty associated with assigning a read to an isoform, and can reveal antisense regulation of a gene. To account for imperfect efficiency of the protocol, reads are mapped to both strands, and those with a highest scoring alignment in the incorrect orientation are filtered away.

Accurate isoform quantification

A typical RNA sequencing read could originate from several isoforms of a gene. An Expectation-Maximization algorithm, similar to that of RSEM, is used to determine the actual expressed isoform. The algorithm generalizes the human intuition that if some reads can only originate from isoform A or B, and others can only originate from isoform B or C, then it is most likely that B is expressed.

Results on the benchmark of Teng et al. (2016) showing the consistency of isoform quantification (left) and the accuracy of detected fold changes (right) for several open source methods and for QIAGEN. In both plots, the x-axis shows the expression level, and the performance of all methods improves for isoforms with higher expression. Perfect quantification would be the line y=0 for the plot on the left, and y=1 for the plot on the right. Flux Capacitor, used by the GTEx consortium, performs worst on these benchmarks.
The QIAGEN quantification approach overlaps with the best performing open source tool, RSEM. Data for the open source methods were kindly provided by the authors of the benchmark.

Layer metadata on visualizations

Samples can be associated with editable metadata at any stage of analysis. The information in the metadata is flowing automatically into statistics and visualizations to provide insight into patterns of expression and the presence of confounding factors.

Intuitive analysis of differential expression between samples or sample groups. Results are depicted in tables, 2-D heat maps, or Venn diagrams. The results are linked and selecting a set of differentially expressed features reveals the respective information in other tables or plots. Differential expression of genes or transcripts can be compared among samples in the genome browser view (track list).

The genome browser view makes it easy to visualize genes or transcripts that are differentially expressed between samples. Color coding reveals differences in gene expression. Fold changes in expression are log-transformed and converted into color space.

Detect differential expression

Differential expression is detected based on metadata associated with samples. The statistics are based on the fit of a Generalized Linear Model with a negative binomial distribution, similar to EdgeR or DESeq2. The model supports paired designs, and can control for batch effects.

Differential expression can also be detected in exploratory studies where no replicates are available. In this case the algorithm shares data between isoforms with similar expression to estimate technical and biological variability.

We have carried out a comparative study to ensure that results generated with our solutions are in line with results generated with leading alternative methods.

Comparison to the DESeq2 benchmark of Love et al. (2014). This plot is equivalent to two panels of figure 6 of the paper by Love et al. but with QIAGEN added, and edgeR modified to run with quasi-likelihood testing. Data are simulated such that 20% of isoforms have a three-fold change in an experiment with 3 or 5 replicates (left and right plots respectively). The experiment is repeated six times. The sensitivity is the fraction of differentially expressed isoforms that are detected. The false discovery rate (FDR) is the fraction of isoforms that the method incorrectly calls as being differentially expressed. A perfect method would have highest possible sensitivity while lying on the black line (which is the target FDR, here set to 0.1). On this data DESeq and edgeR lie to the left of the target error rate, meaning they are being too conservative. DESeq2 and QIAGEN performs best on this benchmark with DESeq2 performing favourably with lower numbers of replicates. Differences between methods become smaller as the size of the fold change or number of replicates increase.

References

Teng, M., Love, M. I., Davis, C. A., Djebali, S., Dobin, A., et al. (2016). A benchmark for RNA-seq quantification pipelines. Genome Biology, 17(1), 74. doi:10.1186/s13059-016-0940-1

Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12), 550. doi:10.1186/s13059-014-0550-8

Share on: