Biopharmaceutical research | Cancer drug discovery

The big data prescription for cancer drug discovery

The cancer drug discovery landscape is shifting. Research and development are growing in cost and complexity. New treatment strategies, including biomarker-driven therapies and cancer immunotherapies, are proving more effective. Yet, a record number of drugs aren’t making it to clinical trials. This article examines how biopharmaceutical researchers can leverage genomic knowledge bases to bring better drugs to more patients in less time.

Request free trial of QIAGEN's databases for cancer drug discovery
REQUEST TRIAL

A recent report reveals that out of every 10 investigational cancer drugs entering Phase 1 human clinical trials, only one drug reaches the market after an average of 14 years with a cost of $2.7 billion (1). In today’s drug development landscape, late-stage failure is primarily the result of insufficient efficacy (2). To improve the drug discovery process, develop more effective clinical trials, and enhance the treatment of rare cancers, biopharmaceutical companies are turning to big data.

As pharmaceutical research and discovery suffers from both declining success rates and a stagnant pipeline, big data and the analytics that go with it could be a key element of the cure. Across all industries, from healthcare to manufacturing and information technology, mining for big data is proving to be today’s most valuable “gold rush”. However, as with gold and precious gems, quality is paramount.

In the pharmaceutical industry, high-quality data can help companies better identify new potential drug candidates, refine clinical trial design and enrollment, and develop approved and reimbursed therapies faster. Relying on “bad” data or insufficient evidence, though, can lead to wrong conclusions, wasted funds, and have significant risk implications. So how can biopharmaceutical companies find the gold among the dross?

Understanding data integrity in cancer drug discovery

In today’s digital gold rush, data integrity is priceless. Data integrity is the assurance that data are accurate, consistent, and traceable. In the biopharmaceutical industry, data include everything from genetic information and protein structures to clinical trial demographics and patient responses.

During the data lifecycle, these data are created, recorded, and often transferred from one system to another before being used, such as in scientific literature, public and proprietary genomic knowledgebases, international genome consortiums, clinical studies, professional guidelines, drug labels, and real-world patient cases. Utilizing this data enables pharma companies to gain deeper insight into whether a cancer drug in development may need additional customization and improvement based on traits in the potential end users. But data integrity requires every step of the process is reliable, clear and transparent.

Poor quality data in any area can impact study results, patient safety, business credibility, and the regulatory process. Poor quality data can also lead to inadequate decision-making, due to misleading insights and incorrect models. In fact, data integrity as it relates to patient safety is one of the key concerns of the Food & Drug Administration (FDA), who’s top 10 citation types in recent years have been related to data quality (3,4). Therefore, biopharmaceutical companies must leverage data and evidence that they can trust.

The dilemma of public databasesovery

Accessing accurate evidence on the patterns and effectiveness of preventing, diagnosing, and treating cancer in real-world settings is mission critical for pharmaceutical researchers. Yet, having data that are consistent, reliable, and well linked is one of the biggest challenges facing pharmaceutical research and development. Publicly available databases are a considerable asset to cancer drug development, but one of the biggest challenges of effectively using these repositories is the data quality.

Limitations of publicly available databases for cancer drug discovery and development:

  • Data quality and standardization: Publicly available databases may have variations in data quality, as contributions are often made by diverse sources with different experimental protocols and standards. Inconsistencies in data formats, annotations, and experimental methodologies can hinder the reliability and reproducibility of the information.

  • Data heterogeneity: Crowd-sourced databases often accumulate data from various studies and sources, leading to heterogeneity in terms of experimental conditions, patient populations, and data collection methods. Integrating heterogeneous data can be challenging and may introduce noise into the analyses.

  • Validation and reproducibility: The reliability and reproducibility of findings from crowd-sourced databases may be challenging to verify, as there may be limited validation of the data entries. Lack of validation can lead to uncertainties about the robustness of the information.

To ensure data integrity, biopharmaceutical companies need access to knowledge bases that are manually curated and reviewed that provide high-quality data on the intricate relationships between drugs, targets, and diseases.

Real-world impact: The value of quality databases

Oncology drug discovery aims to develop therapeutic agents that modulate biological processes to inhibit progression of cancer in the clinical setting. Drug discovery is a resource-intensive process that can broadly be broken down into four steps: target identification and validation, lead identification and optimization, pre-clinical development, and clinical development.

As previously stated, the average time for a new cancer drug to go from start to approval is 10–15 years with costs exceeding $1 billion (1,5). However, over 90% of new oncology agents do not become approved drugs, mainly due to lack of efficacy and/or unmanageable toxicity (1). It is therefore essential that sufficient efforts are invested in the evaluation of a novel target before a project progresses into drug discovery.

The intricate progression of cancer and the dynamic interactions between intrinsic tumor effects and the tumor microenvironment pose significant challenges in gathering the comprehensive information needed for a thorough evaluation of a target. In the post-genomics era, the abundance of omics datasets derived from cell lines, animal models, and patient samples facilitates a thorough evaluation of targets. Various sources provide information for assessing tractability, tolerability, efficacy, and clinical positioning. Two valuable, expert-curated resources are the Catalogue of Somatic Mutations in Cancer (COSMIC) and the Human Somatic Mutation Database (HSMD), which, if mined effectively, can contribute significantly to generating essential data.

Use-case: Leveraging COSMIC to refine candidate selection

This is a block ofCOSMIC is the most detailed and comprehensive resource for exploring the effect of somatic mutations in human cancer. Developed and maintained by Wellcome Sanger Institute, the latest release, COSMIC v99 (December 2023), includes over 6 million coding mutations across 1.5 million tumor samples, curated from over 29,000 publications. In addition to coding mutations, COSMIC covers all the genetic mechanisms by which somatic mutations promote cancer, including non-coding mutations, gene fusions, copy-number variants and drug-resistance mutations. text. Double-click this text to edit it.

The primary objective of target evaluation is to establish a clinical positioning strategy for a compound designed to target a specific biomolecule. COSMIC, which includes manually curated data from The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC), enables pharmaceutical researchers to integrate patient omics profiling data, which encompasses details like gene mutations or target expression profiles, with disease-related outcomes such as patient survival. This correlation refines the clinical positioning strategy, ultimately supporting the delivery of precision medicine. Moreover, the chosen clinical positioning strategy guides the selection of appropriate model systems for predicting efficacy or conducting compound testing.

Use-case: Leveraging HSMD to evaluate drug repurposing

The Human Somatic Mutation Database (HSMD) is a relatively new somatic mutation database from QIAGEN (released in 2019) that combines over two decades of expert curation and data from scientific literature, on- and off-label therapies and clinical trials, and real-world clinical oncology cases. In the latest release, HSMD 3.0 (November 2023), the database contains manually curated, detailed molecular information on over 1.8 million somatic variants, with more than 430,000 observed in real clinical cases, as well as data from over 545,000 real-world clinical oncology cases.

Unique to HSMD is the availability of data from clinically observed variants. When a variant has been “clinically observed,” it means QIAGEN’s professional clinical interpretation service (previously N-of-One) has encountered this alteration in a real-world clinical case. For these variants, QIAGEN assesses the clinical and biological relevance and calculates the gene and variant prevalence across observed tumor types.

HSMD enables the analysis of the frequency and distribution of somatic mutations across various cancer types. The database also provides a catalog of drugs employed in the treatment of patients with specific mutations, along with available response data. Using this information, researchers can explore potential new indications for existing cancer drugs.

For example, in 2019, the FDA granted approval to alpelisib, a PI3Kα inhibitor, in combination with fulvestrant for the treatment of postmenopausal women and men with advanced or metastatic breast cancer who have specific characteristics: HR positive, HER 2-negative, and PIK3CA-mutated. Following the demonstrated effectiveness of alpelisib in breast cancer treatment, the question becomes whether it can be used in treatment of other malignancies that show the disruption of PI3K/AKT/mTOR signaling pathway or repurposed to treat other human non cancer related diseases. Biopharmaceutical companies can use HSMD to easily explore the distribution of PIK3CA clinical cases and determine its mutation frequency in various human malignancies. This information can then be used to accelerate indication expansion and drug repurposing.

The next biopharma blockbuster

In the dynamic landscape of cancer drug discovery and development, databases are not just repositories of information; they are catalysts for transformation and the next “blockbuster”. Many biopharmaceutical companies are leveraging genomic databases to power drug discovery, streamline processes, reduce costs and accelerate approvals. However, to strike gold, companies must first ensure their data is reliable, consistent, and traceable. The companies that leverage high quality databases at critical steps of the cancer drug discovery and development pipeline will be the leaders of this decade’s digital gold rush.

At QIAGEN Digital Insights, we offer two expert-curated databases that link sequence-level somatic mutation data to detailed molecular information about functional and clinical impacts, as well as implications for druggability and relevant clinical trials. The two databases, the Catalogue Of Somatic Mutations In Cancer (COSMIC) and the Human Somatic Mutation Database (HSMD), enable biopharmaceutical researchers to avoid pitfalls in early cancer drug discovery, confidently qualify candidate drug targets, and accelerate indication expansion and repurposing of existing cancer therapies.

Want to learn more?

Take a deeper dive into how to use COSMIC and HSMD in the cancer drug discovery and development pipeline.
READ BLOG
Request a free trial and personal consultation of COSMIC and HSMD with our biopharma research experts.
REQUEST TRIAL
Sample to Insight
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.