Expert curation demonstrates a 126% higher precision score than AI-derived databases

But don’t take our word for it . . .
See what a 2024 independent, peer-reviewed study from the National Institutes of Health (NIH) found when they compared HGMD Professional with three literature mining tools for variant curation.

DOWNLOAD WHITE PAPER
Independent Study

NIH study finds HGMD Professional outperforms Gemonemon® Mastermind® in terms of precision

In Genetics in Medicine, a study entitled, “Comparison of literature mining tools for variant classification: Through the lens of 50 RYR1 variants” was published by Wermers et al. of the Center for Precision Health Research at the NIH. Recognizing the importance of accessing relevant literature that informs the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) criteria for classifying variant pathogenicity, investigators set out to determine the efficacy of four literature mining tools in the retrieval of publications to classify 50 RYR1-related malignant hyperthermia susceptibility genes. The four tools included QIAGEN’s HGMD Professional, Genomenon® Mastermind®, ClinVar, and LitVar2.

*Findings are from a peer-reviewed study, Wermers et al. Genet Med. 2024;26(7):101161. Mastermind is a product from Genomenon with no affiliation to QIAGEN. All data taken directly from the published study.

Discover why manual expert-certified curation remains the gold-standard for variant assessment in our expert white paper providing key findings from the study.
Download Now

Overview of the four literature mining tools evaluated in the study

  • HGMD Professional - Founded and maintained by the Institute of Medical Genetics at Cardiff University nearly 30 years ago, HGMD Professional is an expert-curated, fee-for-service database that collates all known (published) gene lesions responsible for human inherited diseases. While a free version of HGMD is available, it only shows variants three years or older and is inadequate for complete ascertainment of the literature. Therefore, this study evaluated the professional version.

  • Mastermind® – Created by Genomenon® in 2014, the Mastermind Genomics Intelligence Platform® is an automated text-mined genomics search engine that rapidly indexes medical literature to identify publications associated with a particular gene or mutation, similar to Google Scholar. Genomenon® currently offers a free and subscription-based version. This study assessed the subscription-based version which provides access to all features and publications.
  • ClinVar – A NIH-supported resource, ClinVar is a freely accessible, crowd-sourced database that aims to provide variant classifications provided by experts in the field including research laboratories, clinical laboratories and expert panels.

  • LitVar2 – A literature mining tool by ClinVar, LitVar2 that mines PubMed, PubMed Central (PMC) Open Access Subset, dbSNP, and ClinVar to index both abstracts and publications relevant to a specific gene or mutation query.

Methods

As members of the ClinGen Malignant Hyperthermia Susceptibility (MHS) Variant Curation Expert Panel (VCEP), Wermer et al. selected 50 RYR1-related malignant hyperthermia susceptibility genes for the comparative analysis, including 12 variants of uncertain significance (VUS) and 38 pathogenic and likely pathogenic (P/LP) variants as assessed according to the ACMG/AMP/ClinGen criteria as defined by the ClinGen MHS VCEP.

Assessment of publication content and relevancy

The content of each reference returned by all four literature mining tools was analyzed to determine if the publication was a primary or secondary source:

  • Primary source – Defined as sources that presented novel data specific to the variant in question.

  • Secondary source - Papers that only referenced prior work related to the variant or where the variant was identified in large genomics screens of individuals with unrelated disease, false positives, and other citations not relevant to germline variant pathogenicity assessment.

Assessment of tool precision and sensitivity

Researchers calculated sensitivity and precision for the four literature mining tools with reference to the number of papers we deemed relevant for ACMG/AMP/ClinGen classifications.

  • Sensitivity - Calculated by dividing the number of  papers relevent for variant assessment returned by a tool by the union of the relevant papers returned by all tools.

  • Precision – Calculated by dividing the number of relevant references returned by a tool by the number of references (relevant and not relevant) returned by that same tool.

Results

The total number of publications returned by each literature mining tool is as follows: HGMD Professional returned 194; Mastermind® returned 1108; ClinVar returned 372; and LitVar2 returned 401. The metrics on the number of papers returned, including primary and novel references, precision, and sensitivity for the four literature mining tools are displayed in Table 1.

Table 1. Metrics for the four literature mining tools. Data is presented for each tool, including overall number of publications returned, number of primary references, number of novel references, sensitivity and precision.

Key takeaways

  • HGMD Professional showed the highest rate of relevant novel references returned (82%) compared to the other tools (ClinVar 9%, Mastermind® 24%, LitVar2 9%).

  • HGMD Professional demonstrated the highest level of precision, returning 96% primary references and achieving 95% precision. In contrast, other tools showed significantly lower precision and primary reference return rates: ClinVar had 67% and 68%, Mastermind® had 42% and 45%, and LitVar2 had 41% and 43%.  Therefore, HGMD Professional demonstrates a 126% higher precision score than an AI-derived database.

  • Mastermind® and LitVar2 exhibited the lowest precision, with less than half of the identified (Figure 1). For Mastermind®, of the 604 novel publications returned, only 145 were relevant.

Figure 1. Comparison of precision between the four tools. HGMD Professional had the highest precision at 95%. In comparison, Mastermind® had a precision rating of 45%.

Why labs trust HGMD Professional

HGMD Professional delivers a high-quality, expert curation approach, the gold standard for relevant variant curation.

Superior curation approach – A curation team at Cardiff University deploys machine learning algorithms and manual review to screen peer-reviewed biomedical literature on an ongoing basis. Articles identified as potential sources of mutation data are assessed by a team of experienced curators (with an average of more than 12 years experience in curation).

Verification of evidence – HGMD Professional’s curation team addresses discrepancies in variant reporting that require additional scrutiny. The curators analyze information reported in the manuscript or by referring to supplementary material (chromosomal coordinate, sequence chromatogram etc.). In some cases, the curators directly contact with the authors to address ambiguities.

Frequent updates – HGMD Professional ensures labs have current and relevant evidence. In the lastest release, the total number of disease-associated germline mutations in HGMD Professional increased to 527,810 entries. Note, the free version of HGMD is three years behind the Professional version, highlighting the importance for clinical labs to use the subscription-based version.

READ THE STUDY

Download an expert white paper providing an overview of the findings

Don't just take our word for it. Download an expert white paper providing an overview of the study, with details on the methods, additional data, and why the NIH trusts HGMD Professional over other commercial and publically available literature mining tools.

DOWNLOAD NOW

TRY FOR YOURSELF

Request a complimentary 5-day trial of HGMD Professional

Explore, search and test HGMD Professional for free. To demonstrate the quality, flexibility, and superiority of HGMD Professional, QIAGEN Digital Insights offers complimentary, no-obligation trials of the leading database. Start your free trial today!

REQUEST TRIAL

Explore additional resources

Stanford University compares data quality from AVADA database to HGMD Professional
READ WHITE PAPER
HGMD Professional adds over 9,000 germline mutations in the latest release
SEE WHAT'S NEW
How your lab can use HGMD Professional to confidently interpret exome tests
READ CASE STUDY
Sample to Insight
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.