QIAGEN powered by

False positives and misannotation of some fusions in fusion detection workflows

Issue description

Three issues have been identified affecting tools for fusion detection, and thereby affecting workflows delivered by the Biomedical Genomic Analysis 20.0 plugin for fusion detection, which contain these tools. The affected workflows are:

Full details of the software, workflows and tools affected are provided in the “Affected software and tools” section below.

These problems were addressed in Biomedical Genomic Analysis 20.0.1 and Biomedical Genomic Analysis Server Plugin 20.0.1 through a combination of bug fixes and improvements.

Issue 1. Incorrect annotations of fusions

Some fusions identified are not being annotated correctly by Annotate Fusions with Known Fusion Information: Some known fusions detected in the data are not being annotated, while some unknown fusions detected may be annotated as known fusions.

Issue 2. Fusions with breakpoints in close proximity are reported with the same read count

For two fusions where both breakpoints are within 12bp of one another, a given read can be counted as supporting both those fusions. This can then lead to the two fusions being assigned the same read count, as shown in Figure 1. Closer inspection of the read-mapping may reveal that one of the fusions has much better support than the other.

Figure 1. Two fusions with identical 5′ breakpoints and 3′ breakpoints within 12 bp of each other are listed here. The read count is reported to be the same for both, but inspection of the read mapping could reveal better support for one of them.

Issue 3. Large numbers of false positives being reported for some datasets

Functionality introduced in the Biomedical Genomic Analysis 20.0 plugin to detect exotic fusions, i.e. fusions where one or both breakpoints are not at an exon boundary (Figure 2), has led to a large number of such fusions being detected for some datasets, a large fraction of which are false positives.

Figure 2. Examples of exotic fusions into the middle of an exon and into an intron

Recommendations

We plan to release an update to the software where these issues will be addressed. In the meantime, possible actions to take when detecting fusions are:

  1. Use older versions of the software. These issues do not affect the tools and workflows of Biomedical Genomics Analysis Plugin 1.2.xYou need CLC Genomics Workbench 12.x to install and use Biomedical Genomics Analysis 1.2.x, and CLC Genomics Server 11.x to install and use Biomedical Genomics Analysis Server Plugin 1.2.x.See the related FAQ, listed below, for information on getting installers for older versions of the software.
  2. If continuing to use the tools and workflows for fusion detection included with Biomedical Genomics Analysis 20.0, the number of false positives, and the impact of these, can be decreased by:
    • Adjusting the following settings in your custom workflows, or when otherwise using the affected tools. Workflows distributed with the software already include these suggestions.
      • Trim reads for homopolymers prior to fusion detection and annotation using the Trim Reads tool. Relevant settings depend on the purpose of the workflow. See Figure 3.
      • Increase the “Breakpoint distance” parameter in the Refine Fusion Gene tool to 25, as shown in Figure 4. (The default value is 10).
    • Reviewing the evidence for exotic fusions as described in the Interpretation of fusion results section of the manual.

Figure 3. Homopolymer trimming settings included in workflows delivered by Biomedical Genomic Analysis 20.0

 

Figure 4. Breakpoint distance parameter of the Refine Fusion Genes tool increased to 25 from the default value of 10.

 

Affected software and tools

The issues described on this page were addressed in Biomedical Genomic Analysis 20.0.1 and Biomedical Genomic Analysis Server Plugin 20.0.1 through a combination of bug fixes and improvements.

Affected software:

On affected software versions, the following workflows were affected:

as those workflows contained one or more of the following affected tools: