Biomedical Genomics Workbench is a new product from QIAGEN.
Besides the existing functionalities in CLC Cancer Research Workbench, the following was added, improved or fixed:
Major highlights
- The Copy Number Variant (CNV) Detection tool is included as an integrated tool in the Biomedical Research Workbench 2.1, and can be found in the ‘Resequencing’ toolbox. The CNV detection tool has existed as a beta plugin to the Cancer Research Workbench since last year, and moves out of beta status with this release. This update to the tool also includes several improvements and small bugfixes.
- New tool: Proportion-based Statistical Analysis has been added to identify differential expressed genes. The proportions-based tests are applicable in situations where your data samples consists of counts of a number of ‘types’ of data. This could e.g. be in a study where gene expression levels are measured by RNA-Seq or tag profiling.
- To help interpret whether a variant is likely pathogenic due to disruption of interfaces in a multi-subunit complex, the tool “Link Variants to 3D Protein Structure” now provides 3D visualizations of variants in a biomolecule context.
- The filtering option in the Extract Differentially Expressed Genes tool only considered the predicted fold-changes in the positive direction, so features that were reduced in expression were filtered out. This has now been fixed. The change also affects the workflow: “Identify and Annotate Differentially Expressed Genes and Pathways“, as the tool is also included in this workflow.
- Improved visualization and opening of results: The Genome Browser opens with a linked variant table.
Minor highlights
- Particular annotation types (columns) can now be specified for export in Excel, HTML and tab delimited formats.
- Increased the performance for gzip export.
- Added column to output of “Annotate and Merge Counts” in the “Small RNA Analysis” folder indicating 3′ or 5′ direction when using “grouping on mature” parameter.
- Transcriptomics experiment and sample tables can now be sorted, even with large numbers of rows.
- Improved how nucleotides are drawn inside variant track boxes, making letters smaller when zooming out.
Bug fixes
- Fixed an error that in rare cases would result in a division by zero error message when selecting rows in the Annotation Table view.
- Fixed an error that made it impossible to add an annotation via the Annotation Table view, if the table is empty.
- Fixed rare problem where a track list of reads tracks and graph tracks would break.
- Fixed bug where a left-click quickly followed by right-click was interpreted as double-click on OS X (in: the persistence search result list, in the toolbox tree, and in the workflow editor).
- Fixed an error affecting the “Cut Sequence Before/After Selection” tool in the Cloning editor.
- Fixed the SOLiD NGS importer to correctly import basespace encoded sequences in fastq files. It is still assumed that sequences originate from colorspace.
- It is now possible to filter tables based on content in the ‘Link to 3D Protein Structure’ column.
- Fixed an error that occurred when running the QC for Target Sequencing and requesting quality analysis reporting.
- Fixed an error that prevented the import of adapters from csv format.
- Fixed a rare error that caused the Amino Acid Change tool to crash if a CDS feature was less than 3 bases long.
- Resized Manage Reference Data dialog. Previously some of the text was hidden in the default size.
- Fixed a bug in the probabilistic variant caller that caused it to fail for certain input.
Please note: Biomedical Genomics Workbench was formally known as CLC Cancer Research Workbench
CLC Cancer Research Workbench 2.0
- ChIP-seq data analysis is now available. The tools for ChIP-seq analysis are found in the toolbox under “Epigenomics Analysis”.
- A highly sensitive Copy Number Variant Detection tool is available for targeted amplicon and exome sequencing data via the Plugin Manager.
- A new display of mutations at the amino acid level is available in the Genome Browser View.
- It is now possible to find out why a variant is pathogenic or if a variant could be pathogenic. This can be done by visualizing the effect of the mutation on the protein 3D structure using the tool Link Variants to 3D Protein Structure.
New tools
- Merge Read Mappings. This tool can be used to merge two read mappings if you have performed two mappings with the same reference sequences.
- Extract Reads Based on Overlap. This tool can be used to extract subsets of reads based on annotations.
- On Gaussian Data. The Gaussian based test (t-test and Anova) tool has been included.
- Cloning tools. A folder has been added to the toolbox with tools for cloning and restriction site analysis.
- ChIP-Seq Analysis. The tool for analysis of ChIP sequencing experiments is found under “Epigenomics Analysis”. The tool identifies genomic regions with significantly enriched read coverage and a read distribution with a characteristic shape.
- Annotate with Nearby Gene Information. This tool is part of the “Epigenomics Analysis” and can be used to create a copy of the annotation track used as input and add information about nearby genes.
- MA plots, scatter plots and histograms can now accept expression tracks as input
- Increased decimals for numbers when exporting table to CSV, tab delimited text, and Excel.
- Improved reporting of errors related to low disk space.
- Batching: Processes tab and analysis execution logs now display batch names in addition to analysis names for enhanced clarity.
- The External Application Client Plugin is now available directly from the Workbench Plugin Manager.
- Multiple target region tracks for the “Indels and Structural variants” can now be specified.
Bug fixes
- Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
- Fixed display problem in read mappings showing too many hidden insertions (as vertical black lines) in certain overlapping paired reads.
- Fixed problem with links and text in tables that were being cut off when succeeding a link.
- The tool Identify Graph Threshold Areas can now use negative values to define its threshold.
-
Workflows:
- In the workflow editor the “Reset to default” now always reverts to the right names.
- In the workflow editor the validation is now correctly triggered when changing the configuration of an input element.
- The workflow editor can now open workflows in which the graphical view of the workflow is corrupt.
- Fixed an exception which could occur during workflow migration.
- Data with the same name can now be bundled multiple times in a workflow installer.
- Previously when a plugin contained custom actions and a workflow, the workflow could not be installed. This has been fixed.
- Fixed problem with unlocked output names that previously could not be configured during execution of a workflow.
- A workflow with configured data from a server is now automatically validated when connected to the server (when opened in the editor). Previously the workflow had to be closed and reopened first.
- The original workflow file included in a workflow installer can now be exported directly without having to restart the workbench in advance.”
- A problem with saved table settings that sometimes did not work has been fixed. The bug fix includes a more robust/generic way of saving table settings with different columns. To fix this problem, existing saved table settings should first be loaded on an object where it works (i.e. has the same columns as when it was saved); and then the table settings should be saved with the old name to overwrite the settings.
- Fixed an error that could cause batch processing to open all results rather than saving them.
- Fixed problem with import of BED files using external applications.
- SAM/BAM import will no longer fail for alignments with POS = 0, but instead import them as though they were unmapped.
Fixed problem going back in the wizard for the “Find Binding Site and Create Fragments” tool.
- Fixed error occurring when removing an unsaved reads track from a track list.
- Fixed error when showing protein translations of annotations shorter than 3 bases.
- Fixed a bug in the Mapping Coverage exporter.
- Fixed reads tracks reads-amount indicators (the numbers between the reads track and the box with the tracks name and number of reads) that sometimes wrongly said 0.
- Small RNA Analysis -> Annotate and Merge Counts: When you choose to create a “grouped on mature” output, the small RNAs are grouped by both the 5’ and the 3’ mature sequences separately in the “grouped on mature” output. The column heading has therefore been changed to show “Mature” instead of “Mature 5′”.
- When using the RNA-Seq Analysis tool with the “One reference sequence per transcript” option, the “Maximum number of hits for a read” option was sometimes not taken into account for multi-hit reads. This has been fixed.
- Two problems with the F1 help has been fixed; 1) When pressing F1 in a workflow tool wizard more than one help window appeared, and 2) Fixed problems showing help by pressing the F1 key in tool wizards.
- Add Information about Amino Acid Change tool: In cases where an mRNA track does not overlap all annotations in the CDS track, “Coding Region Changes” were not added to variants that overlap a CDS but not an mRNA annotation. This has been fixed.
- Variant callers and the “Add Information about Amino Acid Changes” tool: In cases where variants overlapping an mRNA annotation but not a CDS annotation,”Coding Region Changes” were not added to variants overlapping an mRNA annotation but not a CDS annotation. This has been fixed.
- Fixed an error that in rare cases would prevent creation of tracks from references sequences.
- Hypergeometric test on annotations: Fixed a rare error that occurred for some datasets containing annotations of the form: ‘1234 // abc’.
- Fixed a bug for color space reads in the RNA-Seq Analysis tool that caused only exon-exon matches to be reported.
- An issue where an XSQ file containing both base space and color space versions of the same reads were incorrectly imported into the same sequence list, resulting in each read appearing twice has been addressed.
- Fixed an issue with mapping of paired-end reads, where these were erroneously reported as broken pairs when the fragment size derived from the alignments of the two ends of the pair was longer than reference sequence.
- Fixed issue where when the options “Keep only selected annotations” in the “Remove information from variants” tool was selected, the Coverage, Count and Frequency columns did not appear in the output.
CLC Cancer Research Workbench 1.5.6
Release date: August 18, 2015
- Fixed a read mapper bug that caused some reads to be incorrectly reported as unmapped when global alignment was selected.
- Fixed a bug that caused the mapper to enter an infinite loop if a reference of length 0 was used.
- Fixed a rare bug that sometimes made the read mapper halt prematurely when several seeds were identified at the same reference position.
- Fixed a SOLiD NGS importer bug where import of very low quality, colorspace encoded paired-end sequence reads in fastq format could lead to paired sequence lists where the wrong reads area marked as pairs.
- Fixed sort order for paired reads in SAM/BAM exports in high coverage regions.
- The analysis/workflow execution system now handles search algorithms specially so that search results are not modified. This eliminates a host of concurrency issues.
- Minor improvements in persistence.
CLC Cancer Research Workbench 1.5.5
Release date: June 18, 2015
Bug fixes
- Fixed an issue introduced by a fix in the Cancer Research Workbench 1.5.4 restricting the use of the QC for Targeted Sequencing tool on tracks containing a larger number of nucleotides (>2147483647 bp) than could be supported for coverage table output. This check is no longer applied if coverage table output is not requested.
- Fixed bug in which Local Realignment could produce an illegal read mapping. This only happened for RNA-data.
- The variant caller will now fail if it encounters an illegal RNA read mapping. If the variant caller fails with such a message, and if it was run on locally realigned data, then we suggest to re-run the local realignment to avoid the error.
- Side panel option to show legends for a plot with more than 10 samples is now enabled.
- Fixed saving different line colors in plots through the side panel.
- Plots inside reports are now shown with their saved side panel settings.
- The automated paired distance estimate can no longer exceed the maximum distance accepted by read mapper (100,000 bp).
- Fixed an error that occurred when hovering the mouse cursor over the edge of a read mapping.
- Read-only folders are no longer offered as potential locations to save data bundled with a Workflow.
CLC Cancer Research Workbench 1.5.4
Release date: April 23, 2015
Bug fixes
- The filtering option in the Extract Differentially Expressed Genes tool only considered the predicted fold-changes in the positive direction, so features that were reduced in expression were filtered out. This has now been fixed. The change also affects the workflow: “Identify and Annotate Differentially Expressed Genes and Pathways”, as the tool is also included in this workflow.
- When using the RNA-Seq Analysis tool with the “One reference sequence per transcript” option, the “Maximum number of hits for a read” option was sometimes not taken into account for multi-hit reads. This has been fixed.
- Fixed an issue with mapping of paired-end reads, where these were erroneously reported as broken pairs when the fragment size derived from the alignments of the two ends of the pair was longer than reference sequence.
- Fixed issue where when the options “Keep only selected annotations” in the “Remove information from variants” tool was selected, the Coverage, Count and Frequency columns did not appear in the output.
- Fixed a bug in the probabilistic variant caller that caused it to fail for certain input.
CLC Cancer Research Workbench 1.5.3
Release date: February 17, 2015
Bug fixes
- Fixed reads tracks reads-amount indicators (the numbers between the reads track and the box with the tracks name and number of reads) that sometimes wrongly said 0.
- Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
- Small RNA Analysis -> Annotate and Merge Counts: When you choose to create a “grouped on mature” output, the small RNAs are grouped by both the 5’ and the 3’ mature sequences separately in the “grouped on mature” output. The column heading has therefore been changed to show “Mature” instead of “Mature 5′”.
- Two problems with the F1 help has been fixed; 1) When pressing F1 in a workflow tool wizard more than one help window appeared, and 2) Fixed problems showing help by pressing the F1 key in tool wizards.
- Add Information about Amino Acid Changes tool: In cases where an mRNA track does not overlap all annotations in the CDS track, “Coding Region Changes” were not added to variants that overlap a CDS but not an mRNA annotation. This has been fixed.
- The Low Frequency Variant caller could end up in an infinite loop in certain corner cases. This is now fixed.
- Hypergeometric test on annotations: Fixed a rare error that occurred for some datasets containing annotations of the form: ‘1234 // abc’.
- Fixed a bug for color space reads in RNA-Seq Analysis that caused all exon-exon matches to be filtered away.
- Fixed “Export Graphics” default save-as directory.
- Fixed error when removing an unsaved reads track from a Genome Browser View.
- Fixed display problem showing too many hidden insertions in certain overlapping paired reads.
- Fixed a bug in the Mapping Coverage exporter.
- Fixed problem with import of BED files using external applications.
- Fixed a bug for color space reads in RNA-Seq Analysis that caused all exon-exon matches to be filtered away.
- Fixed problem going back in the wizard for the “Find Binding Site and Create Fragments” tool.
CLC Cancer Research Workbench 1.5.2
Release date: November 12, 2014
- On update of the QIAGEN GeneRead Panel analysis plugin, the plugin would not update the workflow it comes with. This has been fixed.
CLC Cancer Research Workbench 1.5.1
Release date: October 28, 2014
New features and improvements
- RNA-Seq Analysis: The ENSEMBL gene id of each gene, where available, has been added as an additional column to the gene expression track output.
- It is now possible to run a workflow without an optional input.
Bug fixes
- The AAC tool did not annotate variants in 3′ UTR with their DNA-level change using the HGVS c.xxx format. This affects any analysis done with Gx 7.5 or earlier based on ENSEMBL CDS tracks from older versons. The AAC analysis should be redone using Gx 7.5.1 for correct annotation. Important: Please also check the description in the CRWB 7.5 release notes of a bug fix in the translation of CDS annotations to protein sequences that was wrong in cases where the reading frame was not +1 or -1 in CDS annotations imported from ENSEMBL.
- A bug has been fixed in the Set Up Experiment tool. Exon-related expression values can now only be selected when present in the individual samples.
- When creating a subset of a paired experiment, the sub-experiment no longer appeared as being paired. This bug has been fixed and sub-experiments created in previous versions should recover the pairing information when accessed with this version of the workbench.
- Fixed problem importing VCF files using the AO and RO genotype field.
- Fixed problem importing certain VCF files.
- Fixed problem with scrolling to the relevant files when selecting objects as parameters in tool wizards.
- Fixed a bug in the Annotate and Merge Counts tool that in rare cases resulted in incorrect sorting and crash.
- Fixed problem with import of read mappings with supplementary alignments. When importing read mappings with supplementary alignments, supplementary alignments are not imported. Previously import of such read mappings caused import errors.
- Fixed rare problem with coverage that could occur in zoomed out reads tracks containing wrapped paired reads.
- Fixed rare error when sorting experiment tables.
CLC Cancer Research Workbench 1.5
Release date: August 28, 2014
New features and improvements
- New tools:
- It is now possible to analyze RNA-Seq data in CLC Cancer Research Workbench. A new set of ready-to-use workflows have been added to the toolbox under “Ready-to-Use Workflows” -> “Whole Transcriptome Sequencing”:
- Annotate Variants (WTS).
- Compare Variants in DNA and RNA.
- Identify Candidate Variants and Genes from Tumor Normal Pair.
- Identify Differentially Expressed Genes Across Samples.
- Identify Variants and Add Expression values.
- Transcriptomics Analysis tools: A new folder called “Transcriptomics Analysis” has been added to the toolbox under “Tools”. This folder holds a range of different tools that can be used for the analysis of RNA-seq data.
- Add Fold Changes is a new tool found under “Tools” in the folder “Add Information to Variants”…
- Identify Differentially Expressed Gene Groups and Pathways is a new tool found under “Tools” in the folder “Identify Candidate Genes”. The tool can be used to investigate candidate differentially expressed genes for a common functional role.
- The ready-to-use workflows that perform read mapping and local realignment now also include detection of larger insertions and deletions.
- NGS Importers are now enabled for workflows.
- A new folder called “Legacy tools” has been added to the toolbox. The “Probabilistic Variant Detection” and the “Quality Based Variant Detection” tools have been moved to this folder as the three variant detectors; Basic Variant Detection, Fixed Ploidy Variant Detection, and Low Frequency Variant Detection have been moved out of beta.
- CLC Cancer Research Workbench can now be connected to a server with a Cancer Research Server Plugin, which is an add-on to the Genomics Server. In addition to the Genomics Server license you need a license for the CLC Cancer Research Server Plugin. This enables cancer related tools on the GxServer.
- The Search Editor is now capable of filtering on “Path”.
- Zoom tools redesign: The “zoom to selection” feature is now also available for sequences, sequence lists, alignments and read mappings.
- The tracks info panel, with track names in the left side of the track, now wraps information instead of showing a scroll bar.
- Saving/applying side-panel settings for tables now works for different tables that share some columns.
- Graph Tracks can now be exported to Wiggle file; the span option is now supported in the Wiggle import.
- SAM/BAM import. It is now possible to choose to ignore unmapped reads when importing SAM/BAM files.
- Fisher’s Exact Test. Added the following options for correction of p-value for multiple testing: Bonferroni correction and False Discovery Rate (FDR).
- Speedup: newly created expression tracks will display the graph faster.
- Copy operations can now be stopped.
- Import of Example Data and imports done through dragging files into the workbench and dropping them in the Navigation Area will no longer block the user interface while executing. Instead, the import happens as a background process that can be monitored and controlled via the Processes tab in the lower left corner.
- CLC workbenches now support high resolution displays such as Apple retina displays of all data shown in the View Area (including tooltips).
- Data Management for References in the ready-to-use workflows has undergone a small change, making it no longer modal, and adding two new statuses for detecting if reference data is inconsistent (e.g. not fully downloaded), or if different workflows use different versions. If you need to delete (suspected inconsistent) data, you can now do that from the Data Management as well. Furthermore, you can now see where new data is available (local and/or server), and it can be downloaded to both locations at the same time.
- Configuring ready-to-use workflows with your own reference data now helps you select data of the right type.
- The Data Management notification to download new references can now be dismissed (until next time something gets updated, at which point you can dismiss it again).
- Advanced filtering on tables now includes the option to filter for a space, comma or semi-colon delimited lists of terms.
- Improved error messages due to low disk space.
- Improved variant callers:
- The value that the “Read position filter” operates on is now added as an annotation.
- Created a column on variants which contains “Forward/reverse read imbalance significance”.
- The “mark as homopolymer” parameters were removed from the wizard and made into an annotation.
- A “Maximum frequency” parameter has been added to the “Pyro filter”.
- Added a parameter to the “Direction and position filters” parameters in the “Noise filters” wizard step.
- Two new columns that are called ‘Read count’ and “Read coverage” were added to the variant table. These consider “reads” rather than fragments.
- The way the filters are applied has been changed, so that, in the first iteration they are only applied at half their values, and only after joining are they applied at their full value.
Bug fixes
- Translation of CDS annotations to protein sequences was wrong in cases where the reading frame was not +1 or -1 in CDS annotations imported from ENSEMBL. This error affected the Translate to Protein tool, translation functionality in sequence viewers and their context menus, as well as the Amino Acid Consequences (AAC) variant annotation tool. We highly recommend redoing the AAC analysis for correct variant annotation, as CDS tracks typically are created from ENSEMBL data.
- A bug has been fixed in the Local Realignment tool. The bug materializes in extremely rare cases when applying the variant callers on locally realigned RNA-seq mappings with spliced reads. On these mappings, local realignment could generate invalid spliced reads (after local realignement, you could have spliced reads with segments that overlapped).
- Fixed rare error when sorting experiment tables.
Changes
- The start-up background canvas has been updated with the new RNA-seq analysis options.
- The concepts “Secondary Analysis”, “Tertiary Analysis”, and “Combined secondary and tertiary analysis” have been changed to “Data analysis”, “Interpretation”, and “Data analysis and interpretation”, respectively.
CLC Cancer Research Workbench 1.0.1
Release date: June 16, 2014
New features
Bug fixes
- Fixed: When several tracks were available as reference data for a workflow, it was not possible to change the selection after the initial run of the workflow.
CLC Cancer Research Workbench 1.0
Release date: April 7, 2014