CLC Cancer Research Workbench 2.0

Release date: February 24, 2015

Major highlights

ChIP-seq data analysis is now available. The tools for ChIP-seq analysis are found in the toolbox under "Epigenomics Analysis".
A highly sensitive Copy Number Variant Detection tool is available for targeted amplicon and exome sequencing data via the Plugin Manager.
A new display of mutations at the amino acid level is available in the Genome Browser View.
It is now possible to find out why a variant is pathogenic or if a variant could be pathogenic. This can be done by visualizing the effect of the mutation on the protein 3D structure using the tool Link Variants to 3D Protein Structure.

Detailed descriptions

New tools:

Merge Read Mappings. This tool can be used to merge two read mappings if you have performed two mappings with the same reference sequences.
Extract Reads Based on Overlap. This tool can be used to extract subsets of reads based on annotations.
On Gaussian Data. The Gaussian based test (t-test and Anova) tool has been included.
Cloning tools. A folder has been added to the toolbox with tools for cloning and restriction site analysis.
ChIP-Seq Analysis. The tool for analysis of ChIP sequencing experiments is found under "Epigenomics Analysis". The tool identifies genomic regions with significantly enriched read coverage and a read distribution with a characteristic shape.
Annotate with Nearby Gene Information. This tool is part of the "Epigenomics Analysis" and can be used to create a copy of the annotation track used as input and add information about nearby genes.
- The Map Reads to Reference tool now supports both linear gap cost parameters and affine gap cost parameters. The addition of affine gap cost support allows you to get more accurate results for reads with stretches of insertions or deletions.
- The read mapper used in the RNA-Seq Analysis tool has been upgraded to use the affine gap cost read mapper described above. This upgrade enables you to run RNA-seq analysis with as little as 6 GB RAM and at the same time improves your end results. See our blog post for a further review.
- The tool Add Information about Amino Acid Changes has been expanded with an extra output that makes it possible to visualize amino acid changes in track format. The amino acid color schemes can be changed in the Side Panel under "Track layout" and "Amino acids track".
- Chromosome bands/cytogenetic ideograms can now be downloaded to the Workbench via the Data Management. The ideogram can be added to Genome Browser Views to get a better overview of the data.
- Link Variants to 3D Protein Structure makes it possible to visualize amino acid changes on 3D protein structures. After running the tool on a variant table, variants can be visualized on 3D structures. 3D models are automatically built using structural templates from the PDB. The new tool can be found under 'Tools | Add Information to Variants | Link Variants to 3D Protein Structure'.
- The "Identify Known Mutations from Sample Mappings" tool now supports ignoring broken reads and multi-match hits. The change in behavior is that conflicting broken paired reads now contribute to the overall "coverage" of the variant, but are still ignored in the counts.
- A 3D Molecule Viewer is now available for visualization of protein, RNA, DNA, and small molecules.
- Tracks:
  - Consistent output when enriching variant tracks and annotation tracks with extra table columns. Output tracks from these tools now have the same number of added table columns and the columns will always be in the same order. Previously, if an added column had empty values for all variant rows, it would have been removed from the final table, resulting in varying number and relative order of additional columns when multiple samples were processed with the same tools/workflows. All columns are retained now, facilitating downstream processing of exported tables, and providing immediate visual reference as to which enrichment/annotation tools have been applied, even if they did not produce any results for a particular sample.
  - Tables for variant tracks and annotation tracks can now sort and filter columns with cells containing multiple numbers.
  - Improved the track viewer for variant tracks to show the sequence alteration on the rendered variant.
  - Improved performance of creating variant tracks and annotation tracks.
  - Graph tracks now show negative values filled upwards to y=0 (as expected).
  - It is now possible to extract sequences from tracks. The sequence of interest can be selected by dragging the mouse over the region of interest followed by a right click on the reads and a click on Extract from selection.
- Workflows:
  - An extra optional output called "Create coverage graph", that shows the coverage in each position of the targets, has been added to the tool QC for Target Sequencing.
  - When installing a workflow in the workflow manager, the newly installed workflow is automatically selected.
  - When creating a new workflow installer, it is now possible to include reference data without bundling the reference data with the workflow. Instead the reference data can be included by pointing at the CLC_References directory.
  - The "Run" button in the workflow editor does not require a saved workflow anymore to be enabled.
  - In the execution wizard of a workflow the "Reset to default" button is now active.
  - All icons in the workflow editor are now on the left side.
  - Introduction of snippets: Parts of workflows can now be saved as a snippet and reused in other workflows.
  - Installed workflows: It is now possible to create a copy of an installed workflow and open the copy in the view area by clicking once and then right-clicking on the installed workflow in the toolbox. This brings up the option "Open Copy of Workflow".
- MA plots, scatter plots and histograms can now accept expression tracks as input
- Increased decimals for numbers when exporting table to CSV, tab delimited text, and Excel.
- Improved reporting of errors related to low disk space.
- Batching: Processes tab and analysis execution logs now display batch names in addition to analysis names for enhanced clarity.
- The External Application Client Plugin is now available directly from the Workbench Plugin Manager.
- Multiple target region tracks for the "Indels and Structural variants" can now be specified.

Bug fixes

Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
Fixed display problem in read mappings showing too many hidden insertions (as vertical black lines) in certain overlapping paired reads.
Fixed problem with links and text in tables that were being cut off when succeeding a link.
The tool Identify Graph Threshold Areas can now use negative values to define its threshold.
Workflows:
- In the workflow editor the "Reset to default" now always reverts to the right names.
- In the workflow editor the validation is now correctly triggered when changing the configuration of an input element.
- The workflow editor can now open workflows in which the graphical view of the workflow is corrupt.
- Fixed an exception which could occur during workflow migration.
- Data with the same name can now be bundled multiple times in a workflow installer.
- Previously when a plugin contained custom actions and a workflow, the workflow could not be installed. This has been fixed.
- Fixed problem with unlocked output names that previously could not be configured during execution of a workflow.
- A workflow with configured data from a server is now automatically validated when connected to the server (when opened in the editor). Previously the workflow had to be closed and reopened first.
- The original workflow file included in a workflow installer can now be exported directly without having to restart the workbench in advance.”
A problem with saved table settings that sometimes did not work has been fixed. The bug fix includes a more robust/generic way of saving table settings with different columns. To fix this problem, existing saved table settings should first be loaded on an object where it works (i.e. has the same columns as when it was saved); and then the table settings should be saved with the old name to overwrite the settings.
Fixed an error that could cause batch processing to open all results rather than saving them.
Fixed problem with import of BED files using external applications.
SAM/BAM import will no longer fail for alignments with POS = 0, but instead import them as though they were unmapped.
Fixed problem going back in the wizard for the "Find Binding Site and Create Fragments" tool.
Fixed error occurring when removing an unsaved reads track from a track list.
Fixed error when showing protein translations of annotations shorter than 3 bases.
Fixed a bug in the Mapping Coverage exporter.
Fixed reads tracks reads-amount indicators (the numbers between the reads track and the box with the tracks name and number of reads) that sometimes wrongly said 0.
Small RNA Analysis -> Annotate and Merge Counts: When you choose to create a “grouped on mature” output, the small RNAs are grouped by both the 5’ and the 3’ mature sequences separately in the “grouped on mature” output. The column heading has therefore been changed to show "Mature" instead of "Mature 5'".
When using the RNA-Seq Analysis tool with the "One reference sequence per transcript" option, the "Maximum number of hits for a read" option was sometimes not taken into account for multi-hit reads. This has been fixed.
Two problems with the F1 help has been fixed; 1) When pressing F1 in a workflow tool wizard more than one help window appeared, and 2) Fixed problems showing help by pressing the F1 key in tool wizards.
Add Information about Amino Acid Change tool: In cases where an mRNA track does not overlap all annotations in the CDS track, "Coding Region Changes" were not added to variants that overlap a CDS but not an mRNA annotation. This has been fixed.
Variant callers and the "Add Information about Amino Acid Changes" tool: In cases where variants overlapping an mRNA annotation but not a CDS annotation,"Coding Region Changes" were not added to variants overlapping an mRNA annotation but not a CDS annotation. This has been fixed.
Fixed an error that in rare cases would prevent creation of tracks from references sequences.
Hypergeometric test on annotations: Fixed a rare error that occurred for some datasets containing annotations of the form: '1234 // abc'.
Fixed a bug for color space reads in the RNA-Seq Analysis tool that caused only exon-exon matches to be reported.
An issue where an XSQ file containing both base space and color space versions of the same reads were incorrectly imported into the same sequence list, resulting in each read appearing twice has been addressed.
Fixed an issue with mapping of paired-end reads, where these were erroneously reported as broken pairs when the fragment size derived from the alignments of the two ends of the pair was longer than reference sequence.
Fixed issue where when the options "Keep only selected annotations" in the "Remove information from variants" tool was selected, the Coverage, Count and Frequency columns did not appear in the output.

CLC Cancer Research Workbench 36

Release date: August 18, 2015

Fixed a read mapper bug that caused some reads to be incorrectly reported as unmapped when global alignment was selected.
Fixed a bug that caused the mapper to enter an infinite loop if a reference of length 0 was used.
Fixed a rare bug that sometimes made the read mapper halt prematurely when several seeds were identified at the same reference position.
Fixed a SOLiD NGS importer bug where import of very low quality, colorspace encoded paired-end sequence reads in fastq format could lead to paired sequence lists where the wrong reads area marked as pairs.
Fixed sort order for paired reads in SAM/BAM exports in high coverage regions.
The analysis/workflow execution system now handles search algorithms specially so that search results are not modified. This eliminates a host of concurrency issues.
Minor improvements in persistence.

CLC Cancer Research Workbench 1.5.5

Release date: June 18, 2015

Bug fixes

Fixed an issue introduced by a fix in the Cancer Research Workbench 1.5.4 restricting the use of the QC for Targeted Sequencing tool on tracks containing a larger number of nucleotides (>2147483647 bp) than could be supported for coverage table output. This check is no longer applied if coverage table output is not requested.
Fixed bug in which Local Realignment could produce an illegal read mapping. This only happened for RNA-data.
The variant caller will now fail if it encounters an illegal RNA read mapping. If the variant caller fails with such a message, and if it was run on locally realigned data, then we suggest to re-run the local realignment to avoid the error.
Side panel option to show legends for a plot with more than 10 samples is now enabled.
Fixed saving different line colors in plots through the side panel.
Plots inside reports are now shown with their saved side panel settings.
The automated paired distance estimate can no longer exceed the maximum distance accepted by read mapper (100,000 bp).
Fixed an error that occurred when hovering the mouse cursor over the edge of a read mapping.
Read-only folders are no longer offered as potential locations to save data bundled with a Workflow.

CLC Cancer Research Workbench 1.5.4

Release date: April 23, 2015

Bug fixes

The filtering option in the Extract Differentially Expressed Genes tool only considered the predicted fold-changes in the positive direction, so features that were reduced in expression were filtered out. This has now been fixed. The change also affects the workflow: "Identify and Annotate Differentially Expressed Genes and Pathways", as the tool is also included in this workflow.
When using the RNA-Seq Analysis tool with the "One reference sequence per transcript" option, the "Maximum number of hits for a read" option was sometimes not taken into account for multi-hit reads. This has been fixed.
Fixed an issue with mapping of paired-end reads, where these were erroneously reported as broken pairs when the fragment size derived from the alignments of the two ends of the pair was longer than reference sequence.
Fixed issue where when the options "Keep only selected annotations" in the "Remove information from variants" tool was selected, the Coverage, Count and Frequency columns did not appear in the output.
Fixed a bug in the probabilistic variant caller that caused it to fail for certain input.

CLC Cancer Research Workbench 1.5.3

Release date: February 17, 2015

Bug fixes

Fixed reads tracks reads-amount indicators (the numbers between the reads track and the box with the tracks name and number of reads) that sometimes wrongly said 0.
Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
Small RNA Analysis -> Annotate and Merge Counts: When you choose to create a “grouped on mature” output, the small RNAs are grouped by both the 5’ and the 3’ mature sequences separately in the “grouped on mature” output. The column heading has therefore been changed to show "Mature" instead of "Mature 5'".
Two problems with the F1 help has been fixed; 1) When pressing F1 in a workflow tool wizard more than one help window appeared, and 2) Fixed problems showing help by pressing the F1 key in tool wizards.
Add Information about Amino Acid Changes tool: In cases where an mRNA track does not overlap all annotations in the CDS track, "Coding Region Changes" were not added to variants that overlap a CDS but not an mRNA annotation. This has been fixed.
The Low Frequency Variant caller could end up in an infinite loop in certain corner cases. This is now fixed.
Hypergeometric test on annotations: Fixed a rare error that occurred for some datasets containing annotations of the form: '1234 // abc'.
Fixed a bug for color space reads in RNA-Seq Analysis that caused all exon-exon matches to be filtered away.
Fixed "Export Graphics" default save-as directory.
Fixed error when removing an unsaved reads track from a Genome Browser View.
Fixed display problem showing too many hidden insertions in certain overlapping paired reads.
Fixed a bug in the Mapping Coverage exporter.
Fixed problem with import of BED files using external applications.
Fixed a bug for color space reads in RNA-Seq Analysis that caused all exon-exon matches to be filtered away.
Fixed problem going back in the wizard for the "Find Binding Site and Create Fragments" tool.

CLC Cancer Research Workbench 1.5.2

Release date: November 12, 2014

On update of the QIAGEN GeneRead Panel analysis plugin, the plugin would not update the workflow it comes with. This has been fixed.

CLC Cancer Research Workbench 1.5.1

Release date: October 28, 2014

New features and improvements

RNA-Seq Analysis: The ENSEMBL gene id of each gene, where available, has been added as an additional column to the gene expression track output.
It is now possible to run a workflow without an optional input.

Bug fixes

The AAC tool did not annotate variants in 3' UTR with their DNA-level change using the HGVS c.xxx format. This affects any analysis done with Gx 7.5 or earlier based on ENSEMBL CDS tracks from older versons. The AAC analysis should be redone using Gx 7.5.1 for correct annotation. Important: Please also check the description in the CRWB 7.5 release notes of a bug fix in the translation of CDS annotations to protein sequences that was wrong in cases where the reading frame was not +1 or -1 in CDS annotations imported from ENSEMBL.
A bug has been fixed in the Set Up Experiment tool. Exon-related expression values can now only be selected when present in the individual samples.
When creating a subset of a paired experiment, the sub-experiment no longer appeared as being paired. This bug has been fixed and sub-experiments created in previous versions should recover the pairing information when accessed with this version of the workbench.
Fixed problem importing VCF files using the AO and RO genotype field.
Fixed problem importing certain VCF files.
Fixed problem with scrolling to the relevant files when selecting objects as parameters in tool wizards.
Fixed a bug in the Annotate and Merge Counts tool that in rare cases resulted in incorrect sorting and crash.
Fixed problem with import of read mappings with supplementary alignments. When importing read mappings with supplementary alignments, supplementary alignments are not imported. Previously import of such read mappings caused import errors.
Fixed rare problem with coverage that could occur in zoomed out reads tracks containing wrapped paired reads.
Fixed rare error when sorting experiment tables.

CLC Cancer Research Workbench 1.5

Release date: August 28, 2014

New features and improvements

New tools:

It is now possible to analyze RNA-Seq data in CLC Cancer Research Workbench. A new set of ready-to-use workflows have been added to the toolbox under "Ready-to-Use Workflows" -> "Whole Transcriptome Sequencing":
- Annotate Variants (WTS).
- Compare Variants in DNA and RNA.
- Identify Candidate Variants and Genes from Tumor Normal Pair.
- Identify Differentially Expressed Genes Across Samples.
- Identify Variants and Add Expression values.
Transcriptomics Analysis tools: A new folder called "Transcriptomics Analysis" has been added to the toolbox under "Tools". This folder holds a range of different tools that can be used for the analysis of RNA-seq data.
Add Fold Changes is a new tool found under "Tools" in the folder "Add Information to Variants"...
Identify Differentially Expressed Gene Groups and Pathways is a new tool found under "Tools" in the folder "Identify Candidate Genes". The tool can be used to investigate candidate differentially expressed genes for a common functional role.

The ready-to-use workflows that perform read mapping and local realignment now also include detection of larger insertions and deletions.
NGS Importers are now enabled for workflows.
A new folder called “Legacy tools” has been added to the toolbox. The "Probabilistic Variant Detection" and the "Quality Based Variant Detection" tools have been moved to this folder as the three variant detectors; Basic Variant Detection, Fixed Ploidy Variant Detection, and Low Frequency Variant Detection have been moved out of beta.
CLC Cancer Research Workbench can now be connected to a server with a Cancer Research Server Plugin, which is an add-on to the Genomics Server. In addition to the Genomics Server license you need a license for the CLC Cancer Research Server Plugin. This enables cancer related tools on the GxServer.
The Search Editor is now capable of filtering on "Path".
Zoom tools redesign: The “zoom to selection” feature is now also available for sequences, sequence lists, alignments and read mappings.
The tracks info panel, with track names in the left side of the track, now wraps information instead of showing a scroll bar.
Saving/applying side-panel settings for tables now works for different tables that share some columns.
Graph Tracks can now be exported to Wiggle file; the span option is now supported in the Wiggle import.
SAM/BAM import. It is now possible to choose to ignore unmapped reads when importing SAM/BAM files.
Fisher's Exact Test. Added the following options for correction of p-value for multiple testing: Bonferroni correction and False Discovery Rate (FDR).
Speedup: newly created expression tracks will display the graph faster.
Copy operations can now be stopped.
Import of Example Data and imports done through dragging files into the workbench and dropping them in the Navigation Area will no longer block the user interface while executing. Instead, the import happens as a background process that can be monitored and controlled via the Processes tab in the lower left corner.
CLC workbenches now support high resolution displays such as Apple retina displays of all data shown in the View Area (including tooltips).
Data Management for References in the ready-to-use workflows has undergone a small change, making it no longer modal, and adding two new statuses for detecting if reference data is inconsistent (e.g. not fully downloaded), or if different workflows use different versions. If you need to delete (suspected inconsistent) data, you can now do that from the Data Management as well. Furthermore, you can now see where new data is available (local and/or server), and it can be downloaded to both locations at the same time.
Configuring ready-to-use workflows with your own reference data now helps you select data of the right type.
The Data Management notification to download new references can now be dismissed (until next time something gets updated, at which point you can dismiss it again).
Advanced filtering on tables now includes the option to filter for a space, comma or semi-colon delimited lists of terms.
Improved error messages due to low disk space.
Improved variant callers:
- The value that the "Read position filter" operates on is now added as an annotation.
- Created a column on variants which contains "Forward/reverse read imbalance significance".
- The "mark as homopolymer" parameters were removed from the wizard and made into an annotation.
- A "Maximum frequency" parameter has been added to the "Pyro filter".
- Added a parameter to the "Direction and position filters" parameters in the "Noise filters" wizard step.
- Two new columns that are called 'Read count' and "Read coverage" were added to the variant table. These consider "reads" rather than fragments.
- The way the filters are applied has been changed, so that, in the first iteration they are only applied at half their values, and only after joining are they applied at their full value.

Bug fixes

Translation of CDS annotations to protein sequences was wrong in cases where the reading frame was not +1 or -1 in CDS annotations imported from ENSEMBL. This error affected the Translate to Protein tool, translation functionality in sequence viewers and their context menus, as well as the Amino Acid Consequences (AAC) variant annotation tool. We highly recommend redoing the AAC analysis for correct variant annotation, as CDS tracks typically are created from ENSEMBL data.
A bug has been fixed in the Local Realignment tool. The bug materializes in extremely rare cases when applying the variant callers on locally realigned RNA-seq mappings with spliced reads. On these mappings, local realignment could generate invalid spliced reads (after local realignement, you could have spliced reads with segments that overlapped).
Fixed rare error when sorting experiment tables.

Changes

The start-up background canvas has been updated with the new RNA-seq analysis options.
The concepts "Secondary Analysis", "Tertiary Analysis", and "Combined secondary and tertiary analysis" have been changed to "Data analysis", "Interpretation", and "Data analysis and interpretation", respectively.

CLC Cancer Research Workbench 1.0.1

Release date: June 16, 2014

New features

A new plugin for use with QIAGEN GeneRead panels is now available. Go to Help -> Plugins to download.

Bug fixes

Fixed: When several tracks were available as reference data for a workflow, it was not possible to change the selection after the initial run of the workflow.

CLC Cancer Research Workbench 1.0

Release date: April 7, 2014

CLC Cancer Research Workbench 2.0

Detailed descriptions

New tools:

Bug fixes

CLC Cancer Research Workbench 36

CLC Cancer Research Workbench 1.5.5

Bug fixes

CLC Cancer Research Workbench 1.5.4

Bug fixes

CLC Cancer Research Workbench 1.5.3

Bug fixes

CLC Cancer Research Workbench 1.5.2

CLC Cancer Research Workbench 1.5.1

New features and improvements

Bug fixes

CLC Cancer Research Workbench 1.5

New features and improvements

New tools:

Bug fixes

Changes

CLC Cancer Research Workbench 1.0.1

New features

Bug fixes

CLC Cancer Research Workbench 1.0

Follow Us

Contact Us