Latest improvements for Biomedical Genomics Workbench
Biomedical Genomics Workbench 4.1.3
Release date: 2018-12-06
Bug fixes
- Fixed a bug where the “Unaligned end” field provided in the Breakpoint track output of the Indel and Structural Variants tool was left blank when the value should have been “Mixed consensus” on all but one chromosome. The field is now filled for all chromosomes.
- Fixed a issue with the Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools that caused a small minority of variants to go unreported under certain conditions expected to arise rarely.
- Fixed a concurrency bug in the Copy Number Variant Detection tool, which very rarely resulted in the tool reporting all low-coverage targets on one or more chromosomes as false positive deletions.
- Fixed the links to the AmiGO Gene Ontology website used for GO annotations.
- Fixed a bug where in some cases, the Search for Reads in SRA… tool would not fetch the final page of results.
- Fixed an issue where the RNA-Seq Analysis tool would sometimes generate TE tracks that could not be used in downstream tools. The error occurred when the “Calculate expression for genes without transcripts” option was used on a gene track where two genes had the same name, one of the genes contained the other, and neither gene had a transcript.
- Fixed an issue where the RNA-Seq Analysis tool would show an error if the first chromosome or contig contained no transcripts and the “Calculate expression for genes without transcripts” option was used.
Biomedical Genomics Workbench 4.1.2
Release date: 2017-12-05
Improvements and changes
RNA-seq analysis:
- Fixed a bug in the RNA-Seq Analysis tool where, when run in “Genes and transcripts” mode, and using “Total counts” as Expression value, the expression values reported for GE tracks would not include shared exon counts. Downstream analyses based on the Set Up Experiment tool could be affected by this issue. Using affected GE tracks as input to the following tools would *not* affect their results: Differential Expression for RNA-Seq, Create Heat Map for RNA-Seq and PCA for RNA-Seq.
- The behavior of the RNA-Seq Analysis tool has been changed when the option “Genome annotated with genes and transcripts” is used together with the option “Calculate expression for genes without transcripts”.
-
- The counts of genes without transcripts are calculated. Previously only the TPM and RPKM were calculated.
- For a gene without a corresponding transcript, where that gene is overlapped by the intron of another gene, reads aligning to this region are counted towards the expression of the gene without the transcript. Previously such reads were counted as belonging to the intronic region of the overlapping gene.
- A single-exon transcript for each gene without transcripts is now added to the output TE track.
General:
- Fixed an issue where the number of input samples to the Map Reads to Reference tool and Map Reads to Contigs tools would be silently limited to 120. The execution is now aborted with a warning message. Each analysis must be started with 120 samples maximum.
- Improved the information about what to do to when a workflow needs to be updated.
Bug fixes
- Fixed an issue with the mapping tool in the Workbench, which is used in tools involving a mapping stage, such as Map Reads to References and RNA-Seq Analysis, where length and similarity fraction cut-offs in some cases were ignored for reads longer than 500bp.
- Fixed a bug in the Add Information about Amino Acid Changes tool where the CDS reference was used instead of the RNA reference when annotating coding region changes if the RNA and CDS annotations could not be matched. This could result in variants in UTR regions not being reported. The matching has been improved by supporting the ‘parent’ field used by the GFF3 file format to pair CDS and RNA references.
- Fixed an issue where the option to “Highlight reverse paired reads” in the side panel of a reads track would cause paired end reads to be colored incorrectly if the reads completely overlapped, as would happen in the case of adapter read-through.
- Fixed a bug in the Add Information about Amino Acid Changes tool where the CDS reference was used instead of the RNA reference when annotating coding region changes if the RNA and CDS annotations could not be matched. This could result in variants in UTR regions not being reported. The matching has been improved by supporting the ‘parent’ field used by the GFF3 file format to pair CDS and RNA references.
- Fixed a an issues with the InDels and Structural Variants tool duplicate breakpoints and variants were reported if reads mapping as broken pairs were included in the analysis.
- Fixed an issue with the InDels and Structural Variants that caused it to crash if it encountered a particular set of conditions relating to reads with deletions.
- Fixed an issue where the Low Frequency Variant Detection tool could return NaN for the Probability value in rare instances for small datasets.
- Various minor bugfixes
Advanced notice
Support for SOLiD colorspace data will be phased out over the next 12 months. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
Biomedical Genomics Workbench 4.1.1
Release date: 2017-06-22
Bug fixes
- Fixed an issue introduced in the Biomedical Genomics Workbench 4.1 where workflows containing the Copy Number Variant Detection tool could not be updated automatically.
Advanced notice
Support for SOLiD colorspace data will be phased out over the next 18 months. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
Biomedical Genomics Workbench 4.1.0
Release date: 2017-06-15
New features
- Added keyboard shortcuts to change editor views. Ctrl + Shift + PageUp and Ctrl + Shift + PageDown now changes the current view of the currently focused editor.
- New keyboard shortcuts are available for navigation within the workbench:
- Navigate between open tabs with Ctrl + Page Down and Ctrl + Page Up (Windows/Linux/Mac). On laptops without Page Up/Down keys, the shortcuts are Ctrl+fn+arrow up/down.
- Return focus to the navigation area with Alt + Home.
- We have made the following improvements to tab presentation in the View area of the Workbench:
- Tabs show more of the name of the opened object.
- Tabs now open from the top left corner to the right and down.
- Tabs always stay in the same position when another tab is selected or a new tab is opened.
- A new sub menu has been added to the right click menu on tabs to select between the open tabs.
- Anonymous Workbench usage information can now be shared with us to help us improve our products and offerings. Information about what is collected and how to opt out is provided when the updated Workbench is launched. Further details are available in the manual.
Improvements
- New and improved Save View Settings dialog. This new dialog can be used for saving, applying, importing and exporting side panel views.When importing tracks, the history of the track now contains the full path name of the imported file.
- Improvement to the PCA plot generated by the PCA for RNA-Seq tool, so that all points are visible with default side panel view settings. Previously the standard view settings could hide points with missing metadata.
- Stability improvements to SRA search:
- Fixed an issue that could cause the SRA search to prematurely time out with the message “java.com.SocketTimeoutException”.
- Fixed an issue that could cause the SRA search view to display an error when trying to show results.
- Improved messaging when installing workflows on a QIAGEN CLC Genomics Server from the Workbench.
Bug fixes
- Fixed an issue with the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools that could cause the count and frequency values to be too low for a small subset of those variants that are contained within a larger variant region (e.g. an MNV or deletion). For a variant to be affected by this problem, there needed to be at least two other potential variants nearby that were disregarded during the variant calling process. This circumstance and our testing suggest this is a rare issue.
- Fixed a bug in the Copy Number Variation Detection tool where the target-level output could not be produced unless a gene track was also specified.Fixed an issue where switching to the Heat Map view on an Experiment would give an error when no Heat Map existed.
- Fixed an issue in the Copy Number Variation Detection tool where the data in the “Fold-change (adjusted)” and “Fold-change (raw)” columns were reversed in the target-level CNV output.
- Fixed a bug that in some cases would result in incorrect BaseQRankSum values being reported in the outputs of the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools.
- Fixed an issue where the GFF3 Exporter could generate invalid GFF3 for features of length 0.
- Fixed an issue where workflows run in batch mode would fail in the case where no results are saved to the Navigation Area and only one file is exported per batch unit.
Advanced notice
- Support for SOLiD colorspace data will be phased out over the next 18 months. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
Biomedical Genomics Workbench 4.0
Release date: 2017-03-02
*Tools marked with an asterisk were available to earlier Workbench versions via the Advanced RNA-Seq plugin. They can now be found in the Toolbox in the RNA-Seq Analysis folder.
These tools automatically account for differences due to sequencing depth, removing the need to normalize input data. They work with existing RNA-seq TE and GE tracks. Changes made in this release mean that outputs from the Differential Expression for RNA-Seq tool can now be used as inputs to the Extract Reads Based on Overlap tool.
The three workflows Identify and Annotate Differentially Expressed Genes and Pathways for human, mouse, and rat have been replaced by three new workflows of the same names. The new workflows benefit from the inclusion of new RNA-seq tools. (See the New Tools section.)
RNA-Seq Analysis
- The RNA-Seq Analysis tool now supports RNA spike-ins, such as ERCC and SIRV, for quality control. This makes it possible to validate RNA-Seq experiments by comparing known spike-in concentrations to measured transcript concentrations. Spike-ins can be imported using the new RNA Spike-ins Import tool.
- The RNA-Seq Analysis report has been revised and updated:
- We now show the distribution of the biotypes that the reads mapped to.
- The strand specificity of the mapped reads is now reported. Ready-to-use workflows listed under the “Whole Transcriptome Sequencing” folder of the Workbench Toolbox now support strand-specific RNA-seq protocols by allowing the “Strand Specific” parameter to be set.
- Transcript coverage plots make it possible to detect and visualize 5′ and 3′ coverage bias.
- For paired-end reads, we now detect and warn about potential adapter read-through.
- A biotype column is now available in the Expression Track tables produced by the RNA-Seq Analysis tool, when biotype information is available.
- The Mapping options of the RNA-Seq Analysis tool, “Map to gene regions only” and “Also map to inter-genic regions”, have been removed. The tool now runs by mapping reads to the full reference supplied, which is equivalent to choosing the recommended “Also map to inter-genic regions” option in earlier versions.
- The RNA-Seq Analysis tool now always uses the “Expression level” option “Use EM estimation (recommended)” to quantify expression. This is more accurate than the previous default option. Differences are especially noticeable for Transcript Expression (TE) tracks.
- The RNA-Seq Analysis quantification by EM estimation now runs faster.
- In RNA-Seq analyses, reads that map uniquely to a genome position are now always marked as unique. Previously, a uniquely mapped read would be marked as ambiguous if it mapped to a position with multiple overlapping genes.
- Exon IDs will no longer be included in the ENSEMBL column of transcript expression (TE) tracks generated by the RNA-Seq Analysis tool. Gene and transcript names will continue to be listed and hyperlinked in this column.
Import/Export
- A tool to import PacBio data is now available at Import | PacBio.
- Usability aspects of data association using the Import Metadata tool have been improved, including adding a preview of data items to be associated with particular metadata rows.
- Fasta is now the default format the first time the Import | Tracks tool is invoked (was GFF2/GTF/GVF in earlier versions).
- The GFF2/GTF/GVF tracks importer can no longer be used to import GFF3 format files. The new GFF3 tracks importer should be used for this instead.
- The GFF3 importer has been updated with respect to the handling of CDS features. In earlier versions, CDSs with different IDs but the same parent gene would always be merged into the same CDS feature during import. This behavior will still occur in cases where all CDSs in the GFF3 file either have unique IDs or no IDs. For GFF3 files where there are any CDSs with identical IDs, then only CDSs with the same ID are merged into a single feature.
- The Import | Tracks tool now accepts files with a .fna extension.
- The display of the types of files to import using the Import | Tracks tool has been improved.
- The speed of importing to tracks where the original file contains data relating to many chromosomes has been substantially improved.
- RNA tracks imported from GFF3 format files are now colored according to their biotype.
- The Cosmic option of the Import | Tracks tool is now more flexible with regards to the column headings in the files being imported.
- An exporter has been added to export annotations on sequences or tracks to Generic Feature Format Version 3 (GFF3) format.
- A text exporter has been added.
- An option has been added to create an index file when exporting to BAM format.
New features and improvements
- Two new human reference data sets are available for download from the Reference Data Manager. One is based on Ensembl 86 and the other is based on RefSeq GRCh38.p9.
- The former top level Toolbox folder “Expression Analysis” has been removed and the expression analysis tools are now in two top-level folders: “RNA-Seq Analysis” and “Microarray and Small RNA Analysis“.
- When working with Gene Sets that refer to Gene Ontology terms, gene annotations are now automatically propagated to parent Gene Ontology terms. This improvement affects the tools: Identify Differentially Expressed Gene Groups and Pathways, Hypergeometric Tests on Annotations and Gene Set Enrichment Analysis (GSEA).
- The mapping tool in the Workbench, which is used in tools involving a mapping stage, such as Map Reads to References, Map Reads to Contigs and RNA-Seq Analysis has been updated. The update includes improved read mapping quality and speed (especially for longer reads), improved memory performance for the index building stage, and various minor bug fixes. The new mapping tool corresponds to the clc_mapper tool included in Assembly Cell 5.0.3, planned for release in March, 2017.
- Fixed an issue where sequence circularity was not reported in the output from the Map Reads to References tool.
- The default value for the parameter “Maximum guidance-variant length” in the tool Local Realignment tool has been changed to 200 (was 100). This change applies to all ready-to-use workflows and when the tools is launched directly.
- The Basic Variant Detection tool will no longer report N as an alternative allele when there is an ambiguous base at a variant position.
- Default values for two parameters of the InDels and Structural Variants tool have been changed when the tool is run as part of a ready-to-use workflow: “Minimum quality score” has been changed to 20 (was 0), and “Minimum consensus coverage” has been changed to 0.1 (was 0.0). Default values have not been changed in the case where the tool is launched directly.
- The report generated by the tool QC for Target Sequencing now includes a “≥” sign instead of a “>” sign.
- The “Additional Reporting” options in the QC for Sequencing Reads tool, “Quality analysis” and “Over-representation analysis”, have been removed. These outputs are now generated by default.
- A PubMed search option has been added to the Search for Reads in SRA tool. This returns only those runs that are associated with a PubMed abstract or full-text article.
- Support has been added for ‘negative lookahead’ when using Java regular expressions when using the Motif Search Tool.
- For new or existing sequence lists the sequencing platform can now be specified via the Read Group setting of the Element Info view.
- It is now possible to right-click on a table cell and filter table rows based on the value of that cell by choosing options under the new context menu section called “Table filters”. This change applies to all tables where advanced filtering is available.
- The speed of sorting and loading tracks has been greatly improved. Due to these changes, tracks created with this or later versions of the Workbench cannot be used with older Workbenches. Backwards compatibility has been maintained: tracks created using older versions of the Workbench can continue to be used.
- The speed of searches for data elements with associations to specified metadata, from within a Metadata Table, has been greatly improved. To enable metadata related searches to work after upgrading to Biomedical Workbench 4.0, indices for the locations containing the relevant data will need to be rebuilt.
- Tutorial windows are no longer blocked when a wizard is open.
- Less temporary space is now consumed when downloading data via the Reference Data Manager.
- Various minor improvements
Bug fixes and changes
- In all Ready-to-Use workflows containing the tool Map Reads to Reference, the default value for the parameter “Cost of insertions and deletions” has been changed to “affine” (it used to be “linear”). Default values have not been changed in the case where the tool is launched directly.
- Fixed an issue where the index building stage of the Map Reads to References tool was not taking into account the maxcores setting in the cpu.properties file, where this had been configured.
- Fixed a bug in the QC for Read Mapping tool , which sometimes reported incorrect read counts for circular sequences.
- Fixed an issue where the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools reported homozygous reference insertions in cases where a heterozygous variant was possible but the insertion variant was disregarded during filtering.
- Fixed an issue where the Identify Known Mutations from Sample Mappings tool would fail if it was part of a workflow and it received multiple input sample mappings as input.
- Fixed an issue with the Annotation Table view of a sequence where it was possible to change the types of annotations displayed at the same time as an annotation was being edited, which could lead to an error being thrown or the wrong annotation being changed.
- Fixed an issue with GenBank and EMBL exports where quoting specifications were not being conformed to.
- Fixed an issue with Primer Tables where an error resulted if either the option “Save Primer(s) Fwd, Rev” or “Save Fragment” was chosen and then the save operation was stopped by clicking on the Cancel button.
- Fixed an issue where in some cases filtering tables for empty values would not produce any results.
- Fixed an issue where advanced filtering did not work when looking for rows with cells containing multiple values using the filtering term “=” (equals).
- Fixed an issue where a workflow containing an export step that failed did not provide any indication that a problem had occurred.
- A sporadic java issue that led to errors including the text “java.lang.ClassCastException: sun.awt.image.BufImgSurfaceData cannot be cast to sun.java2d.xr.XRSurfaceData”, has been addressed through an upgrade to java. This issue was primarily seen when using the Workbench remotely on Linux systems.
- Fixed a problem with the identification of the correct sequence types from MLST schemes in cases where the schemes contained blank characters. This issue affected workbenches with QIAGEN CLC Microbial Genomics Module installed.
- Various minor bugfixes.
Retirement
- The GFF exporter has been retired and is no longer available. The new GFF3 exporter should be used instead.
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools have been retired and have been removed from the Legacy folder of the Toolbox.
- Tools in the Expression Profiling by Tags folder under the Toolbox | Legacy area have been retired and this folder has been removed. The tools retired are Extract and Count Tags, Create Virtual Tag List and Annotate Tag Experiment.
- The tool Trim Primers of Mapped Reads has been retired and has been removed from the Toolbox. For trimming primers from mapped reads, please use the Trim Primers and their Dimers from Mapping tool, which is distributed with the “QIAGEN GeneRead Panel Analysis Plugin“.
Plugin notes
- The Advanced RNA-Seq plugin has been retired. The tools from this plugin have been integrated into the software. Please see the New Tools for RNA-Seq section for more details.
Other notifications
- An option to opt out of providing anonymous usage information to QIAGEN has been added to the Workbench Preferences. We are not yet collecting any usage information so opting in or out does not have any effect at this time.