Latest improvements for QIAGEN CLC Genomics Server
QIAGEN CLC Genomics Server 23.0.5
Shared with workbenches
Improvements
- Detect and Refine Fusion Genes has a new option allowing fusions of overlapping genes on opposite strands to be reported.
- Previously, when annotation tracks were exported to BED format files, the Score column in the exported file contained only 0 values. Now, if the annotation track contains a Score column, those values are reported in the Score column of the exported file. (This does not affect expression tracks, where the expression value is exported as the score.)
- VCF Import can import VCF files with an unexpected number of values in CLCAD2 or AD. This includes VCF files produced by VarScan2.
- Various minor improvements
Bug fixes
- Fixed an issue that could cause Copy Number Variant Detection (CNVs) to give wrong results when targets were overlapping and coverage tables were used as control mappings, see https://digitalinsights.qiagen.com/technical-support/faq/important-clc-notifications/copy-number-variant-detection-cnvs-can-give-wrong-results-when-targets-overlap-and-coverage-tables-are-used-as-controls/.
- Fixed an issue causing Annotate with Nearby Gene Information to report incorrect nearest-gene information for the last gene (3') on a given chromosome.
- Fixed an issue causing Detect and Refine Fusion Genes to fail if the provided mRNA track contained transcripts annotated with priorities and the track was imported using the GFF3 importer.
- Fixed an issue causing Demultiplex Reads to fail with an error if the Edit/Up/Down buttons in the wizard were used when no tag was selected and the Reset button had earlier been pressed.
- Fixed an issue causing SAM/BAM import to fail when the provided reference element contained one or more circular sequences, but these sequences were not marked as circular in the SAM/BAM file and one or more reads mapped with unaligned ends at the beginning of the read.
- Fixed an issue causing Standard Import of GenBank format to import qualifiers' values as annotations surrounded by quotes. The surrounding quotes are now removed.
- Fixed an issue causing GFF3 export to fail when sequence annotations included features with incorrectly formatted frame qualifiers. Now, such frame qualifiers are ignored.
- Fixed an issue that could cause QC for Sequencing Reads to fail when provided with more than one sequence list, and one or more of those sequence lists contained very few sequences.
- Fixed an issue causing Create Sample Report and Combine Reports to fail, if an input report was named “report”.
Data related updates
From September 19, 2023, Download Pfam Database downloads Pfam 36.0. This update also affects download using earlier versions of the CLC Genomics Server.
Plugin notes
Import Immune Reference Segments, delivered by Biomedical Genomics Analysis Server Plugin and CLC Single Cell Analysis Server Extension, can now import V segments in IMGT format that end in the conserved amino acid. Previously, these segments were silently ignored.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.5.
- QIAGEN CLC Genomics Workbench 23.0.5
- QIAGEN CLC Main Workbench 23.0.5
- QIAGEN CLC Command Line Tools 23.0.5
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1, 23.0.2, 23.0.3 and 23.0.4, QIAGEN CLC Main Workbench 23.0.1, 23.0.2, 23.0.3 and 23.0.4, and QIAGEN CLC Command Line Tools 23.0.1, 23.0.2, 23.0.3 and 23.0.4, can also connect to QIAGEN CLC Genomics Server 23.0.5.
CLC Server Command Line Tools
Compatibility
CLC Server Command Line Tools 23.0.5 is the corresponding client for QIAGEN CLC Genomics Server 23.0.5.
CLC Server Command Line Tools 23.0.5 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1, 23.0.2, 23.0.3 and 23.0.4.
QIAGEN CLC Genomics Server 23.0.4
Shared with workbenches
Improvements
- Download BLAST Databases is more resilient to interrupted connections and similar issues when downloading large databases.
Bug fixes
- Fixed an issue where workflows containing a BAM export element could not be launched from CLC Genomics Workbench 23.0.3 to run on a CLC Genomics Server due to an error reported after selecting an export destination in the launch wizard ("The parameter 'Export destination' File not found.")
- Fixed an issue causing workflows to fail if they contained multiple Filter on Custom Criteria elements connected to a single downstream element, and one or more of the Filter on Custom Criteria outputs was empty.
- Fixed an issue causing QC for Read Mapping to report the number of unaligned ends instead of the number of reads with unaligned ends. This could cause “read count” and “% of all mapped reads” to be too high.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.4.
- QIAGEN CLC Genomics Workbench 23.0.4
- QIAGEN CLC Main Workbench 23.0.4
- QIAGEN CLC Command Line Tools 23.0.4
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1, 23.0.2 and 23.0.3, QIAGEN CLC Main Workbench 23.0.1, 23.0.2 and 23.0.3, and QIAGEN CLC Command Line Tools 23.0.1, 23.0.2 and 23.0.3, can also connect to QIAGEN CLC Genomics Server 23.0.4.
CLC Server Command Line Tools
CLC Server Command Line Tools 23.0.4 is the corresponding client for QIAGEN CLC Genomics Server 23.0.4.
Compatibility
CLC Command Line Tools 23.0.4 is the corresponding client for QIAGEN CLC Genomics Server 23.0.4.
CLC Command Line Tools 23.0.4 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1, 23.0.2 and 23.0.3.
QIAGEN CLC Genomics Server 23.0.3
Server specific
Bug fixes
- Fixed an issue affecting external application configurations, where an unlocked parameter in an exporter could not be mapped to a linked post-processing or high-throughput importer parameter.
Shared with workbenches
Improvements
- The SAM and BAM exporters have a new option relevant where there is one or more circular reference sequences. The new option, "Export reads spanning the origin of circular chromosomes as unmapped", is checked by default, making the default behavior of these exporters match that of CLC Genomics Server 22.x and earlier. This update changes the default behavior of these exporters relative to CLC Genomics Server 23.0.1 and 23.0.2. In those versions, reads that span the origin are exported as extending beyond the end of the reference. That behaviour corresponds to unchecking the new option.
- Import of PacBio SAM/BAM files with Platform Model (PM) set to HIFI are imported as HiFi reads without having to check the "Mark as HiFi reads" option.
- Producing an Amino Acid Track is now optional in Amino Acid Changes.
Bug fixes
- Fixed an issue affecting the homopolymer trimming options of Trim Reads. When enabled, homopolymers that started with 9 identical bases followed by a different base were not trimmed. Other homopolymers were trimmed as expected. This update may affect the number of reads trimmed in a given dataset, and thus could lead to differences in results from downstream analyses, relative to earlier software versions.
- Fixed an issue causing Detect and Refine Fusion Genes to fail on certain data sets.
- Fixed an issue causing RNA-Seq Analysis to fail when reads mapped to a gene located close to the origin of a circular chromosome.
- Fixed an issue causing SAM/BAM export to fail when reference sequence names contained commas, brackets or other characters not in the set of allowed characters according to the SAM format specification. These characters are now replaced by an underscore in the exported file.
- Fixed an issue causing import of SAM/BAM files to fail when they contained a Platform (PL) but no Platform Model (PM) in the header. This affected the PacBio importer, the Ion Torrent importer and Standard Import of reads from SAM/BAM files.
- Fixed an issue where lines in pdfs containing history information were not wrapped, resulting in the ends of long lines not being present in the exported document.
- Fixed an issue that caused VCF Export to fail when exporting fusions that had two or more filter criteria listed in the Filter column.
- Fixed an issue that caused Low Frequency Variant Detection, Fixed Ploidy Variant Detection, and Basic Variant Detection to fail when the end of a mapped read supported a deletion, and there was support in other reads for a variant at the subsequent position. This issue has only been observed for RNA-Seq data where splicing combined with primer trimming could lead to this situation.
- Fixed an issue causing Extract Reads to not correctly extract reads overlapping annotated regions that cross the origin of circular chromosomes when the type of overlap was set to "Span region" or "No overlap".
Plugin notes
Fixed an issue affecting Immune Repertoire Analysis, delivered by Biomedical Genomics Analysis Server Plugin, and Single Cell V(D)J-Seq Analysis, delivered by CLC Single Cell Analysis Server Extension. The tools failed if there were reads where the region that aligned to a C segment was contained within the region that aligned to a J segment.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.3.
- QIAGEN CLC Genomics Workbench 23.0.3
- QIAGEN CLC Main Workbench 23.0.3
- QIAGEN CLC Command Line Tools 23.0.3
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1 and 23.0.3, QIAGEN CLC Main Workbench 23.0.1 and 23.0.2, and QIAGEN CLC Command Line Tools 23.0.1 and 23.0.2, can also connect to QIAGEN CLC Genomics Server 23.0.3.
CLC Server Command Line Tools
CLC Server Command Line Tools 23.0.3 is the corresponding client for QIAGEN CLC Genomics Server 23.0.3.
Compatibility
CLC Command Line Tools 23.0.3 is the corresponding client for QIAGEN CLC Genomics Server 23.0.3.
CLC Command Line Tools 23.0.3 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1 and 23.0.2.
QIAGEN CLC Genomics Server 23.0.2
Shared with workbenches
Improvements and bug fixes
- The runtime of Amino Acid Changes has been significantly improved.
- Fixed an issue in the Trim Reads report where the number of “Trimmed (broken pairs)” was not reported per sequence list provided as input, but were instead added together incrementally. The number of reported “Trimmed reads” decreased correspondingly. The issue would occur when paired reads from more than one sequence list were trimmed and broken read pairs were produced.
- Fixed a rare issue that could cause Trim Reads to retain a wrong part of a read if the read was both trimmed based on quality scores and adapter read-through.
- Fixed an issue causing the Demultiplex Reads tool to always demulitplex based on a sequence structure of "barcode, sequence". Adjustments to the tag list, such as adding a linker or placing the barcode at the end, were ignored. This issue did not affect the tool when run in a workflow context.
- Fixed an issue that could cause Detect and Refine Fusion Genes to fail on Windows when either the dataset was large or fusion genes with many possible transcripts were detected.
- Fixed an issue that could cause VCF Export to fail when exporting filtered annotation tracks that were empty.
- Fixed an issue causing download of the QIAseq xHYB Viral Panels reference data set to fail on Windows.
- Fixed a rare issue where Rebuild Index could not repair a corrupt search index.
- Various minor bug fixes
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.2.
- QIAGEN CLC Genomics Workbench 23.0.2
- QIAGEN CLC Main Workbench 23.0.2
- QIAGEN CLC Command Line Tools 23.0.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 23.0.1, QIAGEN CLC Main Workbench 23.0.1 and QIAGEN CLC Command Line Tools 23.0.1 can also connect to QIAGEN CLC Genomics Server 23.0.2.
CLC Server Command Line Tools
CLC Command Line Tools 23.0.2 is the corresponding client for QIAGEN CLC Genomics Server 23.0.2.
QIAGEN CLC Genomics Server 23.0.1
Shared with Workbenches
Improvements and bug fixes
- Fixed an issue affecting Trim Reads, where the wrong part of a read was retained if the read was both trimmed to a fixed length and also trimmed by another method from the opposite end of the read.
- Fixed an issue affecting Trim Reads when both adapter trimming using a trim adapter list and fixed length trimming were selected. This issue could cause the resulting trimmed reads to be shorter than expected.
- Fixed an issue where fusion plots created by Detect and Refine Fusion Genes were omitted in the report and were not accessible via the fusion track table.
- Fixed an issue where workflows containing a Branch on Coverage element would fail for read mappings with no zero coverage regions when using reports output by QC for Read Mapping.
- Fixed an issue causing Annotate with GFF/GTF/GVF file to fail when the option "Ignore duplicate annotation" was checked.
- Fixed an issue causing Standard Import of GenBank format to stall if qualifier names spanned more than one line.
- Various minor improvements
Please see the release notes for CLC Genomics Workbench 23.0, below, for a full list of changes since the last general release of this software.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 23.0.1.
- QIAGEN CLC Genomics Workbench 23.0.1
- QIAGEN CLC Main Workbench 23.0.1
- QIAGEN CLC Command Line Tools 23.0.1
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server.
CLC Server Command Line Tools
CLC Command Line Tools 23.0.1 is the corresponding client for QIAGEN CLC Genomics Server 23.0.1
Compatibility
CLC Command Line Tools 23.0.1 is the corresponding client for QIAGEN CLC Genomics Server 23.0.1
CLC Command Line Tools 23.0.1 can also act as a client for the QIAGEN CLC Genomics Server 23.0.1
QIAGEN CLC Genomics Server 23.0
Server specific
New features and improvements
- Contents of import/export directories can be browsed from the web interface in the Browse server import/export directories tab under Element info.
- Contents of AWS S3 buckets accessible using AWS Connections configured in the CLC Server can be browsed from the web interface in the Browse S3 locations tab under Element info. Data can be uploaded to S3, downloaded from S3 and deleted in S3 from this area.
- In External Applications, a static script can be specified using the new parameter type: Included script. A script provided using this option becomes accessible to the external process at runtime. This enables integration scripts or extensive parameter files to be included in the External Application and injected into the execution context, rather than being an external dependency. For containerized External Applications this may be the injected integration that enables the direct use of a public available container.
- Files from AWS S3 can now be selected for the External file parameter type of External Applications.
Other improvements
- Search functionality has been substantially improved. Please see the "Important change to search indexes - action needed" section below about changes to search indexes related to this improvement. Indexes for all CLC Locations can be rebuilt using the "Rebuild all indexes" button under Configuration | Main Configuration | File system locations.
- Admin level access to the audit log can be granted to specified groups. The ability to broaden access beyond admin users to installing and configuring workflows, configuring and enabling external applications, and viewing the CLC Server queue, was introduced with version 22.0.
- The message returned upon successful login to the CLC Server now includes information about the connection (username, the CLC Server description, and encryption status). Previously the return message was "Login successful".
- The full names of graphics exporters are listed when configuring External Applications. Previously, the name "Graphics" was used for each of these.
- A search box has been added to several locations in the web client where long lists are presented, for example, in the Algorithms section under Global Permissions.
- Active CLC File System Locations are listed in alphanumeric order in the web administrative interface. Previously they were listed in the order they were added.
- Apache Tomcat has been updated to version 9.0.65.
- Various minor improvements
Bug fixes
- Fixed an issue where CLC Workbenches could not interact with elements stored on a CLC Server if those elements were created using tools provided by a plugin that was no longer installed on the CLC Server.
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.
- Various minor bugfixes
Changes
- AWS account details are now entered into AWS Connections. This term replaces the earlier term: "S3 locations". An AWS region can be specified. When upgrading from an earlier version with AWS account information already configured, the region will be set to the default for the specified AWS partition. For AWS Standard, this is us-east-1. The region can be updated by editing the connection. The region setting is primarily relevant if you plan to submit analyses from a CLC Server with the Cloud Server Plugin installed to run on a CLC Genomics Cloud setup.
- The Core tasks area under the Global Permissions tab has been removed. Standard Data Import is now listed under the Algorithms section with the name "Import Standard Data". The Data Export setting under Core tasks was legacy functionality, only relevant to External Applications with exporters configured in CLC Genomics Server 9.x and earlier. Permissions previously set for both standard import and legacy exporters are retained.
- The Java version bundled with CLC Genomics Server 23.0 Java 17.0.4, where we use the JRE from the Azul OpenJDK builds.
Important change to search indexes - action needed
- Search functionality has been substantially improved. Associated with this, indexes for all CLC Server data locations must be rebuilt after upgrading to 23.0. If they are not, searches for elements in these locations will not find any results, and data associations to CLC Metadata Tables will not be registered. Indexes built using version 23.0 are placed in a folder called "searchindex2" in the installation area of the CLC Server.
- Old search indexes are not automatically deleted. They can be left in place without detrimental effect, or deleted manually. They are found in the folder "searchindex" in the installation area of the CLC Server.
Functionality retirement
- Boolean compound parameters in External Applications. These were made legacy with version 21.0 and are no longer supported in External Application configurations.
Shared with CLC Workbenches
New tools
- Create K-medoids Clustering for RNA-Seq finds clusters of features, e.g., genes/transcripts/miRNAs etc, whose expressions behave similarly, for example first increasing over time and then decreasing. The tool produces a Clustering Collection which contains a Sankey plot showing how these features move between clusters under different conditions, for example different treatments. A line graph representation of features from individual clusters or pairs of clusters is present as well.
New tools coming from plugins
- Detect and Refine Fusion Genes - Find fusion genes in RNA-Seq data by identifying potential fusions and then refining that list by evaluation of the evidence for each fusion. This is an updated version of the tool formerly distributed in the Biomedical Genomics Analysis Server Plugin. The updates made are listed in an Improvements section below.
- Target Region Coverage Analysis - Analyze and compare coverage from multiple samples. This tool was formerly distributed in the Biomedical Genomics Analysis Server Plugin..
- Create Consensus Sequences from Variants – Create consensus sequences from a variant track and a reference sequence. This tool was formerly distributed in the Biomedical Genomics Analysis Server Plugin.
- Annotate with GFF/GVF/GTF file - Add annotations from a GFF, GVF or GTF format file onto sequences, individual or in sequence lists. This tool was formerly distributed in the Annotate with GFF file sever plugin.
Other new functionality and improvements
RNA-Seq Analysis and miRNA analysis tools
- Substantial speed improvements to RNA-Seq Analysis. Reads that map to multiple transcripts or genes will be distributed differently than earlier due to different choices of random seed in the new implementation. The algorithm is still deterministic.
- Transcripts are no longer renamed in Transcript Expression (TE) output unless renaming is necessary to avoid duplicate names. Previously, transcripts were renamed to the gene name plus a number e.g. "BRCA_1". This change means that TE tracks in this version of the software cannot typically be used together with TE tracks generated using older versions to produce Heat Maps, PCA plots, Expression, etc.
- Reports UMI fragment counts when relevant. UMI counts are included in the Fragment statistics section of the report if the input reads are annotated with UMIs by tools from the Biomedical Genomics Analysis plugin, and if the library type is set to 3' sequencing for RNA-Seq Analysis.
- Venn diagrams support four and five groups. Previously up to 3 were supported. Tooltips indicate which groups are part of a specific intersection.
- Quantify miRNA:
- Handles custom databases containing duplicated names.
- Does not allow custom databases containing sequences longer than 60bp. This avoids misallocation of reads to sequences that are similar to small RNAs.
- When adding multiple inputs to Extract IsomiR Counts, the extracted expression tables contain an entry for the combined set of IsomiRs identified among the samples, making them compatible for analysis in Differential Expression in Two Groups and Differential Expression for RNA-Seq.
Differential Expression for RNA-Seq and Differential Expression in Two Groups
- A new option for creating a subset has been added to the miRNA Statistical Comparison Table produced by Differential Expression for RNA-Seq and Differential Expression in Two Groups.
- It is possible to downweigh outliers. This option is disabled by default and recommended only when the results seem enriched for genes that are expressed at anomalously high levels in a small proportion of samples.
- The Max Group Means column of Statistical Comparison Tracks and Tables now shows TPM instead of RPKM. Note that this column is used for filtering data in tools such as Create Heat Map for RNA-Seq and the Pathway Analysis tool of the Ingenuity Pathway Analysis plugin.
Detect and Refine Fusion Genes
This is an updated version of Detect and Refine Fusion Genes, formerly distributed in the Biomedical Genomics Analysis Server Plugin. The updates listed here are relative to the version distributed with Biomedical Genomics Analysis Server Plugin 22.2.
- Fusions will not be called for overlapping genes.
- Novel exon boundary improvements:
- Options have been expanded to allow for detecting fusions with a single fusion partner ("Detect with novel exon boundaries") as well as detecting those with 2 fusion partners ("Allow fusions with novel exon boundaries in both genes")
- The "Detect exon skippings" option supports detection of fusions with novel exon boundaries.
- An option has been added to omit non-significant breakpoints from the report.
- A minimum Z-score can now be specified for use when evaluating evidence for a fusion.
- Speed improvements
- The option "Allow fusions with novel exon boundaries in both genes" now defaults to false to reduce the number of false positive fusions. Setting it to true is useful for exhaustive searches of novel fusions.
- Changes to the maximum number of equivalent matches to the reference allowed for a single read to be retained:
- When remapping reads to a fusion chromosome, the maximum number is now 30. Previously it was 10.
- When searching for unaligned ends, the maximum number remains unchanged, as 10.
- The option "Maximum number of hits for a read" has been removed. It's value was ignored in previous versions.
- Fusions from mRNA transcripts without an associated gene in the Gene track are not used when detecting fusions. mRNA transcript features must have a gene id in one of the following columns to be matched with the associated gene: "Parent", "gene_id" or "gene_name".
- Fixed an issue where paired end reads were treated as single end reads when the option to "Only use fusion primer reads" was enabled.
- Fixed an issue where unaligned ends could be too long or too short for reads containing insertions and deletions. This change may lead to small differences in results compared to earlier versions, expected to be due to a decrease in false positive and false negatives reported.
Bisulfite mapping
- Map Reads to Bisulfite Reference speed improvement. This is data dependent, with about a 50% improvement likely for most data sets. This speed up might change the details of results very slightly.
- Call Methylation Level speed improvement. This speedup might, in some cases, change results very slightly.
- Import of read mappings from SAM/BAM now use methylation information from the optional SAM tags XR for read conversion and XG for reference conversion. The recognized values are "CT" and "GA". Support for these tags is added so that information is not lost if a bisulfite mapping is exported and then re-imported.
- Export of read mappings to SAM/BAM format now includes details on bisulfite conversion. These are specified using the SAM tags XR for read conversion and XG for reference conversion. The possible values of these tags are "CT" and "GA". This is provided for increased compatibility with third party tools.
Import and export
- VCF Import:
- Supports symbolic alleles for inversions (<INV>), insertions (<INS>), deletions (<DEL>) and tandem duplications (<TANDEM:DUP>). Symbolic alleles that do not contain sequence information or are longer than 100,000 base pairs are imported to annotation tracks instead of variant tracks. Previously symbolic alleles were not imported.
- Improved handling of variants with multiple loci encoded in the same vcf record.
- VCF Export supports symbolic allele representation for insertions (<INS>), deletions (<DEL>) and tandem duplications (<TANDEM:DUP>). (Inversions (<INV>) were already supported.) With the exception of deletions, variants in annotation tracks are always exported as symbolic alleles. Deletions in annotation tracks and variants in variant tracks above a specified size are also exported as symbolic alleles. The default size is 1000 bp, which corresponds with the QCI Interpret requirement that InDels > 1000 bp must be represented as symbolic alleles.
- The PacBio importer supports HiFi reads.
- The read length when exporting to FASTQ format files has been increased from 524,288 bp to 16,777,216 bp.
- SAM/BAM Mapping Files importer:
- Performance improvements
- The circular flag of references is now retained.
- Import Tracks from File has been updated to show a warning if the file is not imported.
- GFF3 Export retains the case of attribute headers. Previously, all headers were adjusted to lower case during export.
- The history information of elements imported using Standard Import includes the specific importer used (e.g. "CSV table importer", "Fasta Importer", etc).
- Standard Import can be used to import files from AWS S3 locations.
- When exporting images to bitmap-based formats, the Screen resolution and High resolution options are now bounded so the maximum supported number of pixels will not be exceeded.
Various
- Read mapping speed on Apple Silicon processors has been improved. Read mapping results are not affected by this. Tools benefiting from this change include Map Reads to Reference, RNA-Seq Analysis, Map Reads to Contigs and Map Bisulfite Reads to Reference.
- Branch on Coverage - a new workflow control flow element where the downstream processing of read mappings can be controlled based on coverage values within reports.
- Barcodes can be preconfigured in Demultiplex Reads elements in workflows.
- Demultiplex Reads has been updated to:
- Report barcodes without any matched reads
- Show the barcodes names in the history.
- Workflow Export elements can be preconfigured to export to locations on AWS S3.
- When Low Frequency Variant Detection, Fixed Ploidy Variant Detection or Basic Variant Detection was used with a mapping realigned using Local Realignment with a guidance variant track, it was possible for partial insertions to be called. Now, the full insertion must be present within at least one, individual read for it to be reported.
- QC for Targeted Sequencing:
- Can report coverage statistics per gene.
- Supports analysis of read mappings generated by RNA-Seq Analysis.
- Annotate with Exon Numbers:
- Can add exon numbers to elements in annotation, expression and statistical comparison tracks. Previously only variant tracks could be annotated with exon numbers.
- Adds exon numbers when input elements start outside an exon but still overlap the exon.
- Adds all exons when multiple exons overlaps a single input element.
- Allows annotation with exons from only one transcript or CDS.
- Filter on Custom Criteria can be used to filter Statistical Comparison Tracks, Statistical Comparison Tables, IsomiR tables, and miRNA Seed Tables.
- Reports from Create Sample Reports and Combined Report generated using RNA-Seq reports now include the percentage of reads mapped to exons in the Fragment counting statistics table.
- In Create Sample Report, the percentage of target region positions with coverage above a set threshold can be used as a QC metric.
- QC for Sequencing Reads processes only the first 100,000 base pairs in long reads. Before the tool would fail when provided with very long reads.
- When Annotate with Overlap Information is included more than once in the same workflow, columns with overlap information are now always added in the same order. Previously, concurrency issues could cause column order to be different between different runs.
- Local Realignment no longer realigns reads into regions with no coverage, such as introns in RNA-Seq read mappings.
- Remove Duplicate Mapped Reads uses an improved method to identify duplicate reads when handling paired end reads. In general, this improvement results in slightly more reads being considered duplicates.
- The options for extracting reads according to their location relative to features in an overlap track have been expanded in Extract Reads. Previously reads had to lie fully within an annotated region to be extracted. Now, in addition to that condition, options are provided for extracting any overlapping reads, extracting only reads that fully span annotated regions or extracting all reads except those that overlap with annotations in the overlap track.
- Assemble Sequences to Reference supports alignment of reads that span the origin of a circular reference.
- Secondary Peak Calling has a new option "Peak detection stringency".
- The report from Copy Number Variant Detection (CNVs):
- Includes a table showing the number of genes affected by CNV calls.
- Contains new coverage plots at genome and chromosome levels.
- The Trim Reads report now includes statistics for the number of reads in intact pairs and in broken pairs.
- Updated restriction site database to REBASE 2022-06-30.
- The Identify Known Mutations from Mappings output channel names when used in a workflow have been improved. The elements produced by the tool have not been changed.
- While viewing data, in most situations, tooltips can be suppressed by holding down the Ctrl key. Similarly those tooltips can be displayed immediately, instead of a moment after the mouse cursor stops moving, by holding down the Shift key.
- Various minor improvements
Bug fixes
- Low Frequency Variant Detection, Fixed Ploidy Variant Detection and Basic Variant Detection:
- Fixed an issue that in very rare cases caused insertions to be called twice. Now, the same insertion is always only included once in the variant track.
- Fixed an issue in the remove pyro-error variants filter. Previously, the frequency threshold for removing pyro-error variants was ignored and more variants than intended were removed. The filter is generally only used for Ion Torrent data. This fix may result in a small improvement to the precision of variant detection.
- Fixed a rare issue affecting variant calling in very low coverage regions, where a variant could be reported that was not present in any single read in the mapping.
- Fixed an issue causing Map Reads to Reference to fail if a masking track covering a whole chromosome was provided as input.
- RNA-Seq Analysis
- Fixed an issue where reads were not counted as unique for a transcript in the GE track table, if the read could map in multiple ways to the same transcript, but only to that transcript.
- Fixed an issue that could lead to an IndexOutOfBounds error when the option "Calculate expression for genes without transcripts" was selected, and two or more genes had the same name, and at least one of these has no transcripts, and the Region column of the table view of the gene track contains the text "join", ">", or "<" (i.e., the genes have splice structure, or uncertain end positions).
- Fixed an issue where the gene identifier would be removed from the statistical comparison track and tables produced by the Differential Expression for RNA-Seq tool when it was not recognized to be an Ensembl gene identifier.
- Fixed an issue in Differential Expression in Two Groups and Differential Expression for RNA-Seq that affected the estimation of dispersion estimates including information from nearby genes. This leads to slightly different p-values produced by by these 2 tools.
- Fixed an issue affecting Extract Consensus Sequence where annotations transferred from the reference sequence to the consensus sequence could be wrongly positioned if the read mapping had an insertion in a region that was removed due to low coverage.
- Fixed an issue where, if two genes had the same name and overlapped, their transcripts might become assigned to only one of the genes. The fix only applies when the gene and transcript annotations are imported from GFF3.
- Fixed an issue affecting the naming of outputs from Local Realignment when the tool was provided with multiple read mappings as input and not run in batch mode. Each resulting realigned read mapping is now named after the corresponding input. Previously all the realigned read mappings were named after the first read mapping in the set of inputs.
- QC for Sequencing Reads
- Fixed an issue in the report where the graph for R1 nucleotide contributions would be truncated to only show the same number of nucleotides as the R2 plot.
- Fixed an issue where the median read length in the supplementary report could be incorrect when the number of reads was very low. The median reported in the graphical report was correct.
- Amino Acid Changes
- Fixed an issue causing the output from to be named after the reference data instead of the input data.
- Fixed an issue that caused the transcripts and proteins listed in the Coding region change and Amino acid change columns in the annotated variant track output to be inconsistently ordered.
- Fixed an issue in the Trim Reads report, where the number of reads under “No trim” could be incorrect when "Remove fixed number of bases” was enabled.
- Fixed an issue causing Show Enzymes Cutting Inside/Outside Selection to give wrong results when the selection crossed the junction of a circular sequence and a desired number of cut sites outside the selection was not specified.
- Fixed an issue in VCF Export, where specified minimum ploidy was not always enforced for complex variants. The issue would only occur when an allele had first been removed from a locus to adhere to the specified maximum ploidy.
- Fixed an issue where the wrong entry in a trim adapter list would be opened for editing if the list had been sorted or filtered.
- Fixed a rare issue in K-means/medoids clustering where a gene could be output in multiple clusters. This would occur when genes with identical expressions were chosen to be medoids, and so would only happen when K was comparable to the number of genes with unique expressions across samples.
- Fixed issues with Quantify miRNA where:
- It would fail on paired reads if using spike-ins.
- Opening a sequence list to view it would cause this tool to fail if that same sequence list had been used as input.
- In the report from Create Sample Report the value column in the summary table is coloured green or yellow according to whether the threshold is met. Previously, the threshold column was coloured.
- Workflow related
- Fixed an issue affecting the location of outputs generated from a workflow element that was also linked to a Collect and Distribute element. In cases where the output folder name was defined using the {input} or {2} placeholder, these outputs were sometimes all saved to the first folder created, instead of to different folders as intended.
- Fixed an issue where default names were applied to outputs from Output elements attached directly to an Iterate element in workflows, even when naming placeholders had been configured.
- Fixed an issue affecting workflows with nested Iterate elements where results from the outer level of iteration flowed into a Distribute and Collect element. Any output elements generated in the inner iteration, which should have saved, were lost.
- Fixed an issue where unlocked options for on-the-fly importers in a workflow would be locked if the Input element was re-opened for editing.
- Fixed an issue affecting hyperlinked table entries, where html tags were sometimes included as text in the information exported to Excel or CSV formats.
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.
- Various other minor bug fixes
Legacy tools
The following tools are now legacy tools and will be retired in a future version of the software:
- QIAGEN GeneReader importer (Legacy)
Functionality retirement
The following tools have been retired:
-
- Compare Sample Variant Tracks (Legacy)
- Empirical Analysis of DGE (Legacy)
Plugin notes
Plugin retirements
- Annotate with GFF file server plugin. The tool Annotate with GFF/GVF/GTF file is now available directly in the server.
- Haplotype Calling Server Plugin (beta). Functionality from this plugin is now in the Biomedical Genomics Analysis Server Plugin.
Compatibility
The follow are the corresponding client applications for CLC Genomics Server 23.0:
-
-
- CLC Genomics Workbench 23.0
- CLC Main Workbench 23.0
- CLC Command Line Tools 23.0
-
CLC Server Command Line Tools
Please see the CLC Genomics Server 23.0 listings above for the details about the new tools and features listed here.
New tools
New tools and functionality
- create_kmedoids_for_rnaseq
New tools previously included in plugins
- annotate_with_gff (previously distributed in the Annotate with GFF file plugin)
- consensus_from_variants (previously distributed in the Biomedical Genomics Analysis plugin)
- detect_and_refine_fusion_genes (previously distributed in the Biomedical Genomics Analysis plugin)
- target_region_coverage_analysis (previously distributed in the Biomedical Genomics Analysis plugin)
New and updated options for existing tools
- differential_expression_rna_seq
- option added: --downweight-outliers
- differential_expression_two_groups
- option added: --downweight-outliers
- download_sra
- option removed: --aspera-limit
- option removed: --enable-aspera
- extract_overlapping_reads
- option removed: --in-interval
- option added: --overlap-type
- process_tagged_sequences
- option added: --barcode-values
Barcode structure and barcode values are now provided in separate parameters:
--barcode-1 "linker type#MULTIPLEX_BARCODE#fixedLength#3;linker type#MULTIPLEX_SEQUENCE#maxLength#1000" --barcode-2 "linker type#MULTIPLEX_BARCODE#fixedLength#3;linker type#MULTIPLEX_SEQUENCE#maxLength#1000" --barcode-values "a/b#AAA/GGG#ATA/ATA"
Previously, structure and values were provided in the same parameter.
- secondary_peak_calling
- option added: --peak-slope-stringency
- statistics_target_regions
- option added: --create-gene-coverage-track
- option added: --genes
Other updates
Importers
- ngs_import_pacbio
- option added: --hifi-reads
- option removed: --only-sequencing-zmws
- option remove: --read-hq-regions
Improvements
- Basic data operations such as copying, can be carried out on data elements created using plugins.
Commands retired
- compare_sample_variant_tracks
- empirical_analysis_dge
Bugfixes
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.
QIAGEN CLC Genomics Server 22.0.3
Shared with workbenches
Improvements
- When exporting images to bitmap-based formats, the Screen resolution and High resolution options are now bounded so the maximum supported number of pixels will not be exceeded.
- Various minor improvements
Bug fixes
- Fixed an issue causing Map Reads to Reference to fail if a masking track covering a whole chromosome was provided as input.
- Fixed an issue in RNA-Seq Analysis that could lead to an IndexOutOfBounds error when the option “Calculate expression for genes without transcripts” was selected, and two or more genes had the same name, and at least one of these has no transcripts, and the Region column of the table view of the gene track contains the text “join”, “>”, or “<” (i.e., the genes have splice structure, or uncertain end positions).
- Fixed an issue affecting Extract Consensus Sequence where annotations transferred from the reference sequence to the consensus sequence could be wrongly positioned if the read mapping had an insertion in a region that was removed due to low coverage.
- Amino Acid Changes
- Fixed an issue causing the output from to be named after the reference data instead of the input data.
- Fixed an issue that caused the transcripts and proteins listed in the Coding region change and Amino acid change columns in the annotated variant track output to be inconsistently ordered.
- Fixed issues with Quantify miRNA where:
- It would fail on paired reads if using spike-ins.
- Opening a sequence list to view it would cause this tool to fail if that same sequence list had been used as input.
- Fixed an issue causing Standard Import of GenBank format to stall if qualifier names spanned more than one line.
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.
Changes
- The Java version bundled with CLC Genomics Server 22.0.3 is 11.0.17, where we use the JRE from the Azul Zulu Builds of OpenJDK.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 22.0.3.
- QIAGEN CLC Genomics Workbench 22.0.3
- QIAGEN CLC Main Workbench 22.0.3
- QIAGEN CLC Command Line Tools 22.0.3
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 22.0, 22.0.1 and 22.0.2, QIAGEN CLC Main Workbench 22.0. 22.0.1 and 22.0.2 and QIAGEN CLC Command Line Tools 22.0, 22.0.1 and 22.0.2 can also connect to QIAGEN CLC Genomics Server 22.0.3.
CLC Server Command Line Tools
Bug fixes
- Fixed an issue where text in installer screens was not visible when installing the software in 'dark mode' on Linux.
Compatibility
CLC Command Line Tools 22.0.3 is the corresponding client for QIAGEN CLC Genomics Server 22.0.3.
CLC Command Line Tools 22.0.3 can also act as a client for the QIAGEN CLC Genomics Server 22.0, 22.0.1 and 22.0.2. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 22.0.2
Shared with workbenches
Bug fixes
- Fixed an issue affecting Local Realignment where false positive insertions and deletions could be introduced in read mappings when a guidance variant was provided that had a sequence similar to the reference sequence immediately after the variant. See Important QIAGEN CLC software notifications for further details.
- Fixed an issue affecting the naming of outputs when Local Realignment was provided with multiple read mappings as input. When the tool was not run in Batch mode, a read mapping was generated per input, as expected, but all the realigned read mappings were named after the first read mapping in the set of inputs. Now each is named after the corresponding input.
- Fixed an issue that could result in an error like "no zstd-jni in java.library.path" when importing files or undertaking other activities involving the compression of CLC data elements.
Plugin notes
After upgrading the CLC Genomics Server:
- Immune Repertoire Analysis will be able to identify clonotypes from data with a varied read structure. This tool is available when the Biomedical Genomics Analysis Server Plugin is installed.
- Single Cell TCR-Seq Analysis will be able to identify clonotypes from data with a varied read structure. This tool is available when the CLC Single Cell Analysis Server Extension is installed with access to a valid CLC Genomics Premium Server Extensions license.
The results from these tools may be slightly altered as a result of the underlying improvement.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 22.0.2.
- QIAGEN CLC Genomics Workbench 22.0.2
- QIAGEN CLC Main Workbench 22.0.2
- QIAGEN CLC Command Line Tools 22.0.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 22.0 and 22.0.1, QIAGEN CLC Main Workbench 22.0 and 22.0.1 and QIAGEN CLC Command Line Tools 22.0 and 22.0.1 can also connect to QIAGEN CLC Genomics Server 22.0.2.
CLC Server Command Line Tools
Compatibility
CLC Command Line Tools 22.0.2 is the corresponding client for QIAGEN CLC Genomics Server 22.0.2.
CLC Command Line Tools 22.0.1 can also act as a client for the QIAGEN CLC Genomics Server 22.0 and 22.0.1. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 22.0.1
Server specific
Bug fixes
- Fixed an issue where the presence of a file without a file name extension under an import/export directory caused null errors when browsing that area.
Shared with Workbenches
Improvements
- The RNA-Seq Analysis Report has been updated to include new text in the "Adapter read-through" section. The additional text explains that trimming the start of paired-end reads (5' trim) can lead to spurious detection of adapter read-through.
- When two or more reports produced by the same tool are used as input to Create Sample Report, and they contain conflicting values, the tool now fails. Previously, the results from the first report were used.
- When reports that contain identical tables and values are input to Create Sample Report, the values are included once in the report. Previously, a row per input report was included, even when the values were identical.
- Update Sequence Attributes in Lists now accepts attribute information from comma separated (.csv) and tab separated (.tsv) format files.
- Various minor improvements
Bug fixes
- Fixed an issue in Copy Number Variant Detection (CNVs), where a subset of region CNVs were not correctly calculated on chromosomes with low coverage target regions in the control samples. The issue led to overly large region CNVs. As gene-level CNVs are calculated from region-level CNVs, these were also affected. When only few target regions had low coverage in control samples, results were likely not affected.
- Fixed an issue in Maximum Likelihood Phylogeny that in rare situations led to tree construction never completing.
- Fixed an issue with Create BLAST Database, which could fail if the underlying native BLAST tool reported warnings.
- Fixed an issue with Split Sequence List when splitting based on attribute values, where an error resulted if the sequence list had more than 1000 distinct values for the selected attribute.
- Fixed an issue where a workflow could fail or produce an incorrect folder structure for outputs when it contained Split Sequence List, with splitting based on attribute values, and the name of at least one of the groups contained a forward slash "/" character.
- Fixed an issue affecting the naming of workflow outputs defined using a naming pattern of the form {input:N} or its equivalent {2:N}, e.g. {input:1} or {2:1}. The intended input name was not the one used to form the output element names when the workflow included an Iterate element, and import was done on the fly, and the batch units were defined based on the organization data, and:
- Each batch unit was a set of paired reads, or
- Each batch unit was a folder.
- Fixed an issue where the median read length in the supplementary report produced by QC for Sequencing Reads could be incorrect when the number of reads was very low. The median reported in the graphical report was correct.
- Fixed an issue with Create Sample Report affecting samples where reads mapped to only one chromosome. For such samples, the QC metric "% reads mapped in target region" was not included in the Quality Control table, even when that option had been selected when launching the tool.
- Various minor bug fixes
Changes
The Java version bundled with CLC Genomics Server 22.0.1 is Java 11.0.14.1, where we use the JRE from the Azul OpenJDK builds.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 22.0.1.
- QIAGEN CLC Genomics Workbench 22.0.1
- QIAGEN CLC Main Workbench 22.0.1
- QIAGEN CLC Command Line Tools 22.0.1
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 22.0, QIAGEN CLC Main Workbench 22.0, and QIAGEN CLC Command Line Tools 22.0 can also connect to QIAGEN CLC Genomics Server 22.0.1.
CLC Server Command Line Tools
CLC Command Line Tools 22.0.1 is the corresponding client for QIAGEN CLC Genomics Server 22.0.1.
Compatibility
CLC Command Line Tools 22.0.1 is the corresponding client for QIAGEN CLC Genomics Server 22.0.1.
CLC Command Line Tools 22.0.1 can also act as a client for the QIAGEN CLC Genomics Server 22.0. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 22.0
Server specific
New features and improvements
Reorganization of web administrative interface
The top level tabs in the CLC Genomics Server web administrative interface have been renamed and the contents under them re-organized. From left to right in the interface:
- Element info - Functionality for working with with data stored in CLC Server Locations, including for setting group-level permissions on folders.
- Configuration - Configuration functionality, including for setting up CLC Server Locations, configuring the server setup, configuring settings for external data, configuring user authentication, etc.
- Management - Functionality for managing the CLC Server, including downloading licenses, stopping, restarting and putting the server in maintenance mode, as well as access to the queue, audit log, etc.
- Extensions - Configuration relating to extending the functionality of the CLC Server, such as downloading and managing plugins, and configuring External Applications.
New functionality in the web administrative interface
- Group-level permissions on CLC Server File System Locations and their contents can be configured via the web administrative interface. Previously, configuration could be done only using a CLC Workbench client.
- Recycle bins can be emptied via the web interface. When logged in as an administrative user, any individual recycle bin can be emptied, or all recycle bins in a FIle System Location can be emptied in a single action.
Other new features
-
Access to Amazon S3 and BaseSpace
- Import data from Amazon S3 or BaseSpace when launching workflows
- Save workflow results to Amazon S3
- Admin level access can be granted to specified groups for installing and configuring workflows, and configuring and enabling external applications via the web administrative interface.
External applications improvements
- External Applications can now be configured with a failure strategy.
- The audit log entries for external applications now include the native process command line and the exit value.
- Improvements to error handling when external applications fail, when run directly as a tool or within a workflow context, including that any results files produced despite the job failure will be posted. In such cases, the std out and error files can contain valuable troubleshooting information.
- For external applications in workflows, the final, substituted command line executed by the external application and the native process exit code are posted to the workflow log. In addition, every error contained in the server result (thus also all failures triggered), is posted to the workflow log. Previously, only the first error thrown to stop the process was posted.
- History information can optionally be added to CLC data elements created by external applications.
Other improvements
- Tomcat has been updated to version 9.0.46. Custom port and SSL settings will need to be reconfigured after this upgrade
- Improvements to the display of history information for data elements. The History tab is now under the top level Element info tab.
- The contents of tool and workflow logs, generated when an analysis are run, can be viewed directly under the Element info tab.
- The audit log shows the XML serialized version of any exception.
- Various minor improvements
Bug fixes
- Fixed an issue that could cause the Main configuration tab of the web administrative interface to occasionally be blank after logging in.
- Fixed an issue where the maintenance mode banner, expected in the top right hand corner, was sometimes not shown when it should have been.
- Fixed an issue with the maintenance mode banner message where the reason for maintenance mode/restart was not correctly displayed.
- Fixed an issue where Find and Model Structure could not be run on a CLC Genomics Server.
- Various minor bugfixes
Functionality removal
The Genomics Analysis Portal client is no longer provided in the CLC Genomics Server distribution.
Changes
- The Java version bundled with CLC Genomics Server 22.0 is Java 11.0.10, where we use the JRE from AdoptOpenJDK.
- Tomcat has been updated to version 9.0.46. Custom port and SSL settings will need to be reconfigured after this upgrade.
Shared with CLC Workbenches
New features and improvements
Tools for annotating and manipulating sequence lists
The new Sequence Lists folder under Toolbox | Utility Tools contains tools for working with sequence lists. This includes existing tools, with new names and expanded functionality, as well as new tools:
- Split Sequence List New tool: Splits up nucleotide or peptide sequence lists. The output can be a specified number of lists, lists containing a specified number of sequences, or lists containing sequences with particular attribute values, such as terms in the description.
- Update Sequence Attributes in Lists New tool: Updates and adds information about the sequences in a list. For example, descriptions can be updated, or new information types can be added based on information provided in an Excel file.
- Create Sequence List Existing tool. Create new sequence lists from sequence elements and/or sequence list elements. Previously available only from the File | New menu.
Other new functionality
- MGI/BGI importer An importer for MGI/BGI fastq format files.
- Rename Sequences in Lists Rename sequences within sequence lists by adding or removing characters, or replacing parts of names, optionally using regular expressions.
- Rename Elements Rename elements by adding or removing characters, or replacing parts of names, optionally using regular expressions.
- A Heat map graphics exporter has been introduced for exporting heat maps to graphics file formats.
- Files containing tab separated values (.tsv) can be imported as tables using Standard Import.
- Export VDJ tools Exports T-Cell VDJ repertoire in txt format.
RNA-Seq and Expression Analysis improvements
- Support for chimeric protocols has been improved in RNA-Seq Analysis.
- RNA-Seq Analysis now reports biotypes with frequency 0 in the "Distribution of biotypes" table.
- When included in workflows, PCA for RNA-Seq and Create Heat Map for RNA-Seq can be run using just one sample as input, thus enabling their use for both multi-sample and single-sample analyses.
- The statistical comparisons generated by Differential Expression for RNA-seq and Differential Expression in Two Groups includes the Biotype when it is available from the expression samples used as input.
- When using Create Expression Browser with miRNA input, the miRBase ID is preserved.
- When performing Differential Expression for RNA-seq on miRNA "group on mature" expression tables the miRBase ID is now exposed in the Statistical Comparison Tables.
Demultiplex Reads
- Demultiplex Reads now supports setting barcodes from table elements in addition to importing barcodes from local files.
- The barcode import table format has been extended to support additional columns.
- When multiple elements are provided as input, the information in the Preview pane includes information obtained from across these. Previously, only the first input element was used for this.
BLAST related updates
- BLAST has been upgraded to BLAST+ 2.12.0 that includes a number of improvements and bug fixes. A full list of BLAST+ 2.12.0 changes can be viewed at http://www.ncbi.nlm.nih.gov/books/NBK131777.
- The list of databases available using BLAST at NCBI has been expanded, including the addition of ‘16S ribosomal RNA sequences (Bacteria and Archea)’ and ‘28S ribosomal RNA sequences from Fungi type and reference material (LSU)’.
- When BLAST at NCBI is used with multiple query sequences, the job will continue even if particular sequences fail due to a problem. Results for successful searches (including those with no hits) are returned. Sequences missing from the results due to problems are recorded in the job log.
- Searches against the Patented protein sequences database using BLAST at NCBI work once again. Previously, these searches always failed, with a dialog message saying only that no hits were found even though an error was returned by the NCBI. For affected searches, the error was reported in the job log.
- Fixed an issue affecting BLAST HSP Tables where the calculation of percent overlap between blast hits in reverse direction and query sequence was based on a sequence length that was 2 base pairs two short leading to incorrect values.
- Improvements have been made to make it less likely that a "CPU usage limit was exceeded" error will be returned when running blastp, blastx, tblastn or tblastx using BLAST at NCBI.
Importer and exporter improvements
- Multiple tables can be exported to a single file when using the following exporters: Tab delimited text, Annotation tab delimited text, Table CSV, Annotation CSV.
- A new custom reads option was added to the Illumina importer. The extended options for fastq file import has been added to support 10X data, it is e.g. now possible to import three fastq files with R1, R2, and I1 as paired reads where I1 is added in front of R1.
- When exporting variant tracks to VCF format, variants that fall under thresholds to be exported can now optionally be excluded entirely from the resulting VCF file.
- When using the VCF export setting for complex variant representation "Reference overlap and depth estimate", complex overlapping reference alleles are now exported with a homozygous reference genotype.
- The list of supported GVF attributes in column 9 has been expanded when importing GVF files using the GFF2/GTF/GVF track importer.
- 1000 Genomes annotations are now better supported by the GFF2/GTF/GVF track importer.
- The Zygosity field is now included when exporting to GVF format.
- A subset of columns to export can be specified when exporting Mapping Coverage data.
Other improvements
- Copy Number Variant Detection (CNVs) can use coverage tables generated by QC for Targeted Sequencing as control mappings. Read mappings can still be used as control mappings.
- Copy Number Variant Detection (CNVs) allows different fold-change thresholds for deletions and amplifications.
- When working with paired reads, Trim Reads allows the trimming of a fixed number of bases to apply to only read 1, only read 2, or both reads of each pair.
- An option has been added to Extract Reads or Create Reads Track from Selection to allow just one member of a pair to be extracted when only one meets the extraction criteria.
- Extract Reads accepts stand alone read mappings in addition to reads tracks as input.
- Create Sample Report can take both the Graphical and the Supplementary Report created by QC for Sequencing Reads as input.
- An option has been added to Amino Acid Changes for using one letter amino acid codes in HGVS annotations.
- Filter on Custom Criteria now accepts expression tracks as input.
- In Quantify miRNA the option to select strand-specific analysis has been removed. The analysis is now always strand-specific.
- Remove Duplicate Mapped Reads considers if reads are duplicates based on the start position of reads instead of both start and end. This allows reads that have undergone quality trimming to be recognised as duplicates.
- The distance to consider around an intron-exon boundary when using Predict Splice Site Effect can be specified. Previously a length of 2 was always used.
- Create Mapping Graph can now generate graphs for forward read coverage and reverse read coverage.
- The Sample Reads tool is now named Subsample Sequence List. Peptide sequence lists are now accepted by this tool, in addition to nucleotide sequence lists.
- When a Track List and the tracks it refers to are copied in a single operation, the new copy of the Track List will refer to the the new copies of the tracks. Previously, the new Track List continued to refer to the original tracks.
- For workflows with paired read import as part of the workflow run, and when the workflow is launched in batch mode, or contains Iterate elements, paired read handling is now the same as for the relevant NGS importer tools (Illumina, Fasta, Sanger) themselves, irrespective of how batch units are defined or organized. Previously when batch units were based on data organization and all files were in the same folder, each file was treated as a separate batch unit, irrespective of whether the Paired option was checked.
- Memory usage when launching workflows in batch mode has been improved.
- Trim Sequences specifies which version of the UniVec database was used, both in the report and in the history of the trimmed sequences output.
- The few tools that directly manipulate input elements, instead of generating a new element containing the changes as output, now generate a new element as output when used within a workflow. This allows them to be handled like any other tool in a workflow context.
- In addition to sequence elements, Add attB Sites accepts sequence lists with fewer than 10,000 sequences as input.
- Internal compression of CLC data has been improved. Elements created with this version of the software, with compression enabled, can be opened in version 21.0.5 and higher. Data must be exported or saved as uncompressed if sharing data with earlier versions of the software.
- Various minor improvements
Bug fixes
- Fixed an issue in Create Box Plot where percentiles reported in the history of a box plot element were off by one. For example, the "25%-ile" value was given the 24th percentile value. The correct values were used in the plots themselves.
- Fixed an issue in Demultiplex Reads where dual barcodes were not allowed to have mismatches in both barcodes.
- Fixed an issue in Demultiplex Reads where dual barcodes could previously be selected in random combinations. Dual barcodes are now handled in pairs.
- When using the "Genome annotated with genes only" in RNA-Seq Analysis, the range of annotation track types that can be used has been expanded. This includes the use of CDS annotation tracks, among others.
- Fixed an issue in Create Sample Report where, when QC thresholds had been specified for Trim Reads, wrong values from the Trim Reads report were shown in table 1.1 Quality Control of the sample report.
- Fixed an issue that caused Create Sample Report to fail when input reports did not contain values for specified QC thresholds.
- Fixed an issue in Combine Reports and Create Sample Report where the "Mean coverage per target" section would report coverages 10x too high when including a report from QC for Targeted Sequencing.
- Fixed an issue in VCF export where, in rare cases, variants below a specified minimum allele fraction threshold were not removed.
- Fixed an issue affecting Local Realignment where large indels upstream of a target region were sometimes not used when provided as guidance variants.
- Fixed an issue that in rare cases could cause Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection to fail on very high coverage samples when the "Remove pyro-errors variants" option was enabled.
- Fixed an issue where Remove Duplicate Mapped Reads did not always de-duplicate paired reads with read-through correctly.
- Fixed an issue where Remove Duplicate Mapped Reads did not always de-duplicate reverse mapping single-end reads correctly.
- Fixed an issue affecting QC for Targeted Sequencing, where it failed with an error when an RNA-Seq read mapping containing paired reads was provided as input.
- Fixed an issue in Filter on Custom Criteria where numeric annotations were sometimes not allowed to be filtered using numerical operators such as "<", ">", "=".
- Fixed an issue in Trio Analysis where, in rare cases, inconsistent zygosity between mother and father could lead to a wrong annotation of inheritance. Trio Analysis now reports inheritance as 'Inconsistent zygosity' if zygosity or the number of alleles is inconsistent between child, mother or father.
- Fixed an issue with VCF files exported from the CLC Genomics Workbench, where fusions that had one breakpoint in common were represented in a way that prevented QIAGEN Clinical Insight Interpret from displaying the counts.
- Fixed an issue causing Quantify miRNA to fail when there were empty entries in the Accession column of miRbase
- Fixed an issue where the names of outputs from Output elements attached directly to an Iterate element in workflows were not as intended when the metadata ({3} placeholder was used. We generally recommend that the specific input number(s) to include in output names are specified when configuring workflows that contain control flow elements.
- Fixed an issue where the content of the recycle bin was not shown correctly after the recycle bin had been emptied.
- Various bug fixes
Changes
- The Sample Reads tool is now named Subsample Sequence List and is located under the Utility Tools | Sequence Lists subfolder of the Toolbox. The functionality of this tool has been expanded. See the Improvements listing above, or refer to the manual.
- The Extract Annotations tool is now named Extract Annotated Regions.
- The tool Set Up Experiment is now named Set Up Microarray Experiment.
- The “Number of duplicates distribution” section has been removed from the report produced by Remove Duplicate Mapped Reads.
- When exporting BAM files, file names are limited to a maximum of 254 characters.
- Input modifying tools within workflows generate an output element instead of directly modifying the input provided. Workflows containing these tools may need to be edited.
Legacy tools
The following tools are now legacy tools and will be retired in a future version of the software:
- Batch Rename (legacy) This tool has been replaced by two new tools, Rename Elements, for renaming data elements, and Rename Sequences in Lists for renaming sequences within sequence lists.
- Empirical Analysis of DGE (legacy)
Functionality retirement
The following tools have been retired:
- Create Track from Experiment (legacy)
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
- Roche 454 NGxS import (legacy)
- Create Combined RNA-Seq Report (legacy)
- Remove Reference Variants (legacy)
Compatibility
The follow are the corresponding client applications for CLC Genomics Server 22.0
- CLC Genomics Workbench 22.0
- CLC Main Workbench 22.0
- CLC Command Line Tools 22.0
CLC Genomics Server 22.0 is compatible with GCE version 22.0.
CLC Server Command Line Tools
Please see the CLC Genomics Server 22.0 listings above for the details about the new tools and features listed here.
New tools
Administrative tools
- install_plugin_download_and_restart
- list_plugins_download
- list_installed_plugins
- empty_recycle_bins
Utility tools
- split_sequence_list
- update_seq_attrs_in_list
- rename_seqs_in_seq_list
- rename_elements
Other tools
- ngs_import_mgi_bgi
- history_add
New and updated options for existing tools
Analysis related
- amino_acid_changes
- option added: --one-letter-codon
- cnv_detection
- option added: --minimum-fold-change-amplification
- option added: --minimum-fold-change-deletion
- option removed: --minimum-fold-change-magnitude
- create_sequence_statistics
- option added: --extinction-coefficient
- differential_expression_rna_seq
- option added: --metadata-table-tsv
- extract_overlapping_reads
- option added: --only-matching-read-pair
- mapping_graph_tracks
- option added: --forward-read-coverage
- option added: --reverse-read-coverage
- ngs_import_illumina
- option added: --reads-options
- option added: --use-reads-options
- predict_splice_site
- option added: --splice-window-size
- process_tagged_sequences
- option added: --barcode-table-element
- quantify_small_rna
- option removed: --strand-specific
- rna_seq
- option added: --count-paired-reads-as-two
- option added: --ignore-broken-pairs
- option removed: --broken-pair-countingscheme
- changed: sample_reads
- option added: --shuffle
- trim
- option added: --first-read-trim
- option added: --second-read-trim
Other updates
Export options added:
- heatmap_graphics
- immune_rept_vdj_tools
Import option added
- table_tsv
Improvements
- If options with identical names are found within an element of a workflow, each of the corresponding CLC Server Command Line Tools parameters will have a number appended to ensure uniqueness.
- Job status updates are now more frequent and the time needed for handling finished jobs has been reduced.
Commands removed
Administrative commands removed
- empty_recycle_bin Replaced by new command empty_recycle_bins
- list_plugins See new commands for several commands relating to plugin administration
Analysis related commands removed
- create_combined_rnaseq_report
- experiment_to_track
- filter_reference_variants
- ngs_import_roche454
- small_rna_annotate
- small_rna_sampling
Bugfixes
- Fixed an issue where find_structure_algo was not listed as an available tool when it should have been.
QIAGEN CLC Genomics Server 21.0.6
Shared with Workbenches
Improvements
- When using the VCF export setting for complex variant representation "Reference overlap and depth estimate", complex overlapping reference alleles are now exported with a homozygous reference genotype.
- When exporting variant tracks to VCF format, variants that fall under thresholds to be exported can now optionally be excluded entirely from the resulting VCF file.
Bug fixes
- Fixed an issue in Copy Number Variant Detection (CNVs), where a subset of region CNVs were not correctly calculated on chromosomes with low coverage target regions in the control samples. The issue led to overly large region CNVs. As gene-level CNVs are calculated from region-level CNVs, these were also affected. When only few target regions had low coverage in control samples, results were likely not affected.
- Fixed an issue that caused Create Sample Report to fail when input reports did not contain values for specified QC thresholds.
- Fixed an issue in VCF export where, in rare cases, variants below a specified minimum allele fraction threshold were not removed.
- Fixed an issue in Maximum Likelihood Phylogeny that in rare situations led to tree construction never completing.
- Fixed an issue with Create BLAST Database, which could fail if the underlying native BLAST tool reported warnings.
- Improvements have been made to make it less likely that a "CPU usage limit was exceeded" error will be returned when running blastp, blastx, tblastn or tblastx using BLAST at NCBI.
- Fixed an issue affecting the naming of workflow outputs defined using a naming pattern of the form {input:N} or its equivalent {2:N}, e.g. {input:1} or {2:1}. The intended input name may not have been the one used to form the output element names when the workflow included an Iterate element, and the batch units were folders, defined based on the organization of the input data, and import was done on the fly.
- Various minor bug fixes
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 21.0.6.
- QIAGEN CLC Genomics Workbench 21.0.6
- QIAGEN CLC Main Workbench 21.0.6
- QIAGEN CLC Command Line Tools 21.0.6
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 21.0.1, 21.0.2, 21.0.3. 21.0.4 and 21.0.5, QIAGEN CLC Main Workbench 21.0.1, 21.0.2, 21.0.3, 21.0.4 and 21.0.5, and QIAGEN CLC Command Line Tools 21.0.1, 21.0.2, 21.0.3, 21.0.4 and 21.0.5 can also connect to QIAGEN CLC Genomics Server 21.0.6.
CLC Server Command Line Tools
CLC Command Line Tools 21.0.6 is the corresponding client for QIAGEN CLC Genomics Server 21.0.6.
Compatibility
CLC Command Line Tools 21.0.6 is the corresponding client for QIAGEN CLC Genomics Server 21.0.6.
CLC Command Line Tools 21.0.6 can also act as a client for the QIAGEN CLC Genomics Server 21.0.1, 21.0.2, 21.0.3, 21.0.4 and 21.0.5. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 21.0.5
Shared with workbenches
Improvements
- GO Annotation File (GAF) 2.2 files can now be imported.
- Amino Acid Changes provides c. annotations for intronic regions when "Use transcript priorities" is enabled.
- The order of samples in reports generated by Create Sample Report when run inside workflows is now consistent between workflow runs. Previously, when multiple Collect and Distribute elements connected to a Create Sample Report element, the order of the samples in the report could differ between workflow runs.
Bug fixes
- Fixed an issue where housekeeping gene normalization was never used by Differential Expression in Two Groups.
- Fixed an issue where TPM expression values were incorrectly reported when using the RNA-Seq Analysis tool with Library Type set to 3' sequencing. Previously TPM was calculated per million mapped reads, instead of per million exonic reads. This resulted in TPMs that summed to less than 1 million, and made TPMs less comparable between libraries that had different proportions of intronic and/or intergenic fragments. Additional notes:
- This issue did not affect results of downstream analyses using tools in the RNA-Seq and Small RNA Analysis folder, i.e. PCA, Heat Map, Venn diagrams, and Gene Set testing.??
- This issue did affect the TPM values of the Quantify QIAseq UPX 3' workflow, delivered by the Biomedical Genomics Analysis plugin.
- Fixed an issue causing Copy Number Variant Detection (CNVs) to fail if the order of the chromosomes in the gene and target region annotation tracks differed.
- Fixed an issue in Extract Consensus Sequence so that insertions are no longer added in low-coverage regions.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 21.0.5.
- QIAGEN CLC Genomics Workbench 21.0.5
- QIAGEN CLC Main Workbench 21.0.5
- QIAGEN CLC Command Line Tools 21.0.5
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 21.0.1, 21.0.2, 21.0.3 and 21.0.4, QIAGEN CLC Main Workbench 21.0.1, 21.0.2, 21.0.3 and 21.0.4, and QIAGEN CLC Command Line Tools 21.0.1, 21.0.2, 21.0.3 and 21.0.4 can also connect to QIAGEN CLC Genomics Server 21.0.5.
Advanced notice
We will not be distributing the Genomics Analysis Portal functionality from the CLC Genomics Server distribution with the next major release.
The following will be removed in a future release of the software:
- Create Combined RNA-Seq Report (legacy)
- Create Track from Experiment (legacy)
- Remove Reference Variants (legacy)
- Reverse Sequence (legacy)
- Roche 454 NGS import (legacy)
- Tools under the Small RNA Analysis (legacy) folder:
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
CLC Server Command Line Tools
Bugfixes
- The -O <filename> option can again be used to write out the locations of results to a text file. In earlier 21.x releases, commands including this option failed with a message suggesting the tool being launched was not supported on the CLC Server or was not available to the person running the command.
Compatibility
CLC Command Line Tools 21.0.5 is the corresponding client for QIAGEN CLC Genomics Server 21.0.5.
CLC Command Line Tools 21.0.5 can also act as a client for the QIAGEN CLC Genomics Server 21.0.1, 21.0.2, 21.0.3 and 21.0.4. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 21.0.4
Server specific
Bug fixes
- Fixed an issue that could cause jobs submitted to a grid setup to fail with exit code 78, "a plugin was not correctly distributed to the grid”. This happened to jobs submitted after the Manage Plugins tab in the web administrative interface was opened and an update to an installed plugin was detected as available, and before the CLC Genomics Server was restarted. This issue did not affect jobs already in the queue or running at the point when the availability of a plugin update was detected.
- On job node systems where direct data transfer from client systems is not allowed, administrative activities relating to plugins (installation, updating, etc.) are now explicitly not allowed. Previously, actions could be taken, but changes were applied only to the master server, with actions on the job nodes hanging indefinitely.
- Fixed an issue that caused workflows with a De Novo Assembly element or a Transcript Discovery element to fail with an error if launched using the Genomics Analysis Portal.
- Fixed an issue that in rare instances could cause Create Sample Report to fail when run on a server with job nodes after updating plugins on the server.
Shared with workbenches
Improvements
- The speed of RNA-Seq Analysis jobs has been improved. Workflows containing an RNA-Seq Analysis element will need to be updated.
- The stability of SRA downloads has been improved, providing better support of large downloads, particularly on systems with less stable network connections.
- InDels and Structural Variants now discards reads that are longer than 5000 bp. Long reads could previously cause the tool to fail and are of minimal value for structural variant detection based on unaligned ends. Workflows containing an InDels and Structural Variants element will need to be updated.
- Various minor improvements
Bug fixes
- Fixed an issue where InDels and Structural Variants calculated variant ratios incorrectly in cases where multiple breakpoints supported the variant.
- Fixed an issue that caused Annotate with Repeat and Homopolymer Information to fail when variants were within 10 base pairs of chromosome ends.
- Fixed an issue that caused Download Blast Databases to occasionally fail when downloading a subset of databases from NCBI.
- Fixed an issue affecting naming patterns in export tools, and in workflow output and export elements, where upper case text within curly brackets in these patterns was translated to lower case when naming the outputs.
- Fixed an issue affecting workflows containing Iterate elements when the names of input data element contained characters considered special by the operating system (e.g. on Windows : < > | ). When affected by this issue, no outputs would be produced or just a Workflow Result Metadata table would be produced.
- Fixed an issue affecting workflows containing Iterate elements when the names of input data element contained characters considered special by the operating system (e.g. on Windows : < > | ). When affected by this issue, no outputs would be produced or just a Workflow Result Metadata table would be produced.
- Fixed an issue that caused workflows with a Demultiplex Reads element to fail if the tag list options in that element were unlocked and no other tools in the workflow had unlocked parameters.
- Fixed an issue affecting metadata tables where the paths to data elements that had been moved were not updated to reflect the new location.
- Fixed an issue where Import Tracks from File did not retain COSMIC links in variant tracks imported from VCF.
- Fixed an issue affecting Import Tracks from File where importing COSMIC variation database did not support the QIAGEN reference set Homo_sapiens_sequence_hg38_no_alt_analysis_set.
Plugin notes
Changes have been made that improve the speed of jobs on the CLC Genomics Cloud Engine (GCE) that involve large CLC data elements. This improvement affects systems where the Cloud Server Plugin is installed and GCE-related Cloud Plugin settings have been settings have been configured.
Changes
The location of reference data available for download from QIAGEN via the CLC Genomics Server (e.g. QIAGEN reference sets, protein and resistance databases) is changing. The list of sites for configuring firewall settings for networks that utilize a whitelist approach are:
- reference.clcbio.com
- reference.clcbio.com.s3-website.eu-central-1.amazonaws.com
- genomics-cloud-reference-data-eu-central-1.s3-website.eu-central-1.amazonaws.com.
No configuration changes are needed in the CLC Server Workbench itself. The full list of sites the software accesses is available in our FAQ entry: Which internet addresses does CLC software need access to?
Advanced Notice
We are considering retiring the recently introduced Genomics Analyis Portal functionality. If this would affect your work, please get in touch with us about this by emailing us at ts-bioinformatics@qiagen.com.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 21.0.4.
- QIAGEN CLC Genomics Workbench 21.0.4
- QIAGEN CLC Main Workbench 21.0.4
- QIAGEN CLC Command Line Tools 21.0.4
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 21.0.1, 21.0.2 and 21.0.3, QIAGEN CLC Main Workbench 21.0.1, 21.0.2 and 21.0.3, and QIAGEN CLC Command Line Tools 21.0.1, 21.0.2 and 21.0.3 can also connect to QIAGEN CLC Genomics Server 21.0.4.
CLC Server Command Line Tools
CLC Command Line Tools 21.0.4 is the corresponding client for QIAGEN CLC Genomics Server 21.0.4.
Compatibility
CLC Command Line Tools 21.0.4 is the corresponding client for QIAGEN CLC Genomics Server 21.0.4.
CLC Command Line Tools 21.0.4 can also act as a client for the QIAGEN CLC Genomics Server 21.0.1, 21.0.2 and 21.0.3. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 21.0.3
Shared with workbenches
Bug fixes
- Fixed an issue in that caused metadata layers to be displayed incorrectly on heatmaps produced by Create Heat Map for RNA-Seq. This issue affects analyses run using CLC Genomics Server 21.0.1 or 21.0.2, whether the tool is run independently, or included in workflows. We recommend deleting Heat Maps produced by affected software, and re-running the analyses. Please see the notification about this issue, which includes details about how to check if your results are affected.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 21.0.3.
- QIAGEN CLC Genomics Workbench 21.0.3
- QIAGEN CLC Main Workbench 21.0.3
- QIAGEN CLC Command Line Tools 21.0.3
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 21.0.1 and 21.0.2, QIAGEN CLC Main Workbench 21.0.1 and 21.0.2, and QIAGEN CLC Command Line Tools 21.0.1 and 21.0.2 can also connect to QIAGEN CLC Genomics Server 21.0.3.
CLC Server Command Line Tools
Compatibility
CLC Command Line Tools 21.0.3 is the corresponding client for QIAGEN CLC Genomics Server 21.0.3.
CLC Command Line Tools 21.0.3 can also act as a client for the QIAGEN CLC Genomics Server 21.0.1 and 21.0.1. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 21.0.2
Shared with workbenches
Improvements and bug fixes
- Fixed an issue where Demultiplex Reads run within a workflow context could not be run on paired reads where the barcode was defined on the mate.
- Various minor improvements
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 21.0.2.
- QIAGEN CLC Genomics Workbench 21.0.2
- QIAGEN CLC Main Workbench 21.0.2
- QIAGEN CLC Command Line Tools 21.0.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 21.0.1, QIAGEN CLC Main Workbench 21.0.1, and QIAGEN CLC Command Line Tools 21.0.1 can also connect to QIAGEN CLC Genomics Server 21.0.1.
CLC Server Command Line Tools
Bug fixes
- Fixed an issue where using the -I argument could result in an error if it was not the last argument provided in the command.
Compatibility
CLC Command Line Tools 21.0.2 is the corresponding client for QIAGEN CLC Genomics Server 21.0.2. It can also act as a client for the QIAGEN CLC Genomics Server 21.0.1. We recommend running the corresponding version of the CLC Command Line Tools CLC Genomics Server.
QIAGEN CLC Genomics Server 21.0.1
This is a compatibility release for the corresponding client software, QIAGEN CLC Genomics Workbench 21.0.1 and QIAGEN CLC Main Workbench 21.0.1.
Please see the release notes for CLC Genomics Server 21.0, below, for a full list of changes since the last general release of this software.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 21.0.1.
- QIAGEN CLC Genomics Workbench 21.0.1
- QIAGEN CLC Main Workbench 21.0.1
- QIAGEN CLC Command Line Tools 21.0.1
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 21.0, QIAGEN CLC Main Workbench 21.0, and QIAGEN CLC Command Line Tools 21.0 can also connect to QIAGEN CLC Genomics Server 21.0.1.
CLC Server Command Line Tools 21.0.1
This is a compatibility release, as the corresponding client for the QIAGEN CLC Genomics Server 21.0.1.
Please see the release notes for CLC Server Command Line Tools 21.0, below, for a full list of changes since the last general release of this software.
Compatibility
CLC Command Line Tools 21.0.1 is the corresponding client for QIAGEN CLC Genomics Server 21.0.1. It can also act as a client for the QIAGEN CLC Genomics Server 21.0. We recommend running the corresponding version of the CLC Command Line Tools CLC Genomics Server.
QIAGEN CLC Genomics Server 21.0
Server specific
New features
Plugins and licenses can be downloaded via the web administrative interface
Software licenses and server plugins can be downloaded and updated directly via the web administrative interface, under the new Extensions tab. In association with this, plugin management has been moved to the new Extensions tab, and the display of information about installed plugins has been improved.
Genomics Analysis Portal
The Genomics Analysis Portal is a web browser based, graphical client for the the CLC Genomics Server. Workflows can be submitted for analysis on the server from a web browser, analysis progress and status of analyses can be monitored, and results can be downloaded.
Enabling the Genomics Analysis Portal is described in the CLC Server Admin manual, while configuration and use is described in the Genomics Analysis Portal manual.
"Containerized" External applications
Portable, "containerized" external applications using Docker are supported on Linux-based CLC Server setups. Containers can be local or in a repository, such as Amazon Elastic Container Repository (AWS ECR) or Docker Hub. Like standard external applications, they can be run as individual tools or included in workflows.
User folders for user-level permissions
One or more File system locations can be configured to be used for "user home" folders, which are top-level folders within that location. A user is granted write access only to the folder with a name matching their username. These "user home" folders can be created automatically on user login, or created manually for select users.
Improvements
External applications
- Support has been added for automatic update of external applications.
- Tooltips, visible in the launch wizard, can be added for external applications parameters.
- The order that parameters are displayed in the Workbench launch wizard reflects the order in the external application configuration. Previously the order in the launch wizard reflected the time the parameters were added.
- The order of parameters in an external application can be rearranged directly in the configuration editor. Previously, the parameter needed to be deleted and then reinserted and reconfigured.
- The namea of "standard out" and "standard error" to be shown in workflows, as well as the name of the files saved, can be configured.
- Improved presentation of parameter settings in the General configuration tab.
- The folder-level organization of external applications is reflected in the Export, Delete and Publish dialogs.
General
- The location that intermediate workflow results should be stored can now be configured. They can be stored in a temporary directory of the system the workflow is executed on or stored in a subfolder of the location that final results will be stored. The default is the latter, which is also the behavior of earlier CLC Genomics Server versions.
Bug fixes
- Fixed an issue where output naming patterns referring to specific inputs (such as {input:2}) produced default output names instead of the expected substituted ones. This issue arose for some workflows containing Iterate control flow elements when run on a CLC Genomics Server configured to "Submit tasks in each workflow block to a single node" (Workflow queuing option).
Changes
- The default setting for "Workflow queuing options" has been changed to to "Submit tasks in each workflow block to a single node".
Shared with CLC Workbenches
New features and improvements
Full workflow support for Sanger sequence analysis
New features have been introduced, and improvements made, to support automated analyses of Sanger trace data using workflows.
Trim Sequences
- Trim Sequences is available on the CLC Genomics Server.
- Trim Sequences can be used in workflows.
- A new sequence element containing the trimmed sequences is output. Previously, the input was modified and saved.
- A report can be generated containing a summary of the number of reads trimmed and the reasons for the trimming. This report is supported by the Combine Reports tool.
- The UniVec database used in this tool has been updated to version 10.0 of UniVec_Core.
Other improvements supporting trace data analysis in workflows
- Trace data can be imported using on-the-fly import in workflows.
- Improved output naming by the Assemble Sequences to Reference and Assemble Sequences tools: The sample name is included in the file name and the sequence names in the output.
- Metadata-based naming is supported in workflows run in batch mode or with Iterate control flow elements through the use of new placeholders: {metadata} and {metadata:<columnname>}.
- The Secondary Peak Calling tool no longer modifies the input data element, but instead produces new elements as output. Note: This change requires that the workflows with this tool that were created in older versions of the software must be manually updated. The old workflow element must be replaced by a new one. The recommended upgrade path for installed workflows containing the Secondary Peak Calling tool is to save a copy of the workflow in the Navigation Area using a version 20.x Workbench, and then open and manually update that workflow in the CLC Genomics Workbench 21.0. or CLC Main Workbench 21.0. The new workflow can then be installed, if desired.
New tools
- Create Sample Report creates a summary report of selected information from multiple reports relating to a single sample. Specific types of information can be specified for inclusion in the Quality Control section.
- Extract IsomiR Counts extracts information from the read mappings of each miRNA or other custom added database type, e.g. piRNA etc, and collects the information across all mappings in a table that can be exported.
- Annotate with Repeat and Homopolymer Information adds annotations to variants by appending two new columns with information about repeat and homopolymer status.
- Merge Variant Tracks merges multiple variant tracks into a single track. Options are available for appending annotations from overlapping variants.
Extract IsomiR Counts, Annotate with Repeat and Homopolymer Information and Merge Variant Tracks were previously available via the Biomedical Genomics Analysis Server Plugin.
Workflow related
- When a workflow with Export elements is run in batch mode, the exported files from each batch run can be saved to separate folders.
- BED and VCF format files can be imported on-the-fly in workflows.
- On-the-fly import can be used without metadata when running workflows in batch mode, and when running workflows containing a single Iterate element.
- Name placeholders for output elements and export elements have been updated, and the naming of outputs of workflows run in batch mode can be more finely controlled.
- Improvements for Workflow Input elements
- Workflow Input elements can be configured to limit the data input method to either selection of data elements from the Navigation Area or selection of files to be imported using on-the-fly import. The default is to allow the input method to be chosen when launching the workflow.
- Workflow Input elements can be configured to limit the on-the-fly import types available when launching the workflow. Parameters of selected importers can also be locked or unlocked, as desired, defining whether the setting is configurable when launching the workflow.
- Additional configuration options for Iterate and Collect and Distribute workflow elements are available.
- When a workflow with Iterate elements is run with the "Batch" checkbox checked, the "Batch identifier" column in the Workflow Result Metadata table will contain the combined batch identifier, reflecting all levels of batching and iterations.
- The following tools are available to be included in workflows:
Performance improvements
- Create Tree is significantly faster when creating large trees when using the Jukes-Cantor distance measure.
- Performance has been improved when tools generating a large number of sequences (for example, Trim Reads) are run on a system with many threads.
- Substantial speed improvements have been made to the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection tools.
- The speed of Demultiplex Reads has been improved when it is run on machines with many cores.
- The speed of Copy Number Variant Detection (CNVs) has been improved.
- When running Map Reads to Reference using a reference that has been downloaded through the Download Genomes functionality, it is now faster to determine whether an already cached reference index can be re-used.
- Performance improvements have been made for the calculation of generalized linear models in Differential Expression for RNA-Seq and Differential Expression in Two Groups. This can lead to slightly different results, with changes typically smaller than one part in ten thousand.
Export
- Exported files can be saved into subfolders of the selected output area by using a forward slash character / at the start of the custom file name definition.
- Graphics export of Tracks, Track lists, Sequences, Alignments and Read mappings is supported as a standard export, which can be embedded into workflows and executed on a CLC Genomics Server. This feature is intended for high-throughput applications. This feature is intended for high-throughput applications.
- The naming pattern for files exported using the fastq exporter has been updated to be in line with the naming format the Illumina importer expects. The exported file names now end with "_R1.fastq" and "_R2.fastq". Previously the extension used was ".R1.fastq" when exporting a single file, if pairs where exported to two files, the second file had the extension ".R2.fastq". (The first "." in the original naming has been replaced by an "_").
- Export VCF has been updated:
- It supports the export of CNV and fusion data.
- If multiple elements have been selected for export, there is an option for exporting them to a single file.
- It uses the value "." to represent missing variant annotations.
- Special characters in variant annotations are exported using percent encoding, as specified in VCF 4.3.
Illumina importer
- The "Paired reads" option is enabled by default.
- Improved validation when the "Paired reads" option is enabled,. The names of the pairs of files are validated as follows:
- If the file names follow the Illumina naming format, the two files are required to have the same sample name and lane
- If the file names do not follow the Illumina naming format, but _R1/_R2 is detected in the names, the first file must contain _R1 and the second file must contain _R2.
- If the "Join reads from different lanes" option is enabled, the detected lane, in the format _L001, must be the same for both files.
- If a pair of files does not meet the requirements above, a message is printed in the log and the pair of files is skipped.
- Improved naming of the imported elements:
- If the imported files follow the Illumina naming format, the imported elements no longer contain the _R1_001 suffix.
- Otherwise, if _R1 / _R2 is detected in the names of the files, it is removed from the name of the imported elements.
Local Realignment
- A restriction has been removed from Local Realignment that prevented paired reads from being realigned when that realignment would change which read was left-most on the reference. The overall effect of this change is to increase the likelihood of detecting insertions in rare cases.
- Improvements have been made when realigning large insertions at the beginning of reads.
- The "Allow guidance insertion mismatches" and "Maximum guidance-variant length" options are now enabled only when a guidance-variant track is provided.
- Fixed an issue that caused reads with unaligned ends stretching over a chromosome boundary to be removed from the mapping.
- Local Realignment respects the CPU limit defined via the web administrative interface, if a limit has been set.
QC for Targeted Sequencing
- A new option in QC for Targeted Sequencing allows a custom list of coverage levels to be specified.
- The report includes the complete set of chromosomes in the "Targeted region overview" section when using references with up to 200 chromosomes. Previously the limit was 100 chromosomes. This change means the hg38_no_alt_analysis_set reference data set, available from the Reference Data Manager, is now supported.
- The report has been extended with values reporting the number and percentage of base positions in target regions with coverage above or equal to the minimum threshold.
Other improvements
- Improved the alignment quality for read mappings by removing aligned ends with an alignment score of zero. As a result, some alignments will be shorter and may be filtered away because they no longer pass the minimum length fraction criterion. Tools benefiting from this change include Map Reads to Reference, RNA-Seq Analysis, Map Reads to Contigs and Map Bisulfite Reads to Reference.
- Option names and other information in the wizards for the Trim Reads tool and the corresponding workflow element have been updated for clarity and consistency.
- De Novo Assembly reports can be used as inputs to the Combine Reports tool.
- A new option, "Filter on average expression for FDR correction" is available in Differential Expression for RNA-Seq and Differential Expression in Two Groups. When checked, automatic, independent filtering prior to FDR correction is carried out, with the aim of increasing power.
- Plots and tables generated by QC for Sequencing Reads have better usability, especially when working with long reads. Tables with more than 500 data points now show the first 100 entries and then bin remaining data points, based on range. In graphs, end positions with a coverage below 0.005% across the reads are not included.
- QC for Sequencing Reads reports the QC metrics separately for different read types found in the input data: unpaired reads, R1 reads and R2 reads.
- In Quantify miRNA, the minimum value for the setting "Minimum sequence length", used for seed counting, has been changed to 8. (The seed is a 7 nucleotide sequence from positions 2-8 on the mature miRNA.)
- The Quantify miRNA outputs, "Grouped on mature" and "Grouped on seed tables", contain links to miRBase.
- A new section has been added to the Call Methylation Levels report containing details of read conversion and direction.
- Remove Duplicate Mapped Reads outputs reads in a deterministic order.
- The "Reads trimmed (%)" column in the "Trim summary" section of the Combine Reports output has been removed as it was a duplicate of the "Reads after trim (%)" column.
- Custom attributes can be configured in a data location such that attribute values are not copied when copying data elements.
- Annotate with Overlap Information and Filter Based on Overlap now count insertions and zero-length annotations as overlapping a region when they overlap either border. E.g. when an insertion is right on the border of a gene, we say that the insertion overlaps the gene.
- The SRA toolkit has been updated to version 2.10.7.
- Various minor improvements
Bug fixes
- Fixed an issue affecting Filter on Custom Criteria when included in a workflow with the filtering step option unlocked. If criteria were updated, added, or removed filter in the launch wizard, the updated criteria were not used in the first run of the workflow with these updated values. Instead, the old criteria were used in that run. In subsequent runs, the updated values were used.
- Fixed an issue affecting read mappings where a short deletion was preferred to a mismatch for equal scoring alignments. Tools benefiting from this change include
- Map Reads to Reference, RNA-Seq Analysis, Map Reads to Contigs and Map Bisulfite Reads to Reference.
- Fixed an issue in Trim Reads where length filters were applied before automatic read-through adapter trimming was done, if it was enabled. This could result in reads shorter than minimum length settings being included in the output.
- Fixed an issue affecting Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection, where forward coverage or reverse coverage could be reported as being higher than it was when looking for very low frequency variants with very low minimum count values.
- Fixed a bug where IonTorrent SAM files with special characters in the sample name could not be imported in separate folders.
- Fixed an issue where Map Reads to Reference could occasionally ignore reads when encountering a read with an unaligned end that wraps twice around a chromosome.
- Fixed an issue in Quantify miRNA where the isomiRs associated with a reference mir-rna were not all consistently named using the miRbase isomiR nomenclature (http://www.mirbase.org/help/nomenclature.shtml).
- Fixed an issue in Create Heat Map for RNA-Seq affecting the "Fixed number of features" option, where one member of the set of most variable genes or transcripts was missing from those used in the analysis, with a slightly less variable feature included instead.
- Fixed an issue in Create Heat Map for RNA-Seq, where the "Filter by statistics" option could not be used with miRNA expression data.
- Fixed an issue in Create Heat Map for RNA-Seq, where the history of heat maps did not include the name or the version of the tool used.
- Fixed an issue where RNA-Seq Analysis failed if a read mapped across 2 exons of a gene, where those 2 exons spanned the origin of a chromosome.
- Fixed an issue where RNA-Seq Analysis failed if a gene or mRNA spanned the origin of a chromosome and that chromosome was marked as linear. We now ignore these mRNAs.
- Fixed an extremely rare issue where RNA-Seq Analysis could fail when the positions of genes (or transcripts) were defined with respect to a sequence that was not part of the genome. An example of this kind of annotation is the remote entry identifier allowed by GenBank flat file format, see http://www.insdc.org/files/feature_table.html#3.4 These genes and transcripts are now filtered away prior to the tool being run.
- Fixed an issue that caused Combine Reports to occasionally fail when combining reports with summary information shown as plots.
- Fixed an issue with Combine Reports where, when combining RNA-Seq reports, warning messages for the "Distribution of biotypes" section could be present when they should not have been.
- Fixed an issue where a wrongly formatted VCF file could make the VCF importer terminate instead of writing the error to the log.
- Fixed an issue where Transcription Factor ChIP-Seq would exit with an error when given a read mapping with a circular reference sequence with coverage across all bases.
- Fixed an issue affecting the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection tools, where complex indels were reported in regions where the reference had a sequences of Ns. This error was introduced in CLC Genomics Server 20.0.2.
- Fixed an issue that could cause De Novo Assembly to occasionally fail when assembling paired data with both the "Auto detect paired distances" and ""Map reads back to contigs (slow)" options enabled.
- Fixed an issue with links to HGNC in gene tracks imported from GFF3 files using "Import Tracks from File" and in some Refseq gene tracks provided via the Reference Data Manager.
- Fixed an issue in Excel importer, where the presence of certain formulas would previously prevent successful import.
- Fixed an issue where, if BLAST at NCBI failed with an error, no error would be shown and instead no hits were returned.
- Fixed a bug where some workflows using a Collect and Distribute element with multiple output channels did not pass the correct inputs to a tool after the Collect and Distribute element
- Various minor bug fixes
Changes
- The Java version bundled with CLC Genomics Server 21.0 is Java 11.08, where we use the JRE from AdoptOpenJDK.
- The read mapping tool used by various tools in the CLC Genomics Server (e.g. Map Reads to Reference, RNA-Seq Analysis, Map Reads to Contigs and Map Bisulfite Reads to Reference) has been updated for this release and corresponds to the version in CLC Assembly Cell 5.2.1. Other binaries are unchanged and continue to correspond to the versions in CLC Assembly Cell 5.1.1.
- The default base name for the element being exported is designated using the placeholder {name}, instead of {input}. The numeric equivalent, {1}, is unchanged. The default export naming pattern has correspondingly been changed to {name}.{extension}. (GxS notes only, add the following: This change also applies to exports configured in External Applications.) Previously {input} was used.
- The default expect value (e-value) for BLAST at NCBI is 0.05 and the maximum number of hits is 5000, aligning with the defaults used at the NCBI.
- Changes have been made to the handling of sequence identifiers when using Create BLAST Database. This change allows continued flexibility in the naming of sequences used for making these databases, avoiding direct exposure to limitations present in the underlying BLAST+ program, makeblastdb, such as not allowing long or duplicate sequence names. Further details are provided in our FAQ area.
- The option "Reports originate from a single sample" has been removed from the Combine Reports tool. For generation of a single sample combined report, please use the new Create Sample Report tool.
- The "Chromosome M name" option in Trio Analysis has been renamed to "Chromosome MT name", with default value "MT" instead of "M".
- The creation of Workflow Result Metadata tables is optional when running workflows on the CLC Genomics Server.
Functionality retirement
Tools
- Reverse Sequence
Compatibility
The follow are the corresponding client applications for CLC Genomics Server 21.0
-
-
- CLC Genomics Workbench 21.0
- CLC Main Workbench 21.0
- CLC Command Line Tools 21.0
-
CLC Genomics Server 21.0 is compatible with GCE version 21.0.
Plugin notes
- The Ingenuity Variant Analysis Server Plugin has been retired. The Ingenuity Variant Analysis service has been replaced by QCI Interpret Translational. Please email ts-bioinformatics@qiagen.com if you need further information about this.
- The Advanced Structural Variant Detection Server Plugin (beta) has been retired. An improved tool, Structural Variant Caller, is available in the Biomedical Genomics Analysis Server Plugin.
Advanced notice
The following will be removed in a future release of the software:
- Create Combined RNA-Seq Report (legacy)
- Create Track from Experiment (legacy)
- Remove Reference Variants (legacy)
- Reverse Sequence (legacy)
- Roche 454 NGS import (legacy)
- Tools under the Small RNA Analysis (legacy) folder:
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
CLC Server Command Line Tools 21.0
New features
- -Y Include this option in a command to run it asynchronously
- -I Get information about particular processes or to list all processes submitted by the user running the command.
- -R Get the results of finished processes or cancel processes.
New tools
Analysis related
- anno_with_repeat_and_homopoly_info
- create_sample_report
- extract_isomir_counts
- merge_variant_tracks
- trim_sequences
New exporters
- alignment_graphics
- mapping_graphics
- sequence_graphics
- track_graphics
- track_list_graphics
- trim_sequences
New importers
- trace_files_import
Utility tools
- request_install_server_license
- user_home_read_settings
- user_home_write_setting
New and updated options for existing tools
Analysis related
- process_tagged_sequences - new options added
--aux-values
--barcode-columns
--barcode-table-file--mapping-source-type
--name-column - statistics_target_regions - new options added
--coverage-levels replaces the old option --report-type, which has been removed
--custom-coverage-levels - For export of vcf
- --onefile <Boolean>
Utility tools
Options added
- rm - new option added
--direct
Options removed
- combine_reports
--single-sample (The new create_sample_report tool caters for this situation)
Changes
- ngs_import_illumina The default for the --paired-reads option is now true. It was previously false.
Commands removed
- reverse_sequence
Bugfixes
- Fixed an issue that caused information about libraries intended for the LICENSE and NOTICE files to be omitted.
Other changes
- The Java version bundled with CLC Server Command Line Tools 21.0 is Java 11.08, where we use the JRE from AdoptOpenJDK.
QIAGEN CLC Genomics Server 20.0.5
Shared with workbenches
Improvements and Changes
- The QC for Targeted Sequencing report now includes the complete set of chromosomes in the "Targeted region overview" section when using references with up to 200 chromosomes. Previously the limit was 100. This change means the hg38_no_alt_analysis_set reference data set, available from the Reference Data Manager, is now supported.
- The SRA toolkit has been updated to version 2.10.7.
- The maximum number of hits returned by BLAST at NCBI is now 5000, aligning with the defaults used at the NCBI.
- The speed of Copy Number Variant Detection (CNVs) has been improved
- Various minor improvements
Bug fixes
- Fixed an issue where Map Reads to Reference could occasionally ignore reads when encountering a read with an unaligned end that wraps twice around a chromosome.
- Fixed an issue in Local Realignment that caused reads with unaligned ends stretching over a chromosome boundary to be removed from the mapping.
- Fixed an issue in Create Heat Map for RNA-Seq affecting the "Fixed number of features" option, where one member of the set of most variable genes or transcripts was missing from those used in the analysis, with a slightly less variable feature included instead.
- Fixed an issue in Create Heat Map for RNA-Seq, where the "Filter by statistics" option could not be used with miRNA expression data
- Fixed an issue where the history of heat maps produced by Create Heat Map for RNA-Seq did not include the name or the version of the tool.
- Fixed an issue where, if BLAST at NCBI failed with an error, no error would be shown and instead no hits were returned.
- Fixed an issue with Combine Reports where, when combining RNA-Seq reports, warning messages for the "Distribution of biotypes" section could be present when they should not have been.
- Fixed a bug where some workflows using a Collect and Distribute element with multiple output channels did not pass the correct inputs to a tool after the Collect and Distribute element.
- Various minor bugfixes
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 20.0.5.
- QIAGEN CLC Genomics Workbench 20.0.5
- QIAGEN CLC Main Workbench 20.0.5
- QIAGEN CLC Command Line Tools 20.0.5
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 20.0, 20.0.1, 20.0.2, 20.0.3 and 20.0.4, QIAGEN CLC Main Workbench 20.0, 20.0.1, 20.0.2, 20.0.3 and 20.0.4, and QIAGEN CLC Command Line Tools 20.0, 20.0.1, 20.0.2, 20.0.3 and 20.0.4 can also connect to QIAGEN CLC Genomics Server 20.0.5. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
CLC Server Command Line Tools
Bug fixes
Fixed an issue that caused information about libraries intended for the LICENSE and NOTICE files to be omitted.
Compatibility
CLC Command Line Tools 20.0.5 is the corresponding client for QIAGEN CLC Genomics Server 20.0.5.
CLC Command Line Tools 20.0.5 can also act as a client for the QIAGEN CLC Genomics Server 20.0, 20.0.1, 20.0.2, 20.0.3 and 20.0.4. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 20.0.4
Shared with Workbenches
Improvements
- The NCBI nucleotide database "Betacoronavirus", focused on SARS-CoV-2, has been added to BLAST at NCBI.
- Names of extracted consensus sequences from the Extract Consensus Sequence tool now have the pattern: "<input name> <reference name> consensus".
- The COSMIC track importer has been updated to support version 91 of the COSMIC Mutation Data format. Due to insufficient information in version 90, we are not able to support that particular version. Older versions are still supported.
- The SRA Toolkit has been updated to version 2.10.5 on Linux and Mac OS X. This improves stability in environments with unreliable network connections.
Bug fixes
- Fixed an issue where options of the Extract Annotations were not configurable in workflows.
- Fixed an issue with Filter Custom Criteria where filtering on the Regions column using operators >=, <=, and != would remove all variants or annotations.
- Fixed an issue that could occasionally cause the InDels and Structural Variants tool to fail with an error message about a breakpoint missing at a particular location.
- Fixed an issue where unnecessary, empty output folders could be generated by analyses run in batch mode. This happened when the "Create subfolders per batch unit" option was enabled, and the "Include" or "Exclude" field had been used to specify elements within each batch unit to analyze, such that some batch units were empty.
- Fixed an issue where the Standard Importer for Fasta Alignments would fail when importing multiple files.
- Fixed an issue where importing a Clone Manager file (.cm5) where a sequence length annotation would be ignored and the full sequence imported.
- Fixed an issue that could occasionally cause the export to PDF format of some reports to fail with an error.
- Various minor bugfixes
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 20.0.4.
- QIAGEN CLC Genomics Workbench 20.0.4
- QIAGEN CLC Main Workbench 20.0.4
- QIAGEN CLC Command Line Tools 20.0.4
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 20.0, 20.0.1, 20.0.2, and 20.0.3, QIAGEN CLC Main Workbench 20.0, 20.0.1, 20.0.2, and 20.0.3, and QIAGEN CLC Command Line Tools 20.0, 20.0.1, 20.0.2, and 20.0.3 can also connect to QIAGEN CLC Genomics Server 20.0.4. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
The following tools will be removed in a future release of the software:
- Create Combined RNA-Seq Report (legacy)
- Create Track from Experiment (legacy)
- Remove Reference Variants (legacy)
- Reverse Sequence (legacy)
- Roche 454 NGS import (legacy)
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
The following functionality will be removed in a future release:
- The graphical representation of installed workflows and the Test button, which appear under the Workflows tab of the web administrative interface, are scheduled for removal.
- Information about installed workflows, including validation issues, will continue to be provided under the Workflows tab. Testing installed workflows will continue to be possible by launching from a CLC Workbench or the CLC Server Command Line Tools.
- The "Run in Batch Mode" functionality used for launching workflows with multiple inputs from a CLC Workbench is now legacy. Such workflows can now be launched in batch mode from a CLC Workbench by checking the "Batch" checkbox when selecting the input data.
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools
Compatibility
CLC Command Line Tools 20.0.4 is the corresponding client for QIAGEN CLC Genomics Server 20.0.4.
CLC Command Line Tools 20.0.4 can also act as a client for the QIAGEN CLC Genomics Server 20.0, 20.0.1, 20.0.2, and 20.0.3. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 20.0.3
Server specific
Improvements
- A minimal set of fonts is now included with the CLC Genomics Server. Operations requiring fonts, such as PDF export, can now be successfully carried out without installing the 'fontconfig’ software package and at least one other font on the server.
Shared with workbenches
Bug fixes
- Fixed an issue with Create BLAST Database, where BLAST databases could not be created if the sequence set included entries with identifiers longer than 50 characters or of a form similar to PDB identifiers. This was due to new requirements introduced with NCBI BLAST+ 2.8.1. To address this, we have replaced the BLAST+ 2.9.0 makeblastdb tool, used by Create BLAST Database, with the version from BLAST+ 2.6.0. This is the same version used in CLC Genomics Workbench 12.x. This change does not affect the searching of BLAST databases using this or earlier supported versions.
- Fixed an issue where the output naming pattern of a workflow was not properly respected if output channels of one or more elements were connected to both a Collect and Distribute element and to an Output element.
- Fixed an issue where, after installing a workflow with bundled data and restarting the Workbench, the installed workflow could not be run, with a message in the wizard incorrectly reporting that the bundled data "is missing".
- Fixed an issue that could prevent the export of graphics in PDF format, if an older version of the software had previously been used to export graphics.
- Fixed an issue that prevented printing to certain types of printers on Windows 10 systems.
- Fixed an issue where the overlap of a small minority of annotations or variants was incorrectly determined. Known manifestations, addressed by this fix, occurred in tools delivered by plugins:
- Biomedical Genomics Analysis when installed on CLC Genomics Workbench 20.0, 20.0.1 or 20.0.2
- Results of Annotate RNA Variants, a tool included in the Perform QIAseq Multimodal Analysis (Illumina) and Perform QIAseq RNA Fusion XP Analysis workflows. The problem could affect following annotations on variants called at a small number of specific positions: "Matches known intron", "Possible splice signatures" and "Conserved splice signature". Further details are available, including information relating to the expected (very small) magnitude of the problem.
- Transcript Discovery when installed on CLC Genomics Workbench 20.0.2 or earlier versions
- The Transcript Discovery tool occasionally identified an incorrect exon boundary. Due to the expected level of sensitivity and precision of this tool, we expect this to have very little impact in practice.
- Biomedical Genomics Analysis when installed on CLC Genomics Workbench 20.0, 20.0.1 or 20.0.2
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 20.0.3.
- QIAGEN CLC Genomics Workbench 20.0.3
- QIAGEN CLC Main Workbench 20.0.3
- QIAGEN CLC Command Line Tools 20.0.3
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 20.0, 20.0.1 and 20.0.2, QIAGEN CLC Main Workbench 20.0, 20.0.1 and 20.0.2, and QIAGEN CLC Command Line Tools 20.0, 20.0.1 and 20.0.2 can also connect to QIAGEN CLC Genomics Server 20.0.3. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
- The following tools will be removed in a future release of the software:
- Create Combined RNA-Seq Report (legacy)
- Create Track from Experiment (legacy)
- Remove Reference Variants (legacy)
- Reverse Sequence (legacy)
- Roche 454 NGS import (legacy)
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
- Create Combined RNA-Seq Report (legacy)
- The "Run in Batch Mode" functionality used for launching workflows with multiple inputs from a CLC Workbench is now legacy. Such workflows can now be launched in batch mode from a CLC Workbench by checking the "Batch" checkbox when selecting the input data.
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools
Bug fixes
- Fixed an issue where workflows with locked inputs to Input elements would fail to run with the message "Workflow needs to be updated to run on this version of the software", or would incorrectly offer import command options. If you have experienced this problem, the affected workflow will need to be edited in CLC Genomics Workbench 20.0.3 or CLC Main Workbench 20.0.3, removing any Input elements with locked inputs and then adding them again. This new workflow version can then installed on your CLC Genomics Server.
Compatibility
CLC Command Line Tools 20.0.3 is the corresponding client for QIAGEN CLC Genomics Server 20.0.3.
CLC Command Line Tools 20.0.3 can also act as a client for the QIAGEN CLC Genomics Server 20.0, 20.0.1 and 20.0.3. However, we recommend running the corresponding version.
QIAGEN CLC Genomics Server 20.0.2
Server specific
Shared with workbenches
Improvements
- When the "3' sequencing" Library type setting of RNA-Seq Analysis is selected when analyzing reads that have been annotated with UMIs by tools of the Biomedical Genomics Analysis plugin, expression values in the GE track are based on the number of distinct UMIs for each gene, rather than the number of reads.
- The running time of Fixed Ploidy Variant Detection and Low Frequency Variant Detection has been substantially improved for some data sets with large numbers of differences from the reference in localized regions of the read mapping.
- Metadata associations are preserved when importing .zip files containing both metadata tables and associated data.
- Various minor improvements
Bug fixes
- Fixed a bug in Create Heat Map for RNA-Seq affecting the "Fixed number of features" option, where one member of the set of most variable genes or transcripts was missing from those used in the analysis, with a slightly less variable feature included instead.
General information
NCBI plans to change their blast database folder structure in early February 2020. When that happens, the Download BLAST Databases tool of CLC Genomics Server 20.0.2, and future updates, will list the new, dbV5 blast databases for download. The Create Blast Database tool will continue to create dbV4 databases. Databases of either version can be searched using the blast programs included in QIAGEN CLC Genomics Server 20.x release line.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 20.0.2.
- QIAGEN CLC Genomics Workbench 20.0.2
- QIAGEN CLC Main Workbench 20.0.2
- QIAGEN CLC Command Line Tools 20.0.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 20.0 and 20.0.1, QIAGEN CLC Main Workbench 20.0 and 20.0.1, and QIAGEN CLC Command Line Tools 20.0 and 20.0.1 can also connect to QIAGEN CLC Genomics Server 20.0.2. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
The following tools will be removed in a future release of the software:
- Create Combined RNA-Seq Report (legacy)
- Create Track from Experiment (legacy)
- Remove Reference Variants (legacy)
- Reverse Sequence (legacy)
- Roche 454 NGS import (legacy)
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
The "Run in Batch Mode" functionality used for launching workflows with multiple inputs from a CLC Workbench is now legacy. Such workflows can now be launched in batch mode from a CLC Workbench by checking the "Batch" checkbox when selecting the input data.
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools
This is a compatibility release of the corresponding client for QIAGEN CLC Genomics Server 20.0.2.
Compatibility
CLC Command Line Tools 20.0.2 is the corresponding client for QIAGEN CLC Genomics Server 20.0.2.
CLC Command Line Tools 20.0.2 can also act as a client for the QIAGEN CLC Genomics Server 20.0 and CLC Genomics Server 20.0.2. However, we recommend running the corresponding version of the CLC Command Line Tools CLC Genomics Server.
QIAGEN CLC Genomics Server 20.0.1
Server specific
- Where direct data transfer from client systems has been enabled, temporary files created during data import from an Import/Export location to the CLC Genomics Server are now created in a hidden directory in that Import/Export location. Previously, these temporary files were placed within a subfolder called CLC_Uploads and were visible via the CLC Workbench import wizard.
- Classes other than the standard posixAccount and posixGroup classes can now be configured for LDAP setups using the new options "User object class" and "Group object class".
Shared with workbenches
Improvements
- Reports generated by Call Methylation Levels can now be used as inputs to the Combine Reports tool.
- Import of Gene Ontology Annotation files now supports files including BOM encoding.
- The SRA toolkit has been updated from version 2.8.0 to 2.9.6.
- The hmmsearch program, used by Pfam Domain Search, is now 64-bit on all platforms. Previously a 32-bit version was distributed for use on macOS.
Bug fixes
- When exporting Oxford Nanopore alignments in SAM or BAM format, the platform specification is now exported as "ONT". Previously it was exported as "NANOPORE".
- Fixed an issue in the element history of a result generated by Extract Annotations, where the reference sequence track could appear to have been used for extracting annotations, even when it was not used.
- Fixed a bug in workflows using control flow elements, where some outputs were not saved if the same output channel was connected directly to both an output element and to a Collect and Distribute element. This problem occurred if the combined lengths of the input filenames for a given run exceeded 250 characters, and batch units were defined on the basis of the organization of the input data.
Compatibility
The following are the corresponding clients for the CLC Genomics Server 20.0.1.
- CLC Genomics Workbench 20.0.1
- CLC Main Workbench 20.0.1
- CLC Command Line Tools 20.0.1
We recommend running the corresponding versions of clients for CLC Genomics Server. However, CLC Genomics Workbench 20.0, CLC Main Workbench 20.0, and CLC Command Line Tools 20.0 can also connect to CLC Genomics Server 20.0. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
The following tools will be removed in a future release of the software:
- Create Combined RNA-Seq Report (legacy)
- Create Track from Experiment (legacy)
- Remove Reference Variants (legacy)
- Reverse Sequence (legacy)
- Roche 454 NGS import (legacy)
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
The "Run in Batch Mode" functionality used for launching workflows with multiple inputs from a CLC Workbench is now legacy. Such workflows can now be launched in batch mode from a CLC Workbench by checking the "Batch" checkbox when selecting the input data.
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools
Bugfixes
- Fixed an issue where running a workflow from the command line failed when the input was a local file to be imported on the fly.
Compatibility
CLC Command Line Tools 20.0.1 is the corresponding client for CLC Genomics Server 20.0.1.
CLC Command Line Tools 20.0.1 can also act as a client for the CLC Genomics Server 20.0. However, we recommend running the corresponding version of the CLC Command Line Tools CLC Genomics Server.
QIAGEN CLC Genomics Server 20.0
Server specific
New features
Workflows
- A new workflow queuing option has been introduced and queuing option labels have been updated for clarity.
- The new option, "Submit tasks in each workflow block to a single node" results in a CLC Grid Worker being launched for each iteration of a block and a CLC Grid Worker being launched for each additional block outside the iteration block(s). For workflows consisting of just one block, this option behaves just like the Submit all tasks to a single node option.
- The option previously called "Classic" is now labeled "Submit individual tasks to any available node"
- The option previously called "Single entity" is now labeled "Submit all tasks to a single node".
- The section of the web administrative interface where these options appear, previously called "Job queuing options" has been renamed Workflow queuing options.
Improvements
External Applications
The administrative interface for External Applications has been reworked, providing better support for administration, configuration and collaborative development. Changes include:
- The External applications tab of the web administrative interface groups configurations according to the subfolder of the Toolbox the external application is configured to appear under. For each configuration, version information, who last updated it and when is presented.
- External applications can be dragged between the Toolbox folder groupings in the the web administrative interface, making it easy to put them into the appropriate subfolder as seen in the Workbench Toolbox. Configurations can also be dragged into the Drafts area, which changes the status of that configuration and makes it no longer available for use by client software.
- The External Application configuration editor is now organized into tabs includes a new tab for options related to Management.
- There is now a new Drafts area, for storing external application configurations that should not be available for use by client software. Previously, availability was managed using checkboxes beside each external application configuration name.
- When an external application configuration is saved, just the configuration for that one application is updated. Previously, the save action would save all configurations.
- A button for deleting external application configurations has been added, allowing individual or multiple external applications to be selected and then deleted in a single action. The option to create a backup copy is offered before the deletion is carried out.
General
- The memory limit for CLC Grid Workers can be configured in grid presets via the web administrative interface.
- There is finer control over the memory values that can be set when installing the QIAGEN CLC Genomics Server using the graphical installer.
- Workflow logs now include information about tools in a workflow that failed to run to completion.
- Copying files from a Workbench CLC_References location to the QIAGEN CLC Genomics Server CLC_References location is now faster.
- Minor improvements to the organization of options under the Job distribution tab of the web administrative interface.
- LDAP/AD groups containing illegal XML-1.0-characters (e.g. control-characters) are now omitted from the list of groups read from LDAP/AD. If this happens, error information is printed to the server log.
Bug fixes
- Fixed a bug where every primary output from a workflow element was produced, even when output elements were not connected to them. This fix means that workflows now produce the same set of outputs, whether run on a CLC Server oer CLC Workbench.
- Fixed a bug where Expression Browsers could not be exported if they contained GO annotation values with parentheses but no database reference.
- Fixed an issue where the "Job category" setting in a grid preset was not sent to the underlying grid system.
- Fixed an issue affecting external applications with parameters configured as "User selected files (Import/export directory)" with a default file specified. When such external applications are included in a workflow, a file specified in the workflow design, or, for unlocked parameters, a file selected when launching the workflow, will now take precedence over this default. Previously the file specified in the external application configuration was always used.
- Fixed an issue where, when installing the software using the command line or setting the memory limit using a vmoptions file, the memory limit actually set was different than that specified. For most systems, this discrepancy is small relative to the amount of memory on the machine. (For example, on machine with 96Gb of memory, we expect the maximum discrepancy to be in the order of 2Gb.)
Changes
- The previous default option in the Direct data transfer from client systems area, "Files uploaded via temporary file location on server system(s) (legacy)", has been retired. When upgrading on a system with that option selected, the new default, "Not allowed" is applied in the updated version. The option "Files uploaded via Import/Export location" must be selected to enable data transfer from client systems to your QIAGEN CLC Genomics Server. This option can be configured at any time and that configuration will be preserved when you upgrade. Please see the manual for further details, including what types of transfers are affected by this setting.
- A new section called Cloud Settings is available under the Execution node settings of the web administrative interface. This section is present to support functionality associated with the Cloud Server Plugin, due for release early in 2020.
Shared with CLC Workbenches
New features
Workflows
- NGS sequence data can be imported on the fly, as an initial action when a workflow is run, avoiding the need import the data prior to launching the workflow.
- When launching workflows, batch units can now be defined using metadata, supplied either as a CLC medata table or by selecting an Excel format file containing information about the data.
- Workflows with multiple inputs, where those inputs should be matched with each other, can now be launched in batch mode, making use of the ability to define batch units based on metadata. For example, a workflow where sets of reads should be mapped to different reference sequences can now be launched in batch mode.
- Two new workflow elements have been introduced, Iterate and Collect and Distribute, which allow workflows to be designed where the execution of different parts of a workflow can be finely controlled. For example, using these elements, a single workflow can contain an RNA-Seq analysis step, typically run once per sample, as well as a Differential Expression for RNA-Seq step, typically run once for a set of samples. Similarly, a single workflow can be designed to run batches of trio analyses, producing cohort-level reports as outputs.
- Workflows now produce a Workflow Result Metadata table, which contain one row per output, with the relevant data element associated with that row. When launched in batch mode, the batch the row relates to is clearly indicated.
Epigenomics analysis
- Tools for detecting peaks in sequencing data are now available from a new 'Advanced Peak Shape Tools' folder found in the Epigenomics Analysis folder of the Toolbox:
- Learn Peak Shape Filter
- Apply Peak Shape Filter
- Score Regions
These tools were formerly available via the Advanced Peak Shape Tools Server Plugin (beta).
- A tool for detecting evidence for histone acetylation marks in genes or other predefined genomic regions is now in the Epigenomics Analysis folder of the Toolbox:
- Histone ChIP-Seq
This tool was formerly available via the Histone ChIP-Seq Server Plugin.
- Histone ChIP-Seq
miRNA analysis (small RNA)
- Tools for analyzing miRNA data are now available:
- Quantify miRNA
- Annotate with RNAcentral Accession Numbers
- Create Combined miRNA Report
These tools were formerly available via the Biomedical Genomics Analysis Server Plugin.
Import and export
- Reports can now be exported in JSON format and in PDF format.
- A new option in the Illumina importer, "Join reads from different lanes", will when enabled merge fastq files from the same sequencing run but from different lanes into a single sequences list.
- Create Expression Browser can now use tables imported from CSV or Excel format files as an annotation resource. Using such tables, sort and filtering can be done according to numeric annotation values as well as textual annotations.
- When exporting to PDF, there is now an option to export the history of the report.
Other new tools
- Combine Reports Summarizes information from multiple reports and produces a single report. It can be used for combining different report types for a single sample, or combining reports for a set of samples.
- Create Variant Track Statistics Report Creates a summary report for different types of variants in variant tracks.
RNA-seq Analysis
- A new option, "Library type setting", in the RNA-Seq Analysis tool offers the selection of "Bulk", for analysis of samples where reads are expected to be uniformly distributed across the full length of transcripts , or "3' sequencing", which tailors the output and report quality control for samples generated using low input 3' sequencing applications. "Bulk" is the default, and corresponds to the behavior of the tool in previous software versions.
- The definition of "Maximum number of hits for a read" in the RNA-Seq Analysis tool has been simplified. It now refers to the number of distinct places on the reference that a read maps best to. Previously, a more complex definition was used, involving checking for matches against genes and then against intergenic regions, with rules applied to the results.
- The report generated by RNA-Seq Analysis now includes the percentage of reads mapped to transcripts of particular length ranges, aiding the interpretation of the "Coverage along normalized transcript length" graph.
Trim Reads
- Sequences can now be trimmed to a fixed length from either the 3' or the 5' end.
- New options have been added to allow homopolymer trimming to be finely tuned.
- If "Trim ambiguous nucleotides" is enabled, ambiguous characters (e.g. N) at the end of sequences are removed, even if the number of these characters is lower than the limit set. Previously, such characters were left in place if their number was lower than the limit.
- When included in a workflow, Trim Reads now always produces an output when an output element is connected to it. This includes the following situations:
- Where no reads have been trimmed (either because all trimming options were deselected, or because none of the trim options matched any of the reads). In this case, the "Trimmed sequences" output will contain all input reads, "Discarded sequences" will be empty, and "Percentage trimmed" will be 100% in the report.
- Where all reads have been trimmed. In this case, the "Discarded sequences" output will contain all input reads, "Trimmed sequences" will be empty, and "Percentage trimmed" will be 0%.
BLAST
- A new option for the BLAST tool called Filter out redundant results, will when enabled cull HSPs on a per subject sequence basis by removing HSPs that are completely enveloped by another HSP.
- The NCBI blast executables have been updated to version 2.9.0.
- The option "Choose filter to mask low complexity regions" has been renamed to "Mask low complexity regions".
New options in analysis tools
- A new option for Local Realignment called "Allow guidance insertion mismatches" allows reads to be realigned using guidance insertions that have mismatches relative to the read sequences. This option is enabled by default.
- The creation of reads tracks (mappings) is now optional in the RNA-Seq Analysis tool.
- A new option in Copy Number Variant Detection (CNVs) called "Merge overlapping targets" allows overlapping target regions to be merged into one larger target region. CNV calls are made on this larger region.
- A new option called "Report unmethylated cytosines" is available for the Call Methylation Levels tool. When enabled, methylation levels are reported for all sites with read coverage, rather than only for sites with methylated cytosines.
- Two new options in Create Mapping Graph are available for generating coverage tracks for reads that mapped best to a single location on the reference sequence: "Specific read coverage" and "Paired read specific coverage".
Improvements
Workflows
- Placeholder-based naming of outputs in workflows can now be configured at a finer level: the {input} or {2} placeholder is now replaced by the name of the first workflow input by default. This can then be further configured to use the names of other inputs by specifying them by number after a colon in the placeholder. For example: {2:1,3} would be replaced by the names of workflow inputs number 1 and 3. Previously, a workflow output configured as {2} was replaced by a concatenation of all the workflow input names.
- The "Export to PDF" tool can now be used in workflows to export reports in PDF format.
Performance improvements:
- Mapping of NGS reads on multicore systems is now approximately 25% faster. Tools benefiting from this improvement include Map Reads to References, Map Reads to Contigs and Map Bisulfite Reads to Reference.
- Saving analysis results to an SSD is now considerably faster.
- The import of ZIP files has been improved: temporary objects are cleaned up during the import process, reducing the required disk space.
- Moving and deleting many elements at once is now faster.
- Emptying the Recycle bin now takes place in the background.
- Basic Variant Detection, Fixed Ploidy Variant Detection, and Low Frequency Variant Detection have been optimized to work on machines with lower memory. The changes are most noticeable in situations where coverage is high or where many variants are called.
- Improved memory handling when working with read mappings with very high coverage.
- There are general performance improvements in the following areas:
- BLAST and Add attB Sites tools when using large sequence lists
- Making BLAST databases where most sequences have the same name.
Demultiplex Reads
- Sequences with a single mismatch to a barcode and that can be grouped unambiguously can now be demultiplexed.
- Demultiplex Reads is now multithreaded for faster execution.
- The percentage or reads in each group is now reported to one decimal place in the report.
- The percentage of reads not grouped is no longer included in the "Reads per barcode" plot in the report.
- Various other minor improvements
QC reports
- Plots in the "Per-base analysis" section of the graphical report produced by QC for Sequencing Reads no longer include a value for base position 0. Values at position 0 in these plots previously were not meaningful.
- Base position numbering now starts with 1 in the coverage table of the supplementary report produced by the QC for Sequencing Reads tool. Previously the base position numbers started at 0.
- The reports produced by the QC for Read Mapping and QC for Targeted Sequencing tools now also include the median coverage.
- The coverage report generated by QC for Targeted Sequencing now includes the total length of target region positions with coverage below the specified level.
- QC for Target Sequencing has been updated:
- Reads mapping across the origin of circular references are now counted. (These were previously ignored)
- Relevant warning messages are now written to the report when a target region track contains regions that overlap or that span the origin of circular references.
- An issue has been addressed where the number of mapped bases reported in in the Target Region Overview section was not correct for tracks containing overlapping regions.
Import and export
- When importing BED files using the Import Tracks tool, only the first three columns (chromosome, start and end positions) are now required to match UCSC specifications for the BED format. Remaining columns that do not match these requirements will be imported as Var1, Var2, etc.
- The CSV importer has been updated:
- Values no longer need to be enclosed in quotation marks in the CSV file to be successfully imported.
- Data values starting with a numeric character but also containing non-numeric characters are now interpreted as text. Such values were previously converted to numbers and then only imported up to the first non-numeric character.
- The import of Nexus files has been updated to more closely match the format specifications.
- When selecting files to import from an import/export directory via a CLC Workbench, right-clicking on a folder name now brings up a menu with the options: "Add the content of a directory" or "Add the full content (recursively) of a directory".
- The "Excel 2010" and "Excel 97-2007" exporters now export NaN and +/-Infinity values to #N/A.
- When importing multiple files using the Standard Import, the process ends with an error if at least one of the files failed to import. The details of which file failed and why can be seen in the log.
- The GenBank exporter now replaces any spaces in annotation names with underscores.
Metadata related
- Metadata tables can be moved to a new File System Location while maintaining metadata associations.
- Three matching schemes are now available for associating data with metadata, based on matching data element names with values in the metadata key column: Exact, Prefix and Suffix. Suffix is a new option, where matches are sought starting from the end of data element names. Prefix, previously named Partial, looks for matches from the beginning of data element names. Exact, as in earlier versions, seeks exact matches between data element names and metadata key column values.
Create Box Plot
- Create Box Plot now calculates the median and percentile values in the same way as the "quantile" method in R. This aligns with the way these values are calculated by other tools in the QIAGEN CLC Genomics Workbench.
- Whiskers of boxplots now range from the lowest data point within 1.5 times the inter quartile range (IQR) of the lower quartile and the highest data point within 1.5 IQR of the upper quartile. Previously, they extended 1.5 times the length of the box (IQR).
Improvements to other analysis tools
- The algorithm used to auto-detect paired distances when mapping NGS reads has been improved. Tools benefiting from this include Map Reads to References, RNA-Seq Analysis, Map Reads to Contigs and Map Bisulfite Reads to Reference.
- Improvements to SRA download:
- The temporary disk space needed to download data has been reduced significantly.
- Technical reads are now discarded.
- Orphan reads are now put into a separate output for paired data.
- When importing multiple files containing sequencing reads (QIAGEN GeneReader, Illumina, PacBio, Fasta Read, Ion Torrent) or when importing SAM format files, a single problematic file does not stop the import. The import process now continues with the next file if it encounters a file that could not be imported.
- The "Chromosome coverages" section in the results report produced by the "Copy Number Variant Detection (CNVs) tool" is now a table.
- The "CPM" expression option in the side panel setting of the Expression Browser has been renamed "CPM (TMM-adjusted)" to reflect how it is calculated.
- The TMM Normalization used in the Expression Browser, in Create Heat Map for RNA-Seq, PCA for RNA-Seq, Differential Expression in Two Groups, and Differential Expression for RNA-Seq, has been changed. The change involves how a reference column is selected for TMM Normalization, and is unlikely to lead to noticeable differences in results. Changes are most likely to occur in situations where the majority of transcripts/genes have zero expression.
- The "All group pairs" and "Across groups (ANOVA-like)" comparisons in Differential Expression for RNA-Seq now compare expressions in the same direction. Previously, the fold changes reported by these 2 tests for the same data, entered in an identical order, had opposite signs.
- The long form of the HGVS nomenclature for DNA is now used by the Amino Acid Changes tool for annotating coding region changes: the bases of deletions and duplications not longer than 50 nucleotides are included, and repeated sequences are reported using the insertion form.
- Exon information added by Annotate with Exon Numbers now includes a blank entry if a variant is located in an intronic region. For locations with multiple isoforms annotated, this gives a one to one relationship between the number of exon annotations and the number of isoforms.
- InDels and Structural Variants now consistently assigns a count of 1 for a paired read, leading to improved statistics. Previously, regions where the R1 and R2 reads overlapped were assigned a count of 2.
- Filter Variants on Custom Criteria now prints a message to the log if any columns specified in the criteria are not present in the data.
- The QC for Targeted Sequencing tool now sets the direction of each read in a pair independently, which can lead to more accurate forward and reverse coverage values in some situations.
- Identify Shared Variants now reports homozygous sample frequency, heterozygous sample frequency and mean allele frequency.
Other improvements
- Outputs of tools provided by plugins now include the plugin name and version in the element history.
- Tool and workflow logs now include an "Elapsed time" column.
- CLC URLs have been made more compact.
Bug fixes
- Fixed a rare issue that could cause some jobs to fail when multiple instances of Filter Variants on Custom Criteria were run simultaneously.
- Fixed a bug where failing import of Illumina .fastq files could leave files in the temporary files directory.
- Fixed an issue in the "Duplicated sequences" section of the QC for Sequencing Reads graphical report, where the relative sequence count for the duplicate count of 100 was incorrectly reported in the field for the duplicate count of 99.
- Fixed a bug where Expression Browsers could not be exported if they contained GO annotation values that included parentheses but no database reference.
- Fixed an issue where moving a folder within a server location occasionally caused the contents of the folder to become corrupt, until the persistence was reindexed.
- Fixed an issue in RNA-Seq Analysis where an error message was produced if the value entered for "Minimum read count fusion gene table" was 1 and no fusions were found.
- Fixed an issue in Copy Number Variant Detection (CNVs) algorithm reports, where values in the "Start BIC" and "End BIC" columns in section 3.1 were truncated to a maximum of 4 digits in the integer part. The underlying calculations were not affected.
- Fixed an issue where the Gene Set Test tool did not exclude relevant GO terms as computationally inferred if there were parentheses in the GO annotation description.
- Fixed an issue in the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection tools that in rare cases could result in the QUAL value reported being slightly different between runs.
- Fixed an issue in Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection where the tool could continue to use the CPU and write to disk even after a job was cancelled.
- Fixed an issue in the GFF2 importer with different representation of stop codons in the CDS regions due to differences in input formats.
- Fixed an issue in Map Reads to Reference where the summary statistics table in the report did not include paired read statistics for mappings with paired end reads if no reads were mapped in intact pairs.
- Various minor bugfixes
Changes
- The Java version bundled with QIAGEN CLC Genomics Server 20.0 is Java 11, where we use the JRE from AdoptOpenJDK.
- On Linux systems, the 'fontconfig' software package and at least one font package (e.g dejavu-sans-font) are required as the Java JRE no longer includes default fonts. The Linux installer for the QIAGEN CLC Genomics Server no longer includes the 'fontconfig' package, which must therefore be installed separately before running the QIAGEN CLC Genomics Server installer.
- The {input} or {2} placeholder used when naming outputs from workflows is now replaced by the name of the first workflow input by default. Previously, it was replaced by a concatenation of all the workflow input names.
- The tool Remove Orphan Reference Variants is now called Remove Homozygous Reference Variants.
- The naming of some outputs from some tools have been updated:
- Demultiplex Reads
- Grouped reads Now: <sample name> <Barcode name> Previously: <Barcode name>
- Ungrouped reasds Now: <sample name> Not grouped Previously: Not grouped
- Report Now: <sample name> Demultiplex Reads report Previously: Demultiplex Reads report
- Where multiple sequence lists are provided as input, the name of the first selected sample is used as the sample name.
- Trim Reads
- Trimmed, paired sequences Now: <sample name> (paired, trimmed pairs) Previously: <sample name> (paired) trimmed (paired)
- Trimmed, broken pairs Now: <sample name> (paired, trimmed orphans) Previously: <sample name> (paired, trimmed orphans)
- Discarded sequences Now: <sample name> (discarded) Previously: <sample name> (discarded)
- Report Now: <sample name> report Previously: <sample name>(trim report)
- In the case where multiple sequence lists were provided as input to the Trim Reads, the name of the first selected sample will be used in the output.
- RNA-Seq analysis tool and Map Reads to Reference
- Output names have been shortened: the content of the last set of parentheses of the input name is replaced by in the output name with a new tag denoting the specific type of output. Previously, tags were added to the input names when forming the output name.
- The word "un-mapped" has been replaced with "unmapped" in output names.
- When unmapped reads outputs are added to metadata tables the inputs are associated with, they are are now assigned the metadata role "Unmapped reads".
- Demultiplex Reads
- The following are legacy tools. "(legacy") has been appended to their names and they will be removed in a future version of the software.
- Create Combined RNA-Seq Report: The new "Combine Reports" tool includes this functionality, and should be used to combine RNA-Seq reports.
- Create Track from Experiment
- Reverse Sequence
- Small RNA Analysis
- Extract and Count
- Download miRBase
- Annotate and Merge Counts
- Remove Reference Variants The functionality of this tool can be replicated using "Filter Variants on Custom Criteria" with relevant criteria. To remove reference variants where the alternate allele has already been filtered away, use the new tool Remove Homozygous Reference Variants.
Functionality retirement
The following tools have been retired:
- Identify Differentially Expressed Gene Groups and Pathways (legacy)
- Add Fold Changes (legacy)
- Add Information from Overlapping Genes (legacy)
- Create Fold Change Track (legacy)
- Download Reference Genome Data (legacy)
The import of the following formats is no longer supported:
- qseq
- scarf
Compatibility
The follow are the corresponding client applications for QIAGEN CLC Genomics Server 20.0
- QIAGEN CLC Genomics Workbench 20.0
- QIAGEN CLC Main Workbench 20.0
- CLC Command Line Tools 20.0
Plugin notes
Plugin retirements
Functionality of the following plugins has been integrated into the QIAGEN CLC Genomics Server:
- Histone CHIP-Seq Server Plugin
- Advanced Peak Shape Tools Server Plugin
Advanced notice
The following tools will be removed in a future release of the software:
- Create Combined RNA-Seq Report (legacy)
- Create Track from Experiment (legacy)
- Remove Reference Variants (legacy)
- Reverse Sequence (legacy)
- Roche 454 NGS import (legacy)
- Extract and Count (legacy)
- Download miRBase (legacy)
- Annotate and Merge Counts (legacy)
The "Run in Batch Mode" functionality used for launching workflows with multiple inputs from a CLC Workbench is now legacy. Such workflows can now be launched in batch mode from a CLC Workbench by checking the "Batch" checkbox when selecting the input data.
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools 20.0
Please see the QIAGEN CLC Genomics Server 20.0 listings above for the details about the new tools and features listed here.
New tools
Epigenomics analysis
Tools for detecting peaks in sequencing data are now available and can be called using the following commands:
- peak_shape_apply_filter Apply Peak Shape Filter
- peak_shape_learn_filter Learn Peak Shape Filter
- peak_shape_score_regions Score Regions
- histone_chip_seq Histone ChIP-Seq
The first three of these commands were formerly available via the Advanced Peak Shape Tools Server Plugin (beta), and the fourth via the Histone ChIP-Seq Server Plugin.
miRNA analysis (small RNA)
Tools for analyzing miRNA data are now available and can be called using the following commands:
- create_combined_mirna_report Create Combined miRNA Report
- ann_w_rna_central_accession_numbers Annotate with RNAcentral Accession Numbers
- quantify_small_rna Quantify miRNA
These commands were formerly available via the Biomedical Genomics Analysis Server Plugin.
Import and export
- export_pdf Export report to PDF format
- export_report_json Export report to JSON format
Other new tools
- combine_reports Combine Reports
- variant_statistics_report Create Variant Track Statistics Report
New features for existing tools
- trim - new options added
- --fixed-length-trimming-end
- --fixed-length-trimming-max-length
- --fixed-length-trimming-trim
- --poly-a-trim
- --poly-c-trim
- --poly-g-trim
- --poly-t-trim
- --trim-3-prime
- --trim-5-prime
- ngs_import_illumina - new option added
- --join-lanes
- rna_seq - new options added
- --create-reads-track
- --library-type
- cnv_detection - new option added
- --merge-overlapping-targets
- mapping_graph_tracks
- --paired-end-specific-coverage
- --single-match-coverage
- blast - new options added
- --blastn-use-best-hit
- --blastp-use-best-hit
- --blastx-use-best-hit
- --tblastn-use-best-hit
- --tblastx-use-best-hit
Changes
- The Java version bundled with the <product name> has been updated to the Java 11, where we now use the JRE from AdoptOpenJDK.
- remove_orphan_reference_variants should now be called using remove_homozygous_reference_variants.
- A new option, -L, has been added to support functionality associated with the Cloud Server Plugin, due for release early in 2020.
Commands changed
- ngs_import_iontorrent Removed options: --linker-sequence, --max-distance, --max-distance, --min-distance, --paired-reads, --read-orientation
- ngs_import_genereader Removed options: --discard-failed-reads, --illumina-trim, --max-distance, --miseq-demultiplexing, --paired-reads, --quality-score, --read-orientation
Commands removed
- add_fold_changes_to_variant
- add_info_overlapping_genes
- create_fc_from_expr_tracks_algo
- download_genome
- go_analysis_expression_change
Advanced Notice
- The remove_orphan_reference_variants command will be retired in a future version of the product. The remove_homozygous_reference_variants command runs the same tool, but using a new name.
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
QIAGEN CLC Genomics Server 11.0.4
Server specific
Various minor improvements
Shared with workbenches
Improvements
- Download Blast Databases now retrieves databases from the dedicated version 4 archive ("v4" folder) at the NCBI. From February 4, 2020, this tool in older versions of the software retrieves version 5 (dbV5) blast databases, which cannot be searched using the BLAST search tool in this release line or earlier lines. (QIAGEN CLC Genomics Server 20.0 and higher can be used to download and search dbV5 blast databases.)
- The hmmsearch program, used by Pfam Domain Search, is now 64-bit on all platforms. Previously a 32-bit version was distributed for use on macOS.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 11.0.4.
- QIAGEN CLC Genomics Workbench 12.0.4
- QIAGEN CLC Main Workbench 8.1.4
- QIAGEN CLC Command Line Tools 6.0.4
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 12.0, 12.0.1, 12.0.2 and 12.0.3, QIAGEN CLC Main Workbench 8.1, 8.1.1, 8.1.2 and 8.1.3, and QIAGEN CLC Command Line Tools 6.0, 6.0.1, 6.0.2 and 6.0.3 can also connect to QIAGEN CLC Genomics Server 11.0.4. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
CLC Server Command Line Tools 6.0.4
Compatibility
QIAGEN CLC Command Line Tools 6.0.4 is the corresponding client for QIAGEN CLC Genomics Server 11.0.4.
QIAGEN CLC Command Line Tools 6.0.4 can also act as a client for the QIAGEN CLC Genomics Server 11.0, 11.0.1, 11.0.2 and 11.0.3. However, we recommend running the corresponding versions of client and server software.
QIAGEN CLC Genomics Server 11.0.3
Server specific
Improvements
- A new flag, -Dskip_lazytmp_cleanup=true, has been introduced to support grid workers sharing a temporary file location that does not have global file locks enabled.
Shared with workbenches
Improvements
- Improved the stability of workflow execution when the data is placed on a Network File System (NFS).
- The Basic Variant Detection, the Fixed Ploidy Variant Detection and the Low Frequency Variant Detection tools have been updated: Variants extending up to 50 nucleotides beyond either end of a target region are now reported in full, while variants extending even further will include only the first 50 nucleotides beyond the target region. Insertions at the right hand border of a target region are now considered to be a variant within the target region.
Bug fixes
- Fixed an issue causing the Bonferroni and FDR multiple testing corrections of the Differential Expression for RNA-Seq and Differential Expression in Two Groups tools to be calculated using a greater number of tests than were actually performed, resulting in the corrections being too strict. Further details...
- Fixed an issue where a large nucleotide sequence could be detected as a protein sequence when being extracted from a BLAST database.
- Fixed an issue in the Local Realignment tool that could affect the re-alignment of mappings created using the Create UMI Reads tool of the Biomedical Genomics Analysis plugin. This issue would sometimes lead a small minority of reads to be re-aligned differently in different runs. Further details...
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 11.0.3.
- QIAGEN CLC Genomics Workbench 12.0.3
- QIAGEN CLC Main Workbench 8.1.3
- QIAGEN CLC Command Line Tools 6.0.3
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 12.0, 12.0.1, and 12.0.2, QIAGEN CLC Main Workbench 8.1, 8.1.1, and 8.1.2, and QIAGEN CLC Command Line Tools 6.0, 6.0.1, and 6.0.2 can also connect to QIAGEN CLC Genomics Server 11.0.3. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
Support for paired-end reads in the Ion Torrent importer will be retired and will not be available in the the next major release of the software.
The following tools will be removed in a future release of the software:
- Roche 454 NGS import
- Compare Sample Variant Tracks
- Create Track from Experiment
- Identify Differentially Expressed Gene Groups and Pathways
- Add Fold Changes
- Add Information from Overlapping Genes
- Create Fold Change Track
- Download Reference Genome Data (The functionality via the Reference Data Manager is unaffected by this.)
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools
Compatibility
CLC Command Line Tools 6.0.3 is the corresponding client for QIAGEN CLC Genomics Server 11.0.3.
CLC Command Line Tools 6.0.3 can also act as a client for the QIAGEN CLC Genomics Server 11.0, 11.0.1, and 11.0.2. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
Advanced notice
Some tools previously only available for use with Biomedical-enabled QIAGEN CLC Genomics Servers are now available for any QIAGEN QIAGEN CLC Genomics Server 11.0 and higher, but are legacy tools and will be retired in a future release of the software. These tools are placed within the Legacy folder of QIAGEN CLC Genomics Workbench 12.x. The relevant QIAGEN CLC Server Command Line Tools commands are:
- add_fold_changes_to_variant
- add_info_overlapping_genes
- create_fc_from_expr_tracks_algo
- go_analysis_expression_change
QIAGEN CLC Genomics Server 11.0.2
Please also refer to the Latest Improvements listing for QIAGEN CLC Genomics Server 11.0.1 below to see all the changes that have taken place since QIAGEN CLC Genomics Server 11.0.
Shared with workbenches
Bug fixes
- Fixed an issue where an error could occasionally arise during workflow validation, for example when installing a workflow.
- Fixed an issue where analyses with a read mapping step could occasionally fail with an error if different references were being used at the same time, and the reference cache (the temporary disk space used for reference data structure files) exceeded the configured size limit.
Improvements
The maximum amount of temporary disk space for the read mapping reference cache (the temporary disk space used for reference data structure files) has been increased to 16 GB. It was previously set to 8 GB.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 11.0.2.
- QIAGEN CLC Genomics Workbench 12.0.2
- QIAGEN CLC Main Workbench 8.1.2
- QIAGEN CLC Command Line Tools 6.0.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 12.0 and 12.0.1, QIAGEN CLC Main Workbench 8.1 and 8.1.1, and CLC Command Line Tools 6.0 and 6.0.1 can also connect to QIAGEN CLC Genomics Server 11.0.2. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
The following functionalities of NGS importers will be retired and will not be available in the the next major release of the software:
- Support for paired-end reads in the Ion Torrent importer
- Import of SCARF and QSEQ format files in the Illumina importer
The following tools will be removed in a future release of the software:
- Roche 454 NGS import
- Compare Sample Variant Tracks
- Create Track from Experiment
- Identify Differentially Expressed Gene Groups and Pathways
- Add Fold Changes
- Add Information from Overlapping Genes
- Create Fold Change Track
- Download Reference Genome Data (The functionality via the Reference Data Manager is unaffected by this.)
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 11.0.2.
Advanced Notice
Some tools previously only available for use with Biomedical-enabled QIAGEN CLC Genomics Servers are now available for any QIAGEN CLC Genomics Server 11.0 and higher, but are legacy tools and will be retired in a future release of the software. These tools are placed within the Legacy folder of QIAGEN CLC Genomics Workbench 12.x. The relevant CLC Server Command Line Tools commands are:
- add_fold_changes_to_variant
- add_info_overlapping_genes
- create_fc_from_expr_tracks_algo
- go_analysis_expression_change
Compatibility
CLC Command Line Tools 6.0.2 is the corresponding client for QIAGEN CLC Genomics Server 11.0.2.
CLC Command Line Tools 6.0.2 can also act as a client for the QIAGEN CLC Genomics Server 11.0 and 11.0.1. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 11.0.1
Bug fixes
- Fixed an issue caused by a bug in Java 10 where files resulting from an analysis were occasionally corrupted. Often this affected analysis log files, but could affect other outputs. This affected GPFS file systems, and could affect other distributed file systems. The QIAGEN CLC Genomics Server release line uses Java 10 and while we have worked around this issue so that analysis results files should no longer be affected by it, we recommend that the software itself not be installed directly on a GPFS file system.
Shared with Workbenches
Improvements
- Identify Shared Variants can now run on a single variant track, so that, when included in a workflow, it no longer requests more than one workflow connection (variant track) in order to compare variants shared across samples.
- Improved time performance of the Basic Variant Detection, Fixed Ploidy Variant Detection, Low Frequency Variant Detection tools when running on empty non-circular chromosomes.
- Import of VCF files with reference overlap representation and filtered variants present is now supported. In QIAGEN CLC Genomics Workbench 12.0, filtered variant positions could interfere with reference overlap processing when importing VCF files.
Bug fixes
- Fixed an issue where variants of exactly the same size and location as a target region would not be called. This issue affected variant calling when the "Restrict calling to target regions" parameter was used in Basic Variant Detection, Fixed Ploidy Variant Detection, or Low Frequency Variant Detection. For example, a SNV would not be called in a target region of size 1 that covered the SNV, but would be called in a target region of size >1.
- Reference variants without exact matching non-reference variants are now retained if they partially overlap non-reference variants. This may affect users of the following tools: Annotate with Flanking Sequence, Annotate with Conservation Score, Annotate with Exon Numbers, Remove Variants Present in Control Reads, Remove Marginal Variants, , Remove Orphan Reference Variants, Filter against Known Variants, Filter Based on Overlap, GO Enrichment Analysis, Link Variants to 3D Protein Structure, Predict Splice Site Effect, TRIO Analysis, Identify Shared Variants, Add Information from Overlapping Genes (legacy), Compare Simple Variant Tracks (legacy) and Remove Variants Found in Allele Frequency Community (from the Ingenuity Variant Analysis Server Plugin). Further details.
- Fixed a bug where the Local Realignment tool would fail when run on multiple inputs in a non-batching mode.
- Fixed an issue with the Amino Acid Changes tool where it would fail if there were circular chromosomes and the gene flanking option was enabled. Flanking checks have now been disabled for any exons/CDSs that span a circular chromosome origin.
- When enabling prioritization in Amino Acid Changes, the tool now annotates the highest prioritized transcript of all genes at a variant's position. Previously it only annotated the highest prioritized transcript for one gene on each strand.
- Fixed an issue where the RNA-Seq Analysis tool could fail when run using genes and transcripts from the Transcript Discovery tool if these contained a "biotype" column.
- Fixed an issue where, when used on miRNA data, the tools Differential Expression for RNA-Seq and Differential Expression in Two Groups would report two "FDR p-value" columns. The second column is now correctly labeled "Bonferroni corrected p-value".
- Fixed a rare issue where RNA-Seq Analysis would fail if a read that mapped to the start of a circular chromosome had an unaligned region at the start long enough that, had it aligned, it would have wrapped around the origin of the circular chromosome.
- Fixed a bug where, when running the Identify Graph Threshold Areas tool with a window size greater than 1, the last interval found was one nucleotide shorter than expected.
- Fixed a bug where the Download BLAST Databases tool sometimes failed with an error during download.
- Fixed a concurrency bug in the Copy Number Variant Detection tool, which very rarely resulted in the tool reporting all low-coverage targets on one or more chromosomes as false positive deletions.
- Fixed an issue where the Excel and PDF export of reports failed for reports that contained empty tables.
- Fixed an issue where text files did not have the expected ".txt" extension after a "Tab delimited text" export with the "Output as a single file" option selected.
- Fixed an issue where the PDF export of reports did not contain column headers if the first header was empty.
- The BED Export tool will now export empty values in the "name" column as dots "." and the BED Import tool now interprets dots "." in the "name" column as empty values.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 11.0.1.
- QIAGEN CLC Genomics Workbench 12.0.1
- QIAGEN CLC Main Workbench 8.1.1
- QIAGEN CLC Command Line Tools 6.0.1
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 12.0, QIAGEN CLC Main Workbench 8.1, and CLC Command Line Tools 6.0 can also connect to QIAGEN CLC Genomics Server 11.0.1. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
The following functionalities of NGS importers will be retired and will not be available in the the next major release of the software:
- Support for paired-end reads in the Ion Torrent importer
- Import of SCARF and QSEQ format files in the Illumina importer
The following tools will be removed in a future release of the software:
- Roche 454 NGS import
- Compare Sample Variant Tracks
- Create Track from Experiment
- Identify Differentially Expressed Gene Groups and Pathways
- Add Fold Changes
- Add Information from Overlapping Genes
- Create Fold Change Track
- Download Reference Genome Data (The functionality via the Reference Data Manager is unaffected by this.)
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
CLC Server Command Line Tools
Changes
Compatibility
CLC Command Line Tools 6.0.1 is the corresponding client for QIAGEN CLC Genomics Server 11.0.1.
CLC Command Line Tools 6.0.1 can also act as a client for the QIAGEN CLC Genomics Server 11.0. However, we recommend running the corresponding version of the CLC Command Line Tools and QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 11.0
Server specific
Improvements
- The ability to log into the QIAGEN CLC Genomics Server can now be restricted to members of specified groups.
- Data created in the QIAGEN CLC Genomics Server 11.0 and stored in a file server location will be internally compressed by default. Internal compression is also the default in the corresponding client Workbench software. Internal compression is enabled per default, but can be disabled using the admin interface for the server (under Main configuration | Data compression). Corresponding functionality is also available in the client Workbenches. Data that is internally compressed can be exported to CLC or ZIP format without this compression using options provided in the client software.
- The tools delivered with a server plugin can now be listed by clicking on the name of an installed plugin under the Plugins tab in the web administrative interface.
- When cancelling or re-queuing jobs listed in the Queue tab of the web administrative interface, confirmation is now sought before any action is taken.
- The size a of file in a server Import/export location can be now be seen within the import wizards of QIAGEN CLC Workbench client software by hovering the mouse cursor over the filename.
- Various minor improvements
Bug fixes
- Fixed an issue when using launchd on macOS, where the Restart option under the Status and Management tab of the web administrative interface would stop the server but not restart it, which could affect restarting the service when the system was rebooted and when the service was started up automatically at the end of installing an upgraded version.
- Various minor bugfixes
Changes
- Licenses for QIAGEN CLC Genomics Server modules on a grid node or job node setup are now only needed on the master server. Previously, licenses were also needed for the nodes.
- The license formerly required to run tools and workflows delivered as part of the Biomedical Genomics Server solution is no longer needed. To get access to tools and workflows relevant to biomedical genomics and QIAseq panel data analysis, a valid QIAGEN CLC Genomics Server license is needed and the Biomedical Genomics Analysis Server Plugin must be installed.
- For systems enabled to connect to an SQL database,
- JDBC drivers need to be installed before configuring a database location. Previously this was only the case for MySQL and Oracle databases.
- The H2 database is no longer supported and is not included with the product.
Important notes related to this release
A flag in the CLCGenomicsServer.vmoptions file must be removed when upgrading in place to the QIAGEN CLC Genomics Server 11.0 on macOS. Please delete "-d64" from the CLCGenomicsServer.vmoptions file, which can be found in the QIAGEN CLC Genomics Server installation area and then restart the QIAGEN CLC Genomics Server service. The -d64 option is not supported by recent versions of java. Its inclusion in the vmoptions file on macOS systems will stop the QIAGEN CLC Genomics Server from starting up.
Shared with workbenches
New features
Bisulfite Sequencing Analysis
- Three tools for analyzing cytosine methylation data are now available: Map Bisulfite Reads to Reference, Call Methylation Levels, and Create RRBS-fragment Track. These tools reveal methylated cytosines genome wide and at single base level resolution, support statistical comparison between samples accommodating different experimental designs, and support reduced representation sequencing. These tools were formerly available via the Bisulfite Sequencing Server Plugin, but are now integrated into the server software.
- The Map Bisulfite Reads to Reference offers the option to enable global alignments to produce read mappings with no unaligned ends, which was not formerly possible.
- The default "cost of insertions and deletions" in the Map Bisulfite Reads to Reference tool is now "affine". This improves results on internal benchmarks because it breaks a symmetry in the default "linear" scoring for reads ending in homopolymers (which are abundant in bisulfite mapping due to in-silico conversion of the reads and references to a 3 letter alphabet). This symmetry meant that either a mismatch or an insertion could be introduced at the ends of some of these reads without changing the mapping score. In practice the mismatch is more plausible, and this is favored by the affine penalties.
New tools
- Import Primer Pairs for importing primer pair locations from a generic text format file or from a QIAGEN gene panel primer file. This tool was formerly only available in the Biomedical-enabled QIAGEN CLC Genomics Servers.
- Import QIAGEN GeneReader for importing QIAGEN GeneReader data.
- Copy Number Variation Detection (CNVs), for detecting copy number variations (CNVs) from targeted resequencing experiments. Using read mappings and target regions as input, it produces amplification and deletion annotations. This tool was formerly only available in the Biomedical-enabled QIAGEN CLC Genomics Servers.
- Remove Information from Variants for removing annotations on variants. This tool was formerly only available in the Biomedical-enabled QIAGEN CLC Genomics Servers.
- Remove Orphan Reference Variants for removing reference allele variants that lack a corresponding non-reference variant allele.
- Differential Expression in Two Groups to be used instead of the more general Differential Expression for RNA-Seq tool for testing differential expression between a single treatment group and a control group. Both these tools take the same input, but Differential Expression in Two Groups does not require a metadata table to describe the experimental design.
Improvements
Overall improvements
- Variant tracks now include Forward coverage and Reverse coverage annotations.
- The following tools have changed name:
-
- Coverage Analysis to Whole Genome Coverage Analysis
- Filter Marginal Variant Calls to Remove Marginal Variants
- Filter Reference Variants to Remove Reference Variants
- Create Detailed Mapping Report to QC for Read Mapping
- Create Statistics for Target Regions to QC for Targeted Sequencing
- Identify Candidate Variants to Filter Variants on Custom Criteria
- The Extract Reads Based on Overlap tool has been renamed to Extract Reads. In Extract Reads, the "Overlap tracks" parameter is now optional, so all reads in a mapping can be easily extracted if desired. The Extract Reads tool can also generate either reads tracks or sequence lists as output.
RNA-Seq Analysis tool improvements
- The RNA-Seq Analysis tool supports the alignment and quantification of reads that wrap around the ends of circular chromosomes.
- The tool caches the data structure used by the read mapper to map reads to known mRNA annotations. This reduces run time by up to 3 minutes per sample, with the greatest benefits being observed when using large numbers of mRNA annotations on systems with few cpu cores.
- A new row has been added to the "Strand specificity" section of the report produced by the tool. The row contains the number of "Reads with known strand", which is used in determining the percentage of reads ignored due to being on the wrong strand.
- The "Detected transcripts" column has been renamed to "Uniquely identified transcripts" for both the gene-level and transcript-level expression tracks.
- The "Reference Sequence" section of the report now lists the number and length of all chromosomes used during read mapping. Previously it reported only the length and number of chromosomes with at least one genes or transcript.
- The RNA-Seq Analysis and Map Reads to Reference tools can now share cached copies of the read mapper indexes. This means that the average run time over many samples will be reduced if both tools are frequently used.
- The RNA-Seq Analysis tool is now sometimes able to avoid writing the reference to disk. The changes are most noticeable when batch processing many samples against the same large reference.
- Read mappings produced by the RNA-Seq Analysis tool previously ignored deletions and insertions at exon-intron boundaries. This meant that such deletions/insertions would not be detectable in downstream variant calling. The tool has been updated to keep the deletions and insertions in the mapping, implicitly favoring the hypothesis of a deletion/insertion over a novel splice junction. This change does not affect expression levels.
Amino Acid Changes tool improvements
- The Amino Acid Changes tool previously used square brackets to describe coding region and amino acid changes when a single variant affected multiple transcripts or proteins, e.g., NM_207170.3:c.[140C>T]; NM_015484.4:c.[266C>T]. These brackets have now been removed (e.g., NM_207170.3:c.140C>T; NM_015484.4:c.266C>T) to comply with the HGVS standards, which reserve the brackets for the reporting of alleles. These changes are also reported by the variant callers when run on a standalone read-mapping with CDS annotations.
- The tool describes replacements in the compact format preferred by HGVS (112_117delinsTG). Previously the description included the reference sequence (112_117delAGGTCAinsTG). These changes are also reported by the variant callers when run on a standalone read-mapping with CDS annotations.
- We implemented the 3' HGVS compliance rule for c. annotation of variants: When doing p. annotations (protein-level HGVS) we similarly annotate insertions that really are duplications as such.
- The tool uses all positions covered by a variant when describing coding region changes, in accordance with HGVS recommendations. Previously the tool restricted its change descriptions to positions within a transcript (if supplied) or CDS. This fix will therefore mainly affect the descriptions of deletions that partially overlap a transcript. These changes are also reported by the variant callers when run on a standalone read-mapping with CDS annotations.
- An option can add c. annotations (HGVS DNA-level) for variants that are within a certain distance from the transcript boundaries. The distance can be configured but defaults are set to 5 kb upstream and 3 kb downstream.
- An option in the Amino Acid Changes tool allows users to output a variant track HGVS compliant.
- An option allows the prioritization of a single transcript when several annotations are available for one variant.
VCF importer and exporter improvements
- The VCF exporter and importer have been improved and now support VCF v4.2.
- VCF Export "Enforce diploid" option has been replaced with an improved and more general "Enforce ploidy" option set by default to 2. This option gives more control over the exported genotype and better compatibility with external applications such as Ingenuity Variant Analysis.
- Four complex variant representations can now be handled by the VCF importer and exporter, including the common reference overlap representation.
- The VCF exporter has an option to write variant annotations as INFO fields.
- In the VCF importer, we fixed an issue with the import of INFO IDs that contained non-alphabetical characters.
BED importer and exporter improvements
- The BED exporter now replaces spaces in feature names with underscores, since white space is not allowed in the BED feature names.
- The BED file exporter now always exports to BED12 format.
- The BED importer limit for name lengths has been raised from 80 to 256 characters.
Various tools improvements
- The Local Realignment tool has been optimized to run more quickly.
- The De Novo Assembly tool has been updated to use the same version of the read mapper as the one used by the Map Reads to Contigs tool. This typically leads to more accurate mappings. For larger assemblies the run time is expected to decrease on average, but for small assemblies run time is likely to increase.
- Filter Against Known Variants no longer adds duplicate annotations from known variants tracks. In addition, Overlap, Exact match and Partial MNV match annotations are now always added to the output variant track.
- The Import Ion Torrent and Import PacBio tools support import of reads from SAM or BAM format files. Mapping information is discarded during this import. To import a read mapping from SAM or BAM format files, use the existing Import | SAM/BAM Mapping Files... tool.
- Handling of RNA-Seq reads by the InDels and Structural Variants tool has been improved. This change affects breakpoint p-values and as a result, affects the number of breakpoints and variants reported. In addition, we have improved the calculations of the values reported for the "perfectly mapped" and "not perfectly mapped" breakpoint annotations.
- Improvements in the Differential Expression for RNA-Seq tool
- It now accepts RNA-Seq panel samples (including QIAseq panel samples) as input and offers additional normalization options.
- It now outputs statistical comparison tables in addition to the statistical comparison tracks. Tables offer the same functionality as the tracks, except for the track view.
- Improvements to the Annotate with Overlap Information tool:
- It now adds "Fold Change" annotations if you annotate with a Statistical Comparison track.
- It now has an option to "Keep only one copy of duplicate annotations".
- The Download BLAST Databases tool now requires less disk space when downloading and installing BLAST databases.
- The history information associated with results from the BLAST and BLAST at NCBI tools now includes the version of the BLAST software used for the search.
- The Reverse Sequencetool now names the output sequence name with the input sequence name followed by "-R" .
- For the Gene Set Test tool, the name of the columns "Occurrences in all genes", "Genes (universe)", "Occurrences in subset", "Genes (subset)" have been renamed to "Detected Genes", "Detected Genes (Names)', "DE Genes", "DE Genes (Names)".
- For the GO Enrichment Analysis, the name of the columns "Occurrences in all genes", "Genes (universe)", "Occurrences in sample", "Genes (overlap)" have been renamed to "Matched Genes", "Matched Genes (Names)", "Genes with Variations", "Genes with Variations (Names)".
Bug fixes
- Fixed an issue affecting the Map Reads to Reference tool when it was included in a workflow, where if the References parameter was connected to an input, and a masking track was configured, an error was reported stating that the masking track was incompatible with the reference genome, whether or not it was compatible.
- Fixed an issue with the Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools that caused a small minority of variants to go unreported under certain conditions expected to arise rarely.
- Fixed a bug in the Identify Candidate Variants tool, (now called Filter Variants on Custom Criteria), where no results were returned when one or more criteria used a comparison operator with more than one term (e.g. ">=", "abs value <").
- Fixed a bug in the Import Tracks tool where one nucleotide exons would be skipped during import of GTF files. A consequence of this fix means that we do not support the import of UCSC SNPs typed as exons any longer.
- Fixed a bug where the "Unaligned end" field provided in the Breakpoint track output of the Indel and Structural Variants tool was left blank when the value should have been "Mixed consensus" on all but one chromosome. The field is now filled for all chromosomes.
- Fixed bug that caused import of empty text files to stall.
- Fixed an issue found in the History of a result generated by the Extract Annotations tool, that would incorrectly show that a reference sequence track was used when it was not.
- Specifying a reference cache size greater than 2GB was not possible when using a readmapper.properties file.
- The mapping tool used in tools involving a mapping stage, such as Map Reads to References, Map Reads to Contigs and RNA-Seq Analysis has been updated:
- Fixed an issue that led to some deletions being reported as multiple, separate deletions instead of a single, larger deletion when affine gap costs were used.
- Fixed a very rare bug in the read mapper, where an alignment with a leading unaligned end could get a wrong score.
- On Windows 10 and Windows Server 2016, it now runs with 'below normal' as the priority. Previously, it ran with 'normal' priority.
- On Windows 10 and Windows Server 2016, the underlying program launched when running the Sample Reads tool now runs with 'below normal' as the priority. Previously, it ran with 'normal' priority.
Changes
- The underlying read mapper and de novo binaries included in the QIAGEN CLC Genomics Server 11.0 are from QIAGEN CLC Assembly Cell 5.1.1.
- The SOLiD Importer has been retired. It was previously in Legacy Tools. As a consequence:
- The tools Map Reads to Reference, Map Reads to Contigs, Trim Reads, De Novo Assembly, Extract and Count, and Annotate and Merge no longer have special handling of SOLiD colorspace data. They will continue to work as expected for SOLiD data, but will not make use of color information to correct for phase shifts.
- Import | SAM/BAM Mapping Files and Standard Import | Reads from SAM/BAM files no longer allow import of data where colorspace information is provided in the form of CS flags and sequence data is omitted (SEQ = "*") .
- Export | SAM, Export | BAM, and Export | Fastq no longer have special handling of SOLiD colorspace data. They will continue to work as expected for SOLiD data, but will not make use of color information to correct for phase shifts.
- The *.cas importer found in Import -> Standard Import no longer allows the import of read mappings where SOLiD color information has been used as part of the mapping algorithm.
- The Import Tracks tool no longer supports the import of files in Complete Genomics master VAR file format. To import such files, it is necessary to first convert them to VCF using the tools provided by Complete Genomics.
- The column "Ignored reads (wrong strand)" has been removed from the "Strand specificity" section of the report produced by the Create Combined RNA-Seq Report tool. The column has been removed to better fit the report's purpose of only providing high-level relevant information.
Plugin notes
New plugins
- Biomedical Genomics Analysis Server Plugin 1.0 Installing this plugin on a QIAGEN CLC Genomics Server provides the functionality formerly available by installing a Biomedical Genomics Server Extension license on a QIAGEN CLC Genomics Server and installing the now-retired QIAseq Targeted Panel Analysis Server Plugin.
Plugin retirements
- 5Bisulfite Sequencing Sever Plugin5 The tools delivered by this plugin have been integrated into the QIAGEN CLC Genomics Server and can be launched from the QIAGEN CLC Genomics Workbench or CLC Command Line Tools client software.
- QIAseq Targeted Panel Analysis Server Plugin and QIAGEN GeneRead Panel Analysis Plugin These plugins were formerly available for use on Biomedical-enabled QIAGEN CLC Genomics Servers. Their functionality is now available via the Biomedical Genomics Analysis Server Plugin when installed on a QIAGEN CLC Genomics Server.
Advanced notice
The following tools will be removed in a future release of the software:
- Roche 454 NGS import
- Compare Sample Variant Tracks
- Create Track from Experiment
- Identify Differentially Expressed Gene Groups and Pathways
- Add Fold Changes
- Add Information from Overlapping Genes
- Create Fold Change Track
- Download Reference Genome Data (The functionality via the Reference Data Manager is unaffected by this.)
If you are concerned about these proposed changes, please contact our Support team by emailing ts-bioinformatics@qiagen.com.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 11.0
- QIAGEN CLC Genomics Workbench 12.0
- QIAGEN CLC Main Workbench 8.1
- CLC Command Line Tools 6.0
CLC Server Command Line Tools
Please see the QIAGEN CLC Genomics Server 12.0 listing for the details about the new tools and features listed below.
New tools
- Bisulfite Sequencing Analysis Three tools for analyzing cytosine methylation data are now available and can be called using the following commands:
- bisulfite_call_methylation_levels
- bisulfite_create_rrbs_track
- bisulfite_read_mapping
These tools were formerly available via the Bisulfite Sequencing Server Plugin, but are now integrated into the server software.
- cnv_detection launches the Copy Number Variation Detection (CNVs) tool for detecting copy number variations (CNVs) from targeted resequencing experiments. This tool was formerly only available when running commands on a Biomedical-enabled QIAGEN CLC Genomics Server.
- differential_expression_two_groups launches the Differential Expression in Two Groups tool, which can be used instead of the more general Differential Expression for RNA tool for testing differential expression between a single treatment group and a control group.
- remove_information_from_variants is a tool for removing annotations on variants.
- remove_orphan_reference_variants removes reference allele variants that lack a corresponding non-reference variant allele.
- primer_pair_import launches the Import Primer Pairs tool, which imports primer pair locations from a generic text format file or from a QIAGEN gene panel primer file. This tool was formerly only available when running commands on a Biomedical-enabled QIAGEN CLC Genomics Server.
- ngs_import_genereader is an import tool for QIAGEN GeneReader data.
New features for existing tools
- differential_expression_rna_seq - new options added
- --choose-subset
- --custom-housekeeping-genes
- --housekeeping-genes-category
- --normalization-method
- --technology
- amino_acid_changes - new options added
- --downstream-flanking-bases
- --mrna-prioritized
- --output-hgvs-compliant-variant-track
- --upstream-flanking-bases
- download_sra - new option added
- --ncbi-api-key
- annotate_overlapping - new option added
- --no-duplicate-annotations
- extract_overlapping_reads - new option added
- --output-mode
- export -e clc - new option added
- --maxcompat When set to true, data is exported without internal data compression, allowing the data to be imported into older CLC software. Set to false by default.
- export -e zip - new option added
- --maxcompat When set to true, data is exported without internal data compression, allowing the data to be imported into older CLC software. Set to false by default.
- ls - new option added
- -e When running -A ls -e with a data element specified using the -t option, the export formats supported for that data element are listed.
Other improvements
- The ngs_import_iontorrent and ngs_import_pacbio tools now support the import of reads from SAM or BAM format files. Mapping information from such imports are discarded. To import a read mapping from SAM or BAM format files, use the ngs_import_sam tool.
- The clcserver command can now be run with just the -V option, which will return the version of the CLC Command Line Tools being used. By running the clcserver command with both the -V and -S options, the version of the QIAGEN CLC Genomics Server indicated will also be returned.
- The human readable URLs (ClcUrl Simple) returned when using the "-A ls" command are now of the form: <host:port form here - see comments> , that is, with the host and port of the QIAGEN CLC Genomics Server specified explicitly. Earlier, the generic URL form <generic form here - see comments> was used.
- Various minor improvements
Bug fixes
- Fixed an issue where the commands "list_workflows" and "uninstall_plugin_and_restart" would not work for installed workflows that had an associated icon.
Changes
- SOLiD colorspace data is no longer supported. Please see the QIAGEN CLC Genomics Server 11.0 listings for details. For the CLC Server Command Line Tools, this has resulted in the following changes:
Command removed: ngs_import_solid
Commands changed:
contig_read_mapping Removed options: --color-error-cost, --color-space
denovo_assembly Removed options: --long-reads-color-error-cost, --long-reads-color-space
read_mapping Removed options: --color-error-cost, --color-space
rna_seq Removed options: --color-error-cost, --color-space
small_rna_annotate Removed options: --color-space
small_rna_sampling Removed options: --color-space
trim Removed options: --color-space
Advanced Notice
Some tools previously only available for use with Biomedical-enabled QIAGEN CLC Genomics Servers are now available for any QIAGEN CLC Genomics Server 12.0, but are legacy tools and will be retired in a future release of the software. These tools are placed within the Legacy folder of QIAGEN CLC Genomics Workbench 12.0. The relevant CLC Server Command Line Tools commands are:
- add_fold_changes_to_variant
- add_info_overlapping_genes
- create_fc_from_expr_tracks_algo
- go_analysis_expression_change
QIAGEN CLC Genomics Server 10.0.2
Shared with workbenches
Bug fixes
- Workbench response times after logging into a QIAGEN CLC Genomics Server have been improved in the situation where many server jobs, submitted from the Workbench, had completed since the last login.
- Fixed a bug where the "Unaligned end" field provided in the Breakpoint track output of the Indel and Structural Variants tool was left blank when the value should have been "Mixed consensus" on all but one chromosome. The field is now filled for all chromosomes.
- Fixed a issue with the Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools that caused a small minority of variants to go unreported under certain conditions expected to arise rarely.
- Fixed a bug in the Identify Candidate Variants tool where no results were returned when one or more criteria used a comparison operator with more than one term (e.g. ">=", "abs value <").
- Specifying a reference cache size greater than 2GB was not possible when using a readmapper.properties file.
- Fixed a concurrency bug in the Copy Number Variant Detection tool, which very rarely resulted in the tool reporting all low-coverage targets on one or more chromosomes as false positive deletions. (Biomedical enabled Genomics Server only)
- Various minor bugfixes.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 10.0.2.
- QIAGEN CLC Genomics Workbench 11.0.2
- Biomedical Genomics Workbench 5.0.2
- CLC Command Line Tools 5.0.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 11.0 and 11.0.1, Biomedical Genomics Workbench 5.0 and 5.0.1, and CLC Command Line Tools 5.0 and 5.0.1 can also connect to QIAGEN CLC Genomics Server 10.0.2. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
- SOLiD colorspace data support, including import, is not available in the the next major release line of the software.
- Roche 454 NGS import is now a legacy tool. We have retained it in the next major release line of the software, but it may be retired in a future release.
If you are concerned about these changes, please contact the QIAGEN Bioinformatics team (ts-bioinformatics@qiagen.com).
CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 10.0.2.
Compatibility
CLC Command Line Tools 5.0.2 is the corresponding client for QIAGEN CLC Genomics Server 10.0.2.
CLC Command Line Tools 5.0.2 can also act as a client for the QIAGEN CLC Genomics Server 10.0 and 10.0.1. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 10.0.1
Server specific
- The option for exporting a history csv file when configuring the User-selected Input data (CLC data location) value is now listed as "History CSV (.csv)". It was shown as "Table Comma Separate Values (.csv) in CQIAGEN LC Genomics Server 10.0.
- Fixed a bug preventing the Genomics server from (re-) starting on a Spanish Windows installation
Shared with workbenches
- Implemented the 3' HGVS compliance rule for c. annotation of variants:
- When doing c. annotations (DNA-level HGVS) we annotate insertions that really are duplications as such.
- For c. annotations we furthermore fulfill the 3' rule for insertions, deletions and duplications.
- When determining amino acid changes, the 3' rule is applied to the DNA change first. This may shift a variant in or out of the coding region, and that will affect whether or not we consider it as an amino acid change.
The 3' rule for p. annotations were previously fulfilled and are not affected by this fix.
- Fixed an issue where workflows with a VCF export element could not be run from a workbench on the QIAGEN CLC Genomics Server.
- Fixed a bug in the VCF (Variant Calling Format) file format exporter that affected the QUAL score of the variant. Previously, the variant QUAL score was set to be the maximum QUAL score of all alleles (regardless of whether it was a reference allele or not). In some instances, e.g., when there are two alleles and one has poor QUAL score, this choice was suboptimal. Instead, the variant QUAL score is now chosen as the maximum QUAL scores among all non-reference variants.
- Fixed an issue where the RNA-Seq Analysis tool would show an error if the first chromosome or contig contained no transcripts and the "Calculate expression for genes without transcripts" option was used.
- Fixed an issue where the RNA-Seq Analysis tool would sometimes generate TE tracks that could not be used in downstream tools. The error occurred when the "Calculate expression for genes without transcripts" option was used on a gene track where two genes had the same name, one of the genes contained the other, and neither gene had a transcript.
- Fixed an issue with the Trim Reads tool used in a workflow with multiple Trim adapter lists as input: all but the first list input were previously silently ignored, but the workflow now gives users a warning message.
- Fixed an issue where importing a Trim Adapter List with an adapter with "Discard the read (end matches at 3')" was imported incorrectly.
- Fixed an issue that could cause some third party plugins to fail trying to retrieve the fastq exporter.
- Fixed an issue where domain annotations added by the Pfam Domain Search tool started one amino acid later than expected. The corresponding start position in the table produced by the tool was correct.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 10.0.1
- QIAGEN CLC Genomics Workbench 11.0.1
- Biomedical Genomics Workbench 5.0.1
- CLC Command Line Tools 5.0.1
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 11.0, Biomedical Genomics Workbench 5.0, and CLC Command Line Tools 5.0 can also connect to QIAGEN CLC Genomics Server 10.0.1. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
- SOLiD colorspace data support, including import, will be retired and will not be available in the the next major release of the software.
- Roche 454 NGS import is now a legacy tool. We plan to retain it in the next major release of the software, but it may be retired in a future release.
If you are concerned about these changes, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 10.0.1.
Compatibility
CLC Command Line Tools 5.0.1 is the corresponding client for QIAGEN CLC Genomics Server 10.0.1
CLC Command Line Tools 5.0.1 can also act as a client for the QIAGEN CLC Genomics Server 10.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
Advanced notice
- With a release planned for late 2018, only 64 bit versions of the CLC Server Command Line Tools will be made available. The 32 bit version will be discontinued from that time.
- The NGS import tool ngs_import_solid is now a legacy tool. It will not be available from the the next major release of this software.
- The NGS import tool ngs_import_roche454 is now a legacy tool. We plan to retain it in the next major release of the software, but it may be retired in a future release.
If you are concerned about these changes, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
QIAGEN CLC Genomics Server 10.0.0
Server specific
Improvements and new features
- "Output file from CL" in External Applications (EA) configurations now supports the configuration of parameters for exporters. Existing EA configurations will continue to work using the older configuration, exactly as on older versions of the QIAGEN CLC Genomics Server. Only if a change is made, and saved, to an existing configuration of an "Output file from CL" entry, will the exporter be updated to support the configuration of parameters.
- The report generated by the "check setup" functionality now lists information about the installed licenses.
- External Applications can now be organized into subfolders of the External Applications area of the Workbench Toolbox.
- The time needed to complete the "Saving results" step of a workflow running on a QIAGEN CLC Genomics Server has been shortened.
- The column headings in the table containing statistics for each mapping, optionally produced by the QC for Read Mapping Tool (Bx-enabled servers only) and the Create Detailed Mapping Report tool, have been made more descriptive.
- Fixed an issue where the removal of terminated sub batch-processes from the Workbench Processes tab before the master process finished would produce an error dialog.
- Improved performance for several tools when handling genomes with many chromosomes. Examples include Annotate with Overlap Information, the BED Exporter, Filter Annotations On Name, and Motif Search and for Bx-enabled Servers, Add Fold Changes and Add Information from Overlapping Variants.
Bug fixes
- An issue was fixed that caused a Workbench restart to be necessary for running an External Application if it had been changed on the QIAGEN CLC Genomics Server since the time the Workbench user had logged into that server.
- Fixed an issue where only one post-processing step in an External Application configuration was listed next to an "Output file from CL" parameter, even when multiple post-processing steps had been linked to it.
- Fixed an issue where export to CLC or zip format led to an error when permissions were set on both the source QIAGEN CLC Genomics Server file location and also on the Import/Export location the data was to be exported to.
- Fixed an issue affecting unknown users trying to log into the QIAGEN CLC Genomics Server where the server was incorrectly configured with LDAP with Bind DN while pointing towards an Active Directory (AD) backend. Such logins will now fail. Previously an anonymous login would result.
Shared with workbenches
Improvements and new features
- Trim Reads:
- The Trim Sequences tool has been renamed to Trim Reads.
- A new option has been added to the Trim Reads tool: "Automatic read-through adapter trimming". This option makes it possible to automatically identify overlap in paired reads and will trim the region that is not part of that overlap. This option is turned on by default. This new default affects workflows that include Trim Reads (or by its former name: Trim Sequences); the parameter will be turned on and locked by default. For a Biomedical-enabled server, this change also affects the inbuilt workflow Prepare Raw Data.
- Trimming adaptor:
- The New Trim Adapter List dialog has been updated to a new and more user-friendly interface.
- It is now possible to reverse complement an adapter sequence with a "Reverse Complement" button to the right of the sequence field.
- It is now possible to specify whether the trim should be performed on all reads, or only on the first or second read of a pair.
- A visual shows the adapter and the sequence being trimmed in relation to the rest of the sequence depending on the option chosen when an adapter is found.
- RNA-Seq Analysis:
- RPKM is now always calculated when running the RNA-Seq Analysis tool with the options "Genome annotated with genes only" and "One reference sequence per transcript".
- The default for the reference type parameter is now "Genome annotated with genes and transcripts".
- In the RNA-Seq Analysis tool, the option "Calculate RPKM for genes without transcripts" has been renamed to "Calculate expression for genes without transcripts".
- The behavior of the RNA-Seq Analysis tool has been changed when the option “Genome annotated with genes and transcripts” is used together with the option “Calculate expression for genes without transcripts".
-
- The counts of genes without transcripts are calculated. Previously only the TPM and RPKM were calculated.
- For a gene without a corresponding transcript, where that gene is overlapped by the intron of another gene, reads aligning to this region are counted towards the expression of the gene without the transcript. Previously such reads were counted as belonging to the intronic region of the overlapping gene.
- A single-exon transcript for each gene without transcripts is now added to the output TE track.
- Paired sequence lists can now be exported to 2 fastq formatted files, one file containing the first member of each pair, the other containing the second member. This is now the default for Fastq Export when exporting paired data.
- The history of a data element can now be exported as a CSV format file.
- An option to include reads that partially overlap variants has been added to the Identify Known Mutations from Sample Mappings tool, enabling detection of variants that are longer than the reads.
- The Identify Known Mutations from Sample Mappings tool has been made slightly more strict when handling insertions and replacements, requiring reads to overlap adjacent reference positions to be counted as fully covering the variant.
- The speed of the Illumina High-Throughput Sequencing Import has been substantially improved. The largest gains are seen on paired read files compressed by gzip with speed improvements of up to 30%.
- The Download Pfam Database tool now downloads version 31. Updates can now be made independently of the release of the QIAGEN CLC Genomics Server, so the version available for download could change over time from the one recorded here.
- Clicking "Select genes in other views" in a Volcano Plot with an empty selection no longer gives an error message.
- When exporting files to SAM or BAM format files, information is now entered into the optional fields NM (edit distance) and MD (mismatch string).
- Importing a GO annotation file with the Standard Import tool, specifying the format "Generic annotation file for expression data", now fails with an informative warning if any of the GO annotations are truncated.
- Warnings are now reported if truncated GO annotations are found when opening data created by the Create Expression Browser tool.
- NCBI blast executables are upgraded to version 2.6.0.
- The Download Reference Genome Data tool now downloads genome annotations from GFF3 files instead of previously as GTF files. Genome annotations for Homo sapiens versions hg18 and hg19 are still downloaded as GTF files, as these are not available as GFF3 files.
- HTML formatting tags are now removed during export of data to Excel .xlsx or .xls format. This change does not affect the export of hyperlinks.
- This history information for data generated using the Identify Candidate Variants tool now includes a match criteria field, recording if the option 'match all' or 'match any' was used.
- Parameters for the Trim Sequences tool are now shown in the same order when running the tool from the Toolbox or within a workflow.
- Map Reads to Reference now outputs an empty read mapping and report when the input contains 0 reads.
- A warning message is now presented when the tool Extract Sequences is run with the "Extract to single sequences option" selected and more than 100 sequences would result.
Changes
- The Roche 454 and SOLiD Import tools have been moved to the Legacy folder of the Workbench Toolbox.
- The option "Search on both strands" has been removed in the Trim Reads tool (formerly named Trim Sequences) and the Extract and Count tool.
- The Create Mapping Graph tool has been modified so that the coverage of overlapping paired end reads is now only counted as one in the overlapping region, instead of two as done previously.
- Removed the line "Total consensus length" from Detailed Mapping Report when using a Read Mapping Track as input, as these tracks do not contain consensus information.
- The SAM and BAM Mapping Files importer now fails if there are reads with more than one primary alignment where both are marked as being the first in a pair or both are marked as being second in a pair.
- The GCG sequence exporter has been removed. The GCG alignment exporter is unaffected by these changes.
- The underlying read mapper and de novo binaries included in the QIAGEN CLC Genomics Server 10.0 are from QIAGEN CLC Assembly Cell 5.0.5.
Bug fixes
- Fixed an issue where paired distances were calculated incorrectly for paired reads in Forward-Reverse orientation where there is adapter read-through. Paired distances can be seen in the report from the Map Reads to Reference tool and the RNA-Seq Analysis tool. The paired distance calculation is also used by the "auto-detect paired distances" option in these tools, although this issue is unlikely to affect the inferred distances.
- Fixed a bug where the Amino Acid Changes tool would in some cases use the CDS reference instead of the RNA reference for annotating coding region changes. This would happen if the RNA and CDS annotations could not be matched, and it could cause variants in UTR regions to not be reported. The matching has now been improved by supporting the 'parent' field used by the GFF3 file format to pair CDS and RNA references.
- Fixed a bug in the RNA-Seq Analysis tool where, when run in "Genes and transcripts" mode, and using "Total counts" as Expression value, the expression values reported for GE tracks would not include shared exon counts. Downstream analyses based on the Set Up Experiment tool could be affected by this issue. Using affected GE tracks as input to the following tools would *not* affect their results: Differential Expression for RNA-Seq, Create Heat Map for RNA-Seq and PCA for RNA-Seq.
- Fixed an issue where the option to run the Differential Expression for RNA-Seq tool in batch mode was made available, leading to an error if it was selected.
- Fixed an issue where the number of input samples to the Map Reads to References and Map Reads to Contigs tools would be silently limited to 120. The execution is now aborted with a warning message. Each analysis must be started with 120 samples maximum.
- Fixed an issue with the mapping tool in the QIAGEN CLC Genomics Server, which is used in tools involving a mapping stage, such as Map Reads to References, Map Reads to Contigs and RNA-Seq Analysis, where length and similarity fraction cut-offs in some cases were ignored for reads longer than 500bp.
- Fixed an issue with the InDels and Structural Variants that caused it to crash if it encountered a particular set of conditions relating to reads with deletions.
- Fixed an issues with the InDels and Structural Variants tool duplicate breakpoints and variants were reported if reads mapping as broken pairs were included in the analysis.
- An issue has been fixed so that it is now possible to export in BAM format reads that contain synonyms, for instance 'X' as synonym for 'N'.
- Fixed bug which caused the fasta exporter to fail when exporting read mappings where one or more reference sequences have no reads mapped to it.
- Fixed an issue that could cause exports of reports with line graphs to fail.
- Fixed an issue where resetting the default parameter values when configuring the Identify Candidate Variants tool did not work.
- Fixed an issue that would prevent the Trim Sequences tool being run with certain length filter settings.
- Fixed a bug where a cell containing multiple hyperlinked URLs caused export to Excel 2010 or Excel 97-2007 format to fail. Such cell contents are now written in plain text.
- Contigs with Gap annotations covering regions longer than 10 bp can now be successfully exported to AGP format. Sequences containing such gaps will be split into separate contigs on export. This issue will be particularly of interest to those using the Join Contigs tool of the QIAGEN CLC Genome Finishing Module.
- Fixed an issue where the Low Frequency Variant Detection tool could return NaN for the Probability value in rare instances for small datasets.
- Fixed an issue with the QC for Target Sequencing tool (Bx-enabled servers only) and with the Create Statistics for Target Region tool, where "GC %" was reported as a ratio. It is now reported as a percentage.
- Fixed an issue with the Add Information about Amino Acid Changes tool (Bx-enabled servers only) and the Amino Acid Changes tool, when used with a circular sequence with a CDS annotation placed across the origin. Variants outside such a wrapped annotation could previously be incorrectly annotated with coding region changes.
- Fixed an issue with the Amino Acid Changes (Bx-enabled servers only) and the Amino Acid Changes tool, when used with a circular sequence with an intron across the origin. Previously, nearby variants were not annotated with coding region changes. Now, variants in such introns and that are within 2 nucleotides of the nearest exon will be annotated with coding region changes, if such changes are identified.
Plugin Notes for Biomedical-enabled server
- A new plugin, QIAseq Targeted Panel Analysis 1.0, unifies the three QIAseq Targeted Panel plugins that were previously available: QIAseq DNA V3 Panel Analysis, QIAseq Targeted RNA Panel Analysis and QIAseq Targeted RNAscan Panel Analysis. The new plugin covers Targeted DNA for variant calling, Targeted RNA for differential expression and Targeted RNAscan for fusion gene detection with improvements resulting in more accurate variant calling and fusion gene detection.
Compatibility
The follow software are the corresponding clients for the QIAGEN CLC Genomics Server 10.0
- QIAGEN CLC Genomics Workbench 11.0
- Biomedical Genomics Workbench 5.0 (when the server has the Biomedical Extension)
- CLC Command Line Tools 5.0
Advanced notice
- SOLiD colorspace data support, including import, will be retired and will not be available in the the next major release of the software.
- Roche 454 NGS import will be removed in a future release, but will still be available in the next major release of the software.
If you are concerned about the proposed changes, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
CLC Server Command Line Tools
Changes to existing tools
export-e fastq
- The default behavior of this tool has changed. To maintain current behavior, scripts using this exporter must have "--export-paired-reads-to-two-files false --one-file true"added to the command.
- --export-paired-reads-to-two-files" has been added. This parameter is set to true by default.
- The default value of the --onefile parameter has been changed from true to false. Note that the --onefile parameter cannot be set to true when the --export-paired-reads-to-two-files parameter is also set to true. Having both set to true will cause the tool to fail with an error.
- -e gcg has been removed. Export to gcg format is no longer available
mutation_tester_tool
- --include-partially-covering-reads has been added. When set to true, reads that partially cover variants will be included when calculating the results, enabling detection of variants longer than the reads. Set to false by default.
small_rna_sampling
- The default behavior of this tool has changed. To maintain current behavior, scripts must have "--readthrough-trimming false" added to the command.
- --readthrough-trimming has been added. Detects overlaps in paired reads and trims the non-overlapping part away.
- --reverse-strand has been removed. This option is no longer used nor recognized. The default value in earlier versions was false.
trim
- The default behavior of this tool has changed. To maintain current behavior, scripts must have "--readthrough-trimming false" added to the command.
- --readthrough-trimming has been added and the default is set to true.
- --reverse-strand has been removed. This option is no longer used nor recognized. The default value in earlier versions was false.
New features
export
- -e history_csv Export of history information to csv format has been added.
Advanced Notice
- With a release planned for late 2018, only 64 bit versions of the CLC Server Command Line Tools will be made available. The 32 bit version will be discontinued from that time.
- The NGS import tool ngs_import_solid is now a legacy tool and will be retired and will not be available in the the next major release of the software.
- The NGS import tool ngs_import_roche454 is now a legacy tool and will be removed in a future release, but will be available in the next major release of the software.
QIAGEN CLC Genomics Server 9.1.3
Shared with workbenches
Bug fixes
- Fixed a bug where the "Unaligned end" field provided in the Breakpoint track output of the Indel and Structural Variants tool was left blank when the value should have been "Mixed consensus" on all but one chromosome. The field is now filled for all chromosomes.
- Fixed a issue with the Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools that caused a small minority of variants to go unreported under certain conditions expected to arise rarely.
- Fixed an issue where domain annotations added by the Pfam Domain Search tool started one amino acid later than expected. The corresponding start position in the table produced by the tool was correct.
- Fixed an issue where the RNA-Seq Analysis tool would sometimes generate TE tracks that could not be used in downstream tools. The error occurred when the "Calculate expression for genes without transcripts" option was used on a gene track where two genes had the same name, one of the genes contained the other, and neither gene had a transcript.
- Fixed an issue where the RNA-Seq Analysis tool would show an error if the first chromosome or contig contained no transcripts and the "Calculate expression for genes without transcripts" option was used.
- Fixed a concurrency bug in the Copy Number Variant Detection tool, which very rarely resulted in the tool reporting all low-coverage targets on one or more chromosomes as false positive deletions. (Biomedical enabled Genomics Server only)
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 9.1.3:
- QIAGEN CLC Genomics Workbench 10.1.3
- Biomedical Genomics Workbench 4.1.3
- QIAGEN CLC Command Line Tools 4.1.3
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 10.0.0, 10.0.1 10.1, 10.1.1 and 10.1.2, Biomedical Genomics Workbench 4.0, 4.1, 4.1.1 and 4.1.2, and CLC Command Line Tools 4.0, 4.1, 4.1.1 and 4.1.2 can also connect to QIAGEN CLC Genomics Server 9.1.3. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
SOLiD colorspace data support, including import, will not be available from QIAGEN CLC Genomics Server 11.0 onwards.
CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 9.1.3.
Compatibility
CLC Command Line Tools 4.1.3 is the corresponding client for QIAGEN CLC Genomics Server 9.1.3.
CLC Command Line Tools 4.1.3 can also act as a client for the QIAGEN CLC Genomics Server 9.1.2, 9.1.1, 9.1 and 9.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 9.1.2
Shared with workbenches
Changes and bug fixes
- Fixed an issue with the mapping tool in the QIAGEN CLC Genomics Server, which is used in tools involving a mapping stage, such as Map Reads to References, Map Reads to Contigs and RNA-Seq Analysis, where length and similarity fraction cut-offs in some cases were ignored for reads longer than 500bp.
- Fixed a bug in the Amino Acid Changes tool, and in Bx-enabled servers, the Add Information about Amino Acid Changes tool, where the CDS reference was used instead of the RNA reference when annotating coding region changes if the RNA and CDS annotations could not be matched. This could result in variants in UTR regions not being reported. The matching has been improved by supporting the 'parent' field used by the GFF3 file format to pair CDS and RNA references.
- Fixed an issue where the number of input samples to the Map Reads to Reference tool and Map Reads to Contigs tools would be silently limited to 120. The execution is now aborted with a warning message. Each analysis must be started with 120 samples maximum.
- Fixed a bug in the RNA-Seq Analysis tool where, when run in "Genes and transcripts" mode, and using "Total counts" as Expression value, the expression values reported for GE tracks would not include shared exon counts. Downstream analyses based on the Set Up Experiment tool could be affected by this issue. Using affected GE tracks as input to the following tools would *not* affect their results: Differential Expression for RNA-Seq, Create Heat Map for RNA-Seq and PCA for RNA-Seq.
- The behavior of the RNA-Seq Analysis tool has been changed when the option “Genome annotated with genes and transcripts” is used together with the option “Calculate expression for genes without transcripts".
- The counts of genes without transcripts are calculated. Previously only the TPM and RPKM were calculated.
- For a gene without a corresponding transcript, where that gene is overlapped by the intron of another gene, reads aligning to this region are counted towards the expression of the gene without the transcript. Previously such reads were counted as belonging to the intronic region of the overlapping gene.
- A single-exon transcript for each gene without transcripts is now added to the output TE track.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 9.1.2
- QIAGEN CLC Genomics Workbench 10.1.2
- Biomedical Genomics Workbench 4.1.2
- CLC Command Line Tools 4.1.2
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 10.0.0, 10.0.1 10.1, and 10.1.1, Biomedical Genomics Workbench 4.0, 4.1 and 4.1.1, and CLC Command Line Tools 4.0, 4.1 and 4.1.1 can also connect to QIAGEN CLC Genomics Server 9.1.2. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
Support for SOLiD colorspace data will be phased out over the next 12 months. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 9.1.2.
Compatibility
CLC Command Line Tools 4.1.2 is the corresponding client for QIAGEN CLC Genomics Server 9.1.2.
CLC Command Line Tools 4.1.2 can also act as a client for the QIAGEN CLC Genomics Server 9.1.1, 9.1 and 9.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 9.1.1
Bug fixes
- Fixed an issue introduced in QIAGEN CLC Genomics Server 8.5 causing the Merge Annotation Tracks tool to fail when used on tracks with more than 6 chromosomes.
- Fixed an issue introduced in the Biomedical enabled QIAGEN CLC Genomics Server 9.1 where workflows containing the Copy Number Variant Detection tool could not be updated automatically. (Biomedical-enabled servers only)
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 9.1.1
- QIAGEN CLC Genomics Workbench 10.1.1
- Biomedical Genomics Workbench 4.1.1
- CLC Command Line Tools 4.1.1
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 10.0.0, 10.0.1 and 10.1, Biomedical Genomics Workbench 4.0 and 4.1, and QIAGEN CLC Command Line Tools 4.0 and 4.1 can connect to QIAGEN CLC Genomics Server 9.1.1. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
- Support for SOLiD colorspace data will be phased out over the next 18 months. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
QIAGEN CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 9.1.1.
Compatibility
CLC Command Line Tools 4.1.1 is the corresponding client for QIAGEN CLC Genomics Server 9.1.1.
CLC Command Line Tools 4.1.1 can also act as a client for the QIAGEN CLC Genomics Server 9.1 and 9.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 9.1.0
Server specific
- Improved messaging in the web administrative interface when workflows installed on the server have unresolved dependencies, such as missing plugins, licenses or data.
- More detailed feedback is now presented in the Plugins tab of the web administrative client when there are issues with installed plugins, such as missing or expired licenses or version incompatibilities.
- Fixed a connection issue with LDAP authentication over SSL when the "Disable SSL certificate check" setting was used. This problem was introduced in QIAGEN CLC Genomics Server 9.0.
Shared with workbenches
Improvements
- When importing tracks, the history of the track now contains the full path name of the imported file.
Bug fixes
- Fixed an issue with the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools that could cause the count and frequency values to be too low for a small subset of those variants that are contained within a larger variant region (e.g. an MNV or deletion). For a variant to be affected by this problem, there needed to be at least two other potential variants nearby that were disregarded during the variant calling process. This circumstance and our testing suggest this is a rare issue.
- Fixed a bug that in some cases would result in incorrect BaseQRankSum values being reported in the outputs of the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools.
- Fixed an issue where the GFF3 Exporter could generate invalid GFF3 for features of length 0.
Shared with Biomedical enabled Genomics Server only
- Fixed a bug in the Copy Number Variation Detection tool where the target-level output could not be produced unless a gene track was also specified.
- Fixed an issue in the Copy Number Variation Detection tool where the data in the "Fold-change (adjusted)" and "Fold-change (raw)" columns were reversed in the target-level CNV output.
Compatibility
The following are the corresponding clients for the QIAGEN CLC Genomics Server 9.1.0
- QIAGEN CLC Genomics Workbench 10.1.0
- Biomedical Genomics Workbench 4.1.0
- CLC Command Line Tools 4.1.0
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 10.0.0 and 10.0.1, Biomedical Genomics Workbench 4.0, and CLC Command Line Tools 4.0 can connect to QIAGEN CLC Genomics Server 9.1.0. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
- Support for SOLiD colorspace data will be phased out over the next 18 months. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com).
QIAGEN CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 9.1.0
Compatibility
CLC Command Line Tools 4.1.0 is the corresponding client for QIAGEN CLC Genomics Server 9.1.0.
CLC Command Line Tools 4.1.0 can also act as a client for the QIAGEN CLC Genomics Server 9.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 9.0
Server specific
New features and improvements
- The tabs "Import", "Export" and "Sequence Text" have been removed from the web administration interface. Viewing, import and export functionality are available via Workbench clients, and import and export functionality is also available using the CLC Command Line Tools client.
- The mapping tool in the Workbench, which is used in tools involving a mapping stage, such as Map Reads to References, Map Reads to Contigs and RNA-Seq Analysis has been updated. The update includes improved read mapping quality and speed (especially for longer reads), improved memory performance for the index building stage, and various minor bug fixes. The new mapping tool corresponds to the clc_mapper tool included in Assembly Cell 5.0.3, planned for release in March, 2017.
- Temporary *.cpw files are now deleted immediately after installation of server workflows from the workbench. Previously these were deleted later, when the finished server process for the installation was removed.
- Various minor improvements
Bug fixes
- Fixed a problem where permission changes were not applied as expected when using the "Apply to all subfolders" option when setting group permissions on a folder in server data locations.
- Fixed an issue where the QIAGEN CLC Genomics Server would wait indefinitely when there was a stalled connection to the LDAP server.
- Fixed an issue that caused the import of Wiggle and USCS chromosome band files to fail on QIAGEN CLC Genomics Server setups.
- Fixed an issue where batch jobs submitted to the QIAGEN CLC Genomics Server would not always display all the sub-processes in the Processes tab in a Workbench.
- Various minor bugfixes.
Shared with workbenches
New tools for RNA-Seq
- Create Combined RNA-Seq Report - makes it possible to join multiple reports generated by the RNA-Seq Analysis tool into one combined overview report.
- PCA for RNA-Seq* - clusters samples in 2D or 3D. Known metadata about each sample is added as an overlay.
- Differential Expression for RNA-Seq* - uses multi-factorial statistics based on a negative binomial GLM.
- Create Heat Map for RNA-Seq* - simultaneously clusters samples and features. Known metadata about each sample is added as an overlay.*
- Create Expression Browser* - allows expression values, statistical results, and gene annotations to be viewed together.*
- Create Venn Diagram for RNA-Seq* - shows differentially expressed genes shared between experimental conditions.*
- Gene Set Test - tests the output from the Differential Expression for RNA-Seq tool for overrepresented gene sets (such as Gene Ontology terms) using a hypergeometric test.
- Import | RNA Spike-ins - for importing RNA spike-in sequences and concentration data.
*Tools marked with an asterisk were available to earlier Workbench versions via the Advanced RNA-Seq plugin.
These tools automatically account for differences due to sequencing depth, removing the need to normalize input data. They work with existing RNA-seq TE and GE tracks. Changes made in this release mean that outputs from the Differential Expression for RNA-Seq tool can now be used as inputs to the Extract Annotations (for QIAGEN CLC Genomics Server) and Extract Reads Based on Overlap tools.
RNA-Seq Analysis
- The RNA-Seq Analysis tool now supports RNA spike-ins, such as ERCC and SIRV, for quality control. This makes it possible to validate RNA-Seq experiments by comparing known spike-in concentrations to measured transcript concentrations. Spike-ins can be imported using the new RNA Spike-ins Import tool.
- The RNA-Seq Analysis report has been revised and updated:
- We now show the distribution of the biotypes that the reads mapped to.
- The strand specificity of the mapped reads is now reported.
- Transcript coverage plots make it possible to detect and visualize 5' and 3' coverage bias.
- For paired-end reads, we now detect and warn about potential adapter read-through.
- A biotype column is now available in the Expression Track tables produced by the RNA-Seq Analysis tool, when biotype information is available.
- The Mapping options of the RNA-Seq Analysis tool, "Map to gene regions only" and "Also map to inter-genic regions", have been removed. The tool now runs by mapping reads to the full reference supplied, which is equivalent to choosing the recommended "Also map to inter-genic regions" option in earlier versions.
- The RNA-Seq Analysis tool now always uses the "Expression level" option "Use EM estimation (recommended)" to quantify expression. This is more accurate than the previous default option. Differences are especially noticeable for Transcript Expression (TE) tracks.
- The RNA-Seq Analysis quantification by EM estimation now runs faster.
- In RNA-Seq analyses, reads that map uniquely to a genome position are now always marked as unique. Previously, a uniquely mapped read would be marked as ambiguous if it mapped to a position with multiple overlapping genes.
- Exon IDs will no longer be included in the ENSEMBL column of transcript expression (TE) tracks generated by the RNA-Seq Analysis tool. Gene and transcript names will continue to be listed in this column.
Import/Export
- A tool to import PacBio data is now available. It is located at Import | PacBio in Workbenches.
- The GFF2/GTF/GVF tracks importer can no longer be used to import GFF3 format files. The new GFF3 tracks importer should be used for this instead.
- The GFF3 importer has been updated with respect to the handling of CDS features. In earlier versions, CDSs with different IDs but the same parent gene would always be merged into the same CDS feature during import. This behavior will still occur in cases where all CDSs in the GFF3 file either have unique IDs or no IDs. For GFF3 files where there are any CDSs with identical IDs, then only CDSs with the same ID are merged into a single feature.
- The Import | Tracks tool now accepts files with a .fna extension.
- The speed of importing to tracks where the original file contains data relating to many chromosomes has been substantially improved.
- The Cosmic option of the Import | Tracks tool is now more flexible with regards to the column headings in the files being imported.
- An exporter has been added to export annotations on sequences or tracks to Generic Feature Format Version 3 (GFF3) format.
- An option has been added to create an index file when exporting to BAM format.
New features and improvements
- Toolbox rearrangement: the expression analysis tools are now in two top-level folders: "RNA-Seq Analysis" and "Microarray and Small RNA Analysis". The former top level Toolbox folder Transcriptomics Analysis has been removed.
- When working with Gene Sets that refer to Gene Ontology terms, gene annotations are now automatically propagated to parent Gene Ontology terms. This improvement affects the tools Hypergeometric Tests on Annotations and Gene Set Enrichment Analysis (GSEA).
- The mapping tool used as part of Map Reads to References and the Map Reads to Contigs tools has been updated. The update includes improved read mapping quality for longer reads, improved memory performance for the index building stage, as well as various minor bug fixes. The new mapping tool corresponds to the clc_mapper tool included in Assembly Cell 5.0.3, planned for release in March, 2017.
- The default value for the parameter "Maximum guidance-variant length" in the tool Local Realignment tool has been changed to 200 (was 100). This change applies to all ready-to-use workflows and when the tools is launched directly.
- The Basic Variant Detection tool will no longer report N as an alternative allele when there is an ambiguous base at a variant position.
- The report generated by the tool Create Statistics for Target Regions now includes a "≥" sign instead of a ">" sign.
- The "Additional Reporting" options in the Create Sequencing QC Report tool, "Quality analysis" and "Over-representation analysis" have been removed. These outputs are now generated by default.
- Options to search the full text or abstracts available in Pubmed have been added to the Search for Reads in SRA tool
- Support has been added for 'negative lookahead' when using Java regular expressions when using the Motif Search Tool.
- For new or existing sequence lists the sequencing platform can now be specified via the Read Group setting of the Element Info view.
- The speed of searches for data elements with associations to specified metadata, from within a Metadata Table, has been greatly improved. To enable metadata related searches to work after upgrading to the QIAGEN CLC Genomics Server 9.0, indices for the locations containing the relevant data will need to be rebuilt.
- Various minor improvements
Bug fixes and changes
- Fixed an issue where the index building stage of the Map Reads to References and the Map Reads to Contigs tools was not taking into account the maxcores setting in the cpu.properties file, where this had been configured.
- Fixed an issue where sequence circularity was not reported in the output from the Map Reads to References tool.
- Fixed a bug in the Create Detailed Mapping Report, which sometimes reported incorrect read counts for circular sequences.
- Fixed an issue where the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools reported homozygous reference insertions in cases where a heterozygous variant was possible but the insertion variant was disregarded during filtering.
- Fixed an issue where the Identify Known Mutations from Sample Mappings tool would fail if it was part of a workflow and it received multiple input sample mappings as input.
- Fixed an issue with GenBank and EMBL exports where quoting specifications were not being conformed to.
- Fixed an issue where a workflow containing an export step that failed did not provide any indication that a problem had occurred.
- The speed of sorting and loading tracks in the Workbenches has been greatly improved. Due to these changes, tracks created with this version of the QIAGEN CLC Genomics Server and later ones cannot be used in older Workbenches or Servers. Backwards compatibility has been maintained: tracks created using older versions of the Workbench or QIAGEN CLC Genomics Server can continue to be used.
- Various minor bugfixes.
Shared with Biomedical enabled Genomics Server only
- Two new human reference data sets are available for download from the Reference Data Manager. One is based on Ensembl 86 and the other is based on RefSeq GRCh38.p9.
- The three workflows Identify and Annotate Differentially Expressed Genes and Pathways for human, mouse, and rat have been replaced by three new workflows of the same names. The new workflows benefit from the inclusion of new RNA-seq tools.
- The Ready-to-use workflows listed under the "Whole Transcriptome Sequencing" folder of the Workbench Toolbox now support strand-specific RNA-seq protocols by allowing the "Strand Specific" parameter to be set.
- In all Ready-to-Use workflows containing the tool Map Reads to Reference, the default value for the parameter "Cost of insertions and deletions" has been changed to "affine" (it used to be "linear"). Default values have not been changed in the case where the tool is launched directly.
- Less temporary space is now consumed when downloading data via the Reference Data Manager.
- When working with Gene Sets that refer to Gene Ontology terms, gene annotations are now automatically propagated to parent Gene Ontology terms. This improvement affects the tool Identify Differentially Expressed Gene Groups and Pathways.
- Fixed a bug in the Create Detailed Mapping Report (or QC for Read Mapping tool in Biomedical enabled Genomics Server), which sometimes reported incorrect read counts for circular sequences.
Retirement
- The GFF exporter has been retired and is no longer available. The new GFF3 exporter should be used instead.
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools have been retired.
- The tool Trim Primers of Mapped Reads has been retired. For trimming primers from mapped reads, please use the Trim Primers and their Dimers from Mapping tool, which is distributed with the QIAGEN GeneRead Panel Analysis Server Plugin.
Compatibility
The follow software are the corresponding clients for the QIAGEN CLC Genomics Server 9.0
- QIAGEN CLC Genomics Workbench 10.0
- Biomedical Genomics Workbench 4.0 (when the server has the Biomedical Extension)
- CLC Command Line Tools 4.0
Plugin notes
- The Advanced RNA-Seq plugin has been retired. The tools from this plugin have been integrated into the software. Please see the New Tools section for more details.
QIAGEN CLC Server Command Line Tools
All QIAGEN CLC Genomics Servers
New Tools
- create_combined_rnaseq_report Create Combined RNA-Seq Report
- create_expression_browser Create Expression Browser
- create_heatmap_for_rnaseq Create Heatmap fo RNA-Seq
- create_venn_diagram_for_rnaseq Create Venn Diagram for RNA-Seq
- differential_expression_rna_seq Differential Expression for RNA-Seq
- gene_set_test Gene Set Test
- principal_component_for_rna_seq PCA for RNA-Seq
- spikein_control_import RNA Spike-ins
Other tools
- -e gff3 export to gff3 format
Improvements
- Help for the CLC Command Line Tool is no longer printed to the console when errors are returned.
Changes
Commands removed
- probabilistic_variant_detection Probabilistic Variant Detection. Legacy tool, now retired.
- quality-based_variant_detection Quality based Variant Detection. Legacy tool, now retired.
- -e gff export to gff. Use the new gff3 exporter instead.
rna_seq
- --spike-in-settings Map reads to spike-in controls (default: NONE)
- --spikein-controls Select spike-in controls
exporting to bam format (-A export -e bam)
- --index <Boolean> >Create an index file (.bai) (default: false)
Options removed from commands
rna_seq
- --em-enabled
- --mapping-type
The tool now runs with EM enabled by default.
sequencing_qc_report
- --include-overrepresentation-analysis
- --include-quality-score-analysis
The tool now generates these outputs as part of the report by default.
Biomedical Enabled QIAGEN CLC Genomics Servers only
Changes in addition to those listed for all QIAGEN CLC Genomics Servers
Options removed from commands
qc_for_sequencing_reads
- --include-overrepresentation-analysis
- --include-quality-score-analysis
QIAGEN CLC Genomics Server 8.5.5
Server specific
- Fixed an issue that caused the import of Wiggle and USCS chromosome band files to fail on QIAGEN CLC Genomics Server setups.
- More detailed feedback is now presented in the Plugins tab of the web administrative client when there are issues with installed plugins, such as missing or expired licenses or version incompatibility.
Shared with workbenches
Improvements
- When importing tracks, the history of the track now contains the full path name of the imported file.
Bug fixes
- Fixed an issue with the Basic Variant DetectionLow Frequency Variant Detection and Fixed Ploidy Variant Detection tools that could cause the count and frequency values to be too low for a small subset of those variants that are contained within a larger variant region (e.g. an MNV or deletion). For a variant to be affected by this problem, there needed to be at least two other potential variants nearby that were disregarded during the variant calling process. This circumstance and our testing suggest this is a rare issue.
- Fixed a bug that in some cases would result in incorrect BaseQRankSum values being reported in the outputs of the Basic Variant DetectionLow Frequency Variant Detection and Fixed Ploidy Variant Detection tools.
Compatibility
- QIAGEN QIAGEN CLC Genomics Workbench 9.5.5 is the corresponding client for QIAGEN QIAGEN CLC Genomics Server 8.5.5.
- Biomedical Genomics Workbench 3.5.5 is the corresponding client for QIAGEN QIAGEN CLC Genomics Server 8.5.5.
- CLC Command Line Tools 3.5.5 is the corresponding client for QIAGEN QIAGEN CLC Genomics Server 8.5.5.
We recommend running the corresponding versions of clients for QIAGEN QIAGEN CLC Genomics Server. However, QIAGEN QIAGEN CLC Genomics Workbench 9.5.4, 9.5.3, 9.5.2, 9.5.1, 9.5, 9.0.1 and 9.0, Biomedical Genomics Workbench 3.5.4, 3.5.3, 3,5.2, 3.5.1, 3.5, 3.0.1 and 3.0, and CLC Command Line Tools 3.5.4, 3.5.3, 3.5.2, 3.5.1, 3.5, 3.0.1 and 3.0 can connect to QIAGEN CLC Genomics Server 8.5.5. In addition, QIAGEN QIAGEN CLC Genomics Workbench 9.5.x and Biomedical Genomics Workbench 3.5.x and CLC Command Line Tools 3.5.x can connect to a QIAGEN QIAGEN CLC Genomics Server 8.0.x. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
QIAGEN CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN QIAGEN CLC Genomics Server 8.5.5.
Compatibility
CLC Command Line Tools 3.5.5 is the corresponding client for QIAGEN QIAGEN CLC Genomics Server 8.5.5.
CLC Command Line Tools 3.5.5 can also act as a client for the QIAGEN QIAGEN CLC Genomics Server 8.5.4, 8.5.3, 8.5.2, 8.5.1, 8.5., 8.0.1 and 8.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 8.5.4
Bug fixes
- A timeout value that would lead a job to fail after 24 hours, which was introduced as part of optimizations to run on multiple threads in the QIAGEN CLC Genomics Server 8.5 has been extended to 7 weeks. The tools affected are Annotate from Known Variants, Filter against Known Variants, Filter against Control Reads, Annotate with Exon Number, Annotate with Flanking Sequences, Filter Marginal Variant Calls, Compare Sample Variant Tracks, Trio Analysis, GO enrichment Analysis, Amino Acid Changes, Annotate with Conservation Score, Predict Splice Site Effect, Link Variants to 3D Protein Structure, Merge Annotation Tracks, Create Statistics for Target Regions, Fisher Exact Test, Annotate with Overlap Information, Filter Based on Overlap, Filter Reference Variants, Identify Candidate Variants, Coverage Analysis, and InDels and Structural Variants.
- Fixed an issue in the RNA-Seq Analysis tool where running in EM mode, with a "Strand specific" setting of "Forward" or "Reverse" would result in the second read of a pair mapped as a broken pair being counted incorrectly if that read was mapped outside a region annotated as a transcript.
- Fixed an issue where an error arose when using the RNA-Seq Analysis tool with the EM option and a strand specific setting of "Forward" or "Reverse" in cases where the second read of mapped broken pair mapped to the opposite strand of the strand specific setting.
- Fixed an issue with the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools where the forward and/or reverse count for a longer variant, supported by paired reads with both children having the same direction, could be too low. The forward count and reverse count is now reported correctly.
- Fixed an issue with the InDels and Structural Variants tool where an incorrect insertion could be called when the optimal alignment of a read's unaligned end around the breakpoint included a gap in the insertion sequence.
- Fixed an issue in the InDels and Structural Variants tool that would terminate analysis of large read mappings prematurely a fraction of the times.
- Fixed an issue with the Basic Variant Detection, Low Frequency Variant Detection and Fixed Ploidy Variant Detection tools where the count and read count could be reported as marginally higher than they actually were in a small minority of cases. For the affected variants, this could then also result in variant frequencies being reported that were slightly higher than they should have been, in some cases above 100%. Variants affected by this issue are a small subset of variants where the variant affected overlapped another potential variant and where only the affected variant was then reported. This change could lead to a small decrease in the number variants reported compared to earlier versions of the CLC software, due to a variant no longer passing the count or read count filtering constraints. The impact of this change is expected to be low. For example, in our tests, for a particular analysis that reported 250,000 variants, 30 fewer were reported with the same parameters and filters applied after this fix was implemented.
- Fixed an issue where the Basic Variant Detection, Fixed Ploidy Variant Detection, Low Frequent Variant Detection and Local Realignment tools could fail if a deletion was encountered at the end of a match between a read and the reference in the mapping used as input.
- Fixed an issue in the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequent Variant Detection tools where the tools could stop with an error. The problem arose when a read split up within a mapping (e.g. to map to separate exons) was split into 4 or more parts, and at least 4 of those parts would map within a region of adjacent variants being considered as a possible multiple nucleotide variant (MNV). This infrequent problem was most likely to occur when using high coverage RNA-Seq mappings and looking for variants occurring at low frequency. It was introduced in the previous bugfix release of the QIAGEN CLC Genomics Server, version 8.5.3.
Specific to Biomedical enabled Genomics Server
- A timeout value that would lead a job to fail after 24 hours, which was introduced as part of optimizations to run on multiple threads in the QIAGEN CLC Genomics Server 8.5 has been extended to 7 weeks. The tools affected are Add Information from Variant Databases, Remove Variants Found in External Database, Remove Germline Variants, Add Exon Number, Add Flanking Region, Remove False Positives, Identify highly Mutated Gene Groups and Pathways, Add Information About Amino Acid Changes, Add Conservation Scores, Identify Variants with Effect on Splicing, Link Variants to 3D Protein Structure, QC for Targeted Sequencing, Identify Enriched Variants in Case vs Control Samples, Add Information from Overlapping Genes, Add information from Genomic Regions, Add Information from Overlapping Variants, Remove Variants Outside Genome Genome Regions, Remove Variants Outside Target Regions, Remove Variants Inside Genome Regions, Identify Mutated Genes, Remove Reference Variants, and Whole Genome Coverage Analysis.
Compatibility
- QIAGEN CLC Genomics Workbench 9.5.4 is the corresponding client for QIAGEN CLC Genomics Server 8.5.4.
- Biomedical Genomics Workbench 3.5.4 is the corresponding client for QIAGEN CLC Genomics Server 8.5.4.
- CLC Command Line Tools 3.5.4 is the corresponding client for QIAGEN CLC Genomics Server 8.5.4.
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 9.5.3, 9.5.2, 9.5.1, 9.5, 9.0.1 and 9.0, Biomedical Genomics Workbench 3.5.3, 3,5.2, 3.5.1, 3.5, 3.0.1 and 3.0, and CLC Command Line Tools 3.5.3, 3.5.2, 3.5.1, 3.5, 3.0.1 and 3.0 can connect to QIAGEN CLC Genomics Server 8.5.4. In addition, QIAGEN CLC Genomics Workbench 9.5.x and Biomedical Genomics Workbench 3.5.x and CLC Command Line Tools 3.5.x can connect to a QIAGEN CLC Genomics Server 8.0.x. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools will be removed from the Server and Workbenches in March, 2017.
- The Expression Profiling by Tags tools (Extract and Count Tags, Create Virtual Tag List, and Annotate Tag Experiment) are scheduled to be removed from the Server and Workbench in March, 2017.
- Support for some older operating systems (OS), listed below, will be discontinued in March, 2017. Software released at that time and later may still run without issue, but problems experienced due to using an unsupported OS will not be addressed. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com), letting them know the OS being used and the products you are running on that OS.
- Windows: Windows Vista and Windows Server 2008
- Mac: Mac OS X 10.7 and 10.8
- Linux: Red Hat Enterprise Linux 5, SUSE Linux Enterprise Server 10 and 11 and Fedora 6 through 21
QIAGEN CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 8.5.4.
Compatibility
CLC Command Line Tools 3.5.4 is the corresponding client for QIAGEN CLC Genomics Server 8.5.4.
CLC Command Line Tools 3.5.4 can also act as a client for the QIAGEN CLC Genomics Server 8.5.3, 8.5.2, 8.5.1, 8.5., 8.0.1 and 8.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 8.5.3
Bugfixes and Improvements
For the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection tools, the following have been addressed:
- Fixed an issue where the coverage of a longer variant that contained another variant was reported for both the longer variant and the contained variant. The coverage for the contained variant is now reported correctly.
- Fixed an issue affecting coverage calculation for SNVs without immediately adjacent variants when using paired read data: if the second read of a pair containing the variant did not meet the requirements of the quality filter, neither the first nor second read of that pair contributed to the coverage calculated for the variant.
- Fixed an issue where, for an SNV without immediately adjacent variants, overlapping reads of a pair that had conflicting base calls for that variant position contributed to the values calculated for coverage, read coverage, and read count of that variant.
- Fixed a bug where count, read count, and forward- and reverse read count could be incorrect for variants found in overlapping regions of a pair of reads and where the variant was originally identified as being adjacent to one or more other variants.
The above issues, including information on the products affected, are described on the public notification page: Coverage and count reporting for variants in certain circumstances are incorrect
For the Identify Known Mutations from Sample Mappings tool, the following issues have been addressed:
- Fixed an issue with the Identify Known Mutations from Sample Mappings tool where reads in a sample mapping were not identified as supporting the presence of a known variant in cases where the first position of the variant region in the mapped read contained a gap.
- Fixed an issue with the Identify Known Mutations from Sample Mappings tool where a read containing a variant longer than a known variant being tested for was counted as supporting the known variant in cases where the first part of the read’s variant sequence is identical to that of the known variant.
- Fixed an issue in the Identify Known Mutations from Sample Mappings tool where overlapping reads of a pair having conflicting base calls for a variant position could contribute to the coverage calculated for that variant.
Compatibility
- QIAGEN CLC Genomics Workbench 9.5.3 is the corresponding client for QIAGEN CLC Genomics Server 8.5.3.
- Biomedical Genomics Workbench 3.5.3 is the corresponding client for QIAGEN CLC Genomics Server 8.5.3.
- CLC Command Line Tools 3.5.3 is the corresponding client for QIAGEN CLC Genomics Server 8.5.3.
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 9.5.2, 9.5.1, 9.5, 9.0.1 and 9.0, Biomedical Genomics Workbench 3,5.2, 3.5.1, 3.5, 3.0.1 and 3.0, and CLC Command Line Tools 3.5.2, 3.5.1, 3.5, 3.0.1 and 3.0 can connect to QIAGEN CLC Genomics Server 8.5.3. In addition, QIAGEN CLC Genomics Workbench 9.5.x and Biomedical Genomics Workbench 3.5.x and CLC Command Line Tools 3.5.x can connect to a QIAGEN CLC Genomics Server 8.0.x. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced notice
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools will be removed from the Server and Workbenches in early 2017.
- The Expression Profiling by Tags tools (Extract and Count Tags, Create Virtual Tag List, and Annotate Tag Experiment) are scheduled to be removed from the Server and Workbench in spring, 2017.
- Support for some older operating systems (OS), listed below, will be discontinued in early 2017. Software released at that time and later may still run without issue, but problems experienced due to using an unsupported OS will not be addressed. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com), letting them know the OS being used and the products you are running on that OS.
- Windows: Windows Vista and Windows Server 2008
- Mac: Mac OS X 10.7 and 10.8
- Linux: Red Hat Enterprise Linux 5, SUSE Linux Enterprise Server 10 and 11 and Fedora 6 through 21
QIAGEN CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 8.5.3.
Compatibility
CLC Command Line Tools 3.5.3 is the corresponding client for QIAGEN CLC Genomics Server 8.5.3.
CLC Command Line Tools 3.5.3 can also act as a client for the QIAGEN CLC Genomics Server 8.5.2, 8.5.1, 8.5., 8.0.1 and 8.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 8.5.2
Improvements
- SRA download functionality has been updated to support the upcoming NCBI transition to HTTPS.
Bug fixes
- Fixed an issue with running BLAST on macOS Sierra.
- Updated PFAM links reported by the Pfam Domain Search tool.
Compatibility
- QIAGEN CLC Genomics Workbench 9.5.2 is the corresponding client for QIAGEN CLC Genomics Server 8.5.2.
- Biomedical Genomics Workbench 3.5.2 is the corresponding client for QIAGEN CLC Genomics Server 8.5.2.
- CLC Command Line Tools 3.5.2 is the corresponding client for QIAGEN CLC Genomics Server 8.5.2.
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 9.5.1, 9.5, 9.0.1 and 9.0, Biomedical Genomics Workbench 3.5.1, 3.5, 3.0.1 and 3.0, and CLC Command Line Tools 3.5.1, 3.5, 3.0.1 and 3.0 can connect to QIAGEN CLC Genomics Server 8.5.2. In addition, QIAGEN CLC Genomics Workbench 9.5.x and Biomedical Genomics Workbench 3.5.x and CLC Command Line Tools 3.5.x can connect to a QIAGEN CLC Genomics Server 8.0.x. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Advanced Notice
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools will be removed from the Server and Workbenches in early 2017.
- The Expression Profiling by Tags tools (Extract and Count Tags, Create Virtual Tag List, and Annotate Tag Experiment) are scheduled to be removed from the Server and Workbench in spring, 2017.
- Support for some older operating systems (OS), listed below, will be discontinued in early 2017. Software released at that time and later may still run without issue, but problems experienced due to using an unsupported OS will not be addressed. If you are concerned about the proposed change, please contact our Support team (AdvancedGenomicsSupport@qiagen.com), letting them know the OS being used and the products you are running on that OS.
- Windows: Windows Vista and Windows Server 2008
- Mac: Mac OS X 10.7 and 10.8
- Linux: Red Hat Enterprise Linux 5, SUSE Linux Enterprise Server 10 and 11 and Fedora 6 through 21
CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for QIAGEN CLC Genomics Server 8.5.2.
Compatibility
CLC Command Line Tools 3.5.2 is the corresponding client for QIAGEN CLC Genomics Server 8.5.2.
CLC Command Line Tools 3.5.2 can also act as a client for the QIAGEN CLC Genomics Server 8.5.1, 8.5., 8.0.1 and 8.0. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 8.5.1
Genomics Server
Bug fixes
- Fixed a serious issue that could arise when using Import | Illumina or Import | Ion Torrent to import gzip or bzip2 compressed files using QIAGEN CLC Genomics Server 8.5.
- Fixed a problem where launching a tool from the Quick Launch window after sorting led to the wrong tool being started.
Compatibility
- QIAGEN CLC Genomics Workbench 9.5.1 is the corresponding client for QIAGEN CLC Genomics Server 8.5.1.
- Biomedical Genomics Workbench 3.5.1 is the corresponding client for QIAGEN CLC Genomics Server 8.5.1.
- CLC Command Line Tools 3.5.1 is the corresponding client for QIAGEN CLC Genomics Server 8.5.1.
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 9.5, 9.0 and 9.0.1, Biomedical Genomics Workbench 3.5, 3.0 and 3.0.1, and CLC Command Line Tools 3.5, 3.0 and 3.0.1 can connect to QIAGEN CLC Genomics Server 8.5.1. In addition, QIAGEN CLC Genomics Workbench 9.5.1, Biomedical Genomics Workbench 3.5.1 and CLC Command Line Tools 3.5.1 can connect to a QIAGEN CLC Genomics Server 8.0.x. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
CLC Server Command Line Tools
This is a compatibility release to supply the corresponding client for the QIAGEN CLC Genomics Server x.x.x.
Compatibility
CLC Command Line Tools 3.5.1 is the corresponding client for QIAGEN CLC Genomics Server 8.5.1.
CLC Command Line Tools 3.5.1 can act as a client for the QIAGEN CLC Genomics Server 8.5, 8.0 and 8.0.1. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 8.5
Server specific
Improvements
General
- Support for special characters in AD passwords has been added.
- Added workflow grid job dependency support for Univa Grid Engine.
- When unlocked parameters of a workflow installed on a server are edited, the name of the person to last change that workflow and the date and time of the change are now presented above the design in the thin client.
- To ensure that correct server information is available, grid jobs can no longer be launched on setups where the host name and port are not set in the "Job distribution" area of the thin client on the master server. A check for these settings has been added to the "check setup" functionality of the server.
External Applications
- When changes are made to an External Applications configuration, the name of the person who made that change and the date and time the change was made is available in a tooltip when the cursor is placed over the name of that External Application in the thin client.
Bug fixes
- Fixed an issue where the server web administration interface failed to accept certain passwords containing non-ASCII characters.
Shared with Workbenches
New tools and features
- "Search for Reads in SRA ..." allows search and download of reads from the SRA database.
- "Identify Known Mutations from Sample Mapping " can be used to look up known genomic variants in read mappings.
- "Identify Candidate Variants " can be used to identify and extract variants that fulfill certain criteria.
- A new GFF3 importer is available as an option in the Import -> Tracks tool.
- A new option "Use EM estimation (recommended)" was added to the RNA-Seq Analysis tool. This enables the use of an expectation-maximization algorithm to distribute ambiguous reads between isoform/genes.
- A new option in the Sample Reads tool makes it possible to choose whether sampling should be deterministic or random.
Improvements
General
- The Local Realignment tool has a new option that can allow the use of guidance variants longer than 100bp.
- The InDels and Structural Variants tool now offers the option to include reads mapped as broken pairs in the analysis.
- The InDels and Structural Variants tool offers now the option for consensus calculation to ignore reads if their relative coverage or quality scores are too low.
- The Identify Graph Thresholds tool can now be run using only a lower or upper threshold limit, rather than having to specify both.
- The Identify Graph Thresholds tool can now be configured to work on specified regions only.
- The Trim Sequences tool now handles ambiguity codes in the adapter/primer sequences.
- All NCBI server communication is now encrypted. (NCBI will be moving all web services to the HTTPS protocol on September 30, 2016).
- Standard deviations in reports are now being calculated with a different algorithm than previously. This will have no noticeable effect in the overwhelming majority of cases.
- Improved performance of a number of tools when run on systems with multiple cores: Annotate from Known Variants, Filter against Known Variants, Filter against Control Reads, Annotate with Exon Number, Annotate with Flanking Sequences, Filter Marginal Variant Calls, Compare Sample Variant Tracks, Trio Analysis, GO enrichment Analysis, Amino Acid Changes, Annotate with Conservation Score, Predict Splice Site Effect, Link Variants to 3D Protein Structure, Merge Annotation Tracks, Create Statistics for Target Regions, Fisher Exact Test, Annotate with Overlap Information, Filter Based on Overlap, Identify Candidate Variants, and Filter Reference Variants.
- It is now possible to export expression tracks in BED format. The expression value will be exported as the score.
- The COSMIC importer has been updated to support the latest version of the COSMIC database, release v77.
- The tools "Filter Based On Overlap" and "Annotate with Overlap Information" now work with the Statistical Comparison Tracks produced by the Advanced RNA-Seq plugin.
- When importing metadata from a spreadsheet with formulas in it, the result of the evaluation of the formula (as displayed in Excel) is now imported rather than the formula itself.
- GenBank import now also allows for file names with 'GBFF' extension.
- Improved the progress reporting for the import of large, gzip compressed Illumina and Ion-Torrent files.
- The Extract Consensus Sequence tool now outputs a sequence list for all results. Previously, when running this tool directly, if the result was a single sequence, it would output a sequence, not a sequence list. (Nothing has changed when this tool is run as part of a workflow, where sequence lists were always generated.)
- General speed and usability improvements.
Biomedical-enabled Servers only
- A RefSeq reference data set is now available in the reference data manager.
- Improved performance of a number of tools when run on systems with multiple cores: Add Information from Variant Databases, Remove Variants Found in External Database, Remove Germline Variants, Add Exon Number, Add Flanking Region, Remove False Positives, Identify highly Mutated Gene Groups and Pathways, Add Information About Amino Acid Changes, Add Conservation Scores, Identify Variants with Effect on Splicing, Link Variants to 3D Protein Structure, QC for Targeted Sequencing, Identify Enriched Variants in Case vs Control Samples, Add Information from Overlapping Genes, Add information from Genomic Regions, Add Information from Overlapping Variants, Remove Variants Outside Genome Genome Regions, Remove Variants Outside Target Regions, Remove Variants Inside Genome Regions, Identify Mutated Genes, and Remove Reference Variants.
Bug fixes
- Fixed an issue with the tools "Extract from Selection" and "Extract Reads Based on Overlap" so that they now correctly extract mapped reads that extend over the (arbitrarily chosen) ends of the 1D representation of a circular genome.
- Fixed an issue where the Motif Search tool was incorrectly reporting all match accuracies as either 0% or 100%.
- Fixed a bug where exporting to Wiggle on systems with specific system locales would produce files that could not be re-imported.
- Fixed an issue that caused characters in sequence names to be rendered incorrectly when a report was exported to Excel.
- The Find Binding Sites and Create Fragments tools now properly display mismatches when the primer input is in lower-case.
- Fixed a memory leak in the Extract Consensus Sequence tool.
- Fixed an issue where sequences of length zero would cause the Create BLAST Database tool to throw an error. Such sequences are now skipped and will not included in the final database.
- The Illumina High-Throughput Sequencing Import tool now correctly warns that zip files with multiple entries are not supported.
- Various minor bugfixes.
Plugin updates and retirements
All QIAGEN CLC Genomics Servers
- An update to the Advanced RNA-Seq Server Plugin is available.
- An update to the CLC Microbial Genomics Server Extension is available.
- Annotate with GFF File Server Plugin: Now supports spaces in annotation names.
- The RNA-Seq Legacy plugin has been retired.
Compatibility
- QIAGEN CLC Genomics Workbench 9.5 is the corresponding client for the QIAGEN CLC Genomics Server 8.5.
- Biomedical Genomics Workbench 3.5 is the corresponding client for the QIAGEN CLC Genomics Server 8.5.
- CLC Command Line Tools 3.5 is the corresponding client for the QIAGEN CLC Genomics Server 8.5.
We recommend running the corresponding versions of clients for the QIAGEN CLC Genomics Server. However, QIAGEN CLC Genomics Workbench 9.0 and 9.0.1, Biomedical Genomics Workbench 3.0 and 3.0.1, and CLC Command Line Tools 3.0 and 3.0.1 can connect to the QIAGEN CLC Genomics Server 8.5. In addition, the QIAGEN CLC Genomics Workbench 9.5, Biomedical Genomics Workbench 3.5 and CLC Command Line Tools 3.5 can can connect to a QIAGEN CLC Genomics Server 8.0.x. Tools that have changed between versions cannot be launched when using compatible, but not corresponding, client-server combinations.
Notice
From now on, only 64 bit versions of the QIAGEN CLC Genomics Server, QIAGEN CLC Genomics Workbench, Biomedical Genomics Workbench, CLC Bioinformatics Database and QIAGEN CLC Assembly Cell will be made available. 32 bit versions of these are discontinued.
Advance Notice
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools are scheduled to be removed from the Server and QIAGEN CLC Genomics Workbench in spring, 2017.
- The Expression Profiling by Tags tools (Extract and Count Tags, Create Virtual Tag List, and Annotate Tag Experiment) are scheduled to be removed from the Server and QIAGEN CLC Genomics Workbench in spring, 2017.
CLC Server Command Line Tools
New tools
Analysis related tools
- download_sra SRA Download
- identify_candidate_variants_new Identify Candidate Variants (Existing tool for Bx-enabled servers now enabled on all servers)
- mutation_tester_tool Identify Known Mutations from Sample Mappings (Existing tool for Bx-enabled servers now enabled on all servers)
Tools for server maintenance
- disable_maintenance_mode Disable server maintenance mode
- enable_maintenance_mode Enable server maintenance mode
- install_plugin_and_restart Install plugins and restart the server
- install_workflow Install a workflow
- list_plugins Lists installed plugins
- list_workflows Lists installed workflows
- restart_server Restart server and any attached job nodes
- shutdown_server Shut down the server
- uninstall_plugin_and_restart Uninstall plugin(s) and restart the server
- uninstall_workflow Uninstall workflow
Improvements
- Workflow outputs can now be configured so that subfolders to contain the outputs are created.
- New placeholders are available when defining the names of exporter outputs: {user}, {host}, and for elements of the timestamp of the output object, {year}, {month}, {day}, {hour, {minute}, {second}.
- Placeholders within export output names that were previously available only as digits can now be specified using written names: {input} is a synonym for {1}, {extension} is a synonym for {2} and {counter} is a synonym for {3}.
- Added support for running workflows that exposed custom parameter names containing underscores.
- When using the {2} placeholder for custom naming in workflow output elements, only unlocked inputs will be included in the generated name.
- Improved validation of the VCF exporter such that more meaningful error messages are presented.
Tools with added parameters
- graph_threshold
- Added options:
- --masking-mode
- --region-track
- --use-lower-threshold
- --use-upper-threshold
- Added options:
- local_realignment
- Added option:
- --max-extrinsic-variant-length
- Added option:
- rna_seq
- Added option:
- --em-enabled
- Added option:
- sample_reads
- Added option:
- --sample-type
- Added option:
- structural_variant_detection
- Added options:
- --broken-pairs
- --min-consensus-coverage
- --min-qscore
- Added options:
Bugfixes
- Fixed an issue so that the download_pfam_database can now be run using the CLC Command Line Tools. Associated with this is the removal of the -i , --input parameter for this tool.
Compatibility
CLC Command Line Tools 3.5 is the corresponding client for QIAGEN CLC Genomics Server 8.5.
CLC Command Line Tools 3.5 can act as a client for the QIAGEN CLC Genomics Server 8.0 and 8.0.1. However, we recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
Notice
From now on, only 64 bit versions of the QIAGEN CLC Genomics Server, QIAGEN CLC Genomics Workbench, Biomedical Genomics Workbench, CLC Bioinformatics Database and QIAGEN CLC Assembly Cell will be made available. 32 bit versions of these are discontinued.
QIAGEN CLC Genomics Server 8.0.1
Improvements
- When running a workflow on a QIAGEN CLC Genomics Server with grid nodes and using the classic job queuing option, the name of individual subjobs contains the name of the workflow element being run. Previously, "Grid Executer" was the name reported for each subjob.
Bug fixes
- Fixed an issue with the RNA-Seq Analysis tool that could arise when the "Genomes annotated with genes and transcripts" option was chosen: If two or more genes had the same name, and a transcript could be assigned to each from the mRNA track, then the value in the "Transcripts annotated" column in the GE track and in the TE track was 0. Furthermore, all counts for such genes were reported as zero, even when there were reads mapping to them.
- Fixed an issue that prevented workflows containing an input modifying element but no outgoing connection from being run on the server.
- Fixed an issue where the Motif Search tool incorrectly reported all match accuracies as either 0% or 100%.
Compatibility
- QIAGEN CLC Genomics Workbench: QIAGEN CLC Genomics Workbench 9.0.1 and 9.0 can connect to the QIAGEN CLC Genomics Server 8.0.1. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, QIAGEN CLC Genomics Workbench 9.0.1 with QIAGEN CLC Genomics Server 8.0.1.
- Biomedical Genomics Workbench: Biomedical Genomics Workbench 3.0.1 and 3.0 can connect to a QIAGEN CLC Genomics Server 8.0.1 that has a Biomedical Genomics Server Extension license. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, Biomedical Genomics Workbench 3.0.1 with QIAGEN CLC Genomics Server 8.0.1.
- CLC Command Line Tools. CLC Command Line Tools 3.0.1 and 3.0 can act as clients for the QIAGEN CLC Genomics Server 8.0.1. We generally recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server. Here, CLC Command Line Tools 3.0.1 with QIAGEN CLC Genomics Server 8.0.1.
Advanced notice
- From the autumn 2016 release, only 64 bit versions of the QIAGEN CLC Genomics Server, QIAGEN CLC Genomics Workbench, Biomedical Genomics Workbench, CLC Bioinformatics Database and QIAGEN CLC Assembly Cell will be made available. 32 bit versions of these will be discontinued from that time.
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools will be removed from the Server and Workbenches in early 2017.
CLC Server Command Line Tools 3.0.1
Compatibility release
CLC Command Line Tools 3.0.1 is the corresponding client for QIAGEN CLC Genomics Server 8.0.1.
CLC Command Line Tools 3.0.1 and 3.0 can act as clients for the QIAGEN CLC Genomics Server 8.0.1. However, we generally recommend running the corresponding version of the CLC Command Line Tools QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 8.0
Server specific
New features
- The Server can be configured such that all steps of a Workflow will be executed on the same node in a job node or grid node setup.
- The health and throughput of the CLC Server can now be monitored using any monitoring software that supports the JMX standard.
- Options are now available for how direct data transfer from client systems is handled, including the ability to disable transfers. This affects where temporary data is held when transferring client-side data into Server locations and enables administrative control of Server data import initiated from either Workbench or CLC Command Line Tools.
- Multiple post-processing steps can now be configured for External Applications, allowing the re-import or other post-processing of multiple outputs from third party tools.
- The Create Scatter Plot tool can be run on the QIAGEN CLC Genomics Server.
- The Create MA Plot tool can be run on the QIAGEN CLC Genomics Server.
Improvements
General
- Audit log presentation and searching has been improved. Improvements include the ability to search for failed tasks and for entries between specified time points.
- Host addresses and "Canonical host name" are now presented as suggestions for the master node host in the Server setup area of the web administrative interface.
- Reporting of system level problems has been added to the Status and Maintenance area of the web administrative interface.
- Workflows are now sorted alphabetically by name in the web administrative interface
- It is now possible to choose which LDAP bind is used for selected LDAP operations
- Improved the wording of the status messages on the server maintenance pages.
External Applications
- Post process algorithm parameters are made available for user configuration by unlocking them via the "Edit and map parameters" editor.
- Linking post process algorithm parameters to "End user parameters" is now done via the "Edit and map parameters" editor
- "End user parameters for post processing only" are no longer necessary.
- When available, the name for inputs used by a post-processing tool is presented in the "Edit and map parameters" editor, rather than the earlier generic input name ("Input data (common for all algorithms)").
Changes
- The options formerly in the job queuing options group under the Job distribution area are now in an area called job running options.
- The parameter graph view of External Applications has been removed.
Bug fixes
- Fixed an issue observed on job node setups where consecutive tasks of a workflow executed on the same job node could result in a cached data object being used as input rather than fetching a new copy, which could lead to errors being reported.
- Fixed an issue with built-in authentication on job node setups where any new user added while a job node was down would not be persisted through a Genomics Server restart.
- Fixed an issue where if a job node addition failed, that same job node could not be added within the same Server session.
- Fixed an issue where files on the server could not be moved or copied if the plugin that created them was not installed on the server.
- Fixed an issue where a Workbench could attempt to retrieve data from a Server before the Server login process had completed.
- Fixed a bug where doing automatic association using a metadata table stored on a CLC Server would fail.
Shared with Workbenches
Improvements
RNA-seq related
- The RNA-Seq Analysis tool now computes Transcripts Per Million (TPM) values.
- Faster analysis of multiple samples in the RNA-Seq Analysis tool due to caching of reference index files.
- Performance improvements for Expression Tracks in RNA-Seq.
- Expression tracks now contain links to external databases when available.
- Transcript level expression tracks now contain the gene name for each transcript.
Mapping related
- Match score can now be specified in the Map Reads to Reference and Map Reads to Contigstools.
- Map Reads to Reference now outputs an empty read mapping and report when nothing mapped, and empty unmapped reads if everything mapped.
- Fixed threads being leaked in Map Reads to Reference when caching of indexed reference sequences was used.
Track related
- The tool Create Mapping Graph can now create a coverage graph over the start positions of reads in a read mapping.
- Improved error messaging when trying to import malformed fasta files into tracks.
Metadata related
- The use of partial or exact matching schemes can be chosen when associating data with metadata using the Associate Data Automatically option.
General
- Fixed an issue with the VCF-exporter resulting in inconsistent information being output to the exported VCF. The metadata field "##reference" field now contains a human readable string-representation identifying the reference genome the exported variants are based on. The metadata field "##fileOrigin" was added to contain a human readable string-representation identifing the exported variant track.
- Performance optimization for sizing phylogenetic trees by metadata.
- The 3D Protein Structure Database has been updated.
- The Download Pfam Database tool has been updated to download version 29.
- Substantial speed improvements to BAM export.
- All Excel sheets in a document are now imported and each sheet has a table created for its contents.
- The CSV, HTML and Excel table/tabular exporter now use "Inf" and "NaN" values to replace the ambiguous "?".
- In the wizard for exporting a table in CSV format, when not exporting all columns, it is now possible to cancel or go back to the previous step while selected columns are loading.
- SAM records with CIGAR strings with no aligned residues can now be handled when importing SAM/BAM files.
- GFF Track Import now supports spaces in annotation names
Changes
Workflow related
- The Create Scatter Plot tool is now Workflow enabled.
- The Create MA Plot tool is now Workflow enabled.
General
- The naming rules for the outputs of several tools have been changed to align with those applied by most other tools. The tools affected by these changes are: Local Realignment, Low Frequency Variant Detection, Fixed Ploidy Variant Detection, Basic Variant Detection as well as the legacy variant detection tools: Probabilistic Variant Detection and Quality-based Variant Detection.
- The BaseQRankSum value for variants is now negative to indicate that the qualities for the variant is below those for the reference allele. The BaseQRankSum is now calculated as a positive value when the qualities for the variant are above those for the reference allele.
- Export to clc format now truncates very long filenames.
- Versions of individual tools are now reported in the history of output objects.
- For the NGS importers, the paired reads minimum and maximum default interval has been updated to 1 - 1000.
- Plots without any data points will now be skipped when rendering reports.
- The annotations "Known variation", "Validated by other experiment", "Ancestral allele", and "Phenotype related", created by variant track import are not used and have therefore been removed from variant tracks.
- The Detailed Mapping Report statistics table and the QC for Read Mapping statistics table now show previously missing values for regions with partial coverage. For fully covered regions these values cannot be calculated, and empty strings are replaced with coverage minimum, average and standard deviation. Numeric sorting is retained by inserting NaN values instead of empty strings, where calculations cannot be made.
- RPM package installers for Linux are no longer available.
- Associate Data Automatically accepts data elements (not folders) as input.
Bug fixes
- BED Export: when exporting block list entries (such as connected exons from mRNA tracks), positions were absolute, but are now relative to the 'chromStart' position.
- Fixed a frame offset bug that occurred when translating reverse complemented CDS regions into protein sequences.
- Fixed an off-by-one error for read start positions in the 'Find Broken Pair Mates...' output table.
- Fixed a bug that caused the Excel importer to use column names as cell values of the first row.
- Fixed an issue where an error was reported if the local realignment tool detected an insertion followed by a deletion in the original mapping. Such positions are now ignored.
- Fixed an issue where Workflows were not able to remove intermediate data from permission enabled locations unless the top folder was writable.
- Fixed a bug that led to the creation of an empty folder for each excluded batch unit.
- Added missing percentage signs for identities and gaps in Blast text exports.
- When the InDels and Structural Variants tool is added to the workflow the "P-value Threshold" parameter did not show up in the Select settings wizard step under "Significance of unaligned ends breakpoints". This has been fixed.
- Fix an issue that could lead to an error when a job status description changed while a full description was being generated.
- Fixed an issue with handling dates when importing metadata from Excel format files using the Metadata Table Editor.
- The "Extract and Count" tool in Small RNA analysis now only accepts sequences and sequence lists. Previously, it incorrectly accepted standalone read mappings or small RNA samples as well.
- A bug was fixed where no BaseQRankSum was calculated for insertions of length 1.
Plugin updates and retirements
All QIAGEN CLC Genomics Servers
- A new RNA-Seq analysis plugin is now available: Advanced RNA-Seq
- Annotate with GFF: Now supports spaces in annotation names.
- The RNA-Seq Legacy plugin has been retired.
Biomedical-enabled Servers only
- Ingenuity Variant Analysis: Enforce diploid export has been exposed in the wizard and switched on as default option,in the drop-down options for “Analysis pipeline name”, “Personal genome” has been renamed to “Single sample”, and fixed an issue where the IVA plugin failed when it was given a genome with a non-circular MT chromosome.
- QIAGEN GeneRead Panel Analysis Plugin: Adjusted a number of parameter settings in the workflow and moved the workflow to the subfolder "Somatic cancer (TAS)" under "Targeted Amplicon Sequencing".
Compatibility
- QIAGEN CLC Genomics Workbench 9.0. The QIAGEN CLC Genomics Workbench 9.0 connects to the QIAGEN CLC Genomics Server 8.0.
- Biomedical Genomics Workbench 3.0. The Biomedical Genomics Workbench 3.0 connects to the the QIAGEN CLC Genomics Server 8.0 with a Biomedical Genomics Server Exension.
- CLC Command Line Tools. The CLC Command Line Tools 3.0 connects to the QIAGEN CLC Genomics Server 8.0.
Advanced notice
From the autumn 2016 release, only 64 bit versions of the QIAGEN CLC Genomics Server, QIAGEN CLC Genomics Workbench, Biomedical Genomics Workbench, CLC Bioinformatics Database and QIAGEN CLC Assembly Cell will be made available. 32 bit versions of these will be discontinued from that time.
CLC Server Command Line Tools
New features and improvements
New Tools
- ma_plot Create MA Plot
- xy_scatter_plot Create Scatter Plot
Tools that have changed parameters
All Genomics Servers
- associate_metadata Associate Metadata
Added option: --match-scheme - contig_read_mapping
Added option: --match-cost - mapping_graph_tracks Create Mapping Graph Tracks
Added option: --reads-start-coverage - read_mapping Map Reads to Reference
Added option: --match-cost
Bug fixes - Commands replaced
These replacements apply to Biomedical Genomics Server Extension users only
- add_link_to_structure Please use link_to_structure to run the Link Variants to 3D Protein Structure tool.
- download_3d_structure_information_db Please use download_sequence_to_structure_dbto run the Download 3D Protein Structure Database tool.
Commands removed
- chip_seq ChIP-Seq Analysis (legacy)
- identify_candidate_variants Identify Candidate Variants (legacy)
QIAGEN CLC Genomics Server 7.5.4
All changes in this release have also been fixed on the QIAGEN QIAGEN CLC Genomics Server 9.x and 8.5.x lines at time of writing.
Improvements
- All NCBI server communication is now encrypted (uses HTTPS).
- Updated BLAST executables to be compatible with macOS Sierra. This change only affects Mac users.
Bug fixes
- For the Basic Variant DetectionLow Frequency Variant Detection and Fixed Ploidy Variant Detection tools:
- Fixed an issue where the count and read count could be reported as marginally higher than they actually were in a small minority of cases. For the affected variants, this could then also result in variant frequencies being reported that were slightly higher than they should have been, in some cases above 100%. Variants affected by this issue are a small subset of variants where the variant affected overlapped another potential variant and where only the affected variant was then reported. This change could lead to a small decrease in the number variants reported compared to earlier versions of the CLC software, due to a variant no longer passing the count or read count filtering constraints. The impact of this change is expected to be low. For example, in our tests, for a particular analysis that reported 250,000 variants, 30 fewer were reported with the same parameters and filters applied after this fix was implemented.
- Fixed an issue where the coverage of a longer variant that contained another variant was reported for both the longer variant and the contained variant. The coverage for the contained variant is now reported correctly.
- Fixed a bug where count, read count, and forward- and reverse read count could be incorrect for variants found in overlapping regions of a pair of reads and where the variant was originally identified as being adjacent to one or more other variants.
- Fixed an issue affecting coverage calculation for SNVs without immediately adjacent variants when using paired read data: if the second read of a pair containing the variant did not meet the requirements of the quality filter, neither the first nor second read of that pair contributed to the coverage calculated for the variant.
- Fixed an issue where for a SNV without immediate neighboring variants, overlapping reads of a pair that had conflicting base calls for that variant position contributed to the values calculated for coverage, read coverage, and read count of that variant.
- Fixed an issue where the forward and/or reverse count for a longer variant, supported by paired reads with both children having the same direction, could be too low. The forward count and reverse count is now reported correctly.
- Fixed an issue with the InDels and Structural Variants tool where an incorrect insertion could be called when the optimal alignment of a read's unaligned end around the breakpoint included a gap in the insertion sequence.
- For the Identify Known Mutations from Sample Mappings tool:
- Fixed an issue where reads in a sample mapping were not identified as supporting the presence of a known variant in cases where the first position of the variant region in the mapped read contained a gap.
- Fixed an issue where a read containing a variant longer than a known variant being tested for was counted as supporting the known variant in cases where the first part of the read’s variant sequence is identical to that of the known variant.
- Fixed an issue where overlapping reads of a pair having conflicting base calls for a variant position could contribute to the coverage calculated for that variant.
Compatibility
We recommend running the corresponding versions of clients for QIAGEN CLC Genomics Server. The following are the corresponding client software for the QIAGEN CLC Genomics Server 7.5.4:
- QIAGEN CLC Genomics Workbench 8.5.4
- Biomedical Genomics Workbench 2.5.4
- CLC Command Line Tools 2.5.4
The following client software and versions are compatible with the QIAGEN CLC Genomics Server 7.5.4:
- QIAGEN CLC Genomics Workbench 8.5.3, 8.5.2, 8.5.1, 8.5, 8.0.3, 8.0.2, 8.0.1 and 8.0
- Biomedical Genomics Workbench 2.5.3, 2.5.2, 2.5.1, 2.5, 2.1.2, 2.1.1 and 2.1
- CLC Command Line Tools 2.5.3, 2.5.2, 2.5.1, 2.5, 2.0.3, 2.0.2, 2.0.1 and 2.0
Compatible client versions that are not the corresponding version can connect to QIAGEN CLC Genomics Server 7.5.4. Jobs can be launched on the server from such client software with the exception of jobs where the tool being run has changed between between the client software version and the server software version.
QIAGEN CLC Server Command Line Tools
Changes
- Fixed an issue so that the download_pfam_database can now be run using the CLC Command Line Tools. This issue was also fixed in the QIAGEN CLC Server Command Line Tools 3.x line and above.
QIAGEN CLC Genomics Server 7.5.3
Improvements
- When running a workflow on a QIAGEN CLC Genomics Server with grid nodes and using the classic job queuing option, the name of individual subjobs contains the name of the workflow element being run. Previously, "Grid Executer" was the name reported for each subjob.
Bug fixes
- Fixed an issue with the RNA-Seq Analysis tool that could arise when the "Genomes annotated with genes and transcripts" option was chosen: If two or more genes had the same name, and a transcript could be assigned to each from the mRNA track, then the value in the "Transcripts annotated" column in the GE track and in the TE track was 0. Furthermore, all counts for such genes were reported as zero, even when there were reads mapping to them.
- Fixed an issue where the Motif Search tool incorrectly reported all match accuracies as either 0% or 100%.
- Fixed an issue that prevented workflows containing an input modifying element but no outgoing connection from being run on the Server.
Compatibility
- It is possible to use the QIAGEN CLC Genomics Workbench 8.5.3, 8.5.2, 8.5.1, 8.5, 8.0.3, 8.0.2, 8.0.1 and 8.0 to connect to the QIAGEN CLC Genomics Server 7.5.3. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, that would be a QIAGEN CLC Genomics Workbench 8.5.3 connecting to the QIAGEN CLC Genomics Server 7.5.3.
- It is possible to use the Biomedical Genomics Workbench 2.5.3, 2.5.2, 2.5.1, 2.5, 2.1.2, 2.1.1 and 2.1 to connect to the QIAGEN CLC Genomics Server 7.5.3. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, that would be a Biomedical Genomics Workbench 2.5.3 connecting to QIAGEN CLC Genomics Server 7.5.3.
Advanced notice
- From the autumn 2016 release, only 64 bit versions of the QIAGEN CLC Genomics Server, QIAGEN CLC Genomics Workbench, Biomedical Genomics Workbench, CLC Bioinformatics Database and QIAGEN CLC Assembly Cell will be made available. 32 bit versions of these will be discontinued from that time.
- The Probabilistic Variant Detection (legacy) and Quality-based Variant Detection (legacy) tools will be removed from the Server and Workbenches in early 2017.
CLC Server Command Line Tools 2.5.3
This version of the CLC Server Command Line Tools is the corresponding client version for the QIAGEN CLC Genomics Server 7.5.3.
This version of the CLC Server Command Line Tools can be used to connect to the QIAGEN CLC Genomics Server 7.5.3, 7.5.2, 7.5.1, 7.5, 7.0.3, 7.0.2, 7.0.1 and 7.0. However, we generally recommend running the corresponding client for the server version.
QIAGEN CLC Genomics Server 7.5.2
Retirements
- The Biobase Genome Trax Download Server Extension has been retired because it is dependent on the Biobase GenomeTrax product, which will stop operating in 2016.
Bug fixes
- The Download Pfam Database tool has been updated to download version 29.
- Fixed a frame offset bug that occurred when translating reverse complemented CDS regions into protein sequences.
- When the InDels and Structural Variants tool is added to the workflow the "P-value Threshold" parameter did not show up in the Select settings wizard step under "Significance of unaligned ends breakpoints". This has been fixed.
- BED Export: when exporting block list entries (such as connected exons from mRNA tracks), positions were absolute. This has been fixed: positions are now relative to the 'chromStart' position.
- Fixed an issue where Map Reads to Reference would under rarely occurring circumstances report a persistence error.
Compatibility
- It is possible to use the QIAGEN CLC Genomics Workbench 8.5.2, 8.5.1, 8.5, 8.0.3, 8.0.2, 8.0.1 and 8.0 to connect to the QIAGEN CLC Genomics Server 7.5.2. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, that would be a QIAGEN CLC Genomics Workbench 8.5.2 with the QIAGEN CLC Genomics Server 7.5.2.
- It is possible to use the Biomedical Genomics Workbench 2.5.2, 2.5.1, 2.5, 2.1.2, 2.1.1 and 2.1 to connect to the QIAGEN CLC Genomics Server 7.5.2. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, that would be a Biomedical Genomics Workbench 2.5.2 with the QIAGEN CLC Genomics Server 7.5.2.
CLC Server Command Line Tools 2.5.2
Improvements
On Windows some temporary files named MIMEXXX.tmpwas not removed on exit
Compatibility
This version of the CLC Server Command Line Tools is the corresponding client version for the QIAGEN CLC Genomics Server 7.5.2.
This version of the CLC Server Command Line Tools can be used to connect to the QIAGEN CLC Genomics Server 7.5.2, 7.5.1, 7.5, 7.0.3, 7.0.2, 7.0.1 and 7.0. However, we generally recommend running the corresponding client for the version of the QIAGEN CLC Genomics Server being connected to.
QIAGEN CLC Genomics Server 7.5.1
Bug fixes
- Fixed an issue leading to an error during VCF export where the data involved had originally been imported from VCF files and the values in the QUAL field were integers.
- Export of floating-point (decimal) numbers to VCF format were previously dependent on the specified locale. This has been fixed so that the decimal separator now always is a point.
- Improved user feedback when configuring LDAP/AD configuration.
- Fixed a problem where after import of a large volume of data, using the "Show results" option in the process tab resulted in an error.
Compatibility
- It is possible to use the QIAGEN CLC Genomics Workbench 8.5.1, 8.5, 8.0.3, 8.0.2, 8.0.1 and 8.0 to connect to the QIAGEN CLC Genomics Server 7.5.1. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, that would be a QIAGEN CLC Genomics Workbench 8.5.1 with the QIAGEN CLC Genomics Server 7.5.1.
- It is possible to use the Biomedical Genomics Workbench 2.5.1, 2.5, 2.1.2, 2.1.1 and 2.1 to connect to the QIAGEN CLC Genomics Server 7.5.1. We generally recommend running the corresponding version of the Workbench for the CLC Server. Here, that would be a Biomedical Genomics Workbench 2.5.1 with the QIAGEN CLC Genomics Server 7.5.1.
QIAGEN CLC Genomics Server 7.5
New server-specific features and improvements
- Tools configured for use via the External Applications functionality can now be included within Workflows.
- Metadata functionality has been added, allowing users to create tables of metadata, import metadata, and associate data elements with particular metadata.
- LDAP and Active Directory authentication now supports LDAP over SSL (ldaps://) and Start TLS for encrypting LDAP communication.
- The performance of the "Link Variants to 3D Structure" tool has been significantly improved when running on a grid or on a remote server.
Improvements shared with QIAGEN CLC Genomics Workbench
- Improved use of multiple cores when running the Create Detailed Mapping Report.
- Some applications require VCF formatted variant data to contain quality score values. We now calculate QUAL and annotate variants with the resulting value.
- Improved memory management when handling large report elements.
Batching on selected elements is now possible: it used to be restricted to selected folders. - One can now select "EST" as database when using the Search for Sequences at NCBI tool.
- The output of the Reverse Complement Sequence now gets the suffix -RC attached to the name of the input instead of -1 before.
- The Hierarchical Clustering of Samples tool can now be executed as part of workflows and on the server.
Bug fixes
- Fixed an issue where if a job node addition failed, that same job node could not be added within the same Server session.
- The "Save in separate folders" option is now available for tools being configured to run via the Server.
- Fixed an issue with the Map Reads to Contigs tool that could be extremely slow when included in workflows with multiple inputs.
- Fixed an issue to the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection tools that happened for certain genomes when using masking tracks.
- Fixed an issue where some filtering operations, such as "doesn't contain" did not act correctly when filtering table cells that contained multiple pieces of information.
- Improved memory management when handling large report elements.
- Fixed a rare error in the Create Statistics for Target Regions tool. The error resulted in a failure when a target region only included the very last nucleotide of a chromosome.
- Fixed an issue whereby Create Box Plot and Principal Component Analysis could sometimes be run with illegal arguments, leading to an error message.
- Fixed a bug in the Predict Secondary Structure tool when the option to calculate the partition function was selected for long molecules (>1000 nucleotides).
- Fixed an issue where some filtering operations, such as "doesn't contain" did not act correctly when filtering table cells that contained multiple pieces of information.
- Fixed an issue where the count of objects in the recycle bin of an SQL data area could be inaccurate if some of the items were not found. (Only relevant for Servers with a Bioinformatics Database.)
- Fixed an issue that required the installation of both mySQL and Oracle JDBC drivers, even if only one of these drivers is needed. (Only relevant for Servers with a Bioinformatics Database.)
Compatibility
- QIAGEN CLC Genomics Workbench 8.5. It is possible to use the QIAGEN CLC Genomics Workbench 8.0.3, 8.0.2, 8.0.1 and 8.0 to connect, but we recommend upgrading the Workbench to use the corresponding version for the Server.
- Biomedical Genomics Workbench 2.5. It is possible to use the Biomedical Genomics Workbench 2.1.2, 2.1.1 and 2.1 and the CLC Cancer Research Workbench 2.0, but we recommend upgrading the Workbench to use the corresponding version for the Server.
QIAGEN CLC Genomics Server 7.0.3
Bug fixes - QIAGEN CLC Genomics Server specific
- Resolved an issue with the grid integration where certain workflows run with a big number of inputs would result in the grid worker giving up waiting for input data
- The analysis/workflow execution system now handles search algorithms specially so that search results are not modified. This eliminates a host of concurrency issues.
- Fixed an issue associated with submitting CLC Server jobs as the root user when using AD/LDAP authentication that could resulting in a job node stalling if the AD/LDAP server was down.
- For Servers that include a CLC Bioinformatics Database only: Postgresql and H2 database locations can now be configured without the need to install the MySQL and Oracle database connector.
Bug fixes shared with QIAGEN CLC Genomics Workbench and Biomedical Genomics Workbench
- Fixed a read mapper bug that caused some reads to be incorrectly reported as unmapped when global alignment was selected.
- Fixed a SOLiD NGS importer bug where import of very low quality, colorspace encoded paired-end sequence reads in fastq format could lead to paired sequence lists where the wrong reads area marked as pairs.
- Fixed an issue with the sort order for paired reads in SAM/BAM exports in high coverage regions.
- Fixed an issue where the Local Realignment tool when run with RNA-seq mapping could occasionally report a match that did not meet internal requirements as a valid match. This had a downstream effect when variant calling tools were run, and then failed upon encountering such a position. This issue has also been addressed in this release.
- Fixed an issue where the Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection tools would stop with an error when encountering a place in a read mapping containing a match that did not meet internal requirements of a valid match.
- Fixed a bug that caused the mapper to enter an infinite loop if a reference of length 0 was used.
- Fixed a rare bug that sometimes made the read mapper halt prematurely when several seeds were identified at the same reference position.
- For Biomedical enabled QIAGEN CLC Genomics Servers only: In case of very low sample coverage, the Copy Number Variant Detection algorithm will now terminate the analysis early and an explanation is given in the reports.
Corresponding client software
- QIAGEN CLC Genomics Workbench 8.0.3, 8.0.2, 8.0.1 and 8.0
- Biomedical Genomics Workbench 2.1.2, 2.1.1 and 2.1
- CLC Cancer Research Workbench 2.0.
QIAGEN CLC Genomics Server 7.0.2
Bug fixes specific to Genomics Server
- Improved memory management for long-running server instances.
- Fixed a rare error in the Create Statistics for Target Regions tool. The error resulted in a failure when a target region only included the very last nucleotide of a chromosome.
- Decreased intensity of data integrity check to improve speed when adding new data areas or generating server setup reports.
- Fixed an issue when copying data between areas on a Server where the data was generated by a Workbench plugin and the Server did not have that plugin installed.
Bug fixes shared with QIAGEN CLC Genomics Workbench
- Fixed an issue with running BLAST at NCBI tool where an NCBI-generated error about their CPU usage limit being exceeded was not being reported transparently and a result of "no hits" was being reported instead.
- Fixed an issue with master server - job node communication that has been observed sporadically on systems with large numbers of job nodes (>20).
- Fixed bug in which Local Realignment could produce an illegal read mapping. This only happened for RNA-data.
- The variant caller will now fail if it encounters an illegal RNA read mapping. If the variant caller fails with such a message, and if it was run on locally realigned data, then we suggest to re-run the local realignment to avoid the error.
- BED importer truncates the names to 80 chars now.
Corresponding client software
- QIAGEN CLC Genomics Workbench 8.0.2, 8.0.1 and 8.0
- Biomedical Genomics Workbench 2.1.1 and 2.1
- CLC Cancer Research Workbench 2.0.
QIAGEN CLC Genomics Server 7.0.1
New server-specific features and improvements
- Former plugin "Duplicate Mapped Reads Removal" is now integrated under the name "Remove Duplicate Mapped Reads" and can be found in the NGS Core toolbox. Please uninstall this plugin as part of your Server upgrade process.
Improvements shared with QIAGEN CLC Genomics Workbench
- BLAST has been upgraded to BLAST+ 2.2.30 that includes a number of improvements and bug fixes. A full list of BLAST+ 2.2.30 changes can be viewed at http://www.ncbi.nlm.nih.gov/books/NBK131777
- Particular annotation types (columns) can now be specified for export in Excel, HTML and tab delimited formats.
- Added column to output of "Annotate and Merge Counts" indicating 3' or 5' direction when using "grouping on mature" parameter.
- Increased the performance for gzip export.
Biomedical-enabled Genomics Servers only
- For users holding a Biomedical Genomics Server Solution (previously called Cancer Research Server Solution), the Copy Number Variant (CNV) Detection plugin has been integrated and can be found in the 'Resequencing' toolbox. Please uninstall this plugin as part of your Server upgrade process.
Bug fixes
- Fixed an issue with running blast searches at the NCBI where an NCBI-generated error about their CPU usage limit being exceeded was not being reported transparently and a result of "no hits" was being reported instead.
- Fixed an error with the administrative interface that did not reload the webpage after restarting the Server.
- Fixed the SOLiD NGS importer to correctly import basespace encoded sequences in fastq files. It is still assumed that sequences originate from colorspace.
- Fixed an error that occurred when running the Create Sequencing QC Report tool and requesting quality analysis reporting.
- Fixed a rare error that caused the Amino Acid Change tool to crash if a CDS feature was less than 3 bases long.
- Fixes and updates for automated genome downloads (Zea mays, C. elegans).
- Fixed a bug in the probabilistic variant caller that caused it to fail for certain input.
Corresponding client software
- QIAGEN CLC Genomics Workbench 8.0.1
- Biomedical Genomics Workbench 2.1 (Servers with Biomedical extension)
- CLC Server Command Line Tools 2.4.1.
QIAGEN CLC Genomics Server 7.0
New server-specific features and improvements
- The layout of the process queue for jobs submitted to the server has been improved.
- Genomics Server now supports IPv6.
- It is now possible to set an upper limit to the number of jobs that can run concurrently on a given job node or a single server.
- Access to grid presets on grid node setups can be restricted to particular groups of users.
- The job node list in the server configuration interface now sorts job nodes by name and host name rather than the order they were added.
- Adding database location can now be done with custom connection string.
Improvements shared with QIAGEN CLC Genomics Workbench
- New tools:
- Create Track from Experiment. This tool makes it possible to convert Experiments to Tracks. In the Experiment, the results of the statistical analysis are annotated on the experiment as additional columns. It can be advantageous to visualize the results of the statistical analysis as tracks.
- Link Variants to 3D Protein Structure makes it possible to visualize amino acid changes on 3D protein structures. After running the tool on a variant table, variants can be visualized on 3D structures.
- The Map Reads to Reference tool now supports both linear gap cost parameters and affine gap cost parameters. The addition of affine gap cost support allows you to get more accurate results for reads with stretches of insertions or deletions.
- The read mapper used in the RNA-Seq Analysis tool has been upgraded to use the new read mapper described above. This upgrade enables you to run RNA-seq Analysis with as little as 6 GB RAM and at the same time improves your end results. However, you cannot yet use affine gap cost parameters in your RNA-Seq analysis.
- MA plots, scatter plots and histograms can now accept expression tracks as input.
- Performance of the Merge Read Mappings tool has been improved, especially in situations where the number of reference sequences is very large, such as when merging reads mapped against de novo assembly results.
- The tool Amino Acid Changes has been expanded with an extra output that makes it possible to visualize amino acid changes in track format.
- Improved PDB import of water molecules, DNA/RNA, and saccharides.
- When importing PDB files, the resulting Molecule Project now contains citation information(PDB ID and primary reference), which can be found in the 'Show History' view.
- Batching: Processes tab and analysis execution logs now display batch names in addition to analysis names for enhanced clarity.
Bug fixes
- Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
- The tool Identify Graph Threshold Areas can now use negative values to define its threshold.
- Fixed problem with import of BED files using external applications.
- SAM/BAM import will no longer fail for alignments with POS = 0, but instead import them as though they were unmapped.
- Fixed an issue that previously blocked the ability to run certain non-exclusive jobs on a single server or job node.
Changes
- We now recommend restart of the server after installation of a server plugin. This can be done via the server administrator interface. A restart will cause any running or scheduled jobs to fail. To avoid disruptions during plugin installation, we have added a maintenance mode. Entering maintenance mode allows such jobs to run and complete, while restricting submission of new jobs.
Compatibility
- This release can be used with CLC Server Command Line Tools 2.4.
Cancer-enabled Genomics Servers only
- Plugins
-
- New: Copy Number Variant Detection Plugin (beta)
CLC Server Command Line Tools
New Tools
-
- link_to_structure
- download_sequence_to_structure_db
- extract_diff_exp_genes
Tools that have changed parameters
All Genomics Servers
-
- add_amino_acid_info
Added option:- --filter-empty-cds
- amino_acid_changes
Added option:- --filter-empty-cds
- contig_read_mapping
Added options:- --deletion-extend-cost
- --deletion-open-cost
- --indel-mode
- --insertion-extend-cost
- --insertion-open-cost
- download_genome
Added option:- --download-chromosome-band
- duplicate_mapped_reads_removal
Added option:- --create-report
- qc_target_sequencing
Added option:- --create-coverage-graph
- read_mapping
Added options:- --deletion-extend-cost
- --deletion-open-cost
- --indel-mode
- --insertion-extend-cost
- --insertion-open-cost
- statistics_target_regions
Added option:- --create-coverage-graph
- structural_variant_detection
Added option:- --masking-track
- add_amino_acid_info
Cancer-enabled Genomics Servers only
-
- add_link_to_structure
- download_3d_structure_information_db
- mutation_tester_tool
Added options:- --ignore-broken-pairs
- --ignore-nonspecific-matches
QIAGEN CLC Genomics Server 6.5.6
Bug fixes shared with QIAGEN CLC Genomics Workbench
- Fixed a bug that caused the mapper to enter an infinite loop if a reference of length 0 was used.
- Fixed a rare bug that sometimes made the read mapper halt prematurely when several seeds were identified at the same reference position.
- Fixed sort order for paired reads in SAM/BAM exports in high coverage regions.
- The analysis/workflow execution system now handles search algorithms specially so that search results are not modified. This eliminates a host of concurrency issues.
- Minor improvements in persistence.
Corresponding Client Software
- QIAGEN CLC Genomics Workbench 7.5.X
- Cancer Research Workbench 1.5.X (Servers with Cancer extension)
- CLC Server Command Line Tools 2.4.X.
QIAGEN CLC Genomics Server 6.5.5
Bug fixes server-specific
-
- Integrity checks on Server data areas have been made more conservative than previously, yielding substantial speed benefits when starting up or reconfiguring Servers with very large amounts of data.
- Fixed issue with copying data held in a Server location where the data type is specific to a tool in a Workbench plugin that is not installed on the Server.
- Fixed an issue where certain temporary files were written directly into the system temporary area rather than into the CLCTmp directory.
Bug fixes shared with QIAGEN CLC Genomics Workbench
- Read-only folders are no longer offered as potential locations to save data bundled with a Workflow.
- Fixed bug in which Local Realignment could produce an illegal read mapping. This only happened for RNA-data.
- The variant caller will now fail if it encounters an illegal RNA read mapping. If the variant caller fails with such a message, and if it was run on locally realigned data, then we suggest to re-run the local realignment to avoid the error.
Corresponding client software
- QIAGEN CLC Genomics Workbench 7.5.4
- CLC Cancer Research Workbench 1.5.5 (Servers with Cancer extension)
- CLC Server Command Line Tools 2.2.2.
QIAGEN CLC Genomics Server 6.5.4
Bug fixes server-specific
- Fixed an issue with master server - job node communication that has been observed sporadically on systems with large numbers of job nodes (>20).
- Improved memory management for long-running server instances.
Bug fixes shared with QIAGEN CLC Genomics Workbench
- The filtering option in the Create Track from Experiment tool only considered the predicted fold-changes in the positive direction, so features that were reduced in expression were filtered out. This has now been fixed.
- Fixed an issue with running blast searches at the NCBI where an NCBI-generated error about their CPU usage limit being exceeded was not being reported transparently and a result of "no hits" was being reported instead.
- Fixed an issue with mapping of paired-end reads, where these were erroneously reported as broken pairs when the fragment size derived from the alignments of the two ends of the pair was longer than reference sequence
- Improved memory management for long-running server instances
- Fixed a bug in the probabilistic variant caller that caused it to fail for certain input.
- When using the RNA-Seq Analysis tool with the "One reference sequence per transcript" option, the "Maximum number of hits for a read" option was sometimes not taken into account for multi-hit reads. This has been fixed.
Cancer Research-enabled Genomics Servers only
- The filtering option in the Extract Differentially Expressed Genes tool only considered the predicted fold-changes in the positive direction, so features that were reduced in expression were filtered out. This has now been fixed. The change also affects the workflow: "Identify and Annotate Differentially Expressed Genes and Pathways", as the tool is also included in this workflow.
- Fixed issue where when the options "Keep only selected annotations" in the "Remove information from variants" tool was selected, the Coverage, Count and Frequency columns did not appear in the output.
Corresponding client software
- QIAGEN CLC Genomics Workbench 7.5.3
- Cancer Research Workbench 1.5.4 (Servers with Cancer extension)
- CLC Server Command Line Tools 2.4.1.
QIAGEN CLC Genomics Server 6.5.3
Bug fixes shared with QIAGEN CLC Genomics Workbench
- Fixed an error resulting in billions of reads being silently dropped when producing large read mappings against large counts of reference sequences. The error involves a read count overflow and the dropping of at least 2 billion reads per failure instance.
- Amino Acid Change tool: In cases where an mRNA track does not overlap all annotations in the CDS track, "Coding Region Changes" were not added to variants that overlap a CDS but not an mRNA annotation. This has been fixed.
- Small RNA Analysis -> Annotate and Merge Counts: When you choose to create a “grouped on mature” output, the small RNAs are grouped by both the 5’ and the 3’ mature sequences separately in the “grouped on mature” output. The column heading has therefore been changed to show "Mature" instead of "Mature 5'".
- The Low Frequency Variant caller could end up in an infinite loop in certain corner cases. This is now fixed.
- Fixed problem with import of BED files using external applications.
- Fixed a bug in the QC report creation step of the ChIP-seq analysis.
- Fixed a bug for color space reads in RNA-Seq Analysis that caused all exon-exon matches to be filtered away.
- Fixed a bug that in some cases caused an error when annotating read sequence lists with the GFF/GTF/GVF annotation tool.
Compatibility
- This release can be used with CLC Server Command Line Tools 2.2.2
CLC Server Command Line Tools - Bug fixes
- Fixed problem running Workflows using the Command Line Tools, where unusual characters, such as quotes, were included in unlocked parameters.
QIAGEN CLC Genomics Server 6.5.2
This release contains fixes to issues seen sporadically on CLC Servers with job nodes, where the main symptom is the occasional occurrence of irreproducible errors related to locating data when you run a job. If you are running a single server, or a master server with grid nodes, or if your job node setup is running without problems, then you do not need to upgrade to this version.
Bug fixes
- Fixed an issue seen sporadically on CLC Servers with job node setups where jobs failed due to problems locating results files even when those files were in place and intact.
- Fixed a problem that arose when validating parameters in Workflows on the Server where more than one Workflow being validated referred to the same underlying data.
- Fixed an issue related to the licensing of job nodes.
- License conditions on job nodes changed to better support multi-job processing.
Compatibility
The QIAGEN CLC Genomics Server 6.5.2 is compatible with the QIAGEN CLC Genomics Workbench 6.5.1 and the CLC Command Line Tools 2.2.1.
The QIAGEN CLC Genomics Server 6.5.2 with Cancer Research Add-on is compatible with the CLC Cancer Research Workbench 1.5.2 and the CLC Command Line Tools 2.2.1.
QIAGEN CLC Genomics Server 6.5.1
New features and improvements
- "Filter Annotations on Name" can now insert names to filter on from significantly bigger files. Previously the limit for the file size was 10KB, this has now been increased to 20MB.
- RNA-Seq Analysis: The ENSEMBL gene id of each gene, where available, has been added as an additional column to the gene expression track output.
- Improved performances of the ChIP-seq Analysis tool for genomes with a large number of chromosomes.
- It is now possible to run a workflow without an optional input.
Bug fixes
- Fixed a problem that prevented BLAST operations when choosing to run these on the CLC Server.
- The AAC tool did not annotate variants in 3' UTR with their DNA-level change using the HGVS c.xxx format. This affects any analysis done with Gx 7.5 or earlier based on ENSEMBL CDS tracks from older versons. The AAC analysis should be redone using Gx 7.5.1 for correct annotation. Important: Please also check the description in the MWB 7.5 release notes of a bug fix in the translation of CDS annotations to protein sequences that was wrong in cases where the reading frame was not +1 or -1 in CDS annotations imported from ENSEMBL.
- A bug has been fixed in the Set Up Experiment tool. Exon-related expression values can now only be selected when present in the individual samples.
- When creating a subset of a paired experiment, the sub-experiment no longer appeared as being paired. This bug has been fixed and sub-experiments created in previous versions should recover the pairing information when accessed with this version of the workbench.
- Pfam filtering bug fixed. Previously, Pfam only reported the first domain of each type in a query and as a consequence many domains were missed. We recommend that users whose research depends on Pfam annotations re-run the tool on their data.
Fixed problem importing VCF files using the AO and RO genotype field. - Fixed problem importing certain VCF files.
- Fixed a bug in the 'Maximum Likelihood Phylogeny' tool that failed when generating bootstrap values for certain input alignments.
- The Blast text results have been improved so they show the correct query and subject positions regardless of strand.
- Fixed problem with import of read mappings with supplementary alignments. When importing read mappings with supplementary alignments, supplementary alignments are not imported. Previously import of such read mappings caused import errors.
- Fixed a bug in the Annotate and Merge Counts tool that in rare cases resulted in incorrect sorting and crash.
CLC Server Command Line Tools - Changes
- The Command Line Tool will no longer wait for child processes of the processes it starts, to complete. This is a technical change that does not affect the performance.
QIAGEN CLC Genomics Server 6.5
New server-specific features and improvements
- In CLC Genomics Server 6.0.5 a more efficient way of submitting workflows to grid was introduced. This feature was optional and should be enabled explicitly. This new mechanism is now default.
- A single thread is now used to poll the status of all grid jobs.
- "Create sequence statistics" is now server enabled.
- We now allow master servers to only have an SSL port, which was not previously allowed.
- Parallel task execution on same job node: More than one task can now be executed concurrently on a single job node or on a single server.
Improvements shared with CLC Genomics Workbench
- New tools:
- New variant callers (Resequencing analysis):
- Three new tools for detecting variants are available in the "Variant Detectors" toolbox under "Resequencing Analysis": Basic Variant Detection, Fixed Ploidy Variant Detection and Low Frequency Variant Detection. Basic Variant Detection and Fixed Ploidy Variant Detection are complete reimplementations of the Quality-based and Probabilistic Variant Detection tools, with improved options for filtering. The Low Frequency Variant Detection tool is a new statistics-based tool for detecting low frequency variants e.g. in mixed tissue cancer or mixed population samples. The Quality-based and Probabilistic Variant Detection tools have been moved to the "Legacy tools" folder in the toolbox, and will eventually be retired.
- Improved read mapper and a tool for downsampling (NGS Core Tools):
-
- Memory usage reduced for the read mapper, enabling mapping against human genomes on a modern notebook.
- Caching of reference index files improves the speed when the same reference is used repeatedly for read mapping.
- The new "Sample Reads" tool can be used to downsample large sets of reads for all types of NGS analysis.
-
- New ChIP-Seq tools (Epigenomics Analysis):
-
- The Chip-Seq Analysis tool found in the toolbox under "Epigenomics Analysis" has been replaced with the plugin "Peak Shape ChIP-Seq Analysis" (that has been renamed to "ChIP-Seq Analysis"). The old "Chip-Seq Analysis" tool has been renamed to "ChIP-Seq Analysis (legacy)" and moved to the new "Legacy tools" folder in the toolbox. The new ChIP-Seq Analysis tool uses a new approach to identify genomic regions with significantly enriched read coverage and a read distribution with a characteristic shape. The parametrization of the algorithm is done automatically by learning the characteristic shape of the signal from the data, making the algorithm intuitive and easily understandable.
- The "Annotate with Nearby Gene Information" tool can be used to annotate ChIP-seq peaks with the nearest gene upstream and downstream, based on the start position of the gene. The resulting annotations are provided in the same format as in the legacy ChIP-seq Analysis.
-
- New variant callers (Resequencing analysis):
- Workflows
-
- The execution of a workflow in the Workbench and on the Server have been unified to have the same behavior regarding logs, intermediate results and output naming.
- New workflow-enabled tools:
- Create sequence statistics.
- Protein Analysis, Pfam domain search:
- Pfam Domain Search now uses HMMER3 and the latest Pfam database, which can be downloaded using the new tool "Download Pfam Database".
- Searching multiple sequences is significantly faster.
- New filters are available in the improved Pfam Domain Search tool to enable generation of the same results as the online tool.
- Small improvements of the de novo assembler speed.
- Improved error messages due to low disk space.
Bug fixes
- A bug in the Fisher Exact Test tool that in some cases caused incorrect counting has been fixed. The Fisher Exact Test algorithm now checks if a case variant also exists in a control variant as a different type (e.g. an SNV variant can exist as part of an MNV variant). Note that variants only found in the control tracks are no longer included in the output.
- Several issues with the validation display in the workflow editor have been fixed.
- Fixed problem where the "space" key did not trigger "Find Conflict" in the stand-alone read mapping editor.
- Fixed stand-alone read mappings not showing mismatches and insertions in the overflow graph.
- Fixed a bug in the de novo assembler and legacy read mapper which could cause a crash due to a collision of temporary file names.
- Fixed a bug which caused the de novo assembler to crash in rare cases on systems running windows. Tools depending on read mapping might also have been affected by this.
- NGS import tools now work when run via CLC Server.
Changes
- To improve the stability of workflows: If a variant caller finds no variants, an empty track is produced, rather than no output.
- Due to upgrade to Java 7, Windows Server 2003 and OSX 10.5.8, 10.6 are no longer supported by Oracle. Therefore, the system requirements have been updated to the following: Linux, Windows Vista, Windows 7, Windows 8 or Windows Server 2008, or Mac OS X 10.7 or later.
- As of June 2014, COSMIC download requires registration. This means that COSMIC is no longer part of the resources that can be downloaded with the Download Genome Data Tool. You can still register at the COSMIC website, download the file to your computer, and use the Import Tracks tool to import the data.
Compatibility
- This release can be used with CLC Server Command Line Tools 2.2.
- This release is using the read mapper and de novo assembler that corresponds to CLC Assembly Cell 4.3.
Plugins
New
- Advanced Peak Shape Tools Plugin (beta).
Retired
- Peak Shape ChIP-Seq Analysis Plugin.
CLC Server Command Line Tools
New features and improvements
- Command Line Tools will now give an error if an argument without a preceding option key is detected.
- The command line tool is now enabled to execute identically named workflows authored by different organizations.
New Tools
- add_fold_changes_to_variant
- boxplot
- create_fc_from_expr_tracks_algo
- create_sequence_statistics
- download_pfam_database
- experiment_to_track
- fixed_ploidy_variant_detection
- go_analysis_expression_change
- basic_variant_detection
- mv
- peak_shape_chip_seq_analysis
- pfam_domain_search
- principal_components
- low_frequency_variant_detection
- sample_reads
Tools that have been removed
- constant_value_algo (no longer needed as input to workflows can be defined inside the workflow)
- workflow_output
Tools that are deprecated
- probabilistic_variant_detection is deprecated and is replaced by fixed_ploidy_variant_detection. The old tool continues to exist, but we recommend customers to switch to the new tool as soon as possible.
- quality-based_variant_detection is deprecated and is replaced by basic_variant_detection. The old tool continues to exist, but we recommend customers to switch to the new tool as soon as possible.
Tools that have changed parameters
- assemble_sanger_sequences
Removed options- --consensus
- --full-contigs
- denovo_assembly
Removed options:- --annotate-conflicts
- --conflict-resolution-mode
- --map-as-long-reads
- --map-as-single-reads
- --maximum-alignment-count
- --score-limit-offset
- --strand-specific
- download_genome
Removed option:- --protocol
- fisher_exact_test
Added option:- --correct-p-value
- identify_enriched_variants
Added option:- --correct-p-value
- ngs_import_sam
Added option:- --save-unmapped
- reference_assemble_sanger_sequences
Removed options:- --install-consensus
- --reference-given
QIAGEN CLC Genomics Server 6.0.5
Changes
An option to change the way workflows are submitted to the grid has been introduced. The behaviour of the previous version (and if the property is unset or set to false) is to submit each individual job in the workflow when its dependencies have been executed. The consequence of this is that jobs from different workflows are interleaved in execution. If many workflows are executed at the same time, this means that total run time of a workflow becomes very unpredictable.
By creating a properties file in the server installation directory: /settings/grid.properties
containing one line: "com.clcbio.server.configuration.grid.lsfPreserveWorkflowOrder = true
, and restarting the server, a new behavior can be switched on. With this new behavior all jobs of the workflow are submitted at once, supplying the grid with the dependency information. The consequence of this is that all jobs belonging to one workflow are placed next to each other in the execution queue on the grid. The general execution order of the jobs is therefore, that the workflow submitted first will be completed before jobs from the next workflow is run. It should be noted that if the grid has sufficient resources and all elements of the first workflow are waiting for the execution of an element, elements from the next workflow will be executed.
For this release, this behavior is only supported for LSF grids.
One drawback of submitting the entire workflow at once is that a large number of grid jobs are created (#elements-per-workflow x #workflows). The server creates a new thread for each job on the grid, which it uses to poll for status with 5 second intervals. This does not work very well if there are many grid jobs. By adding a line: com.clcbio.server.configuration.grid.groupedPolling = true
in the /settings/grid.properties
file and restarting the server, this is changed such that a single thread is used to poll the status of all grid jobs. This reduces both the number of threads used but also allows for more efficient polling techniques against the DRMAA library.
The behavior described above is expected to be the default behavior with next major release of the QIAGEN CLC Genomics Server (check latest improvements notes for confirmation).
QIAGEN CLC Genomics Server 6.0.4
Bug fixes
- Fixed a bug in RNA-Seq Analysis regarding the calculation of RPKM. This error was introduced with the new RNA-Seq tool in QIAGEN CLC Genomics Server 6.0. When calculating RPKM, the total number of gene reads was used instead of total exon reads. This will only have a significant impact in case there are many intron reads mapped to this gene. With this release we have fixed the bug, and we recommend all users that base their analysis of RPKM values to re-run all RNA-Seq analyses conducted with QIAGEN CLC Genomics Server 6.0 - 6.0.3. Please note that the legacy RNA-Seq plugin is not affected by this bug.
- Fixed a bug in the Filter against Control Reads tool which meant that variants that are of type "Replacement" and which also introduce an insertion were not properly removed by the filter, even if there were reads supporting them. We recommend all customers that have relied on this tool for processing data with this tool in QIAGEN CLC Genomics Server 6.0.X to run the tool again in the 6.0.4 version.
- Fixed bug that sometimes caused the workbench to crash when running "Local Realignment" on mappings generated with other mappers and imported as BAM files.
- Fixed problem with some parts of workflow not being executed if there was multiple branches in workflow.
Changes
- Users running RNA-seq analyses with only gene annotations can now choose whether to calculate the RPKM for these genes (i.e. genes without transcripts) or not.
QIAGEN CLC Genomics Server 6.0.3
Changes
Adding support for CLC Cancer Research Workbench: If you have purchased a QIAGEN CLC Genomics Server with added support for Cancer Research, you will receive an additional license that will allow Cancer Research-specific tools to be executed on the QIAGEN CLC Genomics Server.
QIAGEN CLC Genomics Server 6.0.2
Bug fixes
- Fixed problem with the amino acid changes tool that reported all variants within coding regions as non-synonymous. This error was introduced with Genomics Workbench 7.0.2
QIAGEN CLC Genomics Server 6.0.1
New features and improvements
- Improved parameter specification for RNA-seq Analysis
- It is now possible to perform both batched and non-batched import of VCF files without genotype information
- Statistical Analysis: Improved reporting of invalid input to the tools "On Gaussian Data" and "On Proportions"
- Fasta export:
-
- Fasta export with trimming is now much faster and consumes less memory
- Fasta export now reports progress while executing
- When the "Remove trimmed regions" option is set, the Fasta export will ignore sequences in which all nucleotides are covered by a Trim annotation
- Translate to Protein (Batch Process):
- There are now options for specifying whether to translate the coding regions or extract translations from the annotations
- The log has been made more detailed and informative
- If the result is just a single protein sequence, the output will be just that, otherwise all sequences are output as a list
- If the tool estimates that the number of protein sequences to be produced is greater than 1.000.000, it will create protein sequences without history, and it will not copy the common name, latin name, and taxonomy fields
Changes
- When importing a VCF file. If multiple count tags are present in a VCF file, the VCF tags are prioritized in the following order: 1) CLCAD2, 2) AD, 3) AO
- In the "Amino Acid Changes" tool, the description of coding region changes at the DNA level now complies with HGVS recommended nomenclature with regard to variants in untranslated regions. Examples: "c.-4A>C" describes a SNV four bases upstream of the start codon, while "c.*4A>C" describes a SNV four bases downstream of the stop codon
Bug fixes
- After annotating variants with the tool "Annotate from Known Variants" a small fraction of the MNVs disappeared. This has now been fixed
- Fixed problem where "InDels and Structural Variation" crashed for certain data
- Fixed "Filter Based on Overlap" accepting expression tracks as inputs but not knowing how to handle them
- Fixed error in mapping long reads as part of de-novo assembly, Read Mapping Legacy plugin, RNA-Seq Legacy plugin, and Transcript Discovery plugin
- A rare error has been fixed in the Secondary Peak Calling tool
- Fixed a bug that in certain cases made the De Novo Assembly fail
- Fixed a bug that in certain cases made the RNA-Seq Analysis fail
- Fixed a bug that made access to data impossible because of a failed rename operation
- A problem importing Ensembl version 75 files has been addressed. If you have previously imported Ensembl version 75 files, please see the FAQ entry for full details of what to do
QIAGEN CLC Genomics Server 6.0
New server-specific features and improvements
- New recycle bin concept with individual recycle bins and automatic clean-up.
- Each user has an individual recycle bin to avoid problems when deleting data where permissions are applied
- No other users have access to the recycle bin (except server administrators)
- The server administrator can access and empty all recycle bins
- All recycle bins on the server can be configured to be automatically emptied when the data is older than 100 days. This can be set in the server administration user interface
- Special note for customers with database locations: when starting the new server version, the user connecting to the database must have permissions to create tables and indexes in the database in order to perform an automatic upgrade of the data location to the new recycle bin concept. This permission is only needed the first time the server starts up. If it is not desirable to grant these permissions to this user, it is possible to upgrade the database using the CLC Bioinformatics Database Tool.
- Gateway cloning tools now available on the server
- Statistical analysis tools now available on the server
- Create track list now available on the server
- Motif search now available on the server
- Signal Peptide prediction is now available on the server
- Node setup
- It's easier to populate the fields in the job distributions part of the administration interface.
- The display name of the server is shown in the top graphics of the administration interface
- An option to Resync Job Nodes has been added to help in cases where job nodes get out of sync with the master node
- Queue management with job nodes has been simplified and better tailored to executing workflows
- Workflow jobs are handled in a special way on servers using job nodes. Once a workflow has started (after having made it to the top of the queue), the sub-processes that are part of the workflow are automatically placed at the top of the queue. This will ensure that the workflow is not punished for being split into several parts. Previously, work flows could end up being effectively blocked in the queue because the sub-processes would always start from the bottom of the queue.
- The buttons to move jobs up and down in the queue have been removed since this would conflict with the way workflow jobs are handled
- Improved performance when setting permissions on folders, especially important on systems linked to directories with large amounts of users.
- Zip export is now available on all server products
Improvements shared with QIAGEN CLC Genomics Workbench
- Copying data in the Navigation Area runs much faster and uses less memory than before. This is a great improvement which also kicks in when moving data between a QIAGEN CLC Genomics Server and a Workbench.
- RNA-Seq on tracks: A substantial update of the popular RNA-Seq Analysis tool together with new statistical tools for analysis of differential expression form a great improvement for all users working with RNA-Seq.
- The output of the RNA-Seq Analysis is based on tracks and includes tracks with the read mapping, expression values and fusion genes.
- The gene-level and transcript-level expression results are now output as two different tracks. Downstream analysis can be performed on either.
- A new column "Relative RPKM" on the transcript-level expression track can be used to see the relative expression of alternative transcripts for a gene.
- Experiments based on the new expression tracks can be used for browsing the track list with read mappings and annotations.
- It is now possible to map the reads against the full genome as well as gene regions.
- The new read mapping algorithm introduced with QIAGEN CLC Genomics Server 5.5 is now also used for RNA-Seq. This means that mapping is faster but for some data sets it will also require more memory. For a human data set using the latest annotation sets (obtained through the Download Reference Genome Data), there is a minimum requirement at 16GB of RAM and we recommended 24 GB of RAM. If this causes problems, it is still possible to make use of the old RNA-Seq Analysis tool which is available as a plugin.
- The parameters have been changed and updated to make use of tracks and includes a more explicit way of controlling what reference annotations should be used (if any).
- The fusion genes table has been changed into an annotation track.
- Variant tracks can be annotated with expression values from expression tracks.
- New statistical testbased on EdgeR:
- The tools available for statistical analysis of differential expression have been extended to also include the 'Exact Test' (developed by Robins and Smyth and implemented in the EdgeR Bioconductor package). The test is applicable to comparisons of pairs of groups and implicitly performs TMM normalization.
- New functionality for phylogenetic trees (was previously part of a beta plugin)
- Tool to reconstruct phylogenetic trees based on k-mers. This approach avoids the computationally intensive step of constructing a multiple alignment of the input sequences. The k-mer based reconstruction tool is especially useful for whole genome phylogenetic reconstruction where the genomes are closely related.
- Tool performing a statistic evaluation of different substitution models to be used with maximum likelihood tree construction. The output of this tool is a report that lists the recommended settings to be used when constructing phylogenetic trees based on maximum likelihood.
- Added an option for using the Kimura 80 substitution model when creating trees with distance based methods.
- Distance-based tree reconstruction methods can now reconstruct trees from protein alignments using the Jukes-Cantor substitution model or the Kimura protein ML distance estimate.
- A user defined start tree can now be supplied to the ML inference tool.
- Tracks:
- The speed of the Annotate with known variants and Filter against Known Variants tools have been greatly improved when using a large reference database like dbSNP.
- Table filtering of tracks: it is not possible to use "overlaps" and "doesn't overlap" when filtering on the region column. This allows for quicker inspection if any of the variants or annotations overlap a particular position.
- Tooltips on variant tracks in track lists now include the number of variants in the track.
- The Identify Graph Threshold Areas tool is now capable of identifying intervals with higher-than-average reads. This is obtained by setting a “window-size” parameter in the "Identify Graph Threshold Areas" wizard that specifies the width of the window around every position that is used to calculate an average value for that position.
- Previously, when importing variants from VCF files and from UCSC, a small number of variants were ignored because they were not proper replacements or MNVs because they contained reference bases at the ends. These variants are now trimmed and properly imported. This also affects the Download Reference Genome Data tool.
- Workflows:
- Possibility to have bulk configuration of elements. This enables to set the same reference data for multiple elements at once.
- Workflows can be added inside a workflow. The inner workflow is "unfolded" into the single elements.
- Parameters can now be renamed in the editor by the creator during configuration of the elements.
- Workflows with invalid/unknown elements are laid out nicer and more consistent.
- The sidepanel has now an option to display rulers in the editor to indicate better the size of a workflow (particularly when exporting)
- Fit Width now fits the entire workflow in the editor by zooming out.
- The sidepanel has a new section "Minimap" which shows an outline of the whole workflow. It allows to navigate the workflow in the view and also supports zooming
- One can change the design of the workflow editor via the sidepanel (removed the old designs in the preferences)
- Better validation when configuring parameters in workflows
- If a tool receives inputs from at least two tools, the inputs can now be ordered via the context menu on the connections or the input part of the target element.
- The name of an output in the workflow can be set by configuring the output element
- Parameters of a workflow run can now be exported to various formats via the wizard
- It is now possible to reset a reference parameter. Before it was only possible by removing the whole element and add it again.
- In the workbench the installed workflows are now sorted alphabetically.
- The graphics export of a workflow now knows about the scale and one can now export the whole workflow or only the current view.
- A cpw file can now be dragged into the workflow manager and will be installed.
- Further speed improvements on working with larger workflows in the editor
- New tools that are now workflow-enabled:
- Create Track List. (With the requirement that all tracks must also be a workflow output.)
- Annotate with Flanking Sequences
- Convert from Tracks
- All tools in Statistical Tests
- Amino acid changes:
- There are two new columns reporting amino acid changes for the longest transcript. Previously, amino acid changes would be reported for all transcripts, and this information is still available, but many users prefer just to use the longest transcript, and this information is now available in two new columns: one for the change on the protein level, and one for the change on the coding DNA level.
- Variants up- and downstream of the coding regions are now annotated with a coding DNA position as long as they are inside the transcript. In order for this to be reported, the amino acid changes tool has to be supplied with an mRNA track which will be used to determine whether the variant included in the transcript.
- Extract consensus sequence is now able to copy annotations from both existing consensus sequence and the reference sequence.
- When extracting consensus sequence from a mapping, conflict and low coverage annotations now include the position on the reference.
- Read mappings can now be exported to a tabular file including detailed per-base information on coverage and nucleotide composition including insertions and deletions.
- Trim annotations can be used to trim off sequences when exporting to fasta.
- Secondary peak calling has been improved: it now only detects peaks that have a distinct peak shape, only peaks that fall within the same interval as the top peak are called. In addition, trim annotations are taken into account so that no peaks are called within trimmed regions. This greatly reduced false positive calls. Finally, the annotations now include information about the secondary peak's fraction of the maximum peak height.
- Limitations on export of Excel 2010 files (xlsx) are removed:
- Multiple tables can be exported to one xlsx file
- Reports can be exported to xlsx
- Hyperlinks are preserved in xlsx files
- SignalP prediction has been updated to be server-, batch- and workflow enabled.
- Assemble Sequences tools now accept sequence lists as input.
- REBASE restriction enzyme list updated to version 310.
Bug fixes
- The log of a server executed workflow now states when the workflow has been cancelled.
- A workflow with elements which provide additional inputs could not be batched.
- Various bugs in the extract consensus sequence tool have been fixed.
- Tracks with many "chromosomes" took up extra disk space. These are now more compressed.
- Fixed crash when creating a detailed mapping report.
- When translating to protein, ambiguous nucleotides potentially resulting in stop codons were not translated properly, and only the codons resulting in an amino acid were represented in the protein. Now the stop codons are also represented by an X in the protein sequence.
- Prevented multiple error dialogs from being shown on top of each other.
Changes
- The De novo assembly legacy plugin has been discontinued and is no longer available for this release.
Compatibility
- This release can be used with CLC Server Command Line Tools 2.1
- This release is using the read mapping and de novo assembler that corresponds to QIAGEN CLC Assembly Cell 4.2.1
CLC Server Command Line Tools
New tools
- add_att_b_sites
- bp_reaction
- create_track_list
- empirical_analysis_dge
- gaussian_statistical_analysis
- kmer_tree_construction
- lr_reaction
- model_testing
- motif_search
- proportion_based_statistical_analysis
Tools that have added parameters
- amino_acid_changes
- --mrna-track
- consensus_sequence_extraction
- --keep-consensus-annotations
- --transfer-reference-annotations
- graph_threshold
- --window-size
Tools that have changed parameters
- import: some importers have new names to comply with conventions
- com.clcbio.baseimplementation.importplugins.TrimLinkerLibraryImportPlugin is now trim_adapter_list
- com.clcbio.genomics.base.algo.rnaseq.smallrna.MirBaseImporter is now mir_base
- rna_seq: parameters removed
- --colorspace
- --downstream
- --exon-discovery
- --exon-discovery-min-coverage
- --exon-discovery-min-length
- --exon-discovery-min-reads
- --expression-level
- --len-fraction
- --max-distance
- --min-distance
- --min-similarity
- --mismatch
- --organism
- --strand-specific
- --upstream
- --use-annotations
- rna_seq: parameters added
- --auto-detect-paired-distances
- --color-error-cost
- --color-space
- --deletion-cost
- --genes
- --global-alignment
- --insertion-cost
- --length-fraction
- --mapping-type
- --mismatch-cost
- --mrna
- --reference-type
- --similarity-fraction
QIAGEN CLC Genomics Server 5.5.2
Bug fixes
- Fixed: The automatic Grid Worker deployment during start up of the server deleted license.properties and vmoptions files.
QIAGEN CLC Genomics Server 5.5.1
New features
- VCF export allows you to enforce diploid reporting of the variants. This will enable the VCF files to be parsed with other software relying on each line to report two alleles. As part of this, the CLCAD field is replaced with CLCAD2 (read more in the user manual).
Changes
- Variant comparison tools are workflow-enabled
- When importing Genbank nucleotide sequences, the Server will determine whether it is DNA or RNA based on the sequence rather than the description in the file.
Bug fixes
- Fixed: An important issue with the interpretation of ensembl-style gtf files when using the Download Genomes functionality or the Import Tracks functionality. This issue only affects version 5.5 of the Genomics Server. If you have downloaded gene annotations using Download Genomes or have chosen to import ensembl-style gtf annotation files using the tool Import | Tracks using version 5.5 of the Genomics Server, then we highly recommend that you delete the annotation tracks you have generated, and perform the download or import again. Annotations from earlier versions of the Server are not affected by this issue.
- Fixed inconsistencies when importing variant files from UCSC, affecting variants on the negative strand where the allele sequence is longer than one base. This affects dbSNP tracks downloaded using the Download Genome tool, and we highly recommend that you delete any variant tracks imported or downloaded from UCSC, and perform the import or download again.
- Fixed: Filter Against Control Reads was using only the first control reads track, if multiple ones were selected. The issue affected both 5.0 and 5.5 versions. If you used multiple control read tracks simultaneously to filter variants, we strongly recommend that you redo the analysis.
- Fixed: Unresponsive administration interface when connected to Active Directory systems with many users
- From CLC Server Command Line Tools It is now possible to export to a local folder.
- Export is now possible on Grid systems.
- SAM/BAM import: reads mapped to reference sequences that were not provided during import is no longer included in the list of unmapped reads. They are not imported at all, and a note in the history of the imported data records how many reads were ignored.
- Improvements and fixes to the Indel and Structural Variation tool:
- Improved the detection of insertions and deletions from self-mapping evidence particularly relevant for amplicon data
- Fixed: a bug which caused some variants to be called as 'replacements' that should be called as 'insertions' or deletions
- Fixed: a bug which caused the structural variantions to go undetected for long unaligned ends
- Fixed: in Trio Analysis, homozygous variants on chr Y and MT and male X were wrongly marked as de novo mutations when not found in the father. The parameters for Trio Analysis have been changed as part of this.
- Fixed: SAM and BAM export now supports direct gzip and zip compression of the files.
- Fixed: Local Realignment fails on certain data sets
- Fixed: out of memory error when performing bootstrapping with ML tree construction methods.
QIAGEN CLC Genomics Server 5.5
New server-specific features
- New features for configuring job nodes
- The CPU limit can now be specified directly from the administrator web interface under "Job distribution". For the master (or single server) and for each job node, a drop down list allows specification of the CPU limit.
- An “information” button has been added next to the “host” field for master and job nodes. A click on this button gives access to information about the server.
- A CLC Server in a Job Node setup now per default executes all Server Commands. A subset of the available Server Commands can be selected from a command selection dialog that now is searchable. Server Command names are listed alphabetically and followed by the command type in parentheses.
- New server-enabled tools:
- Classical Sequence Analysis, Alignments and Trees
New features shared with CLC Genomics Workbench
- Variant detection:
-
- New tool for adjusting read mappings through local realignment. The Local Realignment tool has the option to realign unaligned ends, realignment with a guidance variant track (e.g. obtained from external resources such as dbSNP, through the Indels and Structural Variants tool described below or from analysis of other read mappings) and allows for realignment of multiple samples. Has previously been available as a beta plugin.
- New tool for detecting structural variants (detects insertions and deletions, intra-chomosomal translocations, tandem duplications and inversions) working on "unaligned ends (soft clippings)". Has previously been available as a beta plugin.
- Important changes to variant reporting: adjacent variants are now reported as one variant instead of linked variants.
- A new variant filter has been added to both “Probabilistic Variant Detection” and “Quality-based Variant Detection”: “Ignore variants in non-specific regions”. This new filter ensures that variants in regions covered by just a few non-specific reads are ignored.
- Probabilistic Variant Detection: A new threshold filter, “Required variant count”, has been added to the wizard. This filter ensures that only variants present in a number of reads that exceeds the specified threshold are called.
- Quality-based Variant Detection: Addition of a new column that reports hyper-allelic status of variants. This is based on the specified threshold “Maximum expected allele” in the “Set genome information” wizard under “Ploidy”. The output in the table is “Yes” or “No” with respect to whether the threshold has been exceeded.
- A new column has been added to the variant track table that describes the length of the insertions, deletions, and replacements. This makes it possible to filter on the length of e.g. insertions/deletions.
- VCF export is now using genotype fields. The tag CLCAD is used for count of a variant, and PL is used for coverage. In this version, one variant track will result in one VCF file.
- Variant annotation:
- New tool for comparing variants between two samples
- Filter against known variants: An SNV in the parameter track can be annotated to an MNV in a sample track as partial match, if it is part of this MNV.
- Filter against known variants: There is a new option to let MNVs be annotated as an exact match if several SNVs can be joined to represent the full MNV allele sequence in the parameter track.
- When running the “Annotate with overlap information” tool using an annotation track as input and a variant track as parameter track, the column describing the specific variant in the Track Table now shows the position and description of the variants. The variant description also appears in the track tooltips when holding the mouse over the variants.
- Workflows:
- Automatic update of tools in workflows. Tools in existing workflows will automatically be updated when required after pressing a button labeled "Migrate Workflow" that only is available when a workflow needs to be updated. If new parameters have been added to the updated version of a tool, these will be used with their default settings. In addition to the updated workflow (which keeps the original name), a copy will be created of the workflow in its original form with the original name extended with "backup (disabled)".
- Export can now be part of workflows.
- Previously, when running the “ChIP-Seq Analysis” tool, the result would be a copy of the read mapping with annotations added. Now the annotations are added to the read mapping used as input. Workflows using the "ChIP-Seq Analysis" tool must be manually updated.
- New tools that are now workflow-enabled:
- Classical Sequence Analysis, Alignments and Trees
-
-
-
-
- Classical Sequence Analysis, General Sequence Analysis
- Classical Sequence Analysis, Nucleotide Analysis
- Molecular Biology tools, Sequencing Data Analysis
- Track Tools, Annotate and Filter
- Track Tools, Graphs
- Resequencing Analysis, Compare Variants
- Transcriptomics Analysis, General Plots
- De Novo Sequencing
-
-
- New tool: Map Reads to Contigs. This tool allows mapping of reads to contigs. This can be relevant in situations where contigs have been imported from an external source, the output from a de novo assembly is contigs with no read mapping, or if you wish to map a new set of reads or a subset of reads to the contigs.
Scaffolds can be exported in AGP format: scaffolded contigs are exported as individual contigs and not as a single scaffold with N's inserted in between contigs. This allows for submission-ready data. - Great performance improvement when updating the contig sequence based on reads that are mapped back to contigs.De novo assembly
-
- Tracks: Several new features have been added. It is now possible to:
- A new tool has been included: “Identify Graph Threshold Areas”. This tool uses graph tracks as input to identify graph regions that fall within certain limits (thresholds that have been specified by the user).
- Extract annotations from track. This tool makes it very easy to extract parts of a sequence (or several sequences) based on its annotations.
- The create histogram tool now also accepts graph tracks as input.
- The Coverage analysis tool is a new tool that can find regions in a read mapping where the coverage is suddenly dropping or rising.
- The "Assemble Sequences" and "Assemble Sequences to Reference" tools are now batch, server and workflow enabled.
- Assemble Sequences: Trimming is no longer integrated with the “Assemble Sequences” tool. This means that trimming must be done separately with the “Trim Sequences” tool.
- Export framework redesigned
- Export of multiple files: you can export several files in one go. The naming of the file will default to the name used in the Navigation Area of the Workbench, but the user can specify a naming pattern to use instead.
- Export can be integrated into workflows
- Support for direct compression of exported files in zip and gzip.
- Previously, VCF export required the user to know that both a variant track and a sequence track should be selected before exporting. This has changed, so that the user only has to select the variant track as input, and the sequence track is supplied as a parameter. This means it is more obvious that it should be selected, and it also means that the choice of sequence track will be remembered for the next vcf export.
- SOLiD import now accepts XSQ files
- The following Plug-ins are now fully integrated in the Server
- InDels and Structural Variation (old plugin name: "Structural Variation")
- Local Realignment
- Extract Annotations
- The tomato genome, Solanum lycopersicum SL2.40.18, available in the Download Genome tool.
- Phylogenetic trees:
- Create Tree now support the Kimura 2-parameter substitution model for DNA sequences and Kimura's distance estimate for protein sequences (Kimura 1983).
- It is now possible to construct Maximum Likelihood phylogenies from protein sequences.
Improvements
- Subprocesses: When submitting a batch job to the server, the subprocesses are now shown in the Workbench. When running workflows, only the master process is shown.
- Extract consensus sequence tool:
- It is now possible to use the quality scores when resolving conflicts or disagreements between reads with “Insert ambiguity codes”. Previously, “Use quality scores” could only be selected when using the “Vote” option for conflict resolution.
- Low coverage regions are now annotated in the consensus sequence produced.
- When using the “Translate to protein” tool, the max limit has been raised to 1GB.
- The alignment tool is now more memory efficient.
- Read mapping: The speed of running a read mapping against a masked reference has been improved significantly. When mapping reads to a reference sequence, it is possible to map reads to only selected annotated regions of the reference (= masking). Previously masking of a reference was performed by replacing the masked out nucleotides with N's. The new masking method discards the masked out nucleotides by splitting the reference into separate sequences. Hence, the masked out sequences are completely ignored in the analysis. The remaining sequence fragments are positioned according to the original unmasked reference sequence.
- BLAST has been upgraded to BLAST+ 2.2.28 that includes a number of improvements and bug fixes. A full list of BLAST+ 2.2.28 changes can be viewed at http://www.ncbi.nlm.nih.gov/books/NBK131777.
- Phylogenetic trees:
- Bootstrapping with the "Maximum Likelihood Phylogeny" is now possible.
- Bootstrap values are now displayed in percent instead of absolute numbers.
Bug fixes
- When setting up user authentication, a help button was missing in the “Server User and Group Management” dialog. This has now been fixed.
- Numbering of amino acids when calculating amino acid changes was wrong for coding regions spanning the starting point of circular chromosomes. We recommend running amino acid calculation again. Please note that the actual amino acid change is called correctly, only the numbering is affected.
- PDF export of the history of a result did not include the name and version number of the Workbench that produced the result.
- Phylogenetic trees:
- The Juke-Cantor distance estimate now ignore all positions containing gaps in pairwise alignments.
- Disabled substitution rate estimation when the corresponding option is deselected by the user in the Maximum Likelihood Phylogeny tool.
- Fixed a bug that caused branch lengths to be estimated incorrectly for ML trees.
Changes
- Option to calculate RPKM values for genes without associated transcripts has been added to RNA-seq analysis.
- System requirements for Linux has changed. From this release, SuSE is supported from version 10.2. This was previously version 10.0.
- Secondary Peak Calling: The parameter “Fraction of max peak height for calling”, in the “Secondary Peak Calling” wizard, has been changed to use the interval 0-1with 0.2 as default setting. Previously the interval was 0 – 100 with 20 as default setting.
- The internal Audit Log is no longer archived automatically to prevent loss of disk space.
CLC Server Command Line Tools
This release of the CLC Genomics Server 5.5 is compatible with CLC Server Command Line Tools 2.0
- Export framework means changing syntax of exporters:
- When specifying data to exported use -i instead of -s
- Some exporters have changed names
- Added possibility to compress exported files directly
- New server tools:
- assemble_sanger_sequences
- compare_sample_variant_tracks
- contig_read_mapping
- coverage_analysis
- create_histogram
- extract_annotations
- extract_overlapping_reads
- graph_threshold
- local_realignment
- ml_phylogeny
- reference_assemble_sanger_sequences
- reverse_complement_sequence
- reverse_sequence
- structural_variant_detection
- tree_construction
- Parameter changes:
- annotate_from_known_variants
- Added option:
- --auto-join
- Added option:
- consensus_sequence_extraction
- Added option:
- --ambiguity-noise-minimum
- Added option:
- filter_against_known_variants
- Removed option:
- --keep-group
- Added option:
- --auto-join
- Removed option:
- mapping_graph_tracks
- Added options:
- --negative-strand-coverage
- --positive-strand-coverage
- --stranded-coverage
- Added options:
- probabilistic_variant_detection
- Removed option:
- --discard-coalesced-snvs
- Added options:
- --ignore-variants-in-non-specific-
- --required-read-count
- Removed option:
- quality-based_variant_detection
- Added option:
- --ignore-variants-in-non-specific-
- Added option:
- secondary_peak_calling
- Removed options:
- --addfeatures
- --alwaysn
- --fraction
- Added options:
- --add-annotations
- --ambiguous-mode
- --fraction-remove-this
- Removed options:
- annotate_from_known_variants
QIAGEN CLC Genomics Server 5.0.6
Bug fixes
- Fixed problem listing user groups from Active Directory
QIAGEN CLC Genomics Server 5.0.5
Bug fixes
- Fixed problems downloading and importing COSMIC variation data introduced in QIAGEN CLC Genomics Server 5.0.4: Sex chromosomes and mitochondrial genome were not annotated. We recommend everybody having downloaded or imported COSMIC variations with QIAGEN CLC Genomics Server 5.0.4 to re-do the download or import and re-run all analysis where this COSMIC variant track has been used.
- Various minor bug fixes.
QIAGEN CLC Genomics Server 5.0.4
Bug fixes
- Fixed problem in workflows introduced in QIAGEN CLC Genomics Server 5.0.3: only part of the workflow was executed in workflows that branch right after the input element.
- Fixed issue with automated association of chromosome names during import of track data for some non-human organisms.
QIAGEN CLC Genomics Server 5.0.3
Bug fixes
- The Create Statistics for Target Regions tool begins counting the reference positions at 0 rather than at 1. This causes a discrepancy with the reference position reported in other tools.
- Fixed errors of line breaks in annotation notes
- ChIP-Seq annotations were not added when running ChIP-Seq on the Genomics Server. The fix means that workflows using ChIP-Seq will be broken and needs to be re-configured by deleting the ChIP-Seq element and adding it again.
- Create mapping graph tracks caused problems when part of workflows
QIAGEN CLC Genomics Server 5.0.2
Improvements
- An update to the de novo assembly algorithm means that it will only include Ns in the contigs when doing scaffolding, or if the reads themselves contain Ns. Previously, ambiguities in the graph behind the assembly resulted in regions of Ns, but these have turned out to be problematic for customers submitting their results to NCBI, so the algorithm is now taking extra care to avoid this.
- VCF export: headers mentioning the name and version producing the VCF file, and the identifier of the origin variant track is also encoded as a CLC URL in the header. The installer of the Workbench will per default associate the CLC URL with the Workbench, so that it can directly open the file. Alternatively, the id can be pasted into the search field in the Workbench to retrieve it.
- GVF can now be exported on the server.
- Adapter trim list can now be imported on the server.
Bug fixes
- If the local hostname of the server computer did not resolve to an IP-address, it was not possible to connect to the Server from a Workbench. This has now been fixed.
- Import or download of UCSC variant tracks was only done partially with no warning to the user. Only variants on chr1 were annotated. This has now been fixed, but we strongly recommend all users downloading or importing variant data from UCSC using Genomics Workbench 6.0 to re-run the import/download using the new version.
- Trio analysis tool did not report a reference allele as a de novo mutation, even if both mother and father only had variant alleles at this position. This has now been fixed so that reference alleles are not considered special when analyzing the inheritance.
- The RNA-Seq Analysis produced only single reads in the unmapped reads list. This has now been fixed, and we encourage customers using paired reads as input and performing downstream analysis of the unmapped reads to rerun the RNA-Seq Analysis.
- In the GO Enrichment Analysis tool for variant data, some columns were missing. This has now been fixed.
- When trimming paired data, section 4 in the report did not show the right number of reads used as input.
- Several errors related to workflow configuration and execution have been fixed.
- An error occurring when using variant tracks from old versions in the Compare Variants tool has been fixed.
- Annotations were added by the Find Open Reading Frames tool, even though the option to add annotations was not selected. This is now fixed.
- Fixed an out-of-memory problem in the Create Alignment tool.
- The result of the Target Regions Statistics tool is now named after the input file.
- Various bug-fixes
QIAGEN CLC Genomics Server 5.0.1
Improvements
- The RNA-Seq tool supports strand-specific mapping of paired reads.
Bug fixes
- Workflows including variant detection need to be upgraded. The variant detection elements need to be re-created and connected.
- New versions of the Chrome browser caused web interface to flicker
- Fixed error in probabilistic variant detection that caused it to crash.
- Fixed an error in the trim report: When several trim methods were chosen, the numbers did not accurately reflect the number of sequences trimmed in each step.
- Fixed an error in the figure showing the paired distance in the RNA-Seq results report
- Fixed an error when translating DNA to protein. When more than 10 sequences were produced, the resulting protein sequence included X instead of * as stop symbol. We advice customers to re-run any analyses with the translation tool when using more than 10 sequences as input.
- Fixed error in target region statistics when some regions were 0 bases long.
- Link to reference sequence were missing from the history of mapping results, this is now fixed.
- Unmapped reads from de novo assemblies were not passed on to the next element in a workflow, this is now fixed.
- Various minor bug-fixes
QIAGEN CLC Genomics Server 5.0
New server-specific features
- Support for LDAP authentication using GSSAPI /Kerberos
- Trio Analysis
- Find Open Reading Frames
- Translate to Protein
- Convert DNA to RNA
- Convert RNA to DNA
New features shared with QIAGEN CLC Genomics Workbench
- Workflow: there are several important new features for workflows
- It is possible to control which parameters should be locked or unlocked. This means that the creator of the workflow can decide which parameters should be left open for adjustment when the workflow is executed.
- Several tools are now workflow-enabled:
- Workflow compatibility: with this release, all of the tools in the Resequencing folder and the Trim tool have changed. This is mainly due to the change in the variant format (explained below). Workflows using these tools need to be updated by deleting the tool, adding it again and restoring the connections and parameters that have been modified. When you open the workflow editor, the workflow elements that need to be updated are high-lighted in red. For installed workflows, this needs to be done in the original workflow design, and the installer needs to be re-built and installed again. We are sorry for the inconvenience caused by this, and we are working on a solution to make the upgrade mechanisms for the next release much more smooth.
- Variant detection and resequencing
- New variant data format. We recommend all users of the variant detection tools to read the change notesin the manual for this release. The main features are:
- Variants are reported with one entry per allele. This means that heterozygous variants are represented as two lines, including one line for the reference allele.
- Variants were previously joined to form MNVs. The MNV concept has been replaced by linkage groups that mark that two variants have been observed together and assures that tools like Amino Acid Changes will produce correct results.
- As a consequence, the variant types have been updated.
- As a consequence of the new data format, the Filter against Variant Databasetool has been updated:
- The auto-link feature is now obsolete
- There are now three modes of filtering (learn more here). The filter for exact matches replaces the Haplotype Comparison tool which has been removed from this release
- New tool for annotating variants with flanking sequence from the reference
- New tool for removing reference allele variants
- New variant data format. We recommend all users of the variant detection tools to read the change notesin the manual for this release. The main features are:
- De novo assembly
- Automatic paired distance estimation is now part of the de novo assembly
- Guidance only option is now able to use single reads as well as paired reads
- The number of Ns deriving from ambiguities in the graph data structure built by the assembler is reduced. Note that this does not refer to Ns inserted as part of scaffolding.
- Fixed problem causing scaffold annotations to be removed when updating contig sequences based on mapping
- Improved the scaffolding accuracy for overlapping contigs.
- Mapping reads to circular chromosomes is now fully supported
- All algorithms and exporters support circular mappings
- When downloading genomes using the Download Genome tool, circular chromosomes are marked as circular. If this information is important for the further analysis, please download or import a new copy of the reference genome, since this information is not part of existing tracks. Circular and linear versions of the same chromosome can
- New tool for extracting consensus sequence from a read mapping or BLAST result:
- A number of options for handling low-coverage regions, including putting in Ns or splitting the consensus sequence
- Ability to decide for ambiguity or voting scheme taking quality scores into account when dealing with conflicts. A noise threshold can be added for the ambiguity option.
- Consensus sequence are annotated with important events (low-coverage regions and conflicts).
- Ability to run in batch and be part of workflows
- New tool for merging overlapping pairs
- Tracks
- VCF export of variant tracks: Please note that you have to input both the variant track and the reference genome sequence track for Export.
- Trim:
- Runs on multiple cores. This will greatly speed up trim on computers with multiple cores.
- The definition of adapters for adapter trim has changed from the preferences to its own filein the Navigation Area. This makes it easier to manage large sets of adapters, it solves some usability problems related to the old dialog, and it makes it possible to work with adapter trim from the QIAGEN CLC Genomics Server Command Line Tools. Adapters can be imported directly using the standard import framework, or they can be created from scratch by manually adding in the adapter list editor.
- Target region statistics:
- The minimum coverage value is use throughout the coverage report and tracks for defining low coverage thresholds
- Additional table and plot in the report showing how many target regions have a certain percentage of the region above the low coverage threshold.
- Additional information in the track: median coverage and fraction of fragment covered by the minimum coverage
- New output type: per-base coverage table can now be created
- Detailed mapping reportincludes more information:
- The tables for non-specific and non-perfect matches display the fraction of all mapped reads in addition to the number of reads
- Overview plot of lengths of insertions and deletions in the read alignments
- Tables and plots showing differences between reads and reference for each base.
- Information about quality score distribution for matches and mismatches
- Distribution of mismatches on read position
- Information about number of reads with unaligned ends and distribution of lengths of the unaligned ends
- RNA-Seq: fusion gene table has been changed to list broken pairs rather than gene combinations. The pairs can be extracted to a sequence list for further investigation.
- Import of tabular mapping files is no longer supported. This format was produced by the early Illumina pipelines (with Eland) and this is no longer relevant. The SAM format has taken the place as the de facto standard for mapping data.
- Alignments: The performance of the algorithm for running multiple alignments has been improved and now runs on multiple cores.
- Find Open Reading Frames can be run in batch and workflows
- Translate to protein can be run in batch and workflows
- Restriction map: Excel export now creates a sheet for both the cut sites table and the restriction map.
- Alignments can be used as input for finding primer binding sites.
- BLAST results and 3D structures can be exported as text.
- Export to fastq now supports sequences up to 32k in length
- Naming of output from de novo assembly and read mapping made consistent
Bug fixes
- Fixed a number of mapper errors causing the mapper to crash.
- Fixed a problem in the read mapper when estimating paired distances. This lead to very few reads mapping as pairs.
- Fixed problem of not correctly formatting qualifiers in EMBL export.
- Test on proportions: Fixed an error caused by the wrong group being used as reference, which means that the positive values should have been negative and vice versa.
- Various bug fixes.
CLC Server Command Line Tools
This release of the QIAGEN CLC Genomics Server 5.0 is compatible with CLC Server Command Line Tools 1.7
New tools
- alignment
- annotate_variant_flank
- consensus_sequence_extraction
- convert_to_dna
- convert_to_rna
- filter_reference_variants
- find_open_reading_frames
- merge_overlapping_pairs
- trio_analysis
Tools that have changed names
- annotate_against_database_track has been renamed to annotate_from_known_variants
- filter_against_variant_database has been renamed to filter_against_known_variants
- merge_tracks has been renamed to merge_annotation_tracks
Tools that have been removed
- ngs_import_tabular is no longer relevant because it was directed at the very first Illumina pipelines
- filter_haplotype_comparison has been replaced by filter_against_known_variants
Tools that have changed parameters
- amino_acid_changes
- --CDSTrack changed into --cds-track for consistency reasons
- compare_variants_within_group
- --reference removed (no longer needed)
- fisher_exact_test
- --reference removed (no longer needed)
- statistics_target_regions
- --create-coverage-table (new output option)
- filter_marginal_variants
- --track removed (no longer needed)
- trim
- --adapters and min-count replaced by --trim-adapter-list
- small_rna_sampling
- --adapters and min-count replaced by --trim-adapter-list
QIAGEN CLC Genomics Server 4.5.2
Bug fixes
- Fixed a number of mapper errors causing the mapper to crash.
- Fixed a problem in the read mapper when estimating paired distances. This lead to very few reads mapping as pairs.
- Fixed problem of not correctly formatting qualifiers in EMBL export.
- Various bug fixes.
QIAGEN CLC Genomics Server 4.5.1
New features
Important: In Genomics Workbench 5.5, the Process Tagged Sequences tool would sometimes switch the sample names of the results. We strongly recommend everybody to update to the new version, and re-run all analyses made with this tool in Genomics Workbench 5.5.
QIAGEN CLC Genomics Server 4.5
New features
- Re-sequencing tools
- New variant caller: Probabilistic variant detection .
- This is based on a probabilistic model in contrast to the quality-based variant caller that is based on quality analysis and cut-offs.
- Supports genomes with a ploidy of 1, 2, 3 or 4.
- Pre-filtering for non-specific matches and intact pairs
- Post-filtering of homopolymer regions and forward/reverse reads balance
- The old SNP and DIP detection tools are merged into one: Quality-based Variant Detection .
- Pre-filtering for non-specific matches and intact pairs
- Post-filtering of homopolymer regions and forward/reverse reads balance
- Target regions statistics (previously a plug-in) is now integrated into the Workbench
- A new parameter: Minimum coverage that will report the fraction of each region that is covered by at least this number of reads
- Works on tracks: the regions of interest are defined in a track and the resulting per-region table is reported as a track
- Annotation and filtering tools for variants
- Annotate and filter against database variants (dbSNP, 1000 genomes or other databases that can be downloaded or imported)
- Filtering of marginal variant calls based on average base quality, forward/reverse reads balance and frequency
- Annotating variants with exon numbers
- Variant comparison
- Compare variants within group : Find variants that are shared between a number of samples
- Fisher exact test : Compare variants between case and control groups to find variants that are more common in the case than in the control
- Trio analysis : Compare child-father-mother variants to enable studies of inherited and de novo mutations
- Filter against control reads : Compare a variant track against a control sample to remove variants that are also present in the control
- Filter on haplotype comparison : Identifies variants that have the same haplotype in two samples.
- Functional consequences of variants
- GO enrichment analysis .This tool can be used to investigate the effect of candidate variants by analyzing the affected genes for a common functional role.
- Amino acid changes : Classify synonymous and non-synonymous variants and see the effect on the protein.
- Annotate with conservation scores : Annotate a variant with a score from conservation tracks that can be imported into the Workbench.
- Predict splice site effect : A simple investigation to see if the variant is within two bases of an intron-exon boundary
- New variant caller: Probabilistic variant detection .
- Download of reference genome and annotations
- Integrated download of reference genome sequences and annotations for selected organisms
- Example : for human hg19, you can directly download sequences, genes and transcripts, variants from 1000 genomes, Hapmap, COSMIC, and dbSNP (incl. common SNPs).
- Tracks
- Genomic information for re-sequencing analysis can now be stored as tracks.
- Great power for comparison and visualization because different kinds of data (reads, variants, genes etc) are not bundled into one static file but are separated into one file per data type. This means that different data sources can be compared and visualized in a flexible way.
- All tools for re-sequencing has options to create and use tracks (e.g. read mapping, variant detection etc). More tools will be re-designed to work with tracks later.
- Tools for converting between standard sequences and mappings and tracks:
- Convert tracks to sequences, mappings etc
- Convert sequences, mappings and annotations to tracks
- Tools for filtering, annotating and merging tracks
- Support for importing files as tracks from a number of new formats:
- Fasta
- VCF
- BED
- Wiggle
- UCSC table format
- GFF / GTF and GVF
- Complete genomics master var files
- Workflow
- Workflows can be built in the Workbench to combine various tools from the Toolbox into one analysis, connecting the output from one tool to the input from another
- Workflows can be distributed and installed either in the Workbench or in the QIAGEN CLC Genomics Server
- The creator of the workflow can configure parameters for the workflow and these will be fixed when the workflow is distributed and installed
- The creator of the workflow decides which of the output from the tools that should be saved and which should be discarded
- Workflows can be run in batch , making it a powerful tool for crunching high numbers of samples through the same pipeline.
- New read mapper
- Great improvement of speed for mapping (white paper to be released soon)
- Support for complex genomes with many repeats
- Re-design of wizard for read mapping to make it simpler and easier to use. Options to control consensus sequence building and annotating with conflict annotations have been removed, since they have very little relevance for the amounts of data created by NGS platforms today
- Color space mapping is still performed with the old mapper
- Automatic calculation of paired distance (only for base space data)
- Report includes percentage of reads instead of only counts
- Changed strategy for placement of gaps: previous versions tried to cluster gaps into as few units as possible. This would sometimes cause problems for variant calling because this would in some situations place the gaps differently from read to read.
- Please note that the memory requirements are different than for the old mapper. The memory requirements depend largely on the size of the reference genome. We will soon update our system requirements page to reflect this.
- Sequencing QC report : Create summary statistics for sequencing data in various ways:
- General statistics on read length etc
- Quality statistics on quality scores
- Over-representation analysis of subsequences
- Analysis of duplicated reads
Special notes for customers already using the Genomics Gateway plug-in
- Download tool for downloading genomic data replaces Ensembl download tool
- Unlimited number of chromosomes in tracks
- More streamlined conversion tools:
- Convert tracks to sequences, mappings etc
- Convert sequences, mappings and annotations to tracks
- Export tracks to gff, vcf, sam
- New tool for filtering marginal variant calls
- New tool for annotating against database variants
CLC Server Command Line Tools
This release of the QIAGEN CLC Genomics Server 4.5 is compatible with CLC Server Command Line Tools 1.6
New tools
- amino_acid_changes
- annotate_against_database_track
- annotate_conservation_score
- annotate_exon_numbers
- annotate_overlapping
- blast_delete_index
- blast_list_index_files
- blast_ncbi
- blast_set_db_locations
- compare_variants_within_group
- convert_from_tracks
- convert_to_tracks
- download_genome
- extract_sequences
- filter_against_control_reads
- filter_against_variant_database
- filter_annotation_names
- filter_haplotype_comparison
- filter_marginal_variants
- filter_overlapping
- fisher_exact_test
- gc_contents_graph_track
- go_enrichment_variants
- import_tracks
- mapping_graph_tracks
- merge_tracks
- predict_splice_site
- probabilistic_variant_detection
- sequencing_qc_report
- statistics_target_regions
Tools that have changed names
- dip_detection and snp_detection have been merged into quality-based_variant_detection
- epcr has changed name to find_primer_binding_sites in order to reflect the name used elsewhere for this tool
- merge_clusters has been renamed to merge_mappings to reflect the terminology used elsewhere
Tools that have changed parameters
- chip_seq
- –window-size changed into –window-shifted
- detailed_mapping_report
- –reference-or-contig-count removed
- –separate-mapping-statistics removed
- –create-table added
- ngs_import_roche454
- –flx-or-titanium-linker changed into –linker-sequence
- read_mapping
- options removed
- –annotate-conflicts
- –conflict-resolution-mode
- –create-sequence-list
- –do-split
- –long-reads-alignment-mode
- –long-reads-color-error-cost
- –long-reads-color-space
- –long-reads-deletion-cost
- –long-reads-insertion-cost
- –long-reads-length-fraction
- –long-reads-match-cost
- –long-reads-similarity-fraction
- –map-as-long-reads
- –map-as-single-reads
- –mask-reference
- –mask-reference-type
- –match-mode
- –maximum-alignment-count
- –maximum-distance
- –minimum-distance
- –override-distance
- –read-settings
- –score-limit
- –score-limit-offset
- –short-reads-alignment-mode
- –short-reads-color-error-cost
- –short-reads-color-space
- –short-reads-deletion-cost
- –short-reads-insertion-cost
- –short-reads-mismatch-cost
- –short-reads-ungapped
- –split-count
- –split-index
- –strand-specific
- options added
- –auto-detect-paired-distances
- –collect-unmapped
- –color-error-cost
- –color-space
- –deletion-cost
- –global-alignment
- –insertion-cost
- –length-fraction
- –masking-mode
- –masking-track
- –mismatch-cost
- –non-specific-match-handling
- –output-mode
- –similarity-fraction
- options removed
QIAGEN CLC Genomics Server 4.1
New features
Ion Torrent paired protocols are now supported for both fastq and sff files.
QIAGEN CLC Genomics Server 4.0.1
Bug fixes
- Fixed: For grid set-ups: certain setups with OGE would run on a maximum of one core. (Refer to Genomics Server documentation for updated configuration information).
- Fixed: Calculation of cDNA-level changes in variant detection failed in some situations.
- Updated: Small RNA tools: the annotation tools now support recent changes to miRBase where mature and mature* nomenclature has been replaced with 3′ and 5′ mature regions.
QIAGEN CLC Genomics Server 4.0
New plug-ins and plug-in updates
- Genomics Gateway plug-in updated
- New tools for analyzing variants in groups of samples, enabling systematic analysis of genetic variants for whole genome, exome or targeted approaches.
- Find Common Variations in Group. This can be used to find common variants in a group of variant tracks.
- Fisher Exact Test. Comparing two groups of variant tracks (e.g. can be used for case-control studies). You can see which variants are found more common in the case compared to the control group using the Fisher Exact test.
- Filter against Control Reads. This can be used to compare a single case variant track against a negative control from the same sample. It will check whether a certain number of the reads in the control sample have the same allele present as in the case variant.
- New tools for functional annotation of variants
- Go Enrichment Analysis for identifying significant gene ontology terms, which are annotated to genes having at least one variation.
- Annotation with Conservation Scores. By importing a conservation score track (e.g. PhyloP Scores), variants can be annotated with a conservation score. Variants with a high score are assumed to alter functionally important regions.
- New data structure.
- All tracks are now saved as single files, and you can create a Track List to visualize them together.
- A tool is available for data conversion from track sets to single tracks
- New organization of the “Tool box” to provide a better overview
- Support for batching and running tools on a Genomics Server
- The Track List view supports drag and drop for adding and re-arranging tracks
- Several Graph tracks can be created and displayed
- Read the updated manual here.
- New tools for analyzing variants in groups of samples, enabling systematic analysis of genetic variants for whole genome, exome or targeted approaches.
- Probabilistic Variant Detection Plug-in updated
- The probability used as threshold for the algorithm is now reported in the output
- Variants reported cDNA-level numbering and variant information compatible with www.hgvs.org/
Core server features and improvements
- Possibility to import data from the Workbench on the Server. This was previously only possible with NGS data import.
- BLAST database management: It is now possible to delete BLAST databases in the Web interface.
- The blue dot indicating that a tool could be run on the server has been removed in order to provide a simpler user interface for new users.
- External Applications: parameters for maximum number of cores and the name of the user logged in can be automatically provided by the server when the algorithm is executed.
- Support for redirecting non-SSL port to SSL port when accessing the server’s web interface.
- Command Line Tools: it is allowed to provide CLC urls with full path including the .clc file extension.
Algorithm features and improvements
- New de novo assembler.
- Scaffolding is integrated into the assembly. This means better resolution of contigs and insertion of Ns when two contigs cannot be joined in sequence but there is pair information that connects them.
- New extended report for the assembly with information about nucleotide distribution, contig lengths measurements and scaffolding regions.
- New parameter for specifying the maximum bubble size. There is a default value which is automatically calculated based on the input data.
- New white paper with benchmarks and results from quality control.
- The old de novo assembler is available as a plug-in. At the end of 2012, the plug-in will be discontinued, so it should only be used for backwards compatibility with results from older runs or if the new assembler fails.
- SNP and DIP detection results include cDNA-level numbering and variant information compatible with www.hgvs.org/
- SAM files exported from the Server now include basic information about read groups. Furthermore, read orientation for paired reads is now preserved when exporting to SAM and BAM files.
- Improved exploitation of multi-core machines in read-mapping, RNA-Seq, and de-novo assembly.
- Improved performance and memory management for high-throughput analyses in general.
Bug fixes
- Export of BLAST results was not possible on the server.
QIAGEN CLC Genomics Server 3.6
New plug-ins and plug-in updates
- New plug-in released: Ab Initio Transcript Discovery
- Large gap mapper plug-in is renamed and now includes a tool for transcript discovery. Based on gapped alignments of RNA-Seq data, the plug-in identifies new transcripts and creates or extends annotations on the reference sequence that can be used for measuring gene expression using the RNA-Seq Analysis tool of the Genomics Workbench. The plug-in provides functionality a la Cufflinks/TopHat.
- Genomics Gateway plug-in updated
- New refiner: variant frequency. This allows you to filter a variation track, so that only the variants that have a frequency above a user-defined threshold remain. Note that the filter only applies to the frequency of non-reference alleles.
- Performance improvements when visualizing read tracks
- Fixed: CDS annotations from Ensembl did not include start codons
- Fixed: Some variation tracks were not always recognized as variations. This means that the variation-specific refiners could not be used.
- Fixed: Table view of annotation tracks could have a very large number of columns that are now combined into one column.
- Fixed: There was an error when closing a view without saving changes. This could lead to subsequent errors when trying to rename tracks.
- Structural variation plug-in updated
- Only detection of insertions, deletions and interchromosomal variations are now supported.
- The plug-in has a problem with repeats. The best way to work around this is to ignore non-specific matches when doing the mapping, to run the structural variant detection with a very stringent p-value cutoff and filter repeats out afterwards if possible (this could be by refinement with the microsatallite track from Biobase or another repeat track using the Genomics Gateway).
- Integration of exporter to export results in circos format.
See a list of all plug-ins here. The pdf documentation of all plug-ins has been updated as well.
Core server features and improvements
- The grid integration is now released out of beta
- Core management: you can restrict the maximum number of cores that the Grid Worker is allowed to use. This is useful the execution node is shared with other jobs and the CLC Grid Worker needs to respect an assignment of the number of cores to use. This is mainly an issue for the De novo assembly and Read Mapping algorithms but the restriction applies to all algorithms that use several cores.
- For Oracle databases: There are now two ways of connecting to an Oracle database. One is the traditional using SID style and the other is using thin-style service name. Existing installations do not need to be changed.
Algorithm features and improvements
- BLAST is now available on the server
- Creation of BLAST databases
- Running BLAST jobs on the server against either databases on the server file system or temporary “on-the-fly” databases.
- Process tagged sequences
- A summary report is now available with an overview of the number of reads per bar code.
- You can search for barcodes (MIDs) on both strands, supporting new 454 protocol.
- Find Binding Sites and Create Fragments improved:
- Can now be run on the server
- If your template sequence contains ambiguity nucleotides (like N, Y etc), these will no longer count as mismatches when checking your primers. Note that the primer base of course need to be covered by the ambiguity symbol (e.g. a T would still be a mismatch if the template sequence has an R, which means either A or G).
- Fixed: When using multiple template sequences, the choices to open or annotate a fragment from the fragment table did not work properly. They always applied to the first sequence although the fragment was located on another sequence (as indicated in the table).
- Exporting fastq format no longer includes redundant name of the read in the quality score line. Now the name only appears once per read.
- Enhancing the nomenclature of reporting amino acid changes in variant detection:
- p. prefix included
- ? used for unknown (rather than non-standard “Unknown”)
- = used to denote an allele which agrees with the reference sequence (rather than missing entries or entries like Ala45Ala)
- [...] used around ,-separated lists of changes, each change coming from a different CDS annotation
- [...];[...] scheme used to separate multiple alleles at same site
Bug fixes
- Fixed: Import of SOLiD data failed when multiple sets of paired data was selected.
- Fixed: Calculation of consensus sequence in read mappings: Sometimes a majority of gaps would be ignored and a base erroneously introduced in the consensus sequence. It occurs when 1) there is no coverage in an initial segment of the reference sequence, and 2) a gap is encountered in the global read alignment. From that point onwards, gap counts are included in the consensus vote, but they are taken from the start of the mapping (where they are all 0), so they are out of sync with associated base counts. High gap counts would then kick in further downstream, possibly making the consensus a gap where it should not be. We recommend checking your mapping results manually if you rely on using the consensus sequence for further analysis.
QIAGEN CLC Genomics Server 3.5
New plug-ins and plug-in updates
- New scaffolding de novo assembler released as a beta plug-in.
- New read mapper released as a beta plug-in. First version without color space support.
- New probabilistic variant detection released as a beta plug-in.
- Genomics Gateway beta plug-in updated:
- Direct download of annotations from Ensembl through the Workbench.
- Support for importing zipped data
- Multiple files can be imported in one go
- Conservation track from UCSC can now be imported
- Common SNP track from UCSC can now be imported
- Tools to merge and copy tracks
- New refiner to extract a subset of genes from a gene track (look for the Name filter refiner)
- SpliceSite refiner to annotate variations that affect exon/intron boundaries
- Various bugs fixed
- Check the updated manual
- Structural variation beta plug-in released in version 2:
- Now support for inter-chromosomal structural variations
- Works with mappings created using the Large Gap Mapper beta plug-in
- Check the updated manual
Core server features and improvements
- Permission control on file system. Previously, this feature was only available for customers with a Bioinformatics Database but now all server customers will be able to specify which users should have access to data on the server. This is done by logging in as admin through the Workbench and setting read and write permissions on folders.
- SSL. The Server now supports secure connections from clients (either Workbench, Command Line Tools or the web interface)
- Status icon in Workbench is now showing info on current server connection. Clicking the icon when not logged in will display the log-in dialog.
- Permissions on server commands: server administrators can decide if certain groups should not be allowed to run certain analyses. This can be controlled also for each external application configuration.
- Permissions on import/export directories: server administrators can decide if certain groups should not be allowed to access import/export directories
- Web interface import and export from Import/Export directories on server file system
- Support for attachments on Grid jobs. This means that you can run Next-Generation Sequencing data imports with data from the client file system when running on a grid. Previously you could only import data from the import/export directories.
- Command Line Tools include a tool for setting permissions on folders on the server.
Algorithm features and improvements
- De novo assembly improvements:
- Word size can now be manually adjusted
- When update contigs is not selected, the resulting mapping table will also include contigs where no reads map back. This means that the number of rows in the table will be identical to the number of “Simple contigs” produced by the de novo assembler. Previously contigs with less than two reads mapped back would be omitted from the table.
- Merge Mapping Results will produce a mapping table when mapping tables are provided as input
- Mapping tables now include a row for reference sequences where no reads map. This is done to provide consistency of results. Opening such an entry in the table will just open the reference sequence in the table.
- SNP detection no longer ignores ambiguity bases in the reads. Each ambiguity code is treated as a separate variant; no merging of the possible variants covered by each ambiguity code is attempted (this typically only has an effect when using Sanger sequencing data since standard NGS platforms do not use ambiguity base calls).
- SAM import and export format is now described in detail in the user manual.
Bug fixes
- Fixed: Orientation of SOLiD mate-pair data was not set correctly on import. This meant that the reads were marked as broken pairs after mapping. We strongly recommend all users to re-run the import if using SOLiD mate pair data.
- Fixed: Experiments tables can now be exported in Excel and csv formats
- Fixed: If a combination of trim options is used, like quality trim or length trim in addition to adapter trimming on both strands, the reads could end up reverse complemented.
- Fixed: Import of paired data generated by Illumina Casava 1.8 did not match the pairs correctly. Users are advised to re-import and re-analyze all data imported from Casava 1.8.
- Fixed: De novo assembly sometimes failed on Mac OS 10.7 Lion.
- Fixed: Errors for read mappings with the text “premature end of .cas file” have now been fixed. This has only been a problem on Windows.
QIAGEN CLC Genomics Server 3.2.2
Bug fixes
- Fixed: A cache-related bug which would sometimes result in errors when running large jobs.
- Fixed: A problem with interpretation of broken pairs on re-import from SAM format files.
- Added missing Java VM-options to the CLC Grid Worker
QIAGEN CLC Genomics Server 3.2.1
Bug fixes
- De novo assembly produced empty results
- Paired distances for read mapping were not recorded correctly in history
- Adapter trim with Command Line Tools: if multiple adapters were provided, only the first was used
- Various minor bug-fixes
QIAGEN CLC Genomics Server 3.2
New and improved features
- External applications is now running on grids
- Secondary peak calling functionality is now available on the server
- Import of GFF files is now available on the server
- The High-throughput Sequencing Data Import Location has been redefined as a more general Import/Export location
- CLC Server Command Line Tools is now released in a final version
- Mapping
- New mapping data format supports multiple alignments and allows for import and full visualization of Complete Genomics evidence files in SAM format
- New plug-in for gapped read mapping of e.g. cDNA to genomes
- New plugin to detect Structural variation
- Action to detect structural genomic variation from paired read information
- Action to detect copy-number variation (CNV) from coverage information
- New and more flexible data structures to store information about paired data
- All history entries will from now on include the version number of the software
- Previous limit at 2 billion for the maximum number of reads in one analysis has been removed.
- Reporting of amino acid changes in SNP and DIP detection now follows recommended nomenclature more closely w.r.t. changes that affect start codons and changes that cause indels at the amino acid level.
- Performance of Excel 2010 exporter improved in terms of speed and memory requirements
- Export of trace data in scf format.
QIAGEN CLC Genomics Server 3.1.1
RNA-Seq would crash when selecting prokaryote as organism type
QIAGEN CLC Genomics Server 3.1
New and improved features
- Support for PBS Pro in addition to Oracle Grid Engine. Read more.
- Import of Ion Torrent data. A special importer has been made for Ion Torrent data in fastq or sff format. Read more.
- Grid integration redesigned to be more stable and easier to deploy. Existing users of the grid integration please contact support@clcbio.com for upgrade instructions.
- Import through Command Line Tools now works on grid set-ups. Read more.
- Reporting merged SNPs is now optional. Read more.
- SNP detection: When minimum paired coverage is set, reads from broken pairs will be completely ignored. Read more.
- RNA-Seq: the transcript-level sample includes a column for the ratio of unique to total transcript reads. Note that this means that results generated with this version cannot be used in older versions. Read more.
- Better support for color space SAM/BAM files.
- Export in color space fastq format. When data is marked as color space, exporting in fastq format will produce a file with color encoding rather than bases.
- Error reports from grid workers now include log files
- Audit log files are archived to files every three months
- Faster submission of bug report archives
- List of grid presets in the Workbench dialog is now sorted, and the last selection of preset is preserved so that the first step can be skipped
Bug fixes
- Fixed: The Gridworker would run out of memory on computers with large amounts of memory
- Fixed: Import of csfasta paired data crashed when one read had a dot in the beginning of the sequence.
- Import of paired qseq files: the read pairs are now joined correctly when importing paired qseq files
- Fixed: Import of GO annotation files did not work
- When processing tagged paired data sets, the status of the resulting files were not marked as paired. This means that subsequent analyses did not make use of the paired information.
- Various minor bug fixes
Please note that you need to upgrade the Workbench plug-in in order to connect to the new server and you will also need to update the Command Line Tools client
QIAGEN CLC Genomics Server 3.0.1
Bug fixes
- CHiP-seq analysis adjusted for the use of gapped aligner – CHiP-seq analysis with previous version should be redone
- Improved support for Mac OS X systems with japanese language
QIAGEN CLC Genomics Server 3.0
New features
- New way of running analyses on the server. Previously each analysis tool was duplicated in the Toolbox, but now the first step in the analysis wizard is about where the analysis should run.
- Command Line Tools (will be available for beta test at selected customers medio January 2011)
- A command line-based alternative to the Workbench as client
- Enables using the server tools in a scripting environment
- Note that this is not a stand-alone command line program – it is a client to the server
- Oracle Grid Engine support (will be available for beta test at selected customers medio January 2011)
- Jobs can be run on Oracle Grid Engine (formerly known as Sun Grid Engine)
- Job status can be monitored from the Workbench
- Seamless integration in the Workbench clients
- External Applications re-design Read more
- Check set-up tool for diagnostics of server setup Read more
- Batching functionality of all high-throughput sequencing tools. It is now possible to start batch runs, e.g. running 12 samples through RNA-Seq Analysis in one go. Read more.
- RNA-seq: transcript-level expression values and support for paired data
- Included option to use paired information in RNA-seq. Read more.
- Expression values can now be stratified into transcript level expression values, both for single and paired reads. Read more.
- SOLiD data: new algorithm for mapping reads allows much higher fraction of reads to be mapped. Rather than a score limit, you now specify the stringency of the mapping using length and similarity fractions. Read more.
- Similarity fraction for mapping of long reads is now available as a user-specified option (this was previously automatically set). Read more.
- Simple reporting of putative gene fusions when using paired data. Read more.
- Note about compatibility: Results from earlier versions should not be compared with results from this version.
- SOLiD data: new algorithm for mapping reads allows much higher fraction of the reads to be mapped.
- Rather than a score limit, you now specify the stringency of the mapping using length and similarity fractions. Read more..
- Note about compatibility: Results from earlier versions should not be compared with results from this version.
- Multiplexing: Process tagged sequencing data
- De-multiplexing tool is now running on the server Read more
- It is now possible to import and use a file with bar codes and sample names. This makes it easier to process data with a high number of multiplexed samples. Read more.
- You can specify separate output folders for each sample, making it convenient to batch process the subsequent analyses.
- High-throughput Sequencing Import includes an option to place data into sub-folders (useful for batching subsequent analyses)
- SNP detection reports adjacent SNPs within the same codon as one SNP. Read more.
- De novo assembly: post-processing options when mapping reads back to contig sequences have been expanded. It is now possible to preserve the original contig sequences from the assembler (they used to be replaced by the consensus sequence from the mapping). Read more.
- Support for exporting tables as tab-delimited files.
- Memory allocation: the default memory allocation for the Server changes from 75% to 50% of available physical memory with a maximum at 50 GB.
- New way of getting a license based on Order ID and automatic download of a license file. This makes it much easier to set up the server.
- New licensing model replacing the Small, Medium and Large business editions.
Bug fixes
- SNP detection bug with corrupt complementary CDS annotations.
- SNP detection: color correction errors now count when filtering SNPs (this has become important with the new mapping algorithm for SOLiD data).
- Time out in the communication between Workbench and Server is now recovered in most situations.
- Various bug-fixes
QIAGEN CLC Genomics Server 2.6
Improvements
- Create detailed mapping report now available on server
Bug fixes42>
- SNP and DIP detection previously ignored overlapping pairs. Now they count (as one read) if the fulfill the quality criteria (SNP detection). In cases where the two parts of the pair disagree, the pair does not count. We recommend running all SNP and DIP detections based on overlapping pairs data sets again (this would be the case if the minimum distance when mapping the reads is lower than two times the read length). There is no need to re-run mappings – just the SNP/DIP detection.
- ChIP-Seq: “nearest gene” reported not always right. This was the case for the last peak on each chromosome and also in cases where the order of the gene annotations in the reference file did not correspond to the order of the annotations on the actual sequence. We recommend running all ChIP-Seq Analyses again to get the correct reporting of nearest genes. There is no need to re-run the mappings.
- Color space check box not checked per default when running color space data. We recommend checking the history of mappings based on color space data. If the history shows “Color space alignment = No”, you should re-run the mapping and consequent analyses.
- Improved import of SAM/BAM files:
- Better support for files from SOLiD Bioscope
- Preliminary support for Complete Genomics files (The actual alignment is not represented completely – insertions that relates to a consensus sequence will be represented as unaligned ends in the imported mappings. This should be taken into account when looking for variations.)
- Going from step 2 to 3 in SNP detection wizard took a long time when using data with many references/contigs
- Various minor bug fixes
- Better support for files from SOLiD Bioscope
- Preliminary support for Complete Genomics files (The actual alignment is not represented completely – insertions that relates to a consensus sequence will be represented as unaligned ends in the imported mappings. This should be taken into account when looking for variations.)
QIAGEN CLC Genomics Server 2.5.3
Bug fixes
- Fixed error when importing 454 SFF files
- Fixed error when importing SOLiD data with quality scores when the reads had “.”
- Fixed error mapping large data sets on Windows 64-bit systems
- Genbank export of annotations on the negative strand were not in the right order
- Fixed memory and performance issues related to import of many sequences, eg. from ACE files.
- Fixed problem with SNP detection on large data sets suddenly running very slow.
- Better support for html when exporting tables to Excel.
- Various minor bug fixes
QIAGEN CLC Genomics Server 2.5.2
Bug-fixes
- Fixed a problem resulting in a “node communication error” when using built-in authentication and job nodes
- Fixed a problem using initial bind credentials with LDAP authentication
QIAGEN CLC Genomics Server 2.5.1
Bug-fixes
- Resolves problem with SAM/BAM import
- Resolves problem with import of tabular mapping files
- Missing FastQ export included
- Scalability improvements in mapping and de-novo assembly with drastic improvements in performance
QIAGEN CLC Genomics Server 2.5
New features
- New de novo assembly algorithm. Read more
- Small RNA Analysis
- Brand new tool for analyzing small RNA (including miRNA) data sets
- Adapter trimming
- Counting of tags
- Annotation using miRBase and other resources
- Visualization of miRNA variants
- Expression analysis
- Server-side import of High-throughput Sequencing Data. Read more
- Trim sequences is now available on the server. Read more
- Renaming and redefining concepts
- Reference assembly -> Read mapping. We adjust to the common term used today for aligning sequencing reads to a reference sequence.
- Contig -> Read mapping. The result of read mapping was previously called a contig (i.e. the alignment of reads to a reference sequence). Now, the term “contig” is used exclusively for results from de novo assembly. The result of mapping reads is called a “read mapping”.
- Paired-end -> Paired. We now distinguish during import between Paired-end and Mate pair data. Once imported, there is no difference, and they are both called “Paired”.
- Improved SAM/BAM import :
- BAM format now supported, both import and export
- More robust implementation
- Better performance
- Preview panel making it easier to match reference and SAM/BAM file
- Reference sequence name spaces automatically converted to underscores when comparing with SAM/BAM file
- High-throughput Sequence Data Import
- Gzip support
- SOLiD fastq format supported (when downloading SOLiD data from Sequence Read Archive, SRA). Read more
- 454 paired data: Support for both FLX and Titanium linkers (also the possibility to add custom non-palindromic linkers). Read more
- Improved support for SOLiD paired-end data. Read more
- Support for data from Illumina Pipeline 1.5. Read more
- Import of tabular alignment files: it is now possible to specify a read name from the file to be imported with the read. Read more
- Better compression of reference sequences (lower memory footprint and disk space usage)
- Performance improvement of read mapping algorithm
- Improved memory management in general: lower memory footprint and shorter management overhead pauses.
- Improved memory handling of large tabular data sets.
- RNA-Seq:
- Directional RNA-Seq. Read more
- Exon-intron reads are now counted under Total exon reads. When comparing new and old samples, please re-run the analysis on the old samples to ensure consistency. Read more .
- SNP and DIP detection :
- Dialog usability improved by adding an advanced panel for advanced users
- Minimum counts have been made more clear by creating a Minimum and Sufficient count
- Performance of ACE export improved, especially for long reference sequences or read mapping tables.
- It is now possible to pause and restart processes involving read mapping and de novo assembly (except the accelerated mapping part of the analyses). Read more
- When searching data from the Workbench, the results did not list the custom attributes of the data. Read more .
- Copy operations on server locations are now performed completely on the server side, eliminating client processing and network traffic
- A number of import and export formats have been included on the server, including csv, ace and excel.
- Progress when downloading files from the server using the web interface is shown in browser
- External applications:
- Performance of export of data via the Workbench from a server location has been improved
- Index server status added under User Statistics in the Admin tab of the web interface.
- Possibility to report server-side bugs from both web interface and Workbench, including log and configuration files. Read more
- Improved usability of user interface for adding locations in the web interface. Read more
- Server home path has been removed from configuration panel in the web interface (it now resides in a properties file in the settings directory). This makes it easier to deploy a job node installation in a mixed environment.
Bug-fixes
- Added support for viewing data attributes in web browser for enzyme lists.
- Job nodes failed to start when master server was not already running.
- Display of folder structure was not right in special cases involving differentiated permission on folders
- Changed order of custom attributes was not reflected in search drop-down menu
- Searching for data entered as custom attributes required specification of which attribute to search in
- Read mapping: fixed windows errors on large data sets, fixed color space errors
- RNA-Seq: max number of mismatches when running color space data could be set to three in the dialog but did not take effect. Now the limit at 2 is enforced in the dialog.
- Genbank import: sequence name (LOCUS) was truncated to 18 characters
QIAGEN CLC Genomics Server 2.0.1
Bug fixes
- A few import formats was missing on the server (when importing using the API and the web interface)
- RNA-seq: reads that extend over more than two exons are now shown correctly
- Names of results from reference assemblies are now named according to the input data
- Various bug fixes
QIAGEN CLC Genomics Server 2.0
New features
- Support for Job Nodes. Parallelized Job Execution (on Command level) with flexible/scalable multi job node setup. Advanced configuration of job nodes on command level.
- 3 tier data communication. Data communication/management is now based on communication through the Server middleware.
- Option for File locations on server.
- Index Server for stability/scalability.
- Command Line Integration tool on server side for invocation through user friendly UI on Workbench. Includes example (ClustalW)
- All High throughput sequencing data importers are now on the Server. That is “Roche 454″, “Illumina”, “SOLiD”, “Fasta/Helicos”, “Sanger”, “SAM Assembly Files”, “Tabular Assembly Files (ELAND format)”
- RNA-Sequencing can now be run on the Server. Read more…
- The analysis including mapping of reads, distributing non-specific matches and calculating expression values are done on the server
- Visualization is done in the Workbench
- Statistical analysis of the results are done in the Workbench (these are not heavy calculations)
- Administrators can close user sessions. Read more…
- Option to automatically log in to the server when the Workbench starts Read more…
- The order of attributes can now be changed. Read more…
- Finished server processes are removed when closing down the Workbench. Running processes will be shown next time the Workbench connects to the server.
- Global alignment for long reads when running reference assembly algorithm
- Gapped color-space alignment when running reference assembly
- Significantly improved speed of all operations with large data sets
- The unassembled reads from an assembly now preserves their paired-end status (this also means that you can get two lists – one with pairs and one with the remainder of the broken pairs
- SNP detection output table now reports if multiple non-synonymous SNPs exist in same codon
- SNP detection dialog: Quality filtering is no longer disabled when quality scores are missing. Due to performance issues it is not possible to check if quality scores are present. The SNP detection will just omit the quality score filtering if quality scores are not present.
- SNP detection: possible to detect variants with frequency less than 1 percent.
- General import and export
- Export tables and reports in Excel format.
- Import section of user manual re-structured to provide better overview Read more…. Expression data importers are now described in technical details in a separate section Read more….
- You can now export multiple sequence lists in fasta format
- Forced import of zip files is now supported (it will force import the contents of the zip file)
- The standard import now accepts gzip and tar files as well as zip
- Genbank importer now makes several attempts at naming genes that do not have a gene name. It will iteratively try the following qualifiers: “product”, “locus_tag”, “protein_id” and “transcript_id”
- When importing genbank files where the length stated does not match the actual sequence, a warning is shown but the sequence is accepted.
Bug-fixes:
- Fixed an error opening external files from a server location
QIAGEN CLC Genomics Server 1.6.1
New features
- Export of annotations in GFF format (joined regions not supported)
- Export of sequence data in fastq format
Bug-fixes:
- Fixed problems importing expression annotation files
- Various bug-fixes
This update is recommended for all users.
QIAGEN CLC Genomics Server 1.6
New features
- ChIP-Seq analysis is now able to (optionally) use a control sample. Read more…
- Reference assembly of short reads: user can now choose between local and global alignment Read more…
- Reference and de novo assembly output options have been changed so that you no longer need to decide whether you want a contig table or single contigs. Whenever more than one contig is produced, the Workbench automatically creates a contig table Read more…
Bug fixes
- Assembly against many reference sequences could run out of memory. This is been significantly improved.
- Integration with the Genomics Server: fixed an error when selecting contigs from a contig table for analysis. This is no longer possible (i.e. you have to save the contig first).
- Various bug fixes
QIAGEN CLC Genomics Server 1.5
Data formats:
- Data generated with version 3.5 cannot be read in earlier versions
New features:
- ChIP Sequencing
- DIP Detection
- Wizard-based database initiation tool released
- Peak detection and filtering on ChIP-seq data
- New filter options in SNP and DIP detection.
- SNP and DIP detection: as supplement to minimum variant frequency in percent, you can also specify a minimum variant count.
- SNP detection: just as DIP detection there is a maximum coverage filter
- SNP detection: there is now a “ploidy” setting just as for DIPs. This is used to mark SNPs as “complex”. The “Genetic code” drop-down box has been moved to step 3.
- SNP and DIP detection can now be performed directly on RNA-seq output contig tables
- Much improved memory performance and processing time of NGS data
- You can now specify minimum length of contigs to be reported in de novo assembly
Bug fixes:
- Fixed stability issues regarding database connection.
- Fixed issue that would make client unresponsive if network connection disappeared