An issue has been identified in Copy Number Variant Detection (CNV) (CLC Genomics Workbench) and CNV and LOH Detection (Biomedical Genomics Analysis plugin) that can affect Region-level CNV and Gene-level CNV results for chromosomes where there are targets with low coverage in the control sample. Target-level CNV results are not affected.
Regions are calculated by splitting the chromosome at positions where scores calculated for a sliding window change abruptly between two targets, for example, when targets change from having a positive fold change to a negative fold change. These scores should be computed for all targets, such that the chromosome can be split into regions between any two targets. However, instead, we take the number of targets on each chromosome that exceed the low coverage minimum threshold (N) and then calculate the scores for the first N targets on the chromosome. This means that after the first N targets, we do not attempt to split the remaining region into smaller regions. So, for example, if 10% of targets have coverage below the threshold, we will only consider region break points for the first 90% of targets along the chromosome.
Where only a few targets in the control sample(s) have coverage below the specified low coverage cutoff, the risk due to this issue is quite low. As the number of targets below the specified low coverage cutoff increases, so too does the chance that regions of interest are overlooked or misreported. When the issue occurs, too few Region-level CNV results and too many Gene-level CNVs will be produced.
Region-level CNV and Gene-level CNV results are not affected by this issue if no low coverage target regions are identified in the control read mapping(s).
This issue was fixed in CLC Genomics Workbench 21.0.6 and 22.0.1, CLC Genomics Server 21.0.6 and 22.01, and Biomedical Genomics Analysis plugin 21.2.1 and 22.0.1.