NovaSeq X software update v1.4 improves data quality and customer usability

The NovaSeq X Series delivers Illumina’s most powerful sequencing platform, with XLEAP-SBS chemistry offering increased flow cell density, unprecedented sequencing throughput, and ultimately, lower costs. Software update v1.4 continues the evolution of the NovaSeq X platform by enabling significant improvements in usability and operational efficiency, while extending prior advancements in data quality and instrument robustness.

In this article, we highlight enhancements to customer usability and operational efficiency:         

·      Greater flexibility in scheduling sequencing runs, including the ability to start a second flow cell while the first is sequencing, using a staggered-start configuration

·      A reduction in demux errors and delays, achieved through the automatic detection of sample indices present on a flow cell

·      New flow cell types and increased sequencing lengths, with the introduction of the 5B flow cell, along with 2 × 300 cycle sequencing on 1.5B flow cells

And we illustrate improvements to data quality:

·      Higher sequencing data quality, including significant improvements in base quality

·      Improved robustness in applications that contain a low diversity of base types throughout the sequencing run

·      Enhanced NGS secondary analysis, including an upgrade to DRAGEN v4.4, with improved variant-calling accuracy

Enhanced customer usability

Flexible scheduling of sequencing runs

Prior releases of the NovaSeq X Series software allowed users to start two flow cells concurrently in a dual flow cell configuration (Figure 1).

Figure 1. Dual flow cell configurations enabled with prior NovaSeq X Series releases

This concurrent-start restriction limited operational efficiency, because a second run could not begin until a prior single-sided run had finished. With software update v1.4, users can now start a second flow cell after a first flow cell has already begun sequencing, as illustrated in Figure 2.

Figure 2. Staggered-start flow cell configuration enabled with software update v1.4

As a result, users can adapt to real-time sample arrivals and begin sequencing when samples are ready, allowing projects to move forward without delay. The near independence of flow cell sides A and B maximizes instrument uptime and boosts sequencing capacity, enabling users to fully leverage the scale offered by the NovaSeq X Plus platform.

Figure 3. Example staggered-start operation enabled with software update v1.4

Automatic detection of sample indices reduces demux errors and delays

A common challenge with any sequencer is the prevalence of errors in the sample-demultiplexing process. These errors are introduced when the configuration of a sample sheet is inconsistent with libraries loaded on a flow cell. The impact can be significant; sample reprocessing is often required, resulting in longer turnaround times and increased storage costs.

Software update v1.4 addresses this issue with the introduction of automatic detection of sample indices. The feature identifies missing sample indices in a sample sheet, and it can optionally correct common mistakes, such as an incorrect orientation of the second (i5) sequencing index. Functional on the NovaSeq X Series, in the cloud (BSSH/ICA), or on a local server, index autodetection works with both Illumina and non-Illumina library preparation kits. In short, the feature helps to get BCL conversion to FASTQ right on the first try, eliminating unnecessary requeues.

Autodetection of indices includes two basic modes of operation:

·      Blind detection

·      Guided detection

As illustrated in Figure 4, blind detection works without an input sample sheet:

Figure 4. Blind mode index detection available with software update v1.4

Detected indices are available immediately after Index 2 in the form of a generated sample sheet, along with HTML and JSON files that highlight the detection criteria and decision process. Included in the post-sequencing output are detailed, per-sample FastQC statistical reports in HTML format. Optionally, FASTQC files can be produced automatically on the instrument after sequencing, based on the generated sample sheet.

In the most common use of blind detection, users compare the generated sample sheet to expectations after Index 2, identify any discrepancies (for example, missing samples), and provide a corrected sample sheet to a BCL Convert run executed through offline processing. With this enhanced workflow, the desired output is available from a single execution of BCL Convert, eliminating delays to turnaround time and additional storage costs.

In contrast, guided detection mode requires an input sample sheet, as illustrated in Figure 5:

Figure 5. Guided mode index detection available with software update v1.4

Otherwise, guided mode is similar to blind detection mode: detected indices are available immediately after Index 2, and per-sample FastQC reports and FASTQC files are generated after sequencing.

The benefit of guided mode is that it retains the characteristics (for example, Sample IDs, and barcode mismatch rates) from the original sample sheet, while providing optional configurations to:

·      Correct samples by reverse complementing Index 2, and/or

·      Add samples by extracting indices with high read counts from the undetermined data.

The reverse-complementing process is straightforward. If the percentage of demultiplexed samples in the original sample sheet is less than 5%, then Index 2 in the sample sheet is reverse complemented. If the reverse-complemented sample sheet yields a demultiplexing percentage greater than 80%, then the reverse-complemented sample sheet is accepted. In software update v1.4, the reverse-complement operation occurs on a flow cell basis; in future updates, the reverse-complement process will occur on a per lane basis.

The most common use case of guided mode includes the generation of FASTQC files.

Furthermore, guided mode can be configured to emulate blind detection while retaining a specific parameterization of the sample sheet (for example, barcode mismatch tolerances).

Figure 6 highlights the accuracy of the autodetection feature.

Figure 6. Blind mode index detection accuracy

Statistics are generated for missed-detection and false-detection rates. The algorithms are tuned to reduce the missed-detection rate, because any unintended, added samples can easily be discarded by the user. In general, autodetection works well on Illumina generated sequencing data across multiple library preparations and applications, and the algorithms will be improved in future releases based on user feedback. Because the detection process occurs after Index 2, execution times are relatively short. Post-sequencing statistics typically complete within the wash cycle, and FASTQC generation typically completes within the clustering interval of the next run, thus enabling continuous 24/7 sequencing, as shown in Figures 1 and 2.

Importantly, concerns about software automatically deciding what is or is not in a library pool should be alleviated. The user is always in control of the autodetection process. If desired, parameters—such as the minimum number of reads required for a candidate index to be accepted—can be customized to suit specific applications. Ultimately, autodetection of indices does not make decisions; rather, it provides information to help inform customer decisions. If autodetection identifies an additional unintended sample, it can easily be discarded.

Access NovaSeq X Series software installers and release notes

New flow cell types and sequencing lengths enable optimized batching and new applications

Software update v1.4 introduces the 5B flow cell, a new mid-range throughput offering on the NovaSeq X Series that improves operational efficiency. The 5B flow cell is available for use with 100-, 200-, and 300-cycle kits. These kits are well suited for pilot Illumina Protein Prep studies or for users seeking a more practical batching size that aligns with typical sample arrival intervals.

In addition, software update v1.4 expands sequencing options with the introduction of the 1.5B 600‑cycle flow cell. This new configuration unlocks the high-throughput tier for applications such as shotgun metagenomics, immune repertoire profiling, and amplicon sequencing, enabling deeper coverage and longer read lengths within a single run.

Customer installable software

Prior software updates to the NovaSeq X Series required an Illumina Field Service Engineer to update the instrument to the latest software revision. With the release of software update v1.4, the upgrade process for future software versions can be performed without Illumina assistance by using the Illumina Software Update Manager (ISUM). ISUM not only simplifies installation, but it also enables the latest sequencer enhancements to be adopted as soon as a software release becomes available.

Higher sequencing data quality

Improved base quality

With each software update to the NovaSeq X Series, Illumina demonstrates its commitment to continuous improvements in data quality. These improvements are reflected in an increased percentage of bases exceeding Q30, with a corresponding improvement in variant-calling accuracy. Gains in data quality are driven by optimizations to the sequencing recipe—the set of events that govern the sequencing process; through enhancements to clustering—the process by which DNA library fragments attach to and are amplified on a flow cell; and through improvements to primary analysis—the signal processing algorithms used to extract base calls from the sequencer.

With software update v1.4, further improvements were achieved by fine-tuning the rates and dwell times of the thermal and fluidic steps within the sequencing recipe. Clustering steps were optimized to promote the growth of more genomically pure and robust clusters. These improvements increase signal-to-noise ratios, resulting in higher quality scores. In addition, enhancements to signal processing in primary analysis have delivered more robust performance under challenging conditions, particularly those with variable base diversity.

As a result of these changes, quality improvements are evident in the primary analysis data. Figure 7 demonstrates that the average quality score in the highest quality bin has increased to Q41, up from Q40 in software update v1.3. In addition, most bases exceed the Q41 average. This shift to higher quality scores is observed across 10B and 25B flow cells.

Figure7. Shift in the average quality of high-bin bases to Q41 with software update v1.4

In addition to the shift to Q41, the percentage of bases that exceed Q30 has also increased in software update v1.4. Figure 8 shows the improvement for Phi X and TruSeq PCR-Free libraries.

Figure 8. Improvements in the percentage of bases ≥ to Q30 with software update v1.4

Low diversity enhancements

In some sequencing libraries, each of the four possible base types are not sufficiently abundant in the pool of sequenced clusters at a given cycle. Low-diversity conditions occur when one or more of the four base types are significantly underrepresented in the data. Maintaining high data quality and base calling accuracy under low-diversity conditions is challenging for real-time base calling, particularly when the diversity of bases changes from cycle to cycle.

To improve robustness in low-diversity applications, a Phi X spike-in is recommended to ensure minimal base diversity in each cycle. Enhancements in software update v1.3 enabled a reduction of the Phi X spike-in from 15 to 5%, resulting in a 10% increase in usable data. With software update v1.4, additional enhancements further improve performance under low-diversity conditions, particularly for applications where base diversity varies from cycle-to-cycle.

Figure 9A shows the improved performance with software update v1.4 relative to v1.3, with a higher percentage of bases greater than Q30 and with fewer negative dips in base quality. Figure 9B illustrates an application in which base diversity transitions from high diversity to low diversity and then back to high diversity. Under these conditions, software update v1.4 maintains consistently high data quality, as measured by the Q30 metric. These improvements were achieved without impacting performance in more typical high-diversity use cases.

Figure 9A. Sequencing run with predominantly low-diversity cycles
Figure 9B. Sequencing run with transitions between low- and high-diversity cycles

Enhanced secondary analysis

Accuracy Improvements

The improvements in primary analysis data quality, combined with an upgrade to DRAGEN v4.4, translate to enhanced secondary analysis performance. The HG002 sample was processed using the TruSeq PCR-Free 450 library on 10B and 25B flow cells. Variant-calling results were assessed against a T2T truth set, with the results shown in Figure 10. Relative to software update v1.3, software update v1.4 demonstrates a significant reduction in both false positive (FP) and false negative (FN) rates.

Figure 10. Improvements in SNP and indel precision and recall with software update v1.4

Upgrade to DRAGEN v4.4

Software update v1.4 includes performance enhancements to secondary analysis enabled by the update to DRAGEN v4.4. On the sequencer, all previously available pipelines have been upgraded, including BCL Convert, DRAGEN Germline (for whole-genome sequencing), DRAGEN Somatic (tumor only, for whole-genome sequencing), DRAGEN RNA, DRAGEN Enrichment (germline and somatic tumor only modes), and DRAGEN Methylation.

In addition to autodetection, BCL Convert now supports custom fields in the data section, enabling improved sample traceability and tracking. Furthermore, the DRAGEN Somatic pipeline now supports both heme and solid tumor options. As in prior releases, multiple DRAGEN versions can be used on a single flow cell, and prior DRAGEN versions will run on the latest software update v1.4.

Cloud-based pipelines include those enabled on the sequencer, with additional offerings such as DRAGEN Protein Quantification and DRAGEN Single-Cell RNA. Further details are available in the NovaSeq X Series software installers and release notes.

Coming soon

Looking ahead, the following capabilities are planned for upcoming releases:

File based LIMS integration
Illumina will introduce run automation after first loading consumables, to minimize operator error and to enforce traceability for clinical specialty research and biopharma customers.

Enhanced autodetection
Illumina will extend autodetection capabilities based on user feedback, adding per-lane index correction, species detection, and per-sample coverage metrics.

Improved data quality
Illumina will introduce Q70 quality-score technology, enabling next-generation oncology applications with unmatched accuracy.

Increased output
Illumina will increase the NovaSeq X output from 25B to 35B (a 40% increase), while the 10B configuration will expand to 14B, enabling larger and more complex studies on the same instrument.

Improved speed
Illumina will deliver faster turnaround times, with 14B output in 20–22 hours, representing an average 30% improvement in whole-genome sequencing (WGS) workflows.

Conclusion

Software update v1.4 represents a broad step forward for the NovaSeq X Series, advancing the platform across three dimensions: operational flexibility, ease of use, and data quality. With the introduction of staggered flow-cell starts, users can better optimize flow cell scheduling by starting a second run on-demand, whenever samples are ready, provided a flow cell side is available. The new 5B flow cell provides a flexible mid-range throughput option that improves operational efficiency, while 600-cycle runs on the 1.5B flow cell unlock a wider range of applications.

Upgrades to future software versions can be performed without Illumina assistance with the Illumina Software Update Manager, which enables the latest sequencer enhancements to be adopted as soon as a software release becomes available. Together, these advancements deliver unprecedented flexibility and enable substantial gains in operational efficiency.

In addition, software update v1.4 reinforces Illumina’s commitment to improving data quality on shipping platforms long after initial launch. With this release, a significant increase in data output per flow cell is achieved while maintaining uncompromised data quality. Notably, the percentage of bases exceeding Q30 is elevated relative to prior software versions. These quality improvements translate directly to secondary analysis, with measurable gains in precision and recall for both SNPs and indels.

Software update v1.4 represents an early milestone in the NovaSeq X Series innovation roadmap, delivering gains in flexibility, efficiency, and data quality while establishing the foundation for continued advances in output, speed, and accuracy outlined on the NovaSeq X innovation roadmap.