Using the Ion Library and DIA Results apps in the OneOmics suite
Alexandra Antonoplis1 , Nick Morrice2 and Christie Hunter1
1SCIEX, USA; 2SCIEX, UK , Redwood City, CA, USA
The OneOmics suite is a unified platform that provides researchers tools for multi-omics data management, compound identification and quantification, statistical analysis, and pathway analysis to streamline biomarker discovery studies. This cloud-powered solution also enables rapid and secure sharing of results with collaborators. Here, a complete processing workflow for proteomics analysis is presented using Zeno data-dependent acquisition (DDA, or information dependent acquisition) and Zeno SWATH DIA data acquired on the ZenoTOF 7600 system. In this workflow, a spectral library was generated using the Ion Library app in OneOmics. Zeno SWATH DIA data were subsequently processed using this ion library in DIA-NN and visualized in OneOmics
The combination of omics disciplines has proven to be more powerful than individual disciplines. However, multi-omics data analysis workflows can be time-consuming, as consolidating and interpreting various results outputs – especially when using different scoring schemas and criteria – is often challenging. The OneOmics suite was previously demonstrated to provide researchers tools for multi-omics data management, compound identification and quantification, statistical analysis, and pathway analysis to streamline biomarker discovery studies. 1-3
To further facilitate data processing for high-throughput proteomics workflows, the OneOmics suite extends its SWATH data-independent acquisition (DIA) data processing by supporting visualization and statistical interpretation of DIA-NN software results. DIA-NN software is a widely used proteomics processing platform that leverages neural networks and powerful quantification and inference algorithms to achieve confident protein and peptide identifications.4
Here, a processing workflow for proteomics analysis is presented using Zeno DDA (data-dependent acquisition) and SWATH DIA data. In this workflow, a spectral library was generated from Zeno DDA data using the Ion Library app in OneOmics suite. SWATH DIA data were subsequently processed using this ion library in DIA-NN software and evaluated in the DIA Results app (Figure 1). Results generated using this workflow can be rapidly and securely shared with collaborators.5
Sample preparation: As previously described, a 100 µg sample of K562 cell lysate was fractionated using high-pH RP-HPLC. 6 For Zeno SWATH DIA acquisition on the ZenoTOF 7600 system, a sample of K562 digest (SWATH acquisition performance kit) was prepared in water with 1% formic acid and analyzed at loading amounts ranging from 12.5 – 200 ng. A data set consisting of SWATH DIA results corresponding to 6 human cell lines was also evaluated.3
Chromatography: Microflow analysis of the K562 fractions and six cell lysates was performed as previously described.3,6 For Zeno SWATH DIA experiments, a 45-minute microflow gradient from 5% to 30% mobile phase B on the Waters ACQUITY UPLC M-Class system was implemented in trap-elute mode with a Phenomenex C18 micro trap (10 x 0.3 mm) and a flow rate of 5 µL/min.
Mass spectrometry: A ZenoTOF 7600 system equipped with the OptiFlow Turbo V ion source using a low microflow probe and electrode was used for Zeno DDA and Zeno SWATH DIA data acquisition. For generation of the ion library, DDA parameters were implemented as previously described.6 For Zeno SWATH acquisition, an 80 variable window method was used with an MS/MS accumulation time of 25 ms and dynamic collision energy. The 6 cell lysates were analyzed as previously described on the TripleTOF 6600 system.3
Data processing: DDA data files were uploaded to the OneOmics suite using CloudConnect in PeakView software, version 2.2. Data were then searched using the multi-file option in the Ion Library app using a human FASTA file from UniProt. This app creates ion libraries that are compatible with DIA-NN software. Search results were visualized using the Analytics and Browser apps in the ProteinPilot app to assess the quality of the protein identification results.
The generated ion library was then used to process SWATH DIA data in DIA-NN software, version 1.8.1. In DIA-NN software, the robust LC, high precision workflow setting was selected along with match between runs (MBR). The advanced command – report-lib-info was also added, as the –report-lib-info command is required to create an output file that is compatible with the DIA Results app in the OneOmics suite. 7 The results output from DIA-NN software consists of several *.tsv data files. – the overall final *.tsv report for each search (the largest .tsv file in size) was uploaded to OneOmics suite using CloudConnect in PeakView software for further evaluation.
High-pH fractions of K562 cell lysate digest were analyzed using Zeno DDA and processed using the Ion Library app in the OneOmics suite. The Ion Library app is an extension of ProteinPilot app and uses both the Paragon Algorithm and Pro Group Algorithm to infer peptides and proteins from DDA data. The app creates ion libraries in the *.txt file format for DIA data processing with are compatible with DIA-NN software. Additionally, *.groupexport files are created that can be explored in the ProteinPilot app to assess overall protein and peptide identifications and underlying data quality. The final K562 digest library contained 8,373 proteins at <1% global FDR and 211,150 peptides at <1% global FDR (Figure 2). For each spectral library created in OneOmics suite, a record of analysis settings is saved to help users reproduce their processing results.
A series of K562 loading amounts ranging from 12.5–200 ng was analyzed using Zeno SWATH DIA on the ZenoTOF 7600 system. The resulting files were processed in DIA-NN software using library-based processing. The resulting *.tsv output files were uploaded to OneOmics suite and imported into the DIA Results app to evaluate identifications. The DIA Results app creates *.dexport files upon transforming the *.tsv results output from DIA-NN software. The *.dexport files can then be further evaluated.
In the DIA Results app, peptide identifications at 1% FDR and critical %CV thresholds are visualized for a quick data quality assessment. In the data set explored here, each K562 load was analyzed in triplicate and evaluated for protein and peptide identifications (Figure 3). The FDR Metrics plot, which indicates peptides passing a 1% FDR threshold in each sample analyzed, illustrated consistent yields within each set of replicates. The highest loading amount (200 ng) yielded the greatest amount of peptides passing a 1% FDR threshold and exhibited the highest frequency of proteins with %CV <20% in the area variance metrics plot. For each experiment, the application also records metadata used during data processing for future reference. An overview of proteins and peptides reported is provided in the sample grouping summary.
The DIA Results app enables exploration of differential proteins and their associated fold-changes in the form of volcano plots, heat maps and ontologies. Here, a SWATH DIA data set from the TripleTOF 6600+ system consisting of six different cell lines was processed in DIA-NN software. Results were imported into OneOmics suite to browse protein fold changes relative to a specified control. Heat maps enabled rapid visualization of differential proteins and their associated fold changes (Figure 4) and can be custom-filtered by fold-change confidence and reproducibility metrics. Ontology plots are also provided to correlate differential proteins to biological process, molecular functions, and cellular components. The dot beside each ontology term illustrates the direction and ratio of the fold change for the ontology term as determined from the associated proteins. Additionally, PCA scores plots are provided to illustrate sample clustering pre- and post-normalization, along with metrics illustrating ion library coverage (Figure 5). The visualizations provided in the DIA Results app facilitate interpretation of biological studies processed using DIA-NN software.
In addition to supporting library-based processing, DIA-NN software also supports library-free searching of DIA data. For library-free processing, FASTA sequences are digested in silico using user-specified settings and the resulting output is used for analyzing SWATH DIA data.8,9 Results files from library-free searching created upon using DIA-NN software can be imported into OneOmics for statistical analysis and visualization using the same workflow as described here for library-based DIA-NN software search results.
Here, a data processing workflow using DDA and SWATH DIA data is demonstrated using DIA-NN software and OneOmics suite. Zeno DDA data were processed in the Ion Library app to generate spectral libraries. The spectral libraries were then downloaded and used in DIA-NN software to process Zeno SWATH DIA data from the ZenoTOF 7600 system. Upon import into the DIA Results app in OneOmics suite, critical protein and peptide identifications were evaluated.
The demonstrated workflow also works for SWATH DIA data collected on SCIEX X500 series systems and TripleTOF systems, providing a cloud processing pipeline for all SCIEX accurate mass spectrometers. Using the tools available in the OneOmics suite, this workflow could be extended for the exploration of differential proteins and putative biomarkers by comparing differential proteins across biological samples.