Enabling Systems Biology Driven Proteome Wide Quantitation of Mycobacterium Tuberculosis
SWATH® Acquisition on the TripleTOF® 5600+ System
Samuel L. Bader, Robert L. Moritz
Institute of Systems Biology, USA
Systems biology focuses on elucidating the complex interactions within biological systems and studies how networks perform and manage their many biological functions. The ability to quantitatively monitor the majority of components of protein networks provides insight into the dynamical changes within the network at the molecular level to understand the system and how they are perturbed in disease. SWATH Acquisition is a key workflow to enable systems biology research as it can uniquely provide comprehensive high quality quantitation on a very large number of peptides and their inferred proteins.1 As the assays are predefined, the reproducibility is high and can be used to provide a very detailed view of the proteome dynamics in the biological system under investigation.
Tuberculosis is caused by the bacterium Mycobacterium tuberculosis (Mtb) and induces a chronic infection with high morbidity and mortality worldwide. In this work, Mtb is under investigation as new drugs and clinic-free diagnostics are desperately needed to address this disease caused by this highly prevalent pathogen. The pattern of proteome expression across the Mtb organism’s lifecycle will provide fundamental information for the study of the organisms life cycle and possible modes of disease progression and intervention.
A comprehensive spectral ion library has been developed for the M. tuberculosis proteome (www.srmatlas.org/mtb/swath.php) and is used for interrogating the SWATH Acquisition data.2,3
Sample preparation: M. tuberculosis was grown under BSL3 laboratory conditions and harvested at three time points, early exponential phase, late exponential phase and stationary phase (Figure 4). Each sample was analyzed in technical triplicates.
Chromatography: The samples were analyzed using the NanoLC™ 425 System (SCIEX) coupled to a 15 cm analytical column (75 µm ID, packed in-house with Magic C18 AQ 5 µm diameter, 200 Å pore size, Michrom BioResources). Peptides were separated using a linear gradient from 2 – 35% B over 180 min at a flow rate of 300 nL/min (A – 2% acetonitrile in 0.1% formic acid, B – 98% acetonitrile in 0.1% formic acid).
Mass spectrometry: Eluant from the column was analyzed using the Nanospray® Ion Source on a TripleTOF® 5600+ System (SCIEX). The acquisition method was an SWATH® Acquisition method, where Q1 was stepped across a mass range of 400-1200 m/z and MS/MS was acquired from 100-2000 m/z. Q1 transmission windows were 25 Da wide and 32 steps were used with a 100 msec accumulation time on each, resulting in a total cycle time of 3.5 sec.
Data processing: Using a comprehensive spectral ion library generated for Mtb (Figure 2), the SWATH Acquisition data was processed using the SWATH Acquisition MicroApp 2.0 beta in PeakView® Software. Peptide and protein peak areas were exported to MarkerView™ Software for results interrogation. Principal component analysis (PCA) and principle component variable grouping (PCVG) were used to analyze differences.
M. tuberculosis Spectral Library
The analysis of SWATH Acquisition data requires the precise knowledge of the peptide fragmentation and chromatographic coordinates. These coordinates can be extracted from spectral libraries that consist of a high quality peptide MS/MS spectra obtained by consolidating multiple fragmentation spectra from many data dependent analysis. These spectral libraries contain both fragment intensity information for selected fragment ions and a standardized retention time (Figure 2) that is used when interrogating the SWATH Acquisition data. Such a complete proteome spectral library is available for M. tuberculosis2. In short, this library was generated using 13,007 unique synthetic proteotypic peptides mapping to 3,838 proteins (96% of the annotated proteome) covering all functional categories (Figure 3). Typically, six fragment ions per peptide are used when interrogating the SWATH Acquisition data, and in this case all peptides per protein contained in the library.
Changes in protein expression at different points in growth curve
In order to show that SWATH Acquisition allows quantitative profiling of an entire biological system, we monitored the protein abundance changes between three discrete time points along the growth curve. Samples were collected at early exponential, late exponential and stationary growth phase, lysed and digested with trypsin and then analyzed by three technical replicates. From the SWATH Acquisition data, targeted and reproducible chromatographic traces were extracted for over 5,100 peptides covering in total 2,805 proteins (73% of the proteome) (Figure 4).
The growth of Mtb is well established and proteins such as those from the dosR regulon are known to change in abundance in stationary phase. Verification of the peptides derived from the transcription factor DosR (Rv3133c), which is known to increase in abundance in stationary phase, showed that we could reproduce these well-studied protein changes (Figure 5, bottom). As a control, five house-keeping proteins (DnaK, Enolase, Glycerol-3-phosphate dehydrogenase, GroEL and the DNA-directed RNA polymerase subunit beta) that were quantified in the dataset were analyzed and the abundance of each of these proteins remained unchanged as expected. To illustrate this result, the SWATH Acquisition data for a peptide from the control protein DnaK (DAGQIAGLNVLR, top) and a peptide for DosR (TQAAVFATELKR, bottom) are shown in Figure 5.
To investigate the quantitative changes of the M. tuberculosis proteome along the growth curve, we exported the data from SWATH Acquisition MicroApp 2.0 beta to the MarkerView™ Software for statistical analysis and data visualization. This can be done at both the peptide and protein levels, using the sum of the fragment ion XIC peak areas for quantitation. In this case, the protein level data, inferred from the peptide level data, was analyzed as changes in overall protein expression were of most interest for the current investigation.
Principal component analysis (PCA) of the protein intensities could clearly separate the three growth conditions, and the close proximity of the technical replicates showed good reproducibility (Scores plot not shown). Then principle component variable grouping (PCVG) was used to group the proteins together that have a similar behavior (Figure 6, top).
As a major human pathogen, Mtb has evolved survival mechanisms that allow it to evade the immune response and persist in a host in a dormant condition. The dormancy survival regulon (DosR Regulon) consists of 49 genes, believed to have a major role in dormancy. The DosR transcription factor (Rv3133c) targets this regulon and is well analyzed at the transcriptome level as it is thought to be involved in the dormant stage during latent tuberculosis.
As mentioned, the DosR transcription factor was quantified and confirmed to be up-regulated in stationary phase (blue group 1, Figure 6). There were 14 other proteins found in group 1 that showed this very similar up-regulation and are known targets of DosR (Figure 7).
Of the 49 genes in the DosR regulon, 33 of these proteins were quantified in this study (Figure 8). As expected, the quantitative profiles of the majority of the dosR targets show a higher abundance in the stationary phase over the other points in the growth curve. Some proteins however were found to have different expression patterns as show in Figure 8.
After PCVG analysis, the similarly clustered proteins appear on the Loadings plot with the same color (Figure 6, top). This simplifies exploration of the various protein behaviors; a representative protein out of each color group was selected (bottom). The blue protein selected is transcription factor dosR (Rv3133c), found up-regulated in stationary phase. In contrast, the green protein inositol 1-phosphate synthetase (Rv0046c) was found to decrease in abundance towards stationary phase. The purple and yellow clusters exhibit opposite expression patterns to each other, with yellow showing increased in late exponential phase and purple showing decreased expression.
SWATH Acquisition, a data independent acquisition strategy, provides a comprehensive analysis of complex proteomes, providing reproducible high quality quantitation on large numbers of peptides and proteins. Using a proteome wide spectral library for M. tuberculosis with SWATH Acquisition enabled us to quantify the effects of the DosR transcription factor on its target proteins at the protein level as and across the whole proteome in a single injection analysis.
- Data is interrogated using qualified spectral ion libraries, which are libraries of peptide MS/MS spectra and known retention times. Peptides are detected and extracted ion chromatograms of the fragment ions are generated for quantitation
- Publically available well-characterized whole organism spectral libraries are becoming increasingly3 available to provide very deep analysis of the proteome with SWATH Acquisition
- The high quality quantitative information obtained is both specific and reproducible sample to sample, enabling detailed information about protein abundance to be measured and understanding of protein regulation to be expanded.