Methods
Cell Preparation and RNA/Protein Isolation: Undifferentiated hESCs (UCSF4 line) were cultured in mTeSR Medium (StemCell Technologies). NPCs were derived using a previously established method (Swistowski et al., 2009; Robinson et al., 2016), and grown in Neurobasal Medium (Gibco), containing NEAA, L-Glutamine, penicillin/ streptomycin, B27, FGF2, and LIF. Cells were cultured at 37°C in 5% CO2 and 8% O2. Total RNA was obtained using the Qiagen RNeasy Micro Plus RNA Isolation Kit. RNA quality was examined using the Agilent RNA 6000 Nano LabChip Kit and Bioanalyzer 2100 system (RIN > 9). For protein, cells were collected in 1% SDS and 50mM ammonium bicarbonate and quantified using the Pierce BCA Kit. All samples were stored at -80⁰C before further processing.
Transcriptomic analysis: We isolated RNA from hESC and NPCs and evaluated relative expression using the Affymetrix Human Gene 2.0 ST microarray platform. Hybridization and array scanning was performed at the UCSF Gladstone (NHLBI) Genomics Core Facility. The signal intensity fluorescent images produced during Affymetrix GeneChip hybridizations were read using the Affymetrix Model 3000 Scanner and converted into GeneChip probe results files (CEL) using Command and Expression Console software (Affymetrix). CEL files were RMA normalized and differential gene expression between the two cell populations was determined.
Protein sample preparation: Protein isolated from the hESC and NPC cells were reduced in 5 mM Tris(2-carboxyethyl) phosphine hydrochloride (TCEP), alkylated with 10 mM iodoacetic acid (IAA), trypsinized (Promega, Sequencing Grade), and processed using Pierce Detergent Removal Spin Columns according to manufacturer's specifications. Collections were vacuum-dried and re-suspended at a concentration of 1µg/µl in 98% HPLC-grade water, 2% HPLC-grade acetonitrile, and 0.1% formic acid for MS analyses. For library generation a pool of all the digested samples was fractionated into 8 fractions using SCX Spin Tips (Protea) according to manufacturer’s protocol.
Chromatography: Separations of the tryptic peptides from digested samples and sample fractions were performed on a NanoLC™ 425 System (SCIEX) in trap elute mode, using a 75 µm x 150 mm column and a 0.35 x 0.5 mm trap (both ChromXP™ C18CL, 5 µm, 120 Å phase - SCIEX). A linear gradient of 5-35% over 90 min with a flow rate of 300 nL/min was used and the column was maintained at 35 °C. Mobile phase A was 100% water with 0.1% formic acid. Mobile phase B was 100% acetonitrile with 0.1% formic acid.
Mass spectrometry: MS analyses were performed using either data dependent acquisition (DDA) or SWATH® Acquisition on a TripleTOF® 6600 System equipped with a NanoSpray® Source (SCIEX). Variable Q1 window SWATH Acquisition methods (100 windows)3 were built in high sensitivity MS/MS mode with Analyst® TF Software 1.7.1.
Data processing: DDA data was processed with ProteinPilot™ Software and the group file was used as the spectral ion library. Library and SWATH acquisition data were uploaded to the SCIEX cloud and data were processed using OneOmics Suite of tools (Figure 2). SWATH acquisition data extraction was followed by most likely ratio (MLR) normalization and fold change (FC) calculations2. Protein quantitation data was compared to transcriptomic data using comparison tools in OneOmics Suite and iPathwayGuide (Advaita).
Quality assessment of SWATH Acquisition data using the Analytics App
Shown in Figure 3 are a few of the many data tools provided for assessing MS data quality provided in the Analytics application in Workspaces in the OneOmics Suite. Good discrimination was observed in the false discovery rate (FDR) analysis, shown by plotting the peak group score distributions of the forward and decoy peptides. The reproducibility between the technical/biological replicates in this experiment centered around 10% CV as seen in the %CV vs frequency plots. The MLR normalization is performed during the Assembler data processing, and the results samples showed good alignment after normalization, as seen by viewing the alignment of the ratio distributions before and after normalization. Many more figures are provided to allow the MS operator to confirm the quality of the SWATH Acquisition data.
Visualizing the protein expression changes
Using a library generated from a pool of NPC and hESC cells, 2278 proteins were reliably quantified using SWATH Acquisition across the 2 sample types (3 biological replicates of each). The Browser application in Workspaces can be used to explore the protein expression data; after filtering, 280 proteins were differentially expressed (2 or more peptides per protein, protein fold change confidence > 70%). Figure 4 (right) shows the heat map for these 280 proteins. Example data for an up-regulated protein (Dihydropyrimidinase-related protein 5, DPYL5) in the NPC cells is shown (top left); 8 peptides were measured for this protein. DNA methyltransferase 3 beta (DNMT3B) is down regulated, with 4 peptides quantified. Both proteins are believed to be involved in neuronal development.
Once proteins of interest are identified, more information can be easily obtained on each. Information from UniProt is pulled into the Browser session, including protein sequence and ontology information. As an example, another up-regulated protein (Cadherin 2) was found to be upregulated in NPCs in both the protein and RNA data. This protein is involved in neurogenesis, playing a role in development of the nervous system and formation of cartilage and bone. It was found that 3 peptides were quantified for this protein and they spanned both the cytoplasmic and extracellular domains (Figure 5). When available, information on potential post-translational modification sites and many other sequence features are displayed in this view.
Finding trends in the protein expression data
The protein expression data can also be mined using multivariate statistics using the MarkerView™ Software in Workspaces. Here, sets of proteins showing large or small, up- or down-regulation were found, as seen on the Loadings plot (Figure 6, top) from the PCA-PCVG analysis. The orange PCVG group contains a set of proteins showing larger up-regulation in the NPC cells vs the hESC cells. Protein area data for six proteins from this group is shown (Figure 6, bottom left). Conversely, in the brown group, there is a set of proteins showing a small decrease in expression in NPC vs hESC, and 5 proteins from this group are shown (bottom right).
Understanding the biology
To begin to dig into the underlying biology, the perturbed ontologies can be visualized using the ontology wheels in the Browser application (Figure 7). The ontologies are first retrieved for each protein from UniProt. Data is filtered using protein fold change confidence and the number of proteins per ontology. In comparing the differentially expressed proteins between hESC and NPC, it was observed that the nervous system development ontology was strongly perturbed in this contrast between NPC vs hESC. There were 13 proteins measured in this particular ontology and 12 of these proteins were upregulated in NPC cells. This observation is expected as these cells are actively differentiated into neuronal cells from stem cells.
Comparing protein and RNA expression data
The comparison of the protein expression results to the previously obtained RNA express data was performed using iPathwayGuide (Advaita). The protein and RNA data showing significant expression changes were loaded and aligned. After alignment, it was found that 144 proteins/RNA were significant in both datasets (Figure 8, top). Of these 144, there were 37 that mapped to biological processes involving neuronal development (Figure 8, bottom). Very good correlation of expression was seen between these proteins and genes.
There were an additional 197 proteins that had significant differential expression at the protein level but not at the RNA level; 26 of these also mapped to neuronal processes (Figure 8 bottom). This data is also plotted in Figure 1, which further highlights the good correlation observed between the protein and RNA data.
Conclusions
A project was performed to evaluate the changes observed at the RNA and protein level, between human embryonic stem cells and the differentiated neural progenitor cells. The cloud based tools for multi-omics analysis in the OneOmics Suite were used to explore the dataset.
- Identified significant differential expression of 280 proteins with 2 or more peptides from SWATH Acquisition data
- Filtered at 70% fold change confidence
- Significant correlation between changes in protein and RNA levels was found between NPCs and hESCs
- Many of these involved in neurogenesis
- 26 proteins found to be differentially expressed in the protein data that also mapped to neuronal development processes that were not found to be differentially expressed in the RNA data, highlighting the importance of measuring changes at the protein level.