Methods
Sample preparation: Figure 2 shows a schematic of the sample preparation for this study. Briefly, 250 μg of etanercept were incubated with 40 units/μL of FabALACTICA (IgdE from Genovis) at 37°C overnight. One half of the resulting solution was treated with SialEXO (Genovis) at 37°C for 4 hours for removal of sialic acid (SA), while the other half was incubated with SialEXO and OglyZOR (Genovis) under the same condition to remove SA and O-glycans, simultaneously. Subsequently, one part of these two samples was incubated with N-glycanase (PNGase F from Agilent Technologies) at 37°C overnight to remove N-glycans. All the samples (four in total) were further treated with 7M guanidine-HCl in 50 mM Tris-HCl and dithiothreitol (DTT) to reduce disulfide bonds. The final solutions contained ~60 μg of etanercept subunits (~0.5 μg/μL). 2-4 μL (1-2 μg) of etanercept subunits were injected for LC-MS analysis.
Chromatography: Etanercept subunits were separated using an ExionLC system installed with an ACQUITY UPLC Protein BEH C4 column (300 Å, 1.7 μm, 2.1 x 50 mm, Waters), which was kept at 60°C. Chromatographic separation was performed at a flow rate of 0.3 mL/min using an 8-minute linear gradient (Table 1). Mobile phases A and B consisted of 0.1% formic acid in water and 0.1% FA in acetonitrile, respectively.
Mass spectrometry: TOF-MS data were acquired using a SCIEX ZenoTOF 7600 system with intact protein mode enabled. Key MS settings used in this study are listed in Table 2.
Data processing: All the data were analyzed using the Intact Protein workflow template in the new Biologics Explorer software from SCIEX.1 The default processing parameters optimized for SCIEX TOF MS data were used except for the following settings that required manual input (Table 3).
Overview of subunit LC-MS workflow
Etanercept is a fusion protein consisting of two tumor necrosis factor receptor (TNFR)-Fc chains with six N-glycosylation and 26 O-glycosylation sites that are partially sialylated (Figure 3).
Structural elucidation of fully intact etanercept is challenging due to complexity and heterogeneity induced by O-glycosylation and sialylation. To gain insight into the structure of etanercept, the mAb was cleaved into subunits by IgdE followed by removal of sialyation using SialEXO and reduction with DTT (Figure 2), which reduced sample heterogeneity and data complexity. The monomeric TNFR, Fc, and TNFR-Fc subunits with or without N-and O-glycans were analyzed using the high-performance ZenoTOF 7600 system. While the data on subunits without glycosylation provided information on the integrity of IgdE fragments, the results from subunits with N- and/or O-glycans enhanced the understanding of the glycosylations present in the etanercept sample used.
The high-quality LC-MS data were processed using the intuitive Biologics Explorer software, which provides powerful visualization tools for data review (Figure 4), optimized workflow templates, and proven algorithms for reliable spectrum processing and deconvolution.
Aglycosylated etanercept subunits
The main species detected from the fully aglycosylated and reduced etanercept (sample 4 in Figure 2) were monomeric TNFR-Fc without C-terminal lysine (Lys, 51.1 kDa) and subunits TNFR1-187 (20.3 kDa) and TNFR1-223 (24.1 kDa) corresponding to cleavages at the C-terminus of M187 and M223, respectively (Figure 4), above the hinge region by IgdE.
Additional subunits from TNFR (1-190 and 1-213) and Fc (253-467) were present at much lower abundance (Figure 5). Interestingly, the species with the addition of up to two HexNAc were observed for TNFR1-223 and TNFR-Fc, but not for TNFR1-187. This could be attributed to differences in structure and susceptibility to PNGase F between TNFR1-187 and its longer counterparts.
The co-existence of TNFR and TNFR-Fc subunits in the same sample allowed for characterization of N- and O-glycans in different regions of etanercept, as described below.
Glycosylated TNFR subunits
Figures 6 and 7 show aglycosylated and glycosylated forms of TNFR1-187 and TNFR1-223 observed in four samples. The major N-glycan forms observed for the two subunits are G2+G2F and 2G2F (Figure 6B and 7B), confirming the occupation of the two available N-glycosylation sites in the TNFR domain. The minor N-glycoforms with two Man5 were also detected (Figure 6B and 7B). Interestingly, drastic differences were observed between TNFR1-187 and TNFR1-223 in the samples without sialic acid and N-glycans (Figure 6C and 7C). While one O-glycan (core 1) was observed for TNFR1-187 (Figure 6C), up to nine O-glycan moieties were detected for TNFR1-223 (Figure 7C). These results indicate that most O-glycosylation sites are located in the region of TFNR187-223 containing ten Ser/Tyr. The species containing both N- and O-glycans have been observed in the samples treated only with sialidase (Figure 6D and 7D). The dominant forms are G2+G2F plus one O-glycan for TNFR1-187 (Figure 6D) and eight or nine O-glycans for TNFR1-223 (Figure 7D). Taken together, these results do not only confirm the dominant N-glycan forms, but also pinpoint the region of O-glcosylation in etanercept.
Glycosylated TNFR-Fc monomers
The aglycosylated or glycosylated TNFR-Fc monomers without C-terminal Lys were observed as one of the dominant species in all samples due to incomplete cleavage by IgdE. The presence of TNFR-Fc monomers, however, allowed for the characterization of N- and O-glycosylations simultaneously (Figure 8), providing complementary results to those described above for TNFR fragments (Figure 6 and 7).
The monomeric TNFR-Fc carried one additional N-linked G0F or G1F in the Fc domain (Figure 8B) compared to TNFR1-187 and TNFR1-223 (Figure 6B and 7B). The dominant O-glycans observed for TNFR-Fc contain 9 and 10 HexHexNAc structures (Figure 8C and 8D). Compared to the results from TNFR1-223, 1-2 more O-glycans were observed for TNFR-Fc, indicating additional O-glycosylation site(s) close to or in the hinge region.
In summary, the subunit data of etanercept revealed that this therapeutic protein carries G2F+G2 or 2G2F as the dominant N-glycan forms in the TNFR domain and G0F or G1F in the Fc domain. One O-glycan was detected in the sequence of TNFR1- 187, with the majority of O-glycans (8 out of 13) located between amino acids 187 and 223 above the hinge region (Figure 9).