Chicken egg yolk cytoplasmic proteome, mined via combinatorial peptide ligand libraries

Share Embed


Descripción

Journal of Chromatography A, 1216 (2009) 1241–1252

Contents lists available at ScienceDirect

Journal of Chromatography A journal homepage: www.elsevier.com/locate/chroma

Chicken egg yolk cytoplasmic proteome, mined via combinatorial peptide ligand libraries Alessia Farinazzo a , Umberto Restuccia b , Angela Bachi b , Luc Guerrier c , Frederic Fortis c , Egisto Boschetti c , Elisa Fasoli a , Attilio Citterio a , Pier Giorgio Righetti a,∗ a b c

Department of Chemistry, Materials and Chemical Egineering “Giulio Natta”, Politecnico di Milano, Via Mancinelli 7, 20131 Milan, Italy San Raffaele Scientific Institute, 20132 Milan, Italy Bio-Rad Laboratories, C/o CEA-Saclay-DSV, iBiTec-S, 91191 Gif-sur-Yvette, France

a r t i c l e

i n f o

Article history: Available online 27 November 2008 Keywords: Egg yolk Peptide libraries Hexapeptide ligands Low-abundance proteome Mass spectrometry

a b s t r a c t The use of combinatorial peptide ligand libraries (CPLLs), containing hexapeptides terminating with a primary amine, or modified with a terminal carboxyl group, or with a terminal tertiary amine, allowed discovering and identifying a large number of previously unreported egg yolk proteins. Whereas the most comprehensive list up to date [K. Mann, M. Mann, Proteomics, 8 (2008) 178–191] tabulated about 115 unique gene products in the yolk plasma, our findings have more than doubled this value to 255 unique protein species. From the initial non-treated egg yolk it was possible to find 49 protein species; the difference was generated thanks to the use of the three combined CPLLs. The aberrant behaviour of some proteins, upon treatment via the CPLL method, such as proteins that do not interact with the library, is discussed and evaluated. Simplified elution protocols from the CPLL beads are taken into consideration, of which direct elution in a single step via sodium dodecyl sulphate desorption seems to be quite promising. Alternative methods are suggested. The list of egg yolk components here reported is by far the most comprehensive at present and could serve as a starting point for isolation and functional characterization of proteins possibly having novel pharmaceutical and biomedical applications. © 2008 Elsevier B.V. All rights reserved.

1. Introduction The egg yolk is the part of an egg which serves as the food source for the developing embryo inside. Prior to fertilization the yolk together with the germinal disc is a single cell. Mammalian embryos live off their yolk until they implant on the wall of the uterus. The egg yolk is suspended in the egg white (known popularly as albumen) by one or two spiral bands of tissue called the chalazae. As a food source, yolks are a major reservoir of vitamins and minerals. They contain all of the egg’s fat and cholesterol, and almost half of the protein. In chickens the yolk makes up about 33% of the liquid weight of the egg; it contains approximately 60 calories, three times the caloric content of the egg white. All of the fat soluble vitamins (A, D, E and K) are found in the egg yolk, including vitamin D, which is a rare item to be found in natural foods. Yolk is quite rich in unsaturated (oleic acid, 47%; linoleic acid, 16%; palmitoleic acid, 5%; linolenic acid, 2%) as well as saturated (palmitic acid, 23%; stearic acid, 4%; myristic acid, 1%) fatty acids [1,2]. Egg yolk is a source of lecithin, an emulsifier; its yellow color is caused by lutein and zeaxanthin, carotenoids known as xanthophylls.

∗ Corresponding author. Fax: +39 02 239 930 80. E-mail address: [email protected] (P.G. Righetti). 0021-9673/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.chroma.2008.11.051

Interestingly, artists have been using egg proteins for several millennia, as binders for their pigments. The ancient Greeks and Egyptians painted with a medium known as tempera, which was based on a proteinaceous binder to which the pigments were added. Egg yolk was the most popular binder, but casein (from milk), gelatine (from collagen extracted from animal bones and cartilage) and albumin were also used. In the 16th Century, the tempera technique began to be superseded by oil painting and suffered a sharp decrease in popularity, although it remains a recognized practice today. Perhaps the first mini-proteomic study on eggs came from a work by Tokarski et al. [3], who studied by nanoLC/nano-electrospray ionization (nano)ESI/quadrupole (Qq) time-of-flight (TOF) MS/MS micro-scrapings from renaissance painting and detected ovotransferrin, ovalbumin and vitellogenin. In recent times, there has been a burst of investigations on egg proteome. Although a first report by Guérin-Dubiart et al. [4] on egg white detected only, by two-dimensional (2D) mapping and mass spectrometry (MS), 16 gene products, a more comprehensive study increased the total count to 78 unique species [5], to be superseded most recently by D’Ambrosio et al. [6] who, in the same system, were able to count up to 148 unique protein species. The situation has been rapidly evolving also in the case of egg yolk. Mann and Mann [7], via an in depth-analysis by FT-ICR-MS, found a total of 119 unique protein species in egg yolk, of which 100 were detected

1242

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

in the plasma (soluble) fraction and 89 in the granular (or insoluble) fraction with 70 protein in common. In a further study by this group [8], 528 proteins in the decalcified eggshell organic matrix were also recognized. Moreover, in the eggshell soluble matrix 39 phosphoproteins containing more than 150 different phosphorylation sites were identified, of which 22 had not been recognized as phosphoproteins previously. In yet another report, Mann [9], in the analysis of chicken egg vitelline membrane, detected a total of 137 proteins, only 13 of which had been recognized previously as components of this membrane. The extraordinary catch in total protein count of egg white by D’Ambrosio et al. [6] was made possible by applying the now wellingrained combinatorial peptide ligand library (CPLL) composed of hexapeptides, as originally described by Thulasiraman et al. [10] (for reviews, see refs. [11–15]). With the understanding that the total protein count of Mann and Mann [7], albeit impressive, might not quite represent the full proteome of egg yolk, we have applied to this biological sample our CPLL technique. The results here reported are quite unique in that they have more than doubled the list of species identified previously. 2. Materials and methods 2.1. Materials The solid-phase combinatorial peptide library (ProteoMiner or Library-1), its carboxylated version (Library-2), another variant with a tertiary, terminal amino group (Library-3), as well as materials for electrophoresis such as gel plaques and reagents and ProteinChip arrays were from Bio-Rad Labs, (Hercules, CA, USA). Libraries 2 and 3 are not commercially available; they were prepared for the purpose of the present study. N-Ethylmaleimide, urea, thiourea, 3-(3-cholamidopropyl dimethylammonio)-1-propanesulfonate (CHAPS), isopropanol, acetonitrile, trifluoroacetic acid and sodium dodecyl sulphate (SDS) were all from Sigma–Aldrich (St. Louis, MO, USA). Complete protease inhibitor cocktail tablets were from Roche Diagnostics, (Basel, Switzerland). Sequencing grade bovine trypsin was from Promega (Madison, WI, USA). All other chemical were also from Aldrich and were of analytical grade. 2.2. Sample collection and ProteoMiner treatment Unfertilized chicken eggs, freshly laid, were used. For preparing the yolk plasma proteins, the method of Mann and Mann [7] (in turn taken from Ahn et al. [16]) was followed. Briefly, the yolk, very carefully separated from the egg white to prevent contamination, was extensively washed with distilled water, the membrane punctured with a blunt pipette and yolk removed by suction. Nine eggs were treated this way. After diluting in dist. water (1:6, v/v, in presence of two tablets of complete protease inhibitor cocktail) and adjusting the pH to 5, the sample was centrifuged at 10,000 × g for 30 min at 4 ◦ C. The supernatant, extensively dialysed at pH 5 obtained by addition of acetic acid and lyophilized, was the yolk plasma protein fraction. The pellet (granular or globular fraction) was not treated any further, since it was not amenable to ProteoMiner treatment. The lyophilized powder (1 g total protein) was added with 50 mL of PBS (25 mM phosphate buffer, pH 7.2, containing 150 mM NaCl) gently stirred up to the solubilization of all proteins. The entire solution was put in contact with 0.5 mL of Library-1 (ProteoMiner) and shaken 3 h at room temperature. Library-1 was washed twice to remove excess soluble proteins and the captured proteins eluted with three sequential eluants, as follows: TUC solution (2 M thiourea, 7 M urea, 2% CHAPS), UCA solution (9 M urea, citric acid up to pH 3.3 and 2% CHAPS) and a hydro-

organic solution (OS) composed of 6% (v/v) acetonitrile, 12% (v/v) isopropanol, 10% (v/v) of 20% ammonia and 72% (v/v) water. The supernatant from Library-1 was then mixed with 0.5 mL of Library2 and again gently agitated for 3 h. After filtration, Library-2 was washed twice with PBS to remove non-adsorbed proteins and the captured species eluted according to the same procedure used for Library-1. Finally, the supernatant from Library-2 was contacted with 0.5 mL of Library-3 and the column eluted as in previous steps, after washing in PBS. The nine eluates were immediately neutralized, submitted to protein content analysis by the Bradford-Lowry standard spectrophotometric method, desalted by dialysis at 4 ◦ C against a 10 mM ammonium carbonate solution (dialysis membrane cut off was 1000 Da) and lyophilized. 2.3. One-dimensional electrophoresis A 10-␮L volume of each sample (corresponding to ca. 30 ␮g protein) was mixed with 10 ␮L of Laemmli buffer [17] (4% SDS, 20% glycerol, 10% 2-mercaptoethanol, 0.004% bromophenol blue and 0.125 M Tris–HCl, pH approx. 6.8). The mixture was heated in boiling water for 5 min and immediately loaded in the gel. The SDSpolyacrylamide gel electrophoresis (PAGE) slab was composed by a stacking gel (125 mM Tris–HCl, pH 6.8, 0.1% SDS) with a large pore polyacrylamide (4%) cast over the resolving gel (8–18% acrylamide gradient in 375 mM Tris–HCl, pH 8.8, 0.1% SDS buffer). The cathodic and anodic compartments were filled with Tris–glycine buffer, pH 8.3, containing 0.1% SDS. Electrophoresis was at 100 V until the dye front reached the bottom of the gel. Staining and destaining were performed with Colloidal Coomassie Blue [18] and 7% acetic acid in water, respectively. The SDS-PAGE gels were scanned with a VersaDoc image system (Bio-Rad). For MS analysis, 60–120 ␮g of total proteins were loaded per track; at the end of the run, 14 slices were excised that covered the whole gel resolving region. 2.4. Surface-enhanced laser desorption ionization-mass spectrometry Protein fractions at appropriate concentration, i.e. 0.02 ␮g/␮L, were deposited upon IMAC ProteinChip Array surfaces, using a Bioprocessor device. IMAC (immobilized metal ions affinity chromatography) chip surfaces were loaded with copper ions or with other transition metal ions such as Fe3+ , Ga3+ , and Zr4+ . Different types of binding buffers were tested: 100 mM Tris–HCl pH 7.5, containing 250 mM NaCl; 100 mM sodium acetate, pH 4, containing 250 mM NaCl; and 100 mM sodium acetate, pH 4, containing 50 mM phosphate and 250 mM NaCl (competition buffer). Each array contained eight distinct spots over which the adsorption of protein could be performed. After applying the samples (starting material and the first two eluates from each combinatorial peptide ligand library), chip surfaces were washed to remove non-associated proteins, dried and prepared for the analysis after application of 1 ␮L of energy adsorbing matrix solution composed of a saturated solution of sinapinic acid in 50% acetonitrile and 0.5% trifluoroacetic acid. All arrays were then analyzed with a PCS 4000 ProteinChip Reader. The instrument was used in a positive ion mode, with an ion acceleration potential of 20 kV and a detector gain voltage of 2 kV. The mass range investigated was from 3 to 20 kDa. Laser intensity was set between 200 and 250 units according to the sample tested. The instrument was mass calibrated with a kit of standard masses mixture “All-in-one protein standard”. 2.5. 2D-PAGE analysis The desired volume of each non-treated sample and eluates was solubilized in the “2D sample buffer” (7 M urea, 2 M thiourea, 3% CHAPS, 40 mM Tris) to a final concentration of 2 mg/mL pro-

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

tein and the disulphide bridge reduction allowed to proceed at room temperature for 60 min by addition of TCEP [Tris(2carboxyethyl)phosphine hydrochloride] at final concentration of 5 mM. For alkylating reduced –SH groups, 150 mM De-Streak [Bis(2-hydroxyethyl)disulphide, (HOCH2 CH2 )2 S2 )] (diluted directly from the stock 8.175 M, Sigma–Aldrich) was added to the solution, followed by 0.5% Ampholine (diluted directly from the stock, 40% solution) and a trace amount of bromophenol blue. Seven-cm long IPG strips (Bio-Rad), pH 3–10 L, were rehydrated with 150 ␮L of protein solution, for 4 h. Isoelectric focusing (IEF) was carried out with a Protean IEF Cell (Bio-Rad) in a linear voltage gradient from 100 to 1000 V for 5 h, 1000 V for 4 h, followed by an exponential gradient up to 5000 V, for a total of 25 kV/h. For the second dimension, the IPGs strips were equilibrated for 25 min in a solution containing 6 M urea, 2% SDS, 20% glycerol, 375 mM Tris–HCl (pH 8.8) under gentle shaking. The IPG strips were then laid on 7.5–18% or on 7.5–22% acrylamide gradient SDS-PAGE gel slab with 0.5% agarose in the cathodic buffer (192 mM glycine, 0.1% SDS and Tris–HCl to pH 8.3) [19]. The electrophoretic run was at 5 mA/gel for 1 h, followed by 10 mA/gel for 1 h and 15 mA/gel until the dye front reached the gel bottom. Gels were incubated in a colloidal Coomassie Blue solution and destaining was performed in 7% acetic acid till clear background, followed by a rinse in pure water. The 2-DE gels were scanned with a Versa-Doc image system (Bio-Rad), by fixing the acquisition time at 10 s; the relative gel images were captured via the PDQuest software (Bio-Rad). After filtering the gel images for removing the background, spots were automatically detected, manually edited and then counted. 2.6. Protein identification by nanoLC MS/MS The various sample lanes of SDS-PAGE gels were cut in 14 pieces of about 0.5 cm along the migration path, and proteins were reduced by 10 mM dithiothreitol (DTT) and alkylated by 55 mM iodoacetamide. The gel pieces were shrunk in acetonitrile and dried under vacuum; proteins were digested overnight with bovine trypsin as described elsewhere [20]. The tryptic mixtures were acidified with formic acid up to a final concentration of 10%. Five ␮L of tryptic digest for each band were injected in a capillary chromatographic system (EasyLC, Proxeon Biosystems, Odense, Denmark). Peptide separations occurred on a RP homemade 10-cm reverse phase spraying fused silica capillary column (10 cm × 75 ␮m I.D.), packed with 3-␮m ReproSil 100C18 (Dr. Maisch GmbH, Germany). A gradient of eluents A (H2 O with 2% (v/v) ACN, 0.1% (v/v) formic acid) and B (ACN with 2% (v/v) H2 O with 0.1% (v/v) formic acid) was used to achieve separation, from: 8% B (at 0 min 0.2 ␮L/min flow rate) to 50% B (at 80 min, 0.2 ␮L/min flow rate). The LC system was connected to an LTQ-Orbitrap mass spectrometer (ThermoScientific, Bremen, Germany) equipped with a nano-electrospray ion source (Proxeon Biosystems). Full scan mass spectra were acquired in the LTQ Orbitrap mass spectrometer with the resolution set to 60,000. For accurate mass measurements the lock-mass option was used [21]. The acquisition mass range for each sample was from m/z 350 to 1500 Da and the analysis were made in duplicate. The four most intense doubly and triply charged ions were automatically selected and fragmented in the ion trap. Target ions already selected for the MS/MS were dynamically excluded for 60 s. 2.7. Data analysis Tandem mass spectra were extracted by Raw2MSM, version 1.5 2007.02.22. All MS/MS samples were analyzed using Mascot (Matrix Science, London, UK; version 2.1.04) and X! Tandem (www.thegpm.org; version 2007.01.01.1). the searches were done against International Protein Index (IPI) database of chicken [22] assuming the digestion enzyme trypsin. X! Tandem was searched

1243

with a fragment ion mass tolerance of 0.100 Da and a parent ion tolerance of 10.0 ppm. Mascot was searched with a fragment ion mass tolerance of 0.80 Da and a parent ion tolerance of 10.0 ppm. Iodoacetamide derivative of cysteine was specified as a fixed modification, oxidation of methionine as a variable modification. Scaffold (version Scaffold-2 00 01, Proteome Software, Portland, OR, USA) was used to validate MS/MS based peptide and protein identifications. All Mascot searches were pulled together and analyzed with Scaffold. Peptide identifications were accepted if they could be established at greater than 95.0% probability as specified by the Peptide Prophet algorithm [23]. Protein identifications were accepted if they could be established at greater than 99.0% probability and contained at least two identified unique peptides. Protein probabilities were assigned by the Protein Prophet algorithm [24]. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony. Comparison of different data sets was performed with ProteinCenter Software from Proxeon Bioinformatics (Odense, Denmark), after clustering to 96 homology, to remove the redundancy of protein sequences including splice variants and truncated forms. 3. Results and discussion 3.1. Data from the present study In a previous report we presented and discussed analytical data on egg white proteome before and after treatment with combinatorial peptide ligand libraries as a mean to evidence low-abundance species [6]. In the present work, however, three libraries were used instead: a primary amino terminal hexapeptide library (Library1) followed by its carboxylated form (Library-2) and then a third library with tertiary amine terminal derivative (Library-3). Fig. 1 shows the SDS-PAGE profiling of the starting material (YP), the flowthrough (FT) and all eluates from the three hexapeptide libraries (E1 TUC, UCA, OS for Library-1; E2 TUC, UCA, OS for Library-2 and E3 TUC and UCA for Library-3). Two major proteins present in the control material with masses close to 45 and 60 kDa, respectively, were strongly reduced in concentration in all library eluates. Conversely, all eluates from each library showed an increased number of lowabundance species especially in the 10–30 kDa region. One can also notice that, whereas the OS eluate form Library-1 (E1/OS) was still very rich in protein bands, the corresponding eluate from Library-2 (E2/OS) bore a decreased amount of proteins. The OS eluate from Library-3 did not show detectable proteins; this phenomenon could be interpreted as depletion effect of proteins of similar category (probably hydrophobic) or because proteins with both hydrophobic and anionic properties capable in principle to interact with the third library are not present any longer in the flow-through of the second column. It is interesting to notice that several proteins eluted by means of OS solution appear quite different in the SDS-PAGE pattern with different mobility and hence different masses compared to proteins from previous elutions. Due to the large excess of proteins loaded on the column sequence, the flowthrough is practically indistinguishable from the non-treated extract as it is expected from this technological approach [10]. Fig. 2 shows five 2D maps of selected CPLL eluates as compared with control (Ctrl, uppermost panel). The samples analyzed from the first two libraries are the TUC and UCA eluates, respectively, from where most of the proteins were found. A large part of captured proteins by Library-1 seem to focus in the pH 5–8 range, with a mass distribution covering the 10–250 kDa region. However, the eluate from Library-2 was somewhat enriched in alkaline proteins, reaching pI values of 10. In addition, there seems to be enrichment in low-Mr species (20–50 kDa) as compared to the controls, thus confirming the data from SDS-PAGE. Two-dimensional PAGE

1244

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

Fig. 1. SDS-PAGE analysis of egg yolk. The samples are: YP = egg yolk plasma before library treatment (control); FT = flow-through after Libraries-1 and -2; E1/TUC = eluate from Library-1 with TUC; E2/TUC = eluate from Library-2 with TUC; E1/UCA = eluate from Library-1 with 9 M urea, 50 mM citric acid and 2% CHAPS; E2/UCA = eluate from Library-2 with 9 M urea, 50 mM citric acid and 2% CHAPS; E1/OS = eluate from Libray-1 with 6% (v/v) acetonitrile, 12% (v/v) isopropanol, 10% (v/v) NH3 20% and 72% water; E2/OS = eluate from Library-2 with 6% (v/v) acetonitrile, 12% (v/v) isopropanol, 10% (v/v) NH3 20% and 72% water; E3/TUC = eluate from Library-3 with TUC; E3/UCA = eluate from Library-3 with UCA. All samples loaded at 50 ␮g/lane.

Fig. 2. 2D maps of various fractions obtained from library treatments. The upper map is the initial untreated egg yolk extract (Ctrl). TUC and UCA eluates from Library-1 and Library-2, are shown in the series below the Ctrl, respectively. Spot staining with Colloidal Coomassie Blue. First dimension: 7-cm long IPG strips, pH 3–10 L (linear) since indeed with used a linear pH gradient; second dimension: 8–18% acrylamide gradient SDS-PAGE.

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

shows a large number of isoforms not only in the control experiment, but also in the fractions generated by CPLL treatments. Some protein isoforms that were not detectable prior to library treatment became clearly visible in various protein eluates. When observing the protein patterns from various eluates it can be assessed that the behaviour of the libraries used is in full agreement with previous findings [6,11–14]: whereas the standard ProteoMiner binds the vast majority of proteins present in any proteome deriving from cell lysates or biological fluids, its carboxylated variant binds approximately an additional 20% proteome, non-redundant with the Library-1, that would have been otherwise lost via the use of only a single library. This leads us to conclude that, whereas most proteins bind to their respective hexapeptide bait, having a complementary sequence able to elicit a variety of bonds (hydrophobic, hydrogen or ionic bonds or mixed-type interactions), yet the terminal group plays an important role in modulating the binding event. It is tempting to speculate (and it has been partly verified in our 2D maps) that the amino terminus, having a full positive charge under the adsorption buffer conditions, enhances the binding of negatively charged proteins (and, for that matter, at physiological pH, two thirds of the proteins bear a net negative charge, since they have pI values below pH 7.0) [25]. Conversely the carboxyl terminus, bearing a full negative charge at the same binding pH, tends to enhance the binding of positively charged proteins (here too it is worth recalling that one third of all proteins in any proteome have pI values above pH 7.0) [25]. These statements have to be taken with care, of course, considering that there is a common catch by which, independently from their respective pI values, 80% of the proteins are captured simultaneously and are shared by both libraries. The behaviour of the tertiary amino terminus library (Library-3), is also illuminating: this library captures essentially the same proteins adsorbed by Library-1 (primary amino terminus) and in the same relative abundances. This reinforces the notion that the modulation is primarily driven by the charge of the terminus, and not by the types of residues bound to it (non-bulky substituents have been used here!). Since the primary or tertiary amino groups would have essentially the same charge during the binding events, there is no practically difference on how the proteins are hooked up by their partner bait. In order to identify the species collected from the libraries as compared to the initial sample, the nine eluates and the control were subjected first to SDS-PAGE, by loading 60–120 ␮g of total proteins per track. At the end of the run, fourteen slices were excised, which covered the whole gel resolving region; these gel portions were treated with trypsin, extracted for peptides and subjected to analysis via nanoLC MS/MS, by running each eluted gel slice in duplicate. A total data reproducibility non-inferior to 97% and the consistency criteria reported in the experimental section ensured a proper reliability in protein identification. The total number of identified species in each eluate, together with assigned spectra, total number of spectra and percentage of identified spectra are listed in Table 1. When eliminating all redundant identifications in the various fractions, a grand total of 255 unique gene products could Table 1 Number of unique gene products found it the various eluates from the three libraries. Analyzed sample

Identified proteins

Assigned spectra

Total number of spectra

% of identified spectra

Initial extract Library-1 TUC eluate Library-1 UCA eluate Library-1 OS eluate Library-2 TUC eluate Library-2 UCA eluate Library-2 OS eluate Library-3 TUC eluate Library-3 UCA eluate

49 93 97 69 109 106 68 86 110

1710 6049 3505 1290 4738 3741 1077 5041 10128

7326 16506 13268 8638 15040 13990 8668 22018 28530

23.3 34.5 26.4 14.9 31.5 26.7 12.4 22.8 35.4

1245

be identified in egg yolk, a significant increment as compared to the best proteomic report today available (see Table 2). On the contrary, in the starting egg yolk (control) 49 gene products could be identified. The increment in detection, thus, as compared with the starting material, is by a factor of 5. All proteins found in the initial material were also found after peptide library treatment except one corresponding to IPI00600048 (similar to complement C7). The enhancing properties of the libraries were confirmed by the number of peptides identified in eluates from the three libraries, compared to starting material. While the two main libraries contributed for more than 96%, the third library contributed for only six exclusive gene products. Independently on the type of library, TUC and UCA eluting agents were of similar efficiency (similar number of proteins desorbed and similar exclusive contribution), the third eluent (hydro-organic mixture) allowed desorbing only less than 10 proteins (data not shown). One of the main and most visible effects of CPLL protein extract treatments is the concomitant effect of decreasing the concentration of high-abundance proteins while increasing the concentration of low and very low-abundance species rendering them amenable to further analysis via e.g. various MS tools. This last effect is more pronounced, obviously, the more is the initial sample load, since the baits can only sequester what is present in solution, considering that PCR for proteins simply does not exist. It is now evident, from present data, that few exceptions exist. In particular, from a point of view of drastically reducing the concentration of high-abundance species, there are “well-behaved” and “aberrant” proteins. This phenomenon was demonstrated for several high-abundance proteins present in other biological extracts. One of them is soluble proteins from red blood cells where haemoglobin, that is initially present at a level of ca. 98% severely masking the “minority” proteome, is so dramatically reduced that not only the strong red color of the adsorbed sample fully disappears, but also the signal of ␣and ␤-chains in the Coomassie stained gels becomes quite faint [26]. Another classical example is albumin from human serum, whose initial highly dominant concentration is so reduced that it became almost undetectable after library treatment [27]. A similar behaviour is also attributed to human serum transferrin. There are, however, counter examples of extreme increase of proteins that are of medium abundance such as apolipoprotein A1: the latter possesses a large number of sites for a variety of ligands and thus it could probably recognize several hexapeptide ligands in our CPLL library, leading to an “extraordinary” capture and leaving it, even after CPLL treatment, as one of the most abundant species. In the present work an interesting example is illustrated by apolipoprotein B (Mr in excess of 500 kDa, see the tracings in Fig. 1), which is quite abundant in the control, untreated sample, and even more abundant after CPLL treatment (see SDS-PAGE Fig. 1 and Table 2). Interestingly, it behaves quite similarly to human Apo A1 with a concentration to an abnormal extent [28]. Another class of misbehaved proteins is surely mucins, which are present in large excess in a number of biological fluids. If not properly removed from the sample, they massively coagulate on the surface of the CPLL beads preventing proper binding of all other proteins [6]. It is not known if this is due to their particular composition or to their large size, which would permit binding to several baits across different beads, thus leading to an excessive capture. Glycoproteins too, if extensively glycosylated, might behave abnormally, in that the sugar residues could form extensive hydrogen bonds with the hexapeptide baits, independently from their respective sequences, thus resulting in over-adsorption onto the beads. Most generally, losses of 5 and up to 15% proteins [11–15] which were normally found in the control and became undetectable after CPLL treatment, were lamented in the past. We speculate that such proteins might be unable to find a suitable partner hexapeptide, as strange as it may sound, considering that 1 mL of beads should

1246

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

Table 2 Comprehensive list of the unique proteins identified in the experiment. The results for each kind of bead, for each elution type and for the total lysate before treatment have been merged in a unique file. For each protein the IPI accession number, the molecular mass and the number of unique peptides identified are reported. Scaffold was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 95.0% probability. Protein identifications were accepted if they could be established at greater than 99.0% probability and contained at least two identified peptides. Proteins previously identified by Mann and Mann are marked by an asterisk. Protein name

Accession numbers

Protein molecular mass (u)

Identified unique peptides

APOB Apolipoprotein B* ALB Serum albumin precursor* VIT2 205 kDa protein* VYGIII 184 kDa protein IGLL1 24 kDa protein APOVLDLII Apovitellenin-1 precursor* LTF Ovotransferrin precursor* LOC396449 Riboflavin-binding protein precursor* C3 Complement C3 precursor* LOC772169 13 kDa protein TPRXL Vitellogenin-1 precursor* PLG similar to plasminogen LOC416235 52 kDa protein* PIT 54 PIT 54* APOH 39 kDa protein* CFH similar to complement regulator factor H* LOC418892 similar to complement C4-1* - 154 kDa protein RCJMB04 1h13 Actin, cytoplasmic type 5*

IPI00775749 IPI00574195 IPI00591299 IPI00818934, IPI00823141 IPI00685019 IPI00580509 IPI00683271 IPI00594746 IPI00581158 IPI00586713 IPI00591843,IPI00820086 IPI00574250 IPI00681776 IPI00684336 IPI00599712,IPI00822244 IPI00577650 IPI00681096 IPI00589043 IPI00572084, IPI00655503, IPI00820892, IPI00837485 IPI00829373 IPI00589759 IPI00684262 IPI00574703 IPI00592072,IPI00685084 IPI00583974 IPI00584625 IPI00577903 IPI00578649 IPI00580765 IPI00588322

523 356.2 69 900.7 205 132.6 184 440.4 24 378.2 11 948.3 77 759 27 193.3 184 067.7 13 126.4 210 613.1 90 745.9 51 966.1 50 801.2 38 978.8 139 658.5 172 021.2 153 537.2 41 818.9

312 64 67 30 10 14 57 18 58 9 21 54 35 19 19 38 43 38 14

12 551.6 37 181.7 12 998.4 10 049.9 47 077.2 42 863.7 74 083.8 38 440.2 63 992.7 30 663 54 564.6

4 8 4 5 18 13 18 8 2 15 22

IPI00807214 IPI00596662 IPI00600667

178 867.5 14 602.5 15 273.3

22 5 3

IPI00680258 IPI00590719,IPI00818472 IPI00600609 IPI00594704 IPI00811463 IPI00810959 IPI00583396 IPI00683984 IPI00600859 IPI00683203 IPI00585276 IPI00819112 IPI00591282,IPI00820372 IPI00582056 IPI00590535 IPI00579589 IPI00571323,IPI00589899 IPI00602226 IPI00601322 IPI00596266 IPI00597095 IPI00587267 IPI00577279

56 957 52 122.8 12 610.2 7 158.8 12 308.4 12 760.8 55 799 11 755.6 16 221 10 124.8 12 499.3 14 871.6 16 291 85 814.3 282 107.8 14 691.5 56 155.7 204 792.2 419 172.1 60 619.7 59 455.9 47 065 99 276.4

12 13 2 2 2 2 11 3 10 3 3 3 7 16 23 5 5 2 3 14 11 17 11

IPI00814313 IPI00573007 IPI00580408 IPI00597243 IPI00589873 IPI00581988,IPI00589985 IPI00812005 IPI00581798

72 098.3 69 093.4 93 389.7 46 066.1 48 073.7 50 480.9 15 203.9 57 124.9

2 7 22 4 2 4 2 10

- VH1 protein (Fragment) AHSG similar to fetuin* - 13 kDa protein - 10 kDa protein* FGG 47 kDa protein* LOC396058 Ovalbumin* PROS1 hypothetical protein* AMBP similar to Alpha 1 microglobulin/bikunin LOC420039 similar to Krt42 protein APOA1 Apolipoprotein A-I precursor* FGB similar to Chain B, Crystal Structure Of Native Chicken Fibrinogen* C4 MHC-linked complement C4 precursor LOC429440 similar to Ig ␣ LOC430014 similar to 24 kDa protein; domains: 1 complete and 1 partial IGv/13kDa protein SERPINF2 similar to alpha-2-plasmin inhibitor VTN 52 kDa protein* - 13 kDa protein* - 7 kDa protein* LOC769305 similar to immunoglobulin lambda chain LOC769283 similar to immunoglobulin lambda chain SERPIND1 similar to heparin cofactor II LOC769305 12 kDa protein* LYZ Lysozyme C precursor* - 10 kDa protein LOC772174 13 kDa protein - 15 kDa protein TTR Transthyretin precursor* GSN Gelsolin precursor* FN1 similar to fibronectin 1 isoform 1 preproprotein* LOC426220 similar to avidin, partial* FGA Isoform 2 of Fibrinogen alpha chain precursor* VIT2 Vitellogenin-2 precursor UTRN 419 kDa protein GNS similar to glucosamine (N-acetyl)-6-sulfatase precursor* WFDC8 similar to putative porin SERPINF1 similar to SDF3* LOC776522 similar to Inter-alpha-trypsin inhibitor heavy chain H2 precursor (ITI heavy chain H2) (Inter-alpha-inhibitor heavy chain 2) (Inter-alpha-trypsin inhibitor complex component II) (Serum-derived hyaluronan-associated protein) (SHAP), partial KRT79 similar to Keratin 5 F2 Thrombin* LOC428593 similar to Plasminogen precursor KRT19 Keratin, type I cytoskeletal 19 KRT15 Type I alpha-keratin 15 EEF1A2 eukaryotic translation elongation factor 1 alpha 2 LOC776783 similar to Ig-like fold CPB2 similar to thrombin-activatable fibrinolysis inhibitor

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

1247

Table 2 (Continued ) TFRC transferrin receptor LOC427881 Histone H2A-III*

LOC427886 Histone H2B 8* APOB 31 kDa protein SYNE1 similar to Nesprin-1 KRT12 similar to K12 keratin HGFAC similar to hepatocyte growth factor activator* ITIH3 similar to inter-alpha (globulin) inhibitor H3* SERPINA4 similar to Serpina1d-prov protein* CLEC3B Tetranectin* C6 similar to complement component 6 TENP Tenp* - 22 kDa protein KRT4 hypothetical protein ALDH2 hypothetical protein SERPINB11 similar to Ovalbumin-related protein Y LOC770005; LOC770142; LOC417950; LOC770079; LOC427884; LOC417946 Histone H4* F9 Coagulation factor IX precursor LOC421091 similar to transthyretin* APCS C-reactive protein MFGE8 similar to Milk fat globule-EGF factor 8 protein RBP4 Plasma retinol-binding protein precursor CST3 Cystatin precursor* LOC395772 58 kDa protein CP similar to Ceruloplasmin precursor ACTA2 Actin, aortic smooth muscle EFEMP1 similar to EGF-containing fibulin-like extracellular matrix protein 1 COMP similar to cartilage oligomeric matrix protein LTF 105 kDa protein AGT similar to angiotensinogen* IGFALS similar to insulin-like growth factor binding protein complex acid-labile subunit FYB 92 kDa protein C5 188 kDa protein* FBLN1 Isoform D of Fibulin-1 precursor DNAH9 Protein ZC3H13 similar to KIAA0853 protein CL2 Ribonuclease CL2 HPX hypothetical protein, partial CFI hypothetical protein APOB 51 kDa protein GC Vitamin-D binding protein* C8B similar to Complement component C8 beta chain precursor TOP1 DNA topoisomerase I RCJMB04 10m24 Putative uncharacterized protein CPN1 hypothetical protein isoform 1 C2 similar to complement component C2 CDH20 Isoform 1 of Cadherin-20 precursor CTSA Putative uncharacterized protein* CYP21 Complement C4* TENP 44 kDa protein RIMS2 similar to RIM2-5B HDLBP Vigilin CRP similar to C-reactive protein, partial LOC431660 similar to avidin CHD1 Chromo-helicase-DNA-binding on the Z chromosome protein LOC396059 Neuron-glia cell adhesion molecule (Ng-CAM) precursor LOC416605 similar to PI-3-kinase-related kinase SMG-1 LOC428066 hypothetical protein DNAH5 similar to axonemal dynein heavy chain DNAH5 MCF2L2 similar to guanine nucleotide exchange factor OSTIII UBR4 similar to retinoblastoma-associated factor 600 BAZ1B bromodomain adjacent to zinc finger domain, 1B PIP4K2B similar to phosphatidylinositol-4-phosphate 5-kinase type II beta ROBO2 similar to KIAA1568 protein TTN Connectin (Fragment) SVEP1 similar to polydom protein KIAA0196 hypothetical protein GP1BA similar to glycoprotein Ib* - 70 kDa protein

IPI00575515,IPI00820859 IPI00580239, IPI00588663, IPI00651485, IPI00683104, IPI00683848, IPI00821171, IPI00822332, IPI00822816 IPI00584144, IPI00593187, IPI00600992, IPI00604087, IPI00818650, IPI00822431 IPI00821713 IPI00595507 IPI00599012 IPI00598803 IPI00588436 IPI00596449 IPI00583488 IPI00588843 IPI00598229 IPI00822052 IPI00822585 IPI00589575 IPI00585021 IPI00572919, IPI00576977, IPI00598581, IPI00819077 IPI00574534,IPI00684997 IPI00601238 IPI00601419 IPI00588573 IPI00571731 IPI00576782 IPI00822243 IPI00819050 IPI00578082,IPI00596848,IPI00821989 IPI00603974

86 060.4 13 949

12 2

13 946.7

5

30 733.7 1 011 871.9 54 094.6 65 872.1 99 847.8 47 720.2 22 155 105 116.8 47 417.6 22 131.7 58 914.8 63 944.7 43 611 11 349.7

2 3 2 6 5 6 6 15 7 5 5 2 4 2

51 786.2 16 065.2 25 642.5 53 128.2 22 497.7 15 268.7 57 762.3 123 196.2 41 978.1 66 305.5

2 5 4 5 5 4 2 8 2 10

IPI00601647 IPI00821645 IPI00585118 IPI00576764

83 362.6 105 394.6 51 330.4 65 484.3

6 3 6 2

IPI00598899,IPI00680516 IPI00591736,IPI00819959 IPI00570704,IPI00575205 IPI00682950,IPI00818397,IPI00818931 IPI00572738 IPI00821735 IPI00573473 IPI00577678 IPI00819590 IPI00573327 IPI00595351

92 074.6 188 221.3 78 119 499 497.5 197 900.1 13 414.9 15 898.6 67 195 50 803.9 53 669.3 66 376.3

2 8 6 3 4 4 4 3 2 6 6

IPI00596648 IPI00593413,IPI00600280 IPI00599900 IPI00594508 IPI00572849 IPI00601528 IPI00823324 IPI00819823 IPI00595302 IPI00583171,IPI00820163 IPI00813394 IPI00573379 IPI00591777

90 782.6 84 003.6 51 188.3 18 127.4 89 140.5 53 138 173 856.3 43 628.4 185 504.1 142 204.8 22 830.9 16 117.5 208 389.7

3 2 6 2 2 2 2 2 3 2 2 3 4

IPI00593745

138 413.7

2

IPI00577570 IPI00602726 IPI00588314 IPI00578674 IPI00573705 IPI00820657 IPI00576293

411 249.4 213 694.3 542 656.7 127 058.6 575 907.4 178 598.9 42 159.6

2 2 4 3 3 2 2

IPI00588304 IPI00596489 IPI00575250 IPI00582542 IPI00571680 IPI00680010

153 768.2 904 779.7 389 346.2 134 039.5 75 616.4 69 557.9

2 5 2 2 2 2

1248

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

Table 2 (Continued ) Protein name

Accession numbers

Protein molecular mass (u)

Identified unique peptides

- 222 kDa protein NIPBL similar to delangin CPD similar to carboxypeptidase D C8A similar to Complement component 8, alpha polypeptide OBSCN similar to obscurin, cytoskeletal calmodulin and titin-interacting RhoGEF GOLGA4 similar to trans-Golgi p230 ACTBL2 similar to Actin, alpha 2, smooth muscle, aorta isoform 2 - 65 kDa protein LOC417848 similar to nothepsin C7 similar to complement protein C7 LOC768553 similar to Ran-binding protein 2 PRICKLE2 similar to Prickle2 protein SERPINB3 Ovalbumin-related protein Y* KIAA1109 similar to fragile site-associated protein LOC417967 hypothetical protein ACO2 Putative uncharacterized protein TMPRSS13 similar to Transmembrane protease, serine 13 LAMC1 laminin, gamma 1 LOC771634 similar to progestin induced protein, partial CSNK1G1 Putative uncharacterized protein PLK3 similar to Polo-like kinase 3 WTIP hypothetical protein COL6A3 Collagen alpha-3(VI) chain precursor USP38 hypothetical protein DCUN1D2 similar to Rp42 homolog ATAD2 similar to two AAA domain containing protein GAPDH Glyceraldehyde-3-phosphate dehydrogenase* PKD1 475 kDa protein TRRAP 437 kDa protein MLL3 similar to myeloid/lymphoid or mixed-lineage leukemia 3 HSPG Basement membrane-specific heparan sulfate proteoglycan core protein precursor FAT2 similar to protocadherin Fat 2 LOC416379 hypothetical protein ALS2CR13 hypothetical protein - 87 kDa protein IGSF10 similar to bone specific CMF608 CEP250 similar to centrosomal protein 2 - 294 kDa protein ARHGEF12 similar to guanine nucleotide exchange factor SPAG17 similar to PF6 TTF2 127 kDa protein - 139 kDa protein DOCK5 similar to dedicator of cytokinesis 5 STMN1 Stathmin - 132 kDa protein PIK3C2A similar to Phosphoinositide-3-kinase, class 2, alpha polypeptide CKAP5 similar to colonic and hepatic tumor over-expressed protein isoform 2 AKAP8L hypothetical protein ASPM similar to abnormal spindle-like PCNT similar to kendrin LOC421884 similar to bullous pemphigoid antigen 1, 230/240kDa LOC771315 similar to cytokine receptor common beta chain precursor – human SBNO2 similar to KIAA0963 protein WDR65 hypothetical protein HSPA8 71 kDa protein* TANC2 similar to KIAA1636 protein LOC771232 24 kDa protein TMEM98 similar to MGC80088 protein, partial ZNF294 similar to Zinc finger protein 294 AKAP13 similar to A-kinase anchor protein 13 LOC776237 hypothetical protein, partial FYCO1 167 kDa protein - similar to Titin DEPDC7 similar to dJ85M6.4 LRP1 Isoform 1 of Low-density lipoprotein receptor-related protein 1 precursor CENTD1 similar to ARAP2 ANAPC1 similar to Tsg24 protein

IPI00600743 IPI00600291 IPI00576308 IPI00584072 IPI00577658

221 891 315 844.3 150 756.8 67 708.1 750 964.9

4 4 3 3 2

IPI00579194,IPI00818295 IPI00579460

257 046.8 41 997.1

2 2

IPI00820826 IPI00684448 IPI00600048 IPI00813144 IPI00583953 IPI00573738 IPI00582188 IPI00595914 IPI00576187 IPI00602177 IPI00596936 IPI00575408 IPI00599728 IPI00598602 IPI00574034 IPI00575769 IPI00603752 IPI00583709 IPI00588479 IPI00594653 IPI00588585,IPI00597057 IPI00575358,IPI00819561 IPI00593571

65 411.7 46 733.2 92 736.8 180 851.9 97 887.3 43 755.7 558 667.9 62 036.3 85 634.7 41 809.8 173 534.5 307 728.3 52 177.4 71 990.5 72 368.8 339 579.7 114 494 30 302 151 486.1 35 685.7 474 498.9 436 810.2 540 007.8

2 2 3 3 3 2 2 2 2 2 3 3 2 2 2 2 4 3 3 3 2 2 2

IPI00597846

432 808.5

2

IPI00582865 IPI00580385 IPI00583148 IPI00573900 IPI00597899 IPI00570644 IPI00583871 IPI00604374 IPI00597587 IPI00602086,IPI00821328 IPI00573527 IPI00582607,IPI00812971 IPI00603631 IPI00596501 IPI00589884

482 428.2 291 772.2 59 940.4 86 944.3 19 9155 288 579.2 293 679.7 178 831.8 121 566.3 126 658.7 138 846.8 216 184.5 17 064.8 131 786.3 190 743.7

2 2 2 2 2 2 2 2 4 4 3 3 3 2 2

IPI00575181,IPI00812473

225 339.2

2

IPI00584667 IPI00602741 IPI00584594 IPI00573263

80 393.9 397 087.1 409 234.8 882 613.7

2 2 2 2

IPI00810644,IPI00811706

131 482.2

2

IPI00598713 IPI00589155 IPI00818933 IPI00576847 IPI00822087 IPI00590305 IPI00583645 IPI00581922 IPI00811609 IPI00818700 IPI00812160 IPI00602518 IPI00582419,IPI00593211

157 702.2 126 524.2 70 983 225 558.8 24 315.8 20 857.5 199 991.4 158 397.8 18 723.2 166 937.7 388 561.5 59 408.9 507 103.2

2 2 2 2 2 2 2 3 3 3 3 3 2

IPI00600935 IPI00589622

193 720.1 216 542.2

2 2

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

1249

Table 2 (Continued ). RCJMB04 6c17 69 kDa protein ZCCHC11 similar to zinc finger, CCHC domain containing 11 LAMA3 similar to laminin alpha 3 splice variant b1 SERPINA1 similar to alpha-1-antitrypsin* SNX9 similar to Sorting nexin 9 LOC427942 similar to alpha-2-macroglobulin DNAH1 similar to KIAA1410 protein NIN similar to ninein COL3A1 procollagen, type III, alpha 1 PRPF6 similar to PRP6 pre-mRNA splicing factor 6 homolog PKP2 similar to plakophilin 2a CHD2 similar to chromodomain helicase DNA binding protein 2 PRR11 similar to Proline rich 11 SMNDC1 similar to 30kDa splicing factor; SPF 30 MEI1 similar to meiosis defective 1 BRCA2 Breast cancer susceptibility protein HMGB2 High mobility group protein B2 FXR1 Putative uncharacterized protein SMC4 148 kDa protein LOC416793 similar to autism-related protein 1 RCJMB04 4o17 Putative uncharacterized protein SPATA6 Putative uncharacterized protein LOC423605 similar to VLLH2748 TOMM70A hypothetical protein SMC3 142 kDa protein - 28 kDa protein COL9A1 Collagen alpha-1(IX) chain precursor PDE1C similar to PDE1C protein TMPRSS9 similar to Transmembrane protease, serine 9 LOC776767 hypothetical protein, partial KLHDC7A hypothetical protein USP42 similar to ubiquitin specific protease 42 ANK2 similar to ankyrin B C20orf4 Putative uncharacterized protein TXNRD2 thioredoxin reductase 2 F11R Junctional adhesion protein A TDRD1 similar to tudor domain containing 1 LOC428693 hypothetical protein LOC423126 similar to scavenger receptor cysteine-rich type 1 protein CD163c-alpha COG1 Putative uncharacterized protein LOC426173 similar to F-box only protein 43 PLCD1 87 kDa protein - similar to Titin SUSD3 hypothetical protein PTPN12 similar to protein tyrosine phosphatase, non-receptor type 12 CWF19L2 similar to CWF19L2 protein LOC417584 similar to p95-APP1 GIMAP1 similar to human immunity associated protein 1 ARFGEF2 similar to ARFGEF2 LRRC16A similar to leucine rich repeat containing 16 MKL1 similar to KIAA1438 protein ATXN2 similar to ataxin-2 TTC6 hypothetical protein SIN3B similar to KIAA0700 protein DDX50 similar to Gu protein - 19 kDa protein

IPI00680994 IPI00602900 IPI00597868 IPI00591427 IPI00601634 IPI00599918 IPI00602995,IPI00819024 IPI00571271 IPI00589264 IPI00589966 IPI00599792 IPI00575702 IPI00684856 IPI00582553 IPI00578531 IPI00572046,IPI00821209 IPI00579203 IPI00581672,IPI00585649,IPI00819933 IPI00573837,IPI00684483,IPI00819477 IPI00581453 IPI00585905,IPI00683469,IPI00821614 IPI00577335 IPI00589296 IPI00596994 IPI00598955,IPI00684346 IPI00822887 IPI00819026,IPI00820229,IPI00820643 IPI00581313 IPI00588294 IPI00812468 IPI00580729 IPI00603557 IPI00600976 IPI00572493,IPI00820233 IPI00890657 IPI00829364 IPI00578444,IPI00589668 IPI00580159 IPI00578391,IPI00811498

69 470.1 182 158.3 370 645.9 48 698.1 67 076.8 186 791.1 485 440.1 248 501.3 139 281.5 106 789.5 104 215.3 212 819.6 41 036.7 26 636.4 144 105.5 378 101.2 23 810.5 72 793.6 148 426.8 150 013.8 66 145.4 23 149.5 62 895 65 059.7 141 711.7 28 101.3 91 458.4 68 576.7 118 102 10 288 80 635.2 148 181.1 407 438.5 42 669.3 56 241.1 31 841.9 117 499.5 41 966.9 99 337.4

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

IPI00581219 IPI00814348 IPI00578926,IPI00581980 IPI00812248 IPI00585173 IPI00603739

10 5415 30 449 87 334.3 528 721 29 458 131 522.2

2 2 2 2 2 2

IPI00572017 IPI00583258 IPI00584396 IPI00578132 IPI00601113 IPI00580411 IPI00583208 IPI00579573 IPI00600088 IPI00582179 IPI00847038

114 255.5 38 656.5 62 983 202 948.1 145 933.2 117 879.8 118 062.4 185 868.9 146 272.3 79 168.2 19 003.9

2 2 2 2 2 2 2 2 2 2 2

contain at least 3.5 million different baits, i.e. a very large excess over the potential protein content in any proteome. A more plausible explanation could be that the partner hexapeptide is there, but the affinity of such proteins for it is so low that the complex formation is highly unfavourable. A large part of uncaptured proteins by ProteoMiner are detected after a second library (carboxylated) supporting the hypothesis mentioned above. At the opposite extreme of the scale, i.e. in the class of lowabundance proteins, there appear to be some misbehavings as well. For instance using a second library more proteins are discovered which were neither detected in the initial sample not after primary amino terminal library. They would have been lost with the use of only the first library. As described for non-recognized highabundance proteins, this fact could be qualified of the same nature: the affinity of partner peptide is too low for the capture of these

proteins, on one hand, and also the concentration being in some case very low the thermodynamics requirements are not met. Thus, when using the ProteoMiner technology and making interpretation of obtained data, protein misbehaving should be considered. Nevertheless, even with these limitations, the ProteoMiner treatment is in general very successful, considering that the increment of detection of low-abundance species, after the CPLL treatment, is in general 2–5 times higher then in the control, untreated sample. No other technique, to our knowledge, has been able to achieve that performance. Since all identifications are made en masse (after having digested proteins sliced out from SDS-PAGE, see Section 2), the identity of a gene product does not necessarily mean that the protein is intact. This phenomenon has already been described in previous papers. Interestingly, as it has been repeatedly reported

1250

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

Fig. 3. SELDI-TOF-MS spectra obtained by using NTA-IMAC surface of polypeptides eluted from ProteoMiner (panel A, TUC eluates from Library-1) or from carboxylated ProteoMiner (panel B, TUC eluates from Library-2) using the TUC desorption solution. The top spectrum is a control obtained using an IMAC surface with no metal ions complexes. The second spectrum from the top is obtained by using copper metal ions as chelating surface. The other spectra are polypeptides captured by using gallium, zirconium or iron ions capable to interact with phosphate residues. The “a” spectra were obtained under normal conditions of buffer; the “b” spectra were obtained in presence of phosphate ions, so as to challenge the metal ion-phosphate interaction. Arrows indicate signals that were absent from the initial control and from chelated Cu ions (non-specific for phosphate), but that were present using Ga, Zr and Fe ions challenged with phosphate. The buffer used was 100 mM acetate pH 4 containing 250 mM sodium chloride. Competition with phosphate ions was performed by adding 50 mM NaH2 PO4 to the acetate buffer. m/z = mass over charge.

for a number of biological samples [28,29], the more pronounced increase of the number of species appeared within masses ranging between few thousands and 30 kDa. Since this phenomenon was also observed with egg yolk proteins, we tried to qualify the nature of these polypeptides considering that egg yolk may contain a relatively large number of phosphorylated species. In this respect an analytical investigation was performed with surfaceenhanced laser desorption ionization (SELDI) MS (Fig. 3) on the two TUC eluates, from the standard ProteoMiner (Library-1) and from its carboxylated derivative (Library-2) by exploiting IMAC surfaces, in order to see if it would be possible to evidence the preferential capture and enrichment of some phosphoproteins. For that, two negative controls were made: a surface without any metal ion and a surface with copper ions. Copper ions are not described as being specific for the interaction with phosphate groups when present on polypeptides, but rather more generically interacting with histidine residues. On the contrary, transition metal ions such as Fe3+ , Ga3+ , Ti4+ and Zr4+ are known to interact with phosphopolypeptides [30,31]. In Fig. 3A and B, the “a” spectra were obtained under normal conditions of buffer (see below); on the contrary “b” spectra were performed in the presence of phosphate ions in order to challenge the metal ion–phosphate interaction. The arrows indicate signals of phosphopolypeptides that were absent from the initial control and from chelated copper ions (non-specific for phosphate), but

that were present using gallium, zirconium and iron ions and again absent when challenged with phosphate ions. The buffer used was 100 mM acetate, pH 4, containing 250 mM sodium chloride (acidic pH values and high salt molarities were chosen in order to suppress non-specific binding). Competition with phosphate ions was performed by adding 50 mM NaH2 PO4 to the acetate buffer. By varying the type of chelated metal ions it has been found that phosphopolypeptides were detected. Clearly, specific signals were detected when zirconium, iron and gallium, transition metal ions described as forming specific complexes at acidic pH with phosphate groups [30,31] were adopted. The same signals were neither detected in the presence of copper ions as first control (interaction in this case involves especially histidine groups), nor when the IMAC chip surface was used without metal ions (second control). Moreover the signal was inhibited when in the presence of competitive phosphate ions. Although these polypeptides were not identified, they would constitute an investigation field of interest (work in progress). Considering the large list of proteins found an additional analysis of data was performed about the function of discovered species. Fig. 4 represents the molecular function analysis of proteins found before and after treatment with the libraries. Protein categories that are significantly increased are those having binding properties such as protein binding, nucleotide and metal ion binding, However, other categories are also amplified like proteins with catalytic activity and transcription regulator activity. At the same time there are proteins for which their relative abundance was decreased upon library treatment: they are categorized as enzyme regulator activity and transporter activity. When analysing the proteins found using the CPLL by category of molecular function and compared the analysis from the non-treated egg yolk, it was found that proteins having interaction properties are those that are the most increased. This is not completely surprising since the treatment is based on the formation of complexes between proteins from egg yolk and the peptides of the libraries. This quite rational behaviour has been observed with other biological extracts such as platelets and red blood cells [24]. For the possible physiological role and biological significance of the various categories of proteins identified, the reader is referred to the excellent excursus in Mann and Mann [7]. Whereas it goes without saying that the present data represent an exploration of egg yolk plasma proteome to an unprecedented depth (considering that the best report that appeared so far lists 100 plasma proteins, versus 255 in the present case), one should spend, nevertheless, a few words on the behaviour of the CPLL libraries, on the one hand, and on the proteins meant to bind to them, on the other hand. 3.2. Comparison of the present results with literature data The most exhaustive published protein list from egg yolk plasma [7] reports around hundred unique gene products. An in depthanalysis of the list of proteins indicates that the exact number is 115 gene products. This number resulted from the entire list after having withdrawn proteins found only in egg yolk granules (not contemplated in the present work) and also after having eliminated all IgY fragments. Fig. 5 illustrates the comparison between findings from previous published data and gene products detected by using the present protocol. The two protein lists have been compared using the ProteinCenter software from Proxeon Bioinformatics. After clustering to 96 homology, to remove the redundancy of protein sequences including splice variants and truncated forms, 54 proteins resulted in common between the two data sets. As a consequence, 61 and 201 gene products were exclusively found from respectively published data and the present work. Outside the fact that our work allowed increasing the egg yolk protein list by about 170% (from 115 to 316 total), there were 61 gene products that were not detected using the presently described method.

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

1251

Fig. 4. Ontology analysis (classification by molecular function) of proteins found in the initial egg yolk (light grey bars) and after treatment with peptide libraries (dark grey bars).

This difference may be attributed to a variety of reasons such as egg yolk wash to eliminate the residual egg yolk proteins, or difference in performance of mass spectrometry systems used added to reliability of MS instruments themselves. Although the difference on reliability of mass spectrometers could be evaluated around 10 to 15%, our data come from duplicated experiments and protein identification has been accepted only under stringent criteria rendering the results accurate. Nonetheless egg yolk comprises proteins that were not evidenced by the described principle based on peptide ligand library capture. This is the case of for instance cathepsin D, aspartyl protease inhibitor [32] and cobalamin-binding proteins [33] found by using other methods than mass spectrometry. Regarding the question of elimination of residual proteins from egg white, the list of proteins previously reported [7] shows for instance the presence of ovomucoid and ovoglycoprotein, classical protein species normally found in egg white as clearly discussed by the authors. A recent paper [9], belonging to the series of proteomic exploration of chicken eggs, has highlighted a phenomenon that bears some resemblance to the activity of CPLLs. In an effort to perform a full proteomic analysis of chicken egg vitelline membrane, the authors washed extensively this membrane either with distilled water or with 0.5 M NaCl. Apparently, the water-washed membrane, bore more proteins, especially in the lower Mr (10–20 kDa) region than the salt-stripped membrane as shown via SDS-PAGE profiling (see Fig. 1 in ref. [9]). Yet, upon MS analysis of all proteins along each electrophoretic track, the water-rinsed membrane gave

Fig. 5. Overlapping Venn diagram of proteins found in the present study compared to what has been reported from the literature [7]. Proteins found are the sum of all non-redundant gene products identified from the initial extract as well as from all eluates of the three libraries used.

only 97 proteins, whereas the salt-stripped vitelline membrane enabled the detection of 123 unique gene products, a substantial and significant increment. This result is only apparently aberrant. The strong suppression of the signal of three proteins of very highabundance (VMO-1, ca. 20 kDa; lysozyme C, ca. 14 kDa and another strong band at 10 kDa), via stripping with 0.5 N NaCl, allows detecting the signal of low-abundance proteins migrating in the 10 to 20 kDa region, whose presence would have been masked by the overwhelming presence of these 3 proteins. Thus, paradoxically, this salt-stripping performed part of the function of CPLLs, namely to drastically reduce the signals of some highly-abundant proteins, bringing to the limelight those of low-abundance that would have been masked underneath in the SDS track. Of course, the other, most important function, so typical of CPLLs, namely that of concentrating simultaneously the low-abundance species, could not be performed by this simple salt-stripping, yet the result was quite remarkable, although it fully confirmed the notion that the best treatment would have been with the CPLL methodology, if applicable. In practice, the CPLL methodology, being based on solid-phase adsorption, cannot be used in conjunction with particulate material. This peculiar limitation adds to the impossibility to quantify in the original biological material novel entities evidenced after CPLL treatment. Nevertheless relative quantitation versus a control sample remains possible as described [26]. 4. Conclusions and future perspectives The present report, by exploiting a CPLL technology, has more than doubled the number of proteins (up to 255) found in egg yolk plasma, as compared to the most recent and comprehensive list reported by Mann and Mann [7]. It is not known, at present, what would be the biological significance and role of these new species. Clearly, more work will be necessary in order to trace the origin of these novel egg yolk components and to determine their possible function. Although the treatment with CPLLs might appear to be quite cumbersome, since we use at least two libraries with three sequential elution steps, this resulting in six eluates to be further processed and analyzed, we are presently investigating other simplified protocols, rendering the technique more user friendly. One of them could be the elution en block of all adsorbed proteins in boiling SDS. Such a single step, besides greatly simplifying the present protocol, would have at least two extra benefits: first of all, it would

1252

A. Farinazzo et al. / J. Chromatogr. A 1216 (2009) 1241–1252

allow direct analysis of the eluate via SDS-PAGE; secondly, it would permit a near-quantitative recovery of all material adsorbed onto the beads. Although not shown in the present report, indeed, even after the three sequential elution steps, the three types of beads, further treated with boiling SDS, released traces of proteins that could represent about 2 to 5% of the total catch of each type of library. Due to the paucity of the eluted material, it was not possible in the present investigation to assess whether these proteins represented a left-over of species of relatively high-abundance, already eluted in previous steps, or if they represent novel proteins tenaciously bound to the beads and thus lost completely to the analysis (work in progress). Acknowledgements The authors would like to thank Dr. Alexander Podtelejnikov (Proxeon Bioinformatics, Odense, Denmark) for comparative data analysis. We thank also Dr. K.H. Mann for sharing his data with us. P.G.R. was supported by Fondazione Cariplo (Milan, Italy), by PRIN 2006 (MURST, Rome, Italy) and by the Bilateral Project “Novel Methods for Top-down Analysis of Macromolecular and Nanosized Samples of Biotechnological and Environmental Interest” within the VIII Executive Programme of Scientific and Technological Cooperation between Italy and Korea for the years 2007-2009. A.B. is partially supported by Fondazione Cariplo (Progetto Nobel: Guard). Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.chroma.2008.11.051. References [1] R.W. Burley, D.V. Vadehra, The Avian Egg—Chemistry and Biology, Wiley, New York, 1989. [2] Y. Mine, Worlds Poult. Sci. J. 58 (2002) 31. [3] C. Tokarski, E. Martin, C. Rolando, C. Cren-Olivé, Anal. Chem. 78 (2006) 1494.

[4] C. Guérin-Dubiard, M. Pasco, D. Mollé, C. Désert, T. Croguennec, F. Nau, J. Agric. Food Chem. 54 (2006) 3901. [5] K. Mann, Proteomics 7 (2007) 3558. [6] C. D’Ambrosio, S. Arena, A. Scaloni, L. Guerrier, E. Boschetti, M.E. Mendieta, A. Citterio, P.G. Righetti, J. Proteome Res. 7 (2008) 3461. [7] K. Mann, M. Mann, Proteomics 8 (2008) 178. [8] K. Mann, J.V. Olsen, B. Macek, F. Gnad, M. Mann, World’s Poultry Sci. J. 64 (2008) 209. [9] K. Mann, Proteomics 8 (2008) 2322. [10] V. Thulasiraman, S. Lin, L. Gheorghiu, J. Lathrop, L. Lomas, D. Hammond, E. Boschetti, Electrophoresis 26 (2005) 3561. [11] P.G. Righetti, E. Boschetti, L. Lomas, A. Citterio, Proteomics 6 (2006) 3980. [12] P. G: Righetti, E. Boschetti, FEBS J. 274 (2007) 897. [13] E. Boschetti, L. Lomas, A. Citterio, P.G. Righetti, J. Chromatogr. A 1153 (2007) 277. [14] E. Boschetti, P.G. Righetti, BioTechniques 44 (2008) 663. [15] L. Guerrier, P.G. Righetti, E. Boschetti, Nat. Protoc. 3 (2008) 883. [16] D.U. Ahn, S.H. Lee, H. Singam, E.J. Lee, J.C. Kim, Food Sci. Biotechnol. 15 (2006) 189. [17] U. Laemmli, Nature 227 (1970) 680. [18] G. Candiano, M. Bruschi, L. Musante, L. Santucci, G.M. Ghiggeri, B. Carnemolla, P. Orecchia, L. Zardi, P.G. Righetti, Electrophoresis 25 (2004) 1327. [19] M. Hamdan, P.G. Righetti, Proteomics Today, Wiley, Hoboken, 2005, pp. 346348. [20] M. Shevchenko, O. Wilm, M. Vorm, M. Mann, Anal. Chem. 68 (1996) 850. [21] J.V. Olsen, L.M. de Godoy, G. Li, B. Macek, P. Mortensen, R. Pesch, A. Makarov, O. Lange, S. Horning, M. Mann, Mol. Cell. Proteomics 12 (2005) 2010. [22] P.J. Kersey, J. Duarte, A. Williams, Y. Karavidopoulou, E. Birney, R. Apweiler, Proteomics 4 (2004) 1985. [23] A. Keller, A.I. Nesvizhskii, E. Kolker, R. Aebersold, Anal. Chem. 74 (2002) 5383. [24] A.I. Nesvizhskii, Anal. Chem. 75 (2003) 4646. [25] E. Gianazza, P.G. Righetti, J. Chromatogr. 193 (1980) 1. [26] F. Roux-Dalvai, A. Gonzalez de Peredo, O. Burlet-Schiltz, C. Simó, L. Guerrier, D. Bouyssié, A. Zanella, A. Citterio, O. Burlet-Schiltz, E. Boschetti, P.G. Righetti, B. Monsarrat, Mol. Cell. Proteomics 7 (2008) 2254. [27] L. Sennels, M. Salek, L. Lomas, E. Boschetti, P.G. Righetti, J. Rappsilber, J. Proteome Res. 6 (2007) 4055. [28] P.G. Righetti, E. Boschetti, Mass Spectr. Rev. 27 (2008) 596. [29] P.G. Righetti, E. Boschetti, Proteomics, in press. [30] S. Feng, M. Ye, H. Zhou, X. Jiang, H. Zou, B. Gong, Mol. Cell. Proteomics 6 (2007) 1656. [31] Y. Li, X. Xu, D. Qi, C. Deng, P. Yang, X. Zhang, J. Proteome Res. 7 (2008) 2526. [32] E. Fialho, A. Nakamura, L. Juliano, H. Masuda, M.A. Silva-Neto, Arch. Biochem. Biophys. 436 (2005) 246. [33] A. Del Corral, R. Carmel, Gastroenterology 98 (1990) 1460.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.