Shotgun proteomics

Shotgun proteomics refers to the use of bottom-up proteomics techniques in identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry.^[1]^[2]^[3]^[4]^[5]^[6] The name is derived from shotgun sequencing of DNA which is itself named after the rapidly expanding, quasi-random firing pattern of a shotgun. The most common method of shotgun proteomics starts with the proteins in the mixture being digested and the resulting peptides are separated by liquid chromatography. Tandem mass spectrometry is then used to identify the peptides.

Targeted proteomics using SRM and data-independent acquisition methods are often considered alternatives to shotgun proteomics in the field of bottom-up proteomics. While shotgun proteomics uses data-dependent selection of precursor ions to generate fragment ion scans, the aforementioned methods use a deterministic method for acquisition of fragment ion scans.

History

Shotgun proteomics arose from the difficulties of using previous technologies to separate complex mixtures. In 1975, two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) was described by O’Farrell and Klose with the ability to resolve complex protein mixtures.^[7]^[8] The development of matrix-assisted laser desorption ionization (MALDI), electrospray ionization (ESI), and database searching continued to grow the field of proteomics. However these methods still had difficulty identifying and separating low-abundance proteins, aberrant proteins, and membrane proteins. Shotgun proteomics emerged as a method that could resolve even these proteins.^[5]

Advantages

Shotgun proteomics allows global protein identification as well as the ability to systematically profile dynamic proteomes.^[9] It also avoids the modest separation efficiency and poor mass spectral sensitivity associated with intact protein analysis.^[1]

Disadvantages

The dynamic exclusion filtering that is often used in shotgun proteomics maximizes the number of identified proteins at the expense of random sampling.^[10] This problem may be exacerbated by the undersampling inherent in shotgun proteomics.^[11]

Agilent 1200 HPLC

Quadrupole time-of-flight tandem mass spectrometer (Q-TOF)

Workflow

Cells containing the protein complement desired are grown. Proteins are then extracted from the mixture and digested with a protease to produce a peptide mixture.^[9] The peptide mixture is then loaded directly onto a microcapillary column and the peptides are separated by hydrophobicity and charge. As the peptides elute from the column, they are ionized and separated by m/z in the first stage of tandem mass spectrometry. The selected ions undergo collision-induced dissociation or other process to induce fragmentation. The charged fragments are separated in the second stage of tandem mass spectrometry.

The "fingerprint" of each peptide's fragmentation mass spectrum is used to identify the protein from which they derive by searching against a sequence database with commercially available software (e.g. Sequest or Mascot).^[9] Examples of sequence databases are the Genpept database or the PIR database.^[12] After the database search, each peptide-spectrum match (PSM) needs to be evaluated for validity.^[13] This analysis allows researchers to profile various biological systems.^[9]

Challenges with peptide identification

Peptides that are degenerate (shared by two or more proteins in the database) makes it difficult to unambiguously identify the protein to which they belong. Additionally, some proteome samples of vertebrates have a large number of paralogs, and alternative splicing in higher eukaryotes can result in many identical protein subsequences.^[1] Moreover, many proteins are naturally (co- or post-translational) or artificially (sample preparation artefacts) modified. This further challenges the identification of the peptide sequence by means of conventional database matching approaches. Together with peptide fragmentation spectra of poor quality or high complexity (due to co-isolation or sensitivity limitations), this leaves in a conventional shotgun proteomics experiment many sequencing spectra unidentified.^[14]^[15]^[16]^[17]

Practical applications

With the human genome sequenced, the next step is the verification and functional annotation of all predicted genes and their protein products.^[4] Shotgun proteomics can be used for functional classification or comparative analysis of these protein products. It can be used in projects ranging from large-scale whole proteome to focusing on a single protein family. It can be done in research labs or commercially.

Large-scale analysis

One example of this is a study by Washburn, Wolters, and Yates in which they used shotgun proteomics on the proteome of a Saccharomyces cerevisiae strain grown to mid-log phase. They were able to detect and identify 1,484 proteins as well as identify proteins rarely seen in proteome analysis, including low-abundance proteins like transcription factors and protein kinases. They were also able to identify 131 proteins with three or more predicted transmembrane domains.^[2]

Protein family

Vaisar et al. uses shotgun proteomics to implicate protease inhibition and complement activation in the antiinflammatory properties of high-density lipoprotein.^[18] In a study by Lee et al., higher expression level of hnRNP A2/B1 and Hsp90 were observed in human hepatoma HepG2 cells than in wild type cells. This led to a search for reported functional roles mediated in concert by both these multifunctional cellular chaperones.^[19]

References

^ ^a ^b ^c Alves P, Arnold RJ, Novotny MV, Radivojac P, Reilly JP, Tang H (2007). "Advancement in protein inference from shotgun proteomics using peptide detectability". Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing: 409–20. PMID 17990506.
^ ^a ^b Washburn MP, Wolters D, Yates JR (March 2001). "Large-scale analysis of the yeast proteome by multidimensional protein identification technology". Nature Biotechnology. 19 (3): 242–7. doi:10.1038/85686. PMID 11231557. S2CID 16796135.
^ Wolters DA, Washburn MP, Yates JR (December 2001). "An automated multidimensional protein identification technology for shotgun proteomics". Analytical Chemistry. 73 (23): 5683–90. doi:10.1021/ac010617e. PMID 11774908.
^ ^a ^b Hu L, Ye M, Jiang X, Feng S, Zou H (August 2007). "Advances in hyphenated analytical techniques for shotgun proteome and peptidome analysis--a review". Analytica Chimica Acta. 598 (2): 193–204. doi:10.1016/j.aca.2007.07.046. PMID 17719892.
^ ^a ^b Fournier ML, Gilmore JM, Martin-Brown SA, Washburn MP (August 2007). "Multidimensional separations-based shotgun proteomics". Chemical Reviews. 107 (8): 3654–86. doi:10.1021/cr068279a. PMID 17649983.
^ Nesvizhskii AI (2007). "Protein identification by tandem mass spectrometry and sequence database searching". Mass Spectrometry Data Analysis in Proteomics. Methods Mol. Biol. Vol. 367. pp. 87–119. doi:10.1385/1-59745-275-0:87. ISBN 978-1-59745-275-5. PMID 17185772.
^ O'Farrell PH (May 1975). "High resolution two-dimensional electrophoresis of proteins". The Journal of Biological Chemistry. 250 (10): 4007–21. doi:10.1016/S0021-9258(19)41496-8. PMC 2874754. PMID 236308.
^ Klose J (1975). "Protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. A novel approach to testing for induced point mutations in mammals". Humangenetik. 26 (3): 231–43. doi:10.1007/bf00281458. PMID 1093965. S2CID 30981877.
^ ^a ^b ^c ^d Wu CC, MacCoss MJ (June 2002). "Shotgun proteomics: tools for the analysis of complex biological systems". Current Opinion in Molecular Therapeutics. 4 (3): 242–50. PMID 12139310.
^ Zhang B, VerBerkmoes NC, Langston MA, Uberbacher E, Hettich RL, Samatova NF (November 2006). "Detecting differential and correlated protein expression in label-free shotgun proteomics". Journal of Proteome Research. 5 (11): 2909–18. doi:10.1021/pr0600273. PMID 17081042. S2CID 22254554.
^ Tolmachev AV, Monroe ME, Purvine SO, Moore RJ, Jaitly N, Adkins JN, et al. (November 2008). "Characterization of strategies for obtaining confident identifications in bottom-up proteomics measurements using hybrid FTMS instruments". Analytical Chemistry. 80 (22): 8514–25. doi:10.1021/ac801376g. PMC 2692492. PMID 18855412.
^ Eng JK, McCormack AL, Yates JR (November 1994). "An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database". Journal of the American Society for Mass Spectrometry. 5 (11): 976–89. CiteSeerX 10.1.1.377.3188. doi:10.1016/1044-0305(94)80016-2. PMID 24226387. S2CID 18413192.
^ Cerqueira FR, Ferreira RS, Oliveira AP, Gomes AP, Ramos HJ, Graber A, Baumgartner C (1 January 2012). "MUMAL: multivariate analysis in shotgun proteomics using machine learning techniques". BMC Genomics. 13 (Suppl 5): S4. doi:10.1186/1471-2164-13-S5-S4. PMC 3477001. PMID 23095859.
^ Griss J, Perez-Riverol Y, Lewis S, Tabb DL, Dianes JA, Del-Toro N, et al. (August 2016). "Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets". Nature Methods. 13 (8): 651–656. doi:10.1038/nmeth.3902. PMC 4968634. PMID 27493588.
^ den Ridder M, Daran-Lapujade P, Pabst M (February 2020). "Shot-gun proteomics: why thousands of unidentified signals matter". FEMS Yeast Research. 20 (1). doi:10.1093/femsyr/foz088. PMID 31860055.
^ Michalski A, Cox J, Mann M (April 2011). "More than 100,000 detectable peptide species elute in single shotgun proteomics runs but the majority is inaccessible to data-dependent LC-MS/MS". Journal of Proteome Research. 10 (4): 1785–93. doi:10.1021/pr101060v. PMID 21309581.
^ Devabhaktuni A, Lin S, Zhang L, Swaminathan K, Gonzalez CG, Olsson N, et al. (April 2019). "TagGraph reveals vast protein modification landscapes from large tandem mass spectrometry datasets". Nature Biotechnology. 37 (4): 469–479. doi:10.1038/s41587-019-0067-5. PMC 6447449. PMID 30936560.
^ Vaisar T, Pennathur S, Green PS, Gharib SA, Hoofnagle AN, Cheung MC, et al. (March 2007). "Shotgun proteomics implicates protease inhibition and complement activation in the antiinflammatory properties of HDL". The Journal of Clinical Investigation. 117 (3): 746–56. doi:10.1172/JCI26206. PMC 1804352. PMID 17332893.
^ Lee CL, Hsiao HH, Lin CW, Wu SP, Huang SY, Wu CY, et al. (December 2003). "Strategic shotgun proteomics approach for efficient construction of an expression map of targeted protein families in hepatoma cell lines". Proteomics. 3 (12): 2472–86. doi:10.1002/pmic.200300586. PMID 14673797. S2CID 24518852.

External links

[API-1] Alves P, Arnold RJ, Novotny MV, Radivojac P, Reilly JP, Tang H (2007). "Advancement in protein inference from shotgun proteomics using peptide detectability". Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing: 409–20. PMID 17990506.

[Washburn-2] Washburn MP, Wolters D, Yates JR (March 2001). "Large-scale analysis of the yeast proteome by multidimensional protein identification technology". Nature Biotechnology. 19 (3): 242–7. doi:10.1038/85686. PMID 11231557. S2CID 16796135.

[3] Wolters DA, Washburn MP, Yates JR (December 2001). "An automated multidimensional protein identification technology for shotgun proteomics". Analytical Chemistry. 73 (23): 5683–90. doi:10.1021/ac010617e. PMID 11774908.

[hyphenated-4] Hu L, Ye M, Jiang X, Feng S, Zou H (August 2007). "Advances in hyphenated analytical techniques for shotgun proteome and peptidome analysis--a review". Analytica Chimica Acta. 598 (2): 193–204. doi:10.1016/j.aca.2007.07.046. PMID 17719892.

[pmid17649983-5] Fournier ML, Gilmore JM, Martin-Brown SA, Washburn MP (August 2007). "Multidimensional separations-based shotgun proteomics". Chemical Reviews. 107 (8): 3654–86. doi:10.1021/cr068279a. PMID 17649983.

[pmid17185772-6] Nesvizhskii AI (2007). "Protein identification by tandem mass spectrometry and sequence database searching". Mass Spectrometry Data Analysis in Proteomics. Methods Mol. Biol. Vol. 367. pp. 87–119. doi:10.1385/1-59745-275-0:87. ISBN 978-1-59745-275-5. PMID 17185772.

[7] O'Farrell PH (May 1975). "High resolution two-dimensional electrophoresis of proteins". The Journal of Biological Chemistry. 250 (10): 4007–21. doi:10.1016/S0021-9258(19)41496-8. PMC 2874754. PMID 236308.

[8] Klose J (1975). "Protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. A novel approach to testing for induced point mutations in mammals". Humangenetik. 26 (3): 231–43. doi:10.1007/bf00281458. PMID 1093965. S2CID 30981877.

[SPT-9] Wu CC, MacCoss MJ (June 2002). "Shotgun proteomics: tools for the analysis of complex biological systems". Current Opinion in Molecular Therapeutics. 4 (3): 242–50. PMID 12139310.

[10] Zhang B, VerBerkmoes NC, Langston MA, Uberbacher E, Hettich RL, Samatova NF (November 2006). "Detecting differential and correlated protein expression in label-free shotgun proteomics". Journal of Proteome Research. 5 (11): 2909–18. doi:10.1021/pr0600273. PMID 17081042. S2CID 22254554.

[11] Tolmachev AV, Monroe ME, Purvine SO, Moore RJ, Jaitly N, Adkins JN, et al. (November 2008). "Characterization of strategies for obtaining confident identifications in bottom-up proteomics measurements using hybrid FTMS instruments". Analytical Chemistry. 80 (22): 8514–25. doi:10.1021/ac801376g. PMC 2692492. PMID 18855412.

[12] Eng JK, McCormack AL, Yates JR (November 1994). "An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database". Journal of the American Society for Mass Spectrometry. 5 (11): 976–89. CiteSeerX 10.1.1.377.3188. doi:10.1016/1044-0305(94)80016-2. PMID 24226387. S2CID 18413192.

[MUMAL-13] Cerqueira FR, Ferreira RS, Oliveira AP, Gomes AP, Ramos HJ, Graber A, Baumgartner C (1 January 2012). "MUMAL: multivariate analysis in shotgun proteomics using machine learning techniques". BMC Genomics. 13 (Suppl 5): S4. doi:10.1186/1471-2164-13-S5-S4. PMC 3477001. PMID 23095859.

[14] Griss J, Perez-Riverol Y, Lewis S, Tabb DL, Dianes JA, Del-Toro N, et al. (August 2016). "Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets". Nature Methods. 13 (8): 651–656. doi:10.1038/nmeth.3902. PMC 4968634. PMID 27493588.

[15] Ridder M, Daran-Lapujade P, Pabst M (February 2020). "Shot-gun proteomics: why thousands of unidentified signals matter". FEMS Yeast Research. 20 (1). doi:10.1093/femsyr/foz088. PMID 31860055.

[16] Michalski A, Cox J, Mann M (April 2011). "More than 100,000 detectable peptide species elute in single shotgun proteomics runs but the majority is inaccessible to data-dependent LC-MS/MS". Journal of Proteome Research. 10 (4): 1785–93. doi:10.1021/pr101060v. PMID 21309581.

[17] Devabhaktuni A, Lin S, Zhang L, Swaminathan K, Gonzalez CG, Olsson N, et al. (April 2019). "TagGraph reveals vast protein modification landscapes from large tandem mass spectrometry datasets". Nature Biotechnology. 37 (4): 469–479. doi:10.1038/s41587-019-0067-5. PMC 6447449. PMID 30936560.

[18] Vaisar T, Pennathur S, Green PS, Gharib SA, Hoofnagle AN, Cheung MC, et al. (March 2007). "Shotgun proteomics implicates protease inhibition and complement activation in the antiinflammatory properties of HDL". The Journal of Clinical Investigation. 117 (3): 746–56. doi:10.1172/JCI26206. PMC 1804352. PMID 17332893.

[19] Lee CL, Hsiao HH, Lin CW, Wu SP, Huang SY, Wu CY, et al. (December 2003). "Strategic shotgun proteomics approach for efficient construction of an expression map of targeted protein families in hepatoma cell lines". Proteomics. 3 (12): 2472–86. doi:10.1002/pmic.200300586. PMID 14673797. S2CID 24518852.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]