Acknowledgements
J.A.V, K. J., D.L.T. and M.W. have been funded by the EU Horizon 2020 grant ‘EPIC-XS’ [grant number 823839]. D.L.T. was also supported by the EU Horizon 2020 research and innovation programme under [grant number 829157]. J.A.V. was also funded by EMBL-core funding. The authors would also like to thank Deepti J. Kundu for her help with providing the statistics about public TD proteomics datasets in PRIDE and the whole PRIDE team for their ongoing support to keep data FAIR.
Bibliography
[1] Smith, L.M., Kelleher, N.L., Consortium for Top Down Proteomics, Proteoform: a single term describing protein complexity. Nat. Methods 2013, 10, 186–187.
[2] Perez-Riverol, Y., Bai, J., Bandla, C., García-Seisdedos, D., et al., The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res. 2022, 50, D543–D552.
[3] Deutsch, E.W., Bandeira, N., Perez-Riverol, Y., Sharma, V., et al., The ProteomeXchange consortium at 10 years: 2023 update. Nucleic Acids Res. 2023, 51, D1539–D1548.
[4] Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J.J., Appleton, G., et al., The FAIR Guiding Principles for scientific data management and stewardship.Sci. Data 2016, 3, 160018.
[5] Neely, B.A., Dorfer, V., Martens, L., Bludau, I., et al., Toward an integrated machine learning model of a proteomics experiment. J. Proteome Res. 2023, 22, 681–696.
[6] Deutsch, E.W., Orchard, S., Binz, P.-A., Bittremieux, W., et al., Proteomics standards initiative: fifteen years of progress and future work.J. Proteome Res. 2017, 16, 4288–4298.
[7] Martens, L., Chambers, M., Sturm, M., Kessner, D., et al., mzML–a community standard for mass spectrometry data. Mol. Cell. Proteomics 2011, 10, R110.000133.
[8] Hoffmann, N., Rein, J., Sachsenberg, T., Hartler, J., et al., mzTab-M: A Data Standard for Sharing Quantitative Results in Mass Spectrometry Metabolomics. Anal. Chem. 2019, 91, 3302–3310.
[9] Griss, J., Jones, A.R., Sachsenberg, T., Walzer, M., et al., The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience. Mol. Cell. Proteomics 2014, 13, 2765–2775.
[10] Perez-Riverol, Y., Moreno, P., Scalable data analysis in proteomics and metabolomics using biocontainers and workflows engines.Proteomics 2020, 20, e1900147.
[11] Di Tommaso, P., Chatzou, M., Floden, E.W., Barja, P.P., et al., Nextflow enables reproducible computational workflows. Nat. Biotechnol.2017, 35, 316–319.
[12] Hulstaert, N., Shofstahl, J., Sachsenberg, T., Walzer, M., et al., ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File Conversion. J. Proteome Res. 2020, 19, 537–542.
[13] Jeong, K., Kim, J., Gaikwad, M., Hidayah, S.N., et al., FLASHDeconv: Ultrafast, High-Quality Feature Deconvolution for Top-Down Proteomics. Cell Syst. 2020, 10, 213-218.e6.
[14] Kou, Q., Xun, L., Liu, X., TopPIC: a software tool for top-down mass spectrometry-based proteoform identification and characterization.Bioinformatics 2016, 32, 3495–3497.
[15] Panel v0.14.4 n.d.
[16] Levitsky, L.I., Klein, J.A., Ivanov, M.V., Gorshkov, M.V., Pyteomics 4.0: five years of development of a python proteomics framework. J. Proteome Res. 2019, 18, 709–714.
[17] Kurtzer, G.M., Sochat, V., Bauer, M.W., Singularity: Scientific containers for mobility of compute. PLoS ONE 2017, 12, e0177459.
[18] Jeong, K., Babović, M., Gorshkov, V., Kim, J., et al., FLASHIda enables intelligent data acquisition for top-down proteomics to boost proteoform identification counts. Nat. Commun. 2022, 13, 4407.
[19] Jeong, K., Kaulich, P.T., Jung, W., Kim, J., et al., Precursor deconvolution error estimation: the missing puzzle piece in false discovery rate in top-down proteomics. Authorea, Inc. 2023.
[20] Tabb, D., Jeong, K., Druart, K., Gant, M., et al., Comparing Top-Down Proteoform Identification: Deconvolution, PrSM Overlap, and PTM Detection 2022.
[21] Kou, Q., Wu, S., Tolic, N., Paša-Tolic, L., et al., A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra. Bioinformatics 2017, 33, 1309–1316.
[22] Toby, T.K., Fornelli, L., Srzentić, K., DeHart, C.J., et al., A comprehensive pipeline for translational top-down proteomics from a single blood draw.Nat. Protoc. 2019, 14, 119–152.
[23] Choi, I.K., Abeysinghe, E., Coulter, E., Marru, S., et al., TopPIC Gateway: A Web Gateway for Top-Down Mass Spectrometry Data Interpretation.PEARC20 (2020) 2020, 2020, 461–464.
[24] Park, J., Piehowski, P.D., Wilkins, C., Zhou, M., et al., Informed-Proteomics: open-source software package for top-down proteomics. Nat. Methods 2017, 14, 909–914.
[25] LeDuc, R.D., Deutsch, E.W., Binz, P.-A., Fellers, R.T., et al., Proteomics standards initiative’s proforma 2.0: unifying the encoding of proteoforms and peptidoforms. J. Proteome Res. 2022, 21, 1189–1195.
[26] UniProt Consortium, Uniprot: the universal protein knowledgebase in 2023.Nucleic Acids Res. 2023, 51, D523–D531.
[27] Hollas, M.A.R., Robey, M.T., Fellers, R.T., LeDuc, R.D., et al., The Human Proteoform Atlas: a FAIR community resource for experimentally derived proteoforms. Nucleic Acids Res. 2022, 50, D526–D533.
[28] Smith, L.M., Agar, J.N., Chamot-Rooke, J., Danis, P.O., et al., The Human Proteoform Project: Defining the human proteome. Sci. Adv. 2021, 7, eabk0734.