Acknowledgements
J.A.V, K. J., D.L.T. and M.W. have been funded by the EU Horizon 2020
grant ‘EPIC-XS’ [grant number 823839]. D.L.T. was also supported by
the EU Horizon 2020 research and innovation programme under [grant
number 829157]. J.A.V. was also funded by EMBL-core funding. The
authors would also like to thank Deepti J. Kundu for her help with
providing the statistics about public TD proteomics datasets in PRIDE
and the whole PRIDE team for their ongoing support to keep data FAIR.
Bibliography
[1] Smith,
L.M., Kelleher, N.L., Consortium for Top Down Proteomics, Proteoform: a
single term describing protein complexity. Nat. Methods 2013, 10,
186–187.
[2]
Perez-Riverol, Y., Bai, J., Bandla, C., García-Seisdedos, D., et al.,
The PRIDE database resources in 2022: a hub for mass spectrometry-based
proteomics evidences. Nucleic Acids Res. 2022, 50, D543–D552.
[3] Deutsch,
E.W., Bandeira, N., Perez-Riverol, Y., Sharma, V., et al., The
ProteomeXchange consortium at 10 years: 2023 update. Nucleic Acids
Res. 2023, 51, D1539–D1548.
[4] Wilkinson,
M.D., Dumontier, M., Aalbersberg, I.J.J., Appleton, G., et al., The FAIR
Guiding Principles for scientific data management and stewardship.Sci. Data 2016, 3, 160018.
[5] Neely,
B.A., Dorfer, V., Martens, L., Bludau, I., et al., Toward an integrated
machine learning model of a proteomics experiment. J. Proteome
Res. 2023, 22, 681–696.
[6] Deutsch,
E.W., Orchard, S., Binz, P.-A., Bittremieux, W., et al., Proteomics
standards initiative: fifteen years of progress and future work.J. Proteome Res. 2017, 16, 4288–4298.
[7] Martens,
L., Chambers, M., Sturm, M., Kessner, D., et al., mzML–a community
standard for mass spectrometry data. Mol. Cell. Proteomics 2011,
10, R110.000133.
[8] Hoffmann,
N., Rein, J., Sachsenberg, T., Hartler, J., et al., mzTab-M: A Data
Standard for Sharing Quantitative Results in Mass Spectrometry
Metabolomics. Anal. Chem. 2019, 91, 3302–3310.
[9] Griss, J.,
Jones, A.R., Sachsenberg, T., Walzer, M., et al., The mzTab data
exchange format: communicating mass-spectrometry-based proteomics and
metabolomics experimental results to a wider audience. Mol. Cell.
Proteomics 2014, 13, 2765–2775.
[10]
Perez-Riverol, Y., Moreno, P., Scalable data analysis in proteomics and
metabolomics using biocontainers and workflows engines.Proteomics 2020, 20, e1900147.
[11] Di
Tommaso, P., Chatzou, M., Floden, E.W., Barja, P.P., et al., Nextflow
enables reproducible computational workflows. Nat. Biotechnol.2017, 35, 316–319.
[12]
Hulstaert, N., Shofstahl, J., Sachsenberg, T., Walzer, M., et al.,
ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File
Conversion. J. Proteome Res. 2020, 19, 537–542.
[13] Jeong,
K., Kim, J., Gaikwad, M., Hidayah, S.N., et al., FLASHDeconv: Ultrafast,
High-Quality Feature Deconvolution for Top-Down Proteomics. Cell
Syst. 2020, 10, 213-218.e6.
[14] Kou, Q.,
Xun, L., Liu, X., TopPIC: a software tool for top-down mass
spectrometry-based proteoform identification and characterization.Bioinformatics 2016, 32, 3495–3497.
[15] Panel
v0.14.4 n.d.
[16] Levitsky,
L.I., Klein, J.A., Ivanov, M.V., Gorshkov, M.V., Pyteomics 4.0: five
years of development of a python proteomics framework. J. Proteome
Res. 2019, 18, 709–714.
[17] Kurtzer,
G.M., Sochat, V., Bauer, M.W., Singularity: Scientific containers for
mobility of compute. PLoS ONE 2017, 12, e0177459.
[18] Jeong,
K., Babović, M., Gorshkov, V., Kim, J., et al., FLASHIda enables
intelligent data acquisition for top-down proteomics to boost proteoform
identification counts. Nat. Commun. 2022, 13, 4407.
[19] Jeong,
K., Kaulich, P.T., Jung, W., Kim, J., et al., Precursor deconvolution
error estimation: the missing puzzle piece in false discovery rate in
top-down proteomics. Authorea, Inc. 2023.
[20] Tabb,
D., Jeong, K., Druart, K., Gant, M., et al., Comparing Top-Down
Proteoform Identification: Deconvolution, PrSM Overlap, and PTM
Detection 2022.
[21] Kou, Q.,
Wu, S., Tolic, N., Paša-Tolic, L., et al., A mass graph-based approach
for the identification of modified proteoforms using top-down tandem
mass spectra. Bioinformatics 2017, 33, 1309–1316.
[22] Toby,
T.K., Fornelli, L., Srzentić, K., DeHart, C.J., et al., A comprehensive
pipeline for translational top-down proteomics from a single blood draw.Nat. Protoc. 2019, 14, 119–152.
[23] Choi,
I.K., Abeysinghe, E., Coulter, E., Marru, S., et al., TopPIC Gateway: A
Web Gateway for Top-Down Mass Spectrometry Data Interpretation.PEARC20 (2020) 2020, 2020, 461–464.
[24] Park, J.,
Piehowski, P.D., Wilkins, C., Zhou, M., et al., Informed-Proteomics:
open-source software package for top-down proteomics. Nat.
Methods 2017, 14, 909–914.
[25] LeDuc,
R.D., Deutsch, E.W., Binz, P.-A., Fellers, R.T., et al., Proteomics
standards initiative’s proforma 2.0: unifying the encoding of
proteoforms and peptidoforms. J. Proteome Res. 2022, 21,
1189–1195.
[26] UniProt
Consortium, Uniprot: the universal protein knowledgebase in 2023.Nucleic Acids Res. 2023, 51, D523–D531.
[27] Hollas,
M.A.R., Robey, M.T., Fellers, R.T., LeDuc, R.D., et al., The Human
Proteoform Atlas: a FAIR community resource for experimentally derived
proteoforms. Nucleic Acids Res. 2022, 50, D526–D533.
[28] Smith,
L.M., Agar, J.N., Chamot-Rooke, J., Danis, P.O., et al., The Human
Proteoform Project: Defining the human proteome. Sci. Adv. 2021,
7, eabk0734.