Discussion
Pathway or network analysis is often viewed as a one-size-fits-all approach that can be applied universally to any dataset. However, as our review demonstrates, network analysis encompasses a broad range of approaches with unique data requirements and diverse PKN sources. Any new project or program incorporating network analysis should carefully define the task at hand, explore the available prior information sources, and consider the integration and scalability challenges associated with each resource.
Networks and network-based methods are invaluable tools for the analysis of omics data. It is widely recognized that the selection of prior knowledge network (PKN) can influence the outcome of analysis, therefore selection of an appropriate PKN is key to producing reliable results. With such an enormous suite of network resources available it can become overwhelming to select an appropriate model. To address this challenge, we present a framework for classifying PKNs and network-based methods. This framework characterizes PKNs in terms of their scope, mechanistic detail, and ability to inform causal predictions. We also outline some common computational tasks to describe the aim of network-based analyses. To contextualize the framework, we sampled a handful of published network-based methods and discussed their PKN selection, the tasks they aim to accomplish, their approach to analysis and their real-world applications. While this sampling is not exhaustive, it offers readers a practical glimpse into the application of the framework.
Looking ahead, we anticipate network analysis to gain even greater prominence, shifting towards more detailed approaches for two reasons. First, the rapid advancements in multi-modal, spatial, and single-cell modalities have enabled the measurement of subcellular protein localization changes, post-translational modifications (PTMs), and molecular complexes at a single-cell scale using imaging modalities45.This wealth of information primarily resides in level 4 networks and, to a lesser extent, in level 3 networks. Effectively harnessing these rich datasets will necessitate the utilization of more detailed PKNs. Second, recent breakthroughs in large language models46 have significantly enhanced our ability to extract knowledge from the literature. Combining this capability with crowd-sourcing47 and human-in-the-loop systems48 holds the potential to reduce curation costs by two orders of magnitude47 enabling near-complete curation of the entire biomedical literature on biological molecular processes. The increased completeness of PKNs, along with improved and larger datasets, will unlock extensive application areas for increasingly sophisticated network models.