Discussion
Pathway or network analysis is often viewed as a one-size-fits-all
approach that can be applied universally to any dataset. However, as our
review demonstrates, network analysis encompasses a broad range of
approaches with unique data requirements and diverse PKN sources. Any
new project or program incorporating network analysis should carefully
define the task at hand, explore the available prior information
sources, and consider the integration and scalability challenges
associated with each resource.
Networks and network-based methods are invaluable tools for the analysis
of omics data. It is widely recognized that the selection of prior
knowledge network (PKN) can influence the outcome of analysis, therefore
selection of an appropriate PKN is key to producing reliable results.
With such an enormous suite of network resources available it can become
overwhelming to select an appropriate model. To address this challenge,
we present a framework for classifying PKNs and network-based methods.
This framework characterizes PKNs in terms of their scope, mechanistic
detail, and ability to inform causal predictions. We also outline some
common computational tasks to describe the aim of network-based
analyses. To contextualize the framework, we sampled a handful of
published network-based methods and discussed their PKN selection, the
tasks they aim to accomplish, their approach to analysis and their
real-world applications. While this sampling is not exhaustive, it
offers readers a practical glimpse into the application of the
framework.
Looking ahead, we anticipate network analysis to gain even greater
prominence, shifting towards more detailed approaches for two reasons.
First, the rapid advancements in multi-modal, spatial, and single-cell
modalities have enabled the measurement of subcellular protein
localization changes, post-translational modifications (PTMs), and
molecular complexes at a single-cell scale using imaging
modalities45.This wealth of information primarily
resides in level 4 networks and, to a lesser extent, in level 3
networks. Effectively harnessing these rich datasets will necessitate
the utilization of more detailed PKNs. Second, recent breakthroughs in
large language models46 have significantly enhanced
our ability to extract knowledge from the literature. Combining this
capability with crowd-sourcing47 and human-in-the-loop
systems48 holds the potential to reduce curation costs
by two orders of magnitude47 enabling near-complete
curation of the entire biomedical literature on biological molecular
processes. The increased completeness of PKNs, along with improved and
larger datasets, will unlock extensive application areas for
increasingly sophisticated network models.