Computational Tasks
Networks can be combined with -omics data to achieve a wide range of computational tasks. Below we define some broad categories that describe these computational tasks. These categories are not mutually exclusive, as many computational methods have the capacity to perform multiple tasks or hybrids of them. For example, methods which “upscale networks”, meaning they output a higher-level network from a lower-level PKN, typically do both network inference and explanation extraction, as they select a small subset of the input PKN that can explain the correlations in the data and then will modify it to infer a new, higher-level network. It is also common to use explanation extraction or network inference task as a precursor to phenotype prediction, especially in clinical applications.
Explanation extraction aims to interpret patterns found within an omics profile and contextualize them using prior information about the system. It addresses hypotheses around system changes, such as differential expression or altered interaction strengths, to elucidate the mechanisms involved21. Common examples of explanation extraction tasks include enrichment-analysis and algorithms that produce a relevant subgraph of a larger network. Explanation extraction can also be thought of as emulating the literature search of a molecular biologist to explain the data at hand. As a molecular biologist reads the literature they ask “Is this information fragment compatible with my data? Does it explain it or contradict it? Is this applicable to my experiment’s context?”. The same questions are interrogated by explanation extraction methods, but in a quantitative manner that scales to high throughput data. Explanation extraction tools generate valuable conjectures that can, for example, guide the selection of subsequent perturbing agents, or recognize parallel mechanisms that unify multiple datasets in a novel way12,19.
Network inference tasks produce a network model based on the input -omics data. This can be achieved by integrating prior networks or can be done de novo . Due to the combinatorial complexity of the model space and the inherent stochasticity of biological systems, inference is always an underdetermined problem and coherence of inferred networks and actual biological reality may be low, independent of the performance of the model. Constraining inference to at least partially conform with known biology can help by “anchoring” inferred networks. Another option is to use a large number of biological models in an ensemble learning strategy to reduce bias.
Some network inference approaches construct an entirely new model while others expand on established networks, in either case, the goal is to generate new mechanistic hypotheses. Upscaling algorithms are a common example of network inference. These approaches infer a higher-level representation (e.g. Activity Flow) from a lower-level prior network (e.g. protein-protein interactions) using -omics profiles. Upscaling can also be used to assign weights, direction, sign and rate constants to edges on a graph.
Phenotype prediction aims to predict how an organism or system responds to disease states and perturbations. These methods may be applied at a cellular level to project signaling events and transformations as well as broad phenomena like cell proliferation and survival, but they can also be extended to a network medicine approach, where predictions are made at a patient level to inform diagnosis, prognosis, or treatment response22,23.
Effective phenotype prediction is arguably more difficult than the prior two tasks. Phenotype is a function of the whole system that often contains feedback loops and other non-linear response circuitry. It is also inherently multimodal as at minimum, it requires one omic measurement and one phenotype measurement modality –e.g. IC50, GR50 or disease free survival. Each of these factors can be confounding to phenotype prediction tools.