Network Reference

API Documentation for the network submodule

Centrality

Submodule containing functions for finding centrality of nodes in a network

metworkpy.network.centrality.betweenness_centrality_bipartite_subset(G: Graph | DiGraph, node_partition: Iterable[Hashable], targets: Iterable[Hashable] | None = None, normalized=True, weight=None)

Compute betweenness centrality for a subset of nodes on a bipartite network, where the node subset comes from one of the partitions and nodes in the other partitions are treated as edges

\[c_B(v) =\sum_{s,t \in T} \frac{\sigma(s, t|v)}{\sigma(s, t)}\]

where $T$ is the set of targets, $sigma(s, t)$ is the number of shortest $(s, t)$-paths, and $sigma(s, t|v)$ is the number of those paths passing through some node $v$ other than $s, t$. If $s = t$, $sigma(s, t) = 1$, and if $v in {s, t}$, $sigma(s, t|v) = 0$ [2]__.

The betweenness can also be further normalized to the number of possible pairs of s and t.

Parameters:

G (graph) – A NetworkX graph, should be a bipartite graph (this condition is not checked).
node_partition (Iterable[Hashable]) – One of the two sets of nodes in the bipartite graph, specifically the set which contains all the targets
targets (list of nodes, optional) – Nodes to use as sources/targets for shortest paths in betweenness, all of these should fall into a single partition of the bipartite graph (this condition is not checked). If None, uses all nodes in the node_partition
normalized (bool, optional) – If True the betweenness values are normalized by $2/((n-1)(n-2))$ for graphs, and $1/((n-1)(n-2))$ for directed graphs where $n$ is the number of nodes in targets.

Returns:

nodes – Dictionary of nodes with betweenness centrality as the value. This includes betweenness values for all the nodes in the Graph (in both sets of the partition).

Return type:

dictionary

Notes

The basic algorithm is from [1]__.

The total number of paths between source and target is counted differently for directed and undirected graphs. Directed paths are easy to count. Undirected paths are tricky: should a path from “u” to “v” count as 1 undirected path or as 2 directed paths?

For betweenness_centrality we report the number of undirected paths when G is undirected.

For betweenness_centrality_subset the reporting is different. If the source and target subsets are the same, then we want to count undirected paths. But if the source and target subsets differ – for example, if sources is {0} and targets is {1}, then we are only counting the paths in one direction. They are undirected paths but we are counting them in a directed way. To count them as undirected paths, each should count as half a path.

References

metworkpy.network.centrality.betweenness_centrality_subset(G: Graph | DiGraph, targets: Iterable[Hashable] | None = None, normalized=True, weight=None)

Compute betweenness centrality for a subset of nodes.

\[c_B(v) =\sum_{s,t \in T} \frac{\sigma(s, t|v)}{\sigma(s, t)}\]

where $T$ is the set of targets, $sigma(s, t)$ is the number of shortest $(s, t)$-paths, and $sigma(s, t|v)$ is the number of those paths passing through some node $v$ other than $s, t$. If $s = t$, $sigma(s, t) = 1$, and if $v in {s, t}$, $sigma(s, t|v) = 0$ [2]__.

The normalization is slightly different from NetworkX, as it normalizes only to the possible (s,t) pairs in targets, rather than to all possible (s,t) pairs in the network.

Parameters:

G (graph) – A NetworkX graph.
targets (list of nodes) – Nodes to use as sources/targets for shortest paths in betweenness
normalized (bool, optional) – If True the betweenness values are normalized by $2/((n-1)(n-2))$ for graphs, and $1/((n-1)(n-2))$ for directed graphs where $n$ is the number of nodes in targets.
weight (None or string, optional (default=None)) – If None, all edge weights are considered equal. Otherwise holds the name of the edge attribute used as weight. Weights are used to calculate weighted shortest paths, so they are interpreted as distances.

Returns:

nodes – Dictionary of nodes with betweenness centrality as the value.

Return type:

dictionary

Notes

The basic algorithm is from [1]__.

For weighted graphs the edge weights must be greater than zero. Zero edge weights can produce an infinite number of equal length paths between pairs of nodes.

The normalization might seem a little strange but it is designed to make betweenness_centrality(G) be the same as betweenness_centrality_subset(G,sources=G.nodes(),targets=G.nodes()).

The total number of paths between source and target is counted differently for directed and undirected graphs. Directed paths are easy to count. Undirected paths are tricky: should a path from “u” to “v” count as 1 undirected path or as 2 directed paths?

References

Compute closeness centrality for nodes, considering only paths to a subset of other nodes.

Subset closeness centrality, based on closeness centrality [1]__, of a node u is the reciprocal of the avergage shortest path distance to u over all n-1 reachable nodes which are in targets

\[C(u) = \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},\]

where d(v, u) is the shortest-path distance between v and u, where v is in targets, and n-1 is the number of targets reachable from u. Notice that the closeness distance function computes the incoming distance to u for directed graphs. To use outward distance, act on G.reverse().

Notice that higher values of closeness indicate higher centrality.

Wasserman and Faust propose an improved formula for graphs with more than one connected component. The result is “a ratio of the fraction of actors in the group who are reachable, to the average distance” from the reachable actors [2]__. You might think this scale factor is inverted but it is not. As is, nodes from small components receive a smaller closeness value. Letting N denote the number of nodes in the graph,

\[C_{WF}(u) = \frac{n-1}{N-1} \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},\]

Parameters:

G (graph) – A NetworkX graph
targets (list of nodes, optional) – The nodes to use as targets for the shortest paths in closeness
u (node, optional) – Return only the value for node u
distance (edge attribute key, optional (default=None)) – Use the specified edge attribute as the edge distance in shortest path calculations. If None (the default) all edges have a distance of 1. Absent edge attributes are assigned a distance of 1. Note that no check is performed to ensure that edges have the provided attribute.
wf_improved (bool, optional (default=True)) – If True, scale by the fraction of nodes reachable. This gives the Wasserman and Faust improved formula. For single component graphs it is the same as the original formula.

Returns:

nodes – Dictionary of nodes with closeness centrality as the value.

Return type:

dictionary

Notes

This function is the closeness_centrality function from NetworkX modified to only compute distances to a subset of the nodes in the graph. NetworkX is licensed under a BSD-3-Clause license.

The closeness centrality is normalized to (n-1)/(|T|-1) where n is the number of targets in the connected part of graph containing the node, and |T| is the total number of targets. If the graph is not completely connected, this algorithm computes the closeness centrality for each connected part separately scaled by the number of targets in that parts.

If the ‘distance’ keyword is set to an edge attribute key then the shortest-path length will be computed using Dijkstra’s algorithm with that edge attribute as the edge weight.

The closeness centrality uses inward distance to a node, not outward. If you want to use outword distances apply the function to G.reverse()

References

Components

Functions for analyzing the variable components of the optimal growth solutions

metworkpy.network.components.find_variable_components(model: Model, network: Graph | DiGraph | None = None, tolerance: float = 1e-07, directed: bool = False, strongly_connected: bool = False, **kwargs) → list[set[Hashable]]

Identify the variable components in the metabolic network, that is the components of the network which can vary under at the optimum solution

Parameters:

model (cobra.Model) – Model to find the variable components in
network (nx.Graph or nx.DiGraph, optional) – A metabolic network graph constructed from model, used to find the connected components after removing reactions which can’t vary under the optimal solution
tolerance (float, default=1e-7) – The tolerance, reactions which have minimum and maximum fluxes less than this value will be considered constant
directed (bool, default=False) – If network is not passed, this decides if the constructed network is directed or not
strongly_connected (bool, default=False) – Whether to find the strongly connected components of the graph (only used if the provided network is directed)
kwargs – Keyword arguments are passed to cobra.flux_analysis.flux_variability_analysis

Returns:

List of sets of nodes in the metabolic network, each node represents a variable component of the model at optimum

Return type:

list of set of nodes

Notes

Uses the cobra Model to find the reactions which are constant across optimal solutions, and then identifies the connected groups of variable reactions and associated metabolites

Density

Module for finding the density of targets on a graph.

metworkpy.network.density.find_dense_clusters(network: Graph | DiGraph, targets: list[Hashable] | dict[Hashable, float | int] | Series, radius: int = 3, top_quantile_cutoff: float = 0.2, target_type: Literal['genes', 'nodes'] = 'nodes', **kwargs) → DataFrame

Find the clusters within a network with high target density

Parameters:

network (nx.Graph | nx.DiGraph) – Network to find clusters from
targets (list | dict | pd.Series) – Targets to find density of. Can be a list of nodes or genes, in which case all targets will have equal weight, or a dict or Series keyed by nodes/genes in the network which can specify a target weight. If a dict or Series, values should be ints or floats.
radius (int) – Radius to use for finding density. Specifies how far out from a given node targets are counted towards density. A radius of 0 only counts the single node, and so will just return the targets values back unchanged. Default value of 3.
top_quantile_cutoff (float) – Quantile cutoff for defining high density, the nodes within the top 100*`quantile`% of label density are considered high density. So a top_quantile_cutoff of 0.2 means that the top 20% of mode dense nodes will be defined as high density. Must be between 0 and 1.
target_type ({'genes', 'nodes'}, default='nodes') – The type of targets, with ‘genes’ indicating the targets are genes (which will require that a COBRApy model is provided as a kwarg, i.e. model=model), and so gene target density will be used. If ‘nodes’, then the targets should be nodes in the network.
kwargs – Passed to node_target_density, or gene_target_density functions depending on target_type

Returns:

A dataframe indexed by node id, with columns for density and cluster. The clusters are assigned integers starting from 0 to differentiate them. The clusters are not ordered, and so multiple calls to this method can results in different labels for the clusters.

Return type:

pd.DataFrame

Notes

This method finds the target density of the metabolic graph, and then identifies nodes with a high target density in their neighborhoods. Nodes without a high target densit are dropped from the graph, and then the connected components of the graph are then used as the high density clusters.

metworkpy.network.density.gene_target_density(metabolic_network: Graph | DiGraph, metabolic_model: Model, gene_targets: Series | list | dict, nodes: Iterable[Hashable] | None = None, radius: int = 3, essential: bool = False, processes: int | None = None) → Series

Determine the density of gene targets in the neighborhood of a nodes within a metabolic network

Parameters:

metabolic_network (nx.Graph or nx.DiGraph) – Metabolic network in the form of a reaction network, can be directed or undirected, but directed graphs will be converted to undirected.
metabolic_model (cobra.Model) – Metabolic model from which the metabolic network was constructed
gene_targets (pd.Series or list or dict) – Targets/counts of targets for genes associated with reactions in the metabolic network. If a list each value should be a gene id, and will have equal weight. If a dict, should be keyed by gene id, with values corresponding to weight. If a pd.Series, should be indexed by gene id, with values corresponding to weight.
nodes (iterable of hashable, optional) – Subset of nodes to find the density for, if not provided defaults to all of the nodes in the network
radius (int, default=3) – The radius to use for finding density, specifies how far out from a given node targets are counted towards density. A radius of 0 only counts the genes associated with the single node.
essential (bool) – Whether for a gene to be in a neighborhood it should be essential for at least 1 reaction in that neighborhood. If False, all genes associated with reactions within the radius are counted as in the neighborhood. If True, only genes which are required for at least 1 reaction within the radius are counted as in the neighborhood.
processes (int, optional) – Number of processes to use

Returns:

target_density – Pandas series with index corresponding to reactions in the network, and values corresponding to the density of gene targets in the neighborhood of that reaction node

Return type:

pd.Series

metworkpy.network.density.gene_target_enrichment(metabolic_network: Graph | DiGraph, metabolic_model: Model, gene_targets: set[str] | list[str], nodes: Iterable[Hashable] | None = None, metric: Literal['odds-ratio', 'p-value'] = 'p-value', alternative: Literal['two-sided', 'less', 'greater'] = 'greater', radius: int = 3, essential: bool = False, processes: int | None = None) → Series

Determine the enrichment of gene targets in the neighborhood of a reaction within a metabolic network

Parameters:

metabolic_network (nx.Graph or nx.DiGraph) – Metabolic network in the form of a reaction network, can be directed or undirected, but directed graphs will be converted to undirected.
metabolic_model (cobra.Model) – Metabolic model from which the metabolic network was constructed
gene_targets (list or set of str) – Targeted genes associated with reactions in the metabolic network. Result will be the enrichment in these targeted genes in a neighborhood of each reaction in the network
nodes (iterable of hashable, optional) – Subset of nodes to find the enrichment for, if not provided defaults to all of the nodes in the network
metric ("odds-ratio" or "p-value", default="p-value") – The enrichment metric to return in the Series, either the odds-ratio or the p-value (default) of the Fisher’s exact test used to evaluate enrichment
alternative ("two-sided", "less", or "greater") – The alternative hypothesis for the Fisher’s exact test used to evaluate the enrichment
radius (int, default=3) – The radius to use for defining a neighborhood around the reaction for finding enrichment, specifies how far out from a given node targets are counted towards enrichment. A radius of 0 only counts the genes associated with the single node.
essential (bool) – Whether for a gene to be in a neighborhood it should be essential for at least 1 reaction in that neighborhood. If False, all genes associated with reactions within the radius are counted as in the neighborhood. If True, only genes which are required for at least 1 reaction within the radius are counted as in the neighborhood.
processes (int, optional) – Number of processes to use

Returns:

target_enrichment – Pandas series with index corresponding to reactions in the network, and values corresponding to either the odds-ratio or the enrichment p-value (depending on the value of metric)

Return type:

pd.Series

Find the target density for different nodes in the graph. See note for details.

Parameters:

network (nx.DiGraph | nx.Graph) – Networkx network (directed or undirected) to find the target density of. Directed graphs are converted to undirected, and edge weights are currently ignored.
targets (list | dict | pd.Series) – Targets to find density of. Can be a list of nodes in the network where are targeted nodes will be treated equally, or a dict or Series keyed by nodes in the network which can specify a target weight (such as multiple targets for a single node). If a dict or Series, values should be ints or floats.
nodes (iterable of hashable, optional) – Subset of nodes to find the density for, if not provided defaults to all of the nodes in the network
radius (int) – Radius to use for finding density. Specifies how far out from a given node targets are counted towards density. A radius of 0 only counts the single node, and so will just return the targets values back unchanged. Default value of 3.
node_filter (Callable of node id to bool, or set of node id, optional) – Filter nodes in the network to consider when calculating density. If a Callable, should take node ids as the only argument and return a bool, if True the node will be considered in the density, if False it will not be. If a set, only nodes in the set will be considered when calculating density. Note that the density is still calculated for all nodes, but nodes that are not in the filter won’t count towards the size of the neighborhoods, and won’t be checked for being in the target set.
processes (int, optional) – Number of processes to use for finding the density

Returns:

The target density for the nodes in the network

Return type:

pd.Series

Notes

For each node in a network, neighboring nodes up to a distance of radius away are checked for targets. The total number of targets, or the sum of the targets found (in the case of dict or Series input) divided by the number of nodes within that radius is the density for a particular node.

Fuzzy Reaction Sets

Sub-module for finding fuzzy sets of reactions

class metworkpy.network.fuzzy.FuzzyMembershipFunction(*args, **kwargs)

Bases: Protocol

Protocol for fuzzy membership functions, which should take a reaction, a network, a set of target genes, and a reaction to gene set mapping, and return a float between 0 and 1. This method can also take in additional parameters as kwargs, which will be passed through form the calling functions.

metworkpy.network.fuzzy.fuzzy_reaction_intersection(gene_sets: Iterable[Iterable[str]], metabolic_network: Graph | DiGraph, metabolic_model: Model, intersection_fn: Callable[[DataFrame], Series] | Literal['mean', 'min', 'max', 'geom', 'rra'], intersection_fn_kwargs=typing.Optional[dict[str, typing.Any]], rank_method: Literal['average', 'min', 'max', 'first', 'dense'] = 'max', **kwargs) → Series

Converts gene_sets into fuzzy reaction sets, and find their intersection using intersection_fn

Parameters:

gene_sets (iterable of iterable of str) – Sets of genes to find the fuzzy reaction set intersection for
metabolic_network (nx.Graph or nx.DiGraph) – Metabolic reaction network represented by a networkx Graph or DiGraph. DiGraphs will be converted to Graphs before processing.
metabolic_model (cobra.Model) – Metabolic model from which the metabolic network was constructed (used for translating reactions to genes)
intersection_fn ({"mean", "min", "max", "geom", "rra"} or Callable[[pd.DataFrame], pd.Series]) – Either a str specifying an intersection function (see notes), or a Callable which takes a DataFrame, where each column is a fuzzy reaction set and returns a Series which is a new fuzzy reaction set representing the intersection of the input fuzzy reaction sets.
intersection_fn_kwargs (dict of str to Any) – kwargs passed to the intersection function
rank_method ({"average", "min", "max", "first", "dense"}) – If the intersection_fn is ‘rra’, how are ties in the membership values handled when performing ranking
kwargs – Keyword arguments are passed to fuzzy_reaction_set

Returns:

intersection – A pandas Series representing a fuzzy reaction set constructed by intersecting the fuzzy reaction sets derived from the gene_sets.

Return type:

pd.Series

Notes

The possible methods for the intersection are:

mean: Take the arithmetic mean of the membership values
min: Take the minimum of the membership values
max: Take the max of the membership values
geom: Take the geometric mean of the membership values
rra: Perform robust rank aggregation on the membership values, and the subtract the resulting rho-score from 1.0

metworkpy.network.fuzzy.fuzzy_reaction_set(metabolic_network: Graph | DiGraph, metabolic_model: Model, gene_set: Iterable[str], membership_fn: str | FuzzyMembershipFunction = 'simple gene density', scale: bool | float | None = None, essential: bool = False, processes: int | None = None, **kwargs) → Series

Convert from a gene set to a fuzzy reaction set

Parameters:

metabolic_network (nx.Graph or nx.DiGraph) – Metabolic reaction network represented by a networkx Graph or DiGraph. DiGraphs will be converted to Graphs before processing.
metabolic_model (cobra.Model) – Metabolic model from which the metabolic network was constructed (used for translating reactions to genes)
gene_set (Iterable of str) – Set of genes to convert into a fuzzy reaction set
membership_fn (str or FuzzyMembershipFunction) – The membership function to use, can be a string giving the functions name, or the function itself which must match the signature of FuzzyMembershipFunction
scale (bool or float, optional) – Whether to scale the results of the membership values. If False or None, no scaling will be applied. If True, will be scaled to be between 0 and 1 using a min-max scaler. If a float, the scaling will use a min-max scaler, but treat scale as the max.
essential (bool) – Whether, when translating from reactions to genes, only genes required for a reaction to function should be associated with a particular reaction.
processes (int, optional) – Number of processes to use for parallel processing
kwargs – Additional keyword arguments are passed to the membership_fn

Returns:

reaction_set – The fuzzy reaction set, described by a pandas series. The index is the reaction id, and the values are the set membership.

Return type:

pd.Series

Notes

The options for membership functions to be selected by name (i.e. str arg to membership_fn) are

‘simple gene density’
‘simple reaction density’
‘weighted gene density’
‘weighted reaction density’
‘knn gene density’
‘knn reaction density’
‘gene enrichment’

The difference between the gene and reaction density functions, are how multiple genes being associated with a single reaction are counted. For the gene type, multiple genes will all count towards the membership, whereas with the reaction type reactions are counted only once regardless of how many genes associated with them are in the gene set.

Neighborhoods

Functions for finding and working with neighborhoods in metabolic networks

metworkpy.network.neighborhoods.get_graph_neighborhood(network: Graph | DiGraph, radius: int, node: Hashable) → set[Hashable]

Get the neighborhood around a node in the network

Parameters:

network (nx.Graph or nx.DiGraph) – The network to find the neighborhood in
radius (int) – The radius of the neighborhood
node (Hashable) – The node to find the neighborhood around

Returns:

neighborhood – The neighborhood around node in network

Return type:

set of Hashable

metworkpy.network.neighborhoods.get_graph_neighborhood_group(network: Graph | DiGraph, radius: int, nodes: set[Hashable]) → set[Hashable]

Get the neighborhood of a group of nodes, that is all nodes reachable within a distance of radius from a node in nodes

Parameters:

network (nx.Graph or nx.DiGraph) – The network to find the neighborhood in
radius (int) – The radius of the neighborhood
node (set of Hashable) – The group of nodes to find the neighborhood for

Returns:

neighborhood – The neighborhood around the nodes in network

Return type:

set of Hashable

metworkpy.network.neighborhoods.graph_gene_neighborhood_iter(network: Graph, model: Model, radius: int, essential: bool = False)

Iterator over gene neighborhoods in a graph

Parameters:

network (nx.Graph) – The network whose neighborhoods will be iterated over
model (cobra.Model) – The cobra model associated with the metabolic network
radius (int) – The radius determining the size of the neighborhood
essential (bool) – Whether to only include genes essential for reactions in the neighborhood

Yields:

tuple of Hashable and set of str – Tuple of node and gene ids in neighborhood

metworkpy.network.neighborhoods.graph_gene_neighborhoods(network: Graph, model: Model, radius: int) → dict[Hashable, set[str]]

Find the neighborhoods of a graph

Parameters:

network (nx.Graph) – The network whose neighborhoods will be identified
model (cobra.Model) – The cobra model associated with the metabolic network
radius (int) – The radius determining the sizes of the neighborhoods

Returns:

neighborhoods – Dict describing the nodes in the graph, keyed by node with values of sets of gene ids in the neighborhood of the node

Return type:

dict of nodes to sets of gene ids

metworkpy.network.neighborhoods.graph_neighborhood_iter(network: Graph, radius: int) → Iterator[tuple[Hashable, set[Hashable]]]

Iterator over neighborhoods in a graph

Parameters:

network (nx.Graph) – The network whose neighborhoods will be iterated over
radius (int) – The radius determining the size of the neighborhood

Yields:

tuple of Hashable and set of Hashable – Tuple of node and neighborhood

metworkpy.network.neighborhoods.graph_neighborhoods(network: Graph, radius: int) → dict[Hashable, set[Hashable]]

Find the neighborhoods of a graph

Parameters:

network (nx.Graph) – The network whose neighborhoods will be identified
radius (int) – The radius determining the sizes of the neighborhoods

Returns:

neighborhoods – Dict describing the nodes in the graph, keyed by node with values of sets of nodes in the neighborhood of the node (including the node itself)

Return type:

dict of nodes to sets of nodes

Network Construction

Functions for constructing networks based on genome scale metabolic models

metworkpy.network.network_construction.create_adjacency_matrix(model: Model, weighted: bool, directed: bool, weight_by: Literal['stoichiometry', 'fva', 'pfba'] = 'stoichiometry', threshold: float = 0.0, **kwargs) → DataFrame

Create an adjacency matrix representing the metabolic network of a provided cobra Model

Parameters:

model (cobra.Model) – Cobra Model to create the network from
weighted (bool) – Whether the network should be weighted
directed (bool) – Whether the network should be directed
weight_by ({'fva', 'pfba', 'stoichiometry'}, default='stoichiometry') – String indicating if the network should be weighted by ‘stoichiometry’, ‘fva’, ‘pfba’ (see notes for more information). Ignored if weighted = False
threshold (float) – Threshold, below which to consider a (absolute value of a) bound/flux to be 0
kwargs – Passed to cobra’s flux_variability_analysis function if the weight_by is ‘fva’, or cobra’s pfba function if the weight_by is ‘pfba’

Returns:

The adjacency matrix

Return type:

pd.DataFrame

Notes

When creating a weighted network, the options are to weight the edges based on flux, or stoichiometry. If stoichiometry is chosen the edge weight will correspond to the stoichiometric coefficient of the metabolite, in a given reaction.

For ‘fva’ weighting, first flux variability analysis is performed. The edge weight is determined by the maximum flux through a reaction in a particular direction (forward if the metabolite is a product of the reaction, reverse if the metabolite is a substrate) multiplied by the metabolite stoichiometry. If the network is unweighted, the maximum of the absolute value of the forward and the reverse flux is used instead.

For ‘pfba’ weighting, first parsimonious flux analysis is performed. The edge weight between a reaction and metabolite is determined by the stoichiometric coefficient of the metabolite multiplied by flux of the reaction in the pFBA solution.

metworkpy.network.network_construction.create_gene_network(model: Model, directed: bool, nodes_to_remove: list[str] | None, essential: bool) → Graph | DiGraph

Create a gene connectivity network from the metabolic model, see notes for details

Parameters:

model (cobra.Model) – Cobra Model to create the network from
directed (bool) – Whether the network should be directed. It True, the network’s edges direction will be decided by the directionality of the reaction network, and multiple genes associated with a single reaction will have two (reciprocal) edges connecting them.
nodes_to_remove (list[str] or None) – List of any metabolites or reactions to remove from the metabolic network prior to projecting it onto the reactions and constructing the gene network. Each metabolite/reaction to remove should be the string id associated with them in the cobra Model
essential (bool) – Whether a gene should be required for a reaction to function in order for that reaction to be used in assigning the gene edges

Returns:

gene_network – Network connecting genes which are neighboring in the reaction network together

Return type:

nx.Graph or nx.DiGraph

Notes

The gene network includes nodes for each gene associated with a reaction in the network (whether or not essential is True). Edges are added by connecting each gene associated with a reaction to genes associated with all the neighboring reactions. If the graph is directed, then gene nodes are connected to genes associated with succcessor reactions. For genes associated with a single reaction they are given edges between them (going both directions in the case of directed graphs).

The essential parameter is to decide which genes are associated with which reactions in order to determine which genes are neighbors in the gene network. If True, genes will only be associated with a reaction, when adding edges to the network, if they are required for that reaction to function. All genes associated with reactions in the network will still be added as nodes even if they are not essential for any reactions in the network.

metworkpy.network.network_construction.create_group_distance_adjacency_matrix(network: Graph | DiGraph, groups: dict[Hashable, Iterable[Hashable]], weight: str | None = None, linkage: Literal['mean', 'min', 'max'] = 'mean', directed: bool = False) → DataFrame

Create an adjacency matrix for the distances between the groups

Parameters:

network (nx.Graph or nx.DiGraph) – Network to use when finding distances between nodes in the groups. Edge weights are ignored.
groups (: dict of Hashable to Iterable of Hashable) – Group definitions, must be a map between group names (which will be used as index/columns in the matrix), and an iterable of group members (which should be nodes in the network)
weight (str, optional) – Edge attribute to use for weight, if None all edges have weight 1
linkage ({'mean', 'min', 'max'}) – Method to use when combining pairwise distances between groups
directed (bool) – Whether the adjacency matrix should be directed or not, ignored unless the input network is a nx.DiGraph

Returns:

adjacency_matrix – DataFrame representing the adjacency matrix of the distances between the groups on the network. Index and columns are the keys of the groups dict, with values representing the distances between the groups.

Return type:

pd.DataFrame

Notes

Constructs the adjacency matrix using the pairwise distances between groups. For each pair of groups, finds the distances between their nodes and finds the distance between the two groups by aggregating these distances, either using the mean, minimum, or maximum of the set of pairwise distances between two groups of nodes.

metworkpy.network.network_construction.create_group_distance_network(network: Graph | DiGraph, groups: dict[Hashable, Iterable[Hashable]], weight: str | None = None, linkage: Literal['mean', 'min', 'max'] = 'mean', directed: bool = False) → Graph | DiGraph

Create an network for the distances between the groups

Parameters:

network (nx.Graph or nx.DiGraph) – Network to use when finding distances between nodes in the groups. Edge weights are ignored.
groups (: dict of Hashable to Iterable of Hashable) – Group definitions, must be a map between group names (which will be used as index/columns in the matrix), and an iterable of group members (which should be nodes in the network)
weight (str, optional) – Edge attribute to use for weight, if None all edges have weight 1
linkage ({'mean', 'min', 'max'}) – Method to use when combining pairwise distances between groups
directed (bool) – Whether the adjacency matrix should be directed or not, ignored unless the input network is a nx.DiGraph

Returns:

Network with a node for each group, and edges weighted by the distances between the groups on the network.

Return type:

nx.Graph or nx.DiGraph

Notes

Constructs the network using the pairwise distances between groups. For each pair of groups, finds the distances between their nodes and finds the distance between the two groups by aggregating these distances, either using the mean, minimum, or maximum of the set of pairwise distances between two groups of nodes.

metworkpy.network.network_construction.create_group_neighborhood_network(network: Graph | DiGraph, groups: dict[Hashable, Iterable[Hashable]], max_distance: int = 1, weighted: Literal['count', 'proportion', 'enrichment'] | None = None, directed: bool = False) → Graph | DiGraph

Create a group connectivity network, see notes for details

Parameters:

network (nx.Graph or nx.DiGraph) – Network to use when finding neighbors. Edge weights will be ignored.
groups (dict of Hashable to Iterable of Hashable) – Group definitions, must be a map between group names (which will be used as nodes in the network), and an iterable of group members (which should be nodes in the network)
max_distance (int, default=1) – Max distance for nodes to be considered neighbors. A value of 0 will only connect groups with direct overlaps, while a value of 1 will connect groups which have members that are direct neighbors in the network.
weighted ({'count', 'proportion', 'enrichment'}, optional) – Whether to weight the graph based on the number of connections between the groups. If None (default) no weights are added. If ‘count’ then the edge weight is the count of connections between the two groups. If ‘proportion’, the edge weight is normalized by the maximum possible overlap. If enrichment, node attributes are added called pvalue, odds_ratio, and significance. The pvalue and odds ratio are the results of performing a Fisher’s exact test on the enrichment of one group in the neighborhood of the other (in the undirected case, it is the minimum p-value/maximum odds_ratio found when finding the enrichment of one group in the neighborhood of the other). The significance is the -log10 of the p-value. Note that the odds_ratio can be infinite.
directed (bool, default=False) – Whether the resulting connectivity graph should be directed, ignored unless the input network is directed.

Returns:

group_neighborhood_network – The group connectivity graph, which includes nodes for every group defined in group, with edges connecting groups which are connected in network, with optional edge weighted. Will be nx.Graph unless the input network is a DiGraph, and directed is True.

Return type:

nx.Graph or nx.DiGraph

Notes

The group connectivity graph is a graph with a node for each group in groups, and edges connecting groups which include neighbors on the network.

For example, take a graph with:

Nodes: {a, b, c, d, e, f, g}

Edges: {(a, b), (c,d), (e,f), (a,g)}

then the group connectivity graph for groups {group1: {a,c}, group2:{d,e}, group3:{b,f}, group4:{g}} will produce the group connectivity graph (with parameter max_distance set to 1):

Nodes: {group1, group2, group3, group4}

Edges: {(group1, group2), (group1, group3), (group1, group4), (group2, group3)}

When counting the number of connections, it is determined by finding the total neighborhood of one of the groups (that is the total node set within radius of a node in that group), and counting the number of nodes from the other group which are within that neighborhood.

metworkpy.network.network_construction.create_metabolic_network(model: Model, weighted: bool, directed: bool, weight_by: Literal['stoichiometry', 'fva', 'pfba'] = 'stoichiometry', nodes_to_remove: list[str] | None = None, reciprocal_weights: bool = False, threshold: float = 0.0, **kwargs) → Graph | DiGraph

Create a metabolic network from a cobrapy Model

Parameters:

model (cobra.Model) – Cobra Model to create the network from
weighted (bool) – Whether the network should be weighted
directed (bool) – Whether the network should be directed
weight_by ({'fva', 'pfba', 'stoichiometry'}, default='stoichiometry') – String indicating if the network should be weighted by ‘stoichiometry’, ‘fva’, ‘pfba’ (see notes for more information). Ignored if weighted = False
nodes_to_remove (list[str] | None) – List of any metabolites or reactions that should be removed from the final network. This can be used to remove metabolites that participate in a large number of reactions, but are not desired in downstream analysis such as water, or ATP, or pseudo reactions like biomass. Each metabolite/reaction should be the string ID associated with them in the cobra model.
reciprocal_weights (bool) – Whether to use the reciprocal of the weights, useful if higher flux should equate with lower weights in the final network (for use with graph algorithms)
threshold (float) – Threshold, below which to consider a bound to be 0
kwargs – Keyword arguments are passed to the cobra flux_variability_analysis method when weight_by is flux

Returns:

A network representing the metabolic network from the provided cobrapy model

Return type:

nx.Graph | nx.DiGraph

Notes

When creating a weighted network, the options are to weight the edges based on flux, or stoichiometry. If stoichiometry is chosen the edge weight will correspond to the stoichiometric coefficient of the metabolite, in a given reaction.

For ‘fva’ weighting, first flux variability analysis is performed. The edge weight is determined by the maximum flux through a reaction in a particular direction (forward if the metabolite is a product of the reaction, reverse if the metabolite is a substrate) multiplied by the metabolite stoichiometry. If the network is unweighted, the maximum of the absolute value of the forward and the reverse flux is used instead.

For ‘pfba’ weighting, first parsimonious flux analysis is performed. The edge weight between a reaction and metabolite is determined by the stoichiometric coefficient of the metabolite multiplied by flux of the reaction in the pFBA solution.

metworkpy.network.network_construction.create_metabolite_network(model: Model, weighted: bool, directed: bool, weight_by: Literal['stoichiometry', 'fva', 'pfba'] = 'stoichiometry', nodes_to_remove: list[str] | None = None, reciprocal_weights: bool = False, threshold: float = 0.0, projection_weight: str | Callable[[float, float], float] | None = None, projection_weight_combine: Callable[[list[float]], float] | None = None, **kwargs)

Create a metabolite connectivity network from the metabolic model

Parameters:

model (cobra.Model) – Cobra Model to create the network from
weighted (bool) – Whether the network should be weighted
directed (bool) – Whether the network should be directed
weight_by ({'fva', 'pfba', 'stoichiometry'}, default='stoichiometry') – String indicating if the network should be weighted by ‘stoichiometry’, ‘fva’, ‘pfba’ (see notes for more information). Ignored if weighted = False
nodes_to_remove (list[str] | None) – List of any metabolites or reactions that should be removed from the final network. This can be used to remove metabolites that participate in a large number of reactions, but are not desired in downstream analysis such as water, or ATP, or pseudo reactions like biomass. Each metabolite/reaction should be the string ID associated with them in the cobra model.
reciprocal_weights (bool) – Whether to use the reciprocal of the weights, useful if higher flux should equate with lower weights in the final network (for use with graph algorithms)
threshold (float) – Threshold, below which to consider a bound to be 0
projection_weight (str | Callable[[float, float], float] | None) – How to weight the projected graph. If None, the projected graph will not be weighted. If “ratio”, the edges will be weighted based on the ratio between actual shared neighbors and maximum possible shared neighbors. If “count”, the edges will be weighted by the number of shared neighbors. A function can also be provided, which takes two float arguments (the weights of two edges), and returns a float.
projection_weight_combine (Callable[[list[float]], float], optional) – How to combine multiple projected edges. If two nodes in the set being projected onto, share multiple neighbors in the other node set, they can have multiple possible edge weights. This function takes in a list of possible weights, and returns a single final weight. Python builtin max and min can be used for this. If not provided, max is used.
kwargs – Keyword arguments are passed to the cobra flux_variability_analysis method when weight_by is flux

metworkpy.network.network_construction.create_mutual_information_network(model: Model | None = None, flux_samples: DataFrame | ndarray | None = None, reaction_names: Iterable[str] | None = None, cutoff_significance: float | None = None, n_samples: int = 10000, reciprocal_weights: bool = False, processes: int = 1, **kwargs) → Graph

Create a mutual information network from the provided metabolic model

Parameters:

model (Optional[cobra.Model]) – Metabolic model to construct the mutual information network from. Only required if the flux_samples parameter is None
flux_samples (Optional[pd.DataFrame|np.ndarray]) – Flux samples used to calculate mutual information between reactions. If None, the passed model will be sampled to generate these flux samples.
reaction_names (Optional[Iterable[str]]) – Names for the reactions
cutoff_significance (float, optional) – Upper bound for the significance of the mutual information, any mutual information values with p-values above this cutoff will have their mutual information set to 0. Will calculate this p-value using permutation testing, see mi_pairwise for more information.
n_samples (int) – Number of samples to take if flux_samples is None (ignored if flux_samples is not None)
reciprocal_weights (bool) – Whether the non-zero weights in the network should be the reciprocal of mutual information.
processes (int) – Number of processes to use during the flux sampling and mutual information calculation
kwargs – Keyword arguments passed to the mi_pairwise function

Returns:

A networkx Graph, which nodes representing different reactions and edge weights corresponding to estimated mutual information

Return type:

nx.Graph

metworkpy.network.network_construction.create_reaction_network(model: Model, weighted: bool, directed: bool, weight_by: Literal['stoichiometry', 'fva', 'pfba'] = 'stoichiometry', nodes_to_remove: list[str] | None = None, reciprocal_weights: bool = False, threshold: float = 0.0, projection_weight: str | Callable[[float, float], float] | None = None, projection_weight_combine: Callable[[list[float]], float] | None = None, **kwargs)

Create a reaction connectivity network from the metabolic model

Parameters:

model (cobra.Model) – Cobra Model to create the network from
weighted (bool) – Whether the network should be weighted
directed (bool) – Whether the network should be directed
weight_by ({'fva', 'pfba', 'stoichiometry'}, default='stoichiometry') – String indicating if the network should be weighted by ‘stoichiometry’, ‘fva’, ‘pfba’ (see notes for more information). Ignored if weighted = False
nodes_to_remove (list[str] | None) – List of any metabolites or reactions that should be removed from the final network. This can be used to remove metabolites that participate in a large number of reactions, but are not desired in downstream analysis such as water, or ATP, or pseudo reactions like biomass. Each metabolite/reaction should be the string ID associated with them in the cobra model.
reciprocal_weights (bool) – Whether to use the reciprocal of the weights, useful if higher flux should equate with lower weights in the final network (for use with graph algorithms)
threshold (float) – Threshold, below which to consider a bound to be 0
projection_weight (str | Callable[[float, float], float] | None) – How to weight the projected graph. If None, the projected graph will not be weighted. If “ratio”, the edges will be weighted based on the ratio between actual shared neighbors and maximum possible shared neighbors. If “count”, the edges will be weighted by the number of shared neighbors. A function can also be provided, which takes two float arguments (the weights of two edges), and returns a float.
projection_weight_combine (Callable[[list[float]], float], optional) – How to combine multiple projected edges. If two nodes in the set being projected onto, share multiple neighbors in the other node set, they can have multiple possible edge weights. This function takes in a list of possible weights, and returns a single final weight. Python builtin max and min can be used for this. If not provided, max is used.
kwargs – Keyword arguments are passed to the cobra flux_variability_analysis method when weight_by is flux

metworkpy.network.network_construction.get_top_metabolite_pairs(model: Model, n: int, ignore_top: int = 0) → list[tuple[str, str]]

Get a list including tuples of the most frequent metabolite pairs in the model

Parameters:

model (cobra.Model) – The model to find the top metabolite pairs for
n (int) – The number of top metabolite pairs to find
ignore_top (int) – Before finding pairwise frequency of metabolites, remove the top ignore_top number of metabolites

Returns:

A list of the most common metabolite pairs in the form of a list of tuples, each containg a pair of metabolite ids

Return type:

list of tuples of str,str

metworkpy.network.network_construction.get_top_metabolites(model: Model, n: int, type: Literal['substrate', 'reactant', 'product'] = 'substrate') → list[str]

Get a list of the top n metabolites involved in the most reactions in the model

Parameters:

model (cobra.Model) – The model to find the top metabolites for
n (int) – The number of top metabolites to find

Returns:

A list of the ids of the top n metabolites in the model

Return type:

list of str

Projection

Module to project bipartite graphs onto the node sets

metworkpy.network.projection.bipartite_project(network: Graph | DiGraph, node_set: Iterable, directed: bool | None = None, weight: str | Callable[[float, float], float] | None = None, weight_combine: Callable[[list[float]], float] | None = None, weight_attribute: str = 'weight', reciprocal: bool = False) → Graph | DiGraph

Function to project a bipartite graph onto the specified set of nodes

Parameters:

network (nx.Graph | nx.DiGraph) – Network to project
node_set (Iterable) – Nodes to project the graph onto
directed (bool | None) – Whether the projected graph should be directed. If the network argument is not directed this is ignored. A value of None will have the directedness of the output match the directedness of the input network.
weight (str | Callable[[float, float], float], optional) – How to weight the projected graph. If None, the projected graph will not be weighted. If “ratio”, the edges will be weighted based on the ratio between actual shared neighbors and maximum possible shared neighbors. If “count”, the edges will be weighted by the number of shared neighbors. A function can also be provided, which takes two float arguments (the weights of two edges), and returns a float.
weight_combine (Callable[[list[float]], float], optional) – How to combine multiple projected edges. If two nodes in the set being projected onto, share multiple neighbors in the other node set, they can have multiple possible edge weights. This function takes in a list of possible weights, and returns a single final weight. Python builtin max and min can be used for this. If not provided, max is used.
weight_attribute (str) – Which edge attribute in the original network to use for weighting. Default is ‘weight’.
reciprocal (bool, default=False) – If converting from a directed graph to an undirected one, whether to only keep edges that appear in both directions in the original directed network.

Returns:

Projected network

Return type:

nx.Graph | nx.DiGraph