napistu.network.net_create_utils
Functions
|
Create a DataFrame representing the graph hierarchy for a given wiring approach. |
|
Create a Napistu graph from a reaction and its species. |
|
Convert reaction species data into network edges using specified wiring approach. |
- napistu.network.net_create_utils._find_sbo_duos(reaction_species: DataFrame, target_sbo_term: str = 'SBO:0000336') list[str]
Find r_ids that have exactly 2 rows with the specified sbo_term and no other sbo_terms.
- Parameters:
reaction_species (pd.DataFrame) – DataFrame with columns: sbo_term, sc_id, stoichiometry, r_id
target_sbo_term (str) – The sbo_term to match (e.g., “SBO:0000336” aka “interactor”)
- Returns:
List of r_ids that meet the criteria
- Return type:
list
- napistu.network.net_create_utils._format_cross_tier_edges(entities_ordered_by_tier: DataFrame, r_id: str, drop_reactions_when: str = 'same_tier')
Format edges for reactions where participants are on different tiers of a wiring hierarchy.
- Parameters:
entities_ordered_by_tier (pd.DataFrame) – DataFrame of entities ordered by tier.
r_id (str) – Reaction ID.
drop_reactions_when (str, optional) – The condition under which to drop reactions as a network vertex. Default is ‘same_tier’.
- Returns:
DataFrame of formatted edges for cross-tier reactions.
- Return type:
pd.DataFrame
- napistu.network.net_create_utils._format_same_tier_edges(rxn_species: DataFrame, r_id: str) DataFrame
Format edges for reactions where all participants are on the same tier of a wiring hierarchy.
- Parameters:
rxn_species (pd.DataFrame) – DataFrame of reaction species for the reaction.
r_id (str) – Reaction ID.
- Returns:
DataFrame of formatted edges for same-tier reactions.
- Return type:
pd.DataFrame
- Raises:
ValueError – If reaction has multiple distinct metadata.
- napistu.network.net_create_utils._format_tier_combo(upstream_tier: DataFrame, downstream_tier: DataFrame) DataFrame
Create all edges between two tiers of a tiered reaction graph.
This function generates a set of edges by performing an all-vs-all combination between entities in the upstream and downstream tiers. Tiers represent an ordering along the molecular entities in a reaction, plus a tier for the reaction itself. Attributes such as stoichiometry and sbo_term are assigned from both the upstream and downstream tiers, providing complete information about both endpoints of each edge. Reaction entities have neither a stoichiometry nor sbo_term annotation, so these attributes will be missing (None/NaN) when a tier is a reaction.
- Parameters:
upstream_tier (pd.DataFrame) – DataFrame containing upstream entities in a reaction (e.g., regulators or substrates).
downstream_tier (pd.DataFrame) – DataFrame containing downstream entities in a reaction (e.g., products or targets).
- Returns:
DataFrame of edges, each with columns: ‘from’, ‘to’, ‘stoichiometry_upstream’, ‘stoichiometry_downstream’, ‘sbo_term_upstream’, ‘sbo_term_downstream’, and ‘r_id’. The number of edges is the product of the number of entities in the upstream tier and the number in the downstream tier. Attributes will be missing (None/NaN) if the corresponding tier is a reaction.
- Return type:
pd.DataFrame
Notes
This function is used to build the edge list for tiered graphs, where each tier represents
a functional group (e.g., substrates, products, modifiers, reaction). - Both upstream and downstream attributes are included when available from the respective tiers. - Reaction entities themselves do not contribute stoichiometry or sbo_term attributes.
- napistu.network.net_create_utils._interactor_duos_to_wide(interactor_duos: DataFrame)
Convert paired long format to wide format with ‘from’ and ‘to’ columns.
- Parameters:
interactor_duos (pd.DataFrame) – DataFrame with exactly 2 rows per r_id, containing sc_id and stoichiometry
- Returns:
Wide format with from_sc_id, from_stoichiometry, to_sc_id, to_stoichiometry columns
- Return type:
pd.DataFrame
- napistu.network.net_create_utils._log_pathological_same_tier(distinct_metadata: DataFrame, r_id: str) None
Log a warning if a reaction has multiple distinct metadata.
- napistu.network.net_create_utils._reaction_species_to_tiers(rxn_species: DataFrame, graph_hierarchy_df: DataFrame, r_id: str) DataFrame
Map reaction species to tiers based on the graph hierarchy.
- Parameters:
rxn_species (pd.DataFrame) – DataFrame of reaction species.
graph_hierarchy_df (pd.DataFrame) – DataFrame defining the graph hierarchy.
r_id (str) – Reaction ID.
- Returns:
DataFrame of entities ordered by tier.
- Return type:
pd.DataFrame
- napistu.network.net_create_utils._should_drop_reaction(entities_ordered_by_tier: DataFrame, drop_reactions_when: str = 'same_tier')
Determine if a reaction should be dropped based on regulatory relationships and stringency.
- Parameters:
entities_ordered_by_tier (pd.DataFrame) – The entities ordered by tier.
drop_reactions_when (str, optional) – The desired stringency for dropping reactions. Default is ‘same_tier’.
- Returns:
True if the reaction should be dropped, False otherwise.
- Return type:
bool
Notes
reactions are always dropped if they are on the same tier. This greatly decreases the number of vertices in a graph constructed from relatively dense interaction networks like STRING.
- Raises:
ValueError – If drop_reactions_when is not a valid value.
- napistu.network.net_create_utils._validate_interactor_duos(interactor_duos: DataFrame)
Logs cases when a pair of interactors have non-zero stoichiometry
- napistu.network.net_create_utils._validate_sbo_indexed_rsc_stoi(rxn_species: DataFrame) None
Validate that rxn_species is a DataFrame with correct index and columns.
- Parameters:
rxn_species (pd.DataFrame) – DataFrame of reaction species, indexed by SBO_TERM.
- Return type:
None
- Raises:
TypeError – If rxn_species is not a pandas DataFrame.
ValueError – If index or columns are not as expected.
- napistu.network.net_create_utils.create_graph_hierarchy_df(wiring_approach: str) DataFrame
Create a DataFrame representing the graph hierarchy for a given wiring approach.
- Parameters:
wiring_approach (str) – The type of tiered graph to work with. Each type has its own specification in constants.py.
- Returns:
DataFrame with sbo_name, tier, and sbo_term.
- Return type:
pd.DataFrame
- Raises:
ValueError – If wiring_approach is not valid.
- napistu.network.net_create_utils.format_tiered_reaction_species(rxn_species: DataFrame, r_id: str, graph_hierarchy_df: DataFrame, drop_reactions_when: str = 'same_tier') DataFrame
Create a Napistu graph from a reaction and its species.
- Parameters:
rxn_species (pd.DataFrame) – The reaction’s participants indexed by SBO terms
r_id (str) – The ID of the reaction. Should be indexed by sbo_term and have columns
graph_hierarchy_df (pd.DataFrame) – The graph hierarchy.
drop_reactions_when (str, optional) – The condition under which to drop reactions as a network vertex. Default is ‘same_tier’.
- Returns:
The edges of the Napistu graph for a single reaction.
- Return type:
pd.DataFrame
- napistu.network.net_create_utils.wire_reaction_species(reaction_species: DataFrame, wiring_approach: str, drop_reactions_when: str) DataFrame
Convert reaction species data into network edges using specified wiring approach.
This function processes reaction species data to create network edges that represent the relationships between molecular entities in a biological network. It handles both interactor pairs (processed en-masse) and other reaction species (processed using tiered algorithms based on the wiring approach).
- Parameters:
reaction_species (pd.DataFrame) –
DataFrame containing reaction species data with columns: - r_id : str
Reaction identifier
- sc_idstr
Compartmentalized species identifier
- stoichiometryfloat
Stoichiometric coefficient (negative for reactants, positive for products, 0 for modifiers)
- sbo_termstr
Systems Biology Ontology term defining the role of the species in the reaction (e.g., ‘SBO:0000010’ for reactant, ‘SBO:0000011’ for product, ‘SBO:0000336’ for interactor)
wiring_approach (str) – The wiring approach to use for creating the network. Must be one of: - ‘bipartite’ : Creates bipartite network with molecules connected to reactions - ‘regulatory’ : Creates regulatory hierarchy (modifiers -> catalysts -> reactants -> reactions -> products) - ‘surrogate’ : Alternative layout with enzymes downstream of substrates
drop_reactions_when (str) – Condition under which to drop reactions as network vertices. Must be one of: - ‘always’ : Always drop reaction vertices - ‘edgelist’ : Drop if there are exactly 2 participants - ‘same_tier’ : Drop if there are 2 participants which are both “interactor”
- Returns:
DataFrame containing network edges with columns: - from : str
Source node identifier (species or reaction ID)
- tostr
Target node identifier (species or reaction ID)
- stoichiometryfloat
Stoichiometric coefficient for the edge
- sbo_termstr
SBO term defining the relationship type
- r_idstr
Reaction identifier associated with the edge
- Return type:
pd.DataFrame
Notes
The function processes reaction species in two phases:
- Interactor Processing: Pairs of interactors (SBO:0000336) are processed
en-masse and converted to wide format edges.
- Tiered Processing: Non-interactor species are processed using tiered
algorithms based on the wiring approach hierarchy. This creates edges between entities at different tiers in the hierarchy.
Reactions with ≤1 species are automatically dropped as they represent underspecified reactions (e.g., autoregulation or reactions with removed cofactors).
Examples
>>> from napistu.network import net_create_utils >>> from napistu.constants import SBML_DFS, MINI_SBO_FROM_NAME, SBOTERM_NAMES >>> import pandas as pd >>> >>> # Create sample reaction species data >>> reaction_species = pd.DataFrame({ ... SBML_DFS.R_ID: ['R1', 'R1', 'R2', 'R2'], ... SBML_DFS.SC_ID: ['A', 'B', 'C', 'D'], ... SBML_DFS.STOICHIOMETRY: [-1, 1, 0, 0], ... SBML_DFS.SBO_TERM: [ ... MINI_SBO_FROM_NAME[SBOTERM_NAMES.REACTANT], ... MINI_SBO_FROM_NAME[SBOTERM_NAMES.PRODUCT], ... MINI_SBO_FROM_NAME[SBOTERM_NAMES.INTERACTOR], ... MINI_SBO_FROM_NAME[SBOTERM_NAMES.INTERACTOR] ... ] ... }) >>> >>> # Wire the reaction species using regulatory approach >>> edges = wire_reaction_species( ... reaction_species, ... wiring_approach='regulatory', ... drop_reactions_when='same_tier' ... )
- Raises:
ValueError – If wiring_approach is not a valid value. If drop_reactions_when is not a valid value. If reaction species have unusable SBO terms.
See also
format_tiered_reaction_speciesProcess individual reactions with tiered algorithms
create_graph_hierarchy_dfCreate hierarchy DataFrame for wiring approach