napistu.network.net_create_utils

Functions

create_graph_hierarchy_df(wiring_approach)

Create a DataFrame representing the graph hierarchy for a given wiring approach.

format_tiered_reaction_species(rxn_species, ...)

Create a Napistu graph from a reaction and its species.

wire_reaction_species(reaction_species, ...)

Convert reaction species data into network edges using specified wiring approach.

napistu.network.net_create_utils._find_sbo_duos(reaction_species: DataFrame, target_sbo_term: str = 'SBO:0000336') list[str]

Find r_ids that have exactly 2 rows with the specified sbo_term and no other sbo_terms.

Parameters:
  • reaction_species (pd.DataFrame) – DataFrame with columns: sbo_term, sc_id, stoichiometry, r_id

  • target_sbo_term (str) – The sbo_term to match (e.g., “SBO:0000336” aka “interactor”)

Returns:

List of r_ids that meet the criteria

Return type:

list

napistu.network.net_create_utils._format_cross_tier_edges(entities_ordered_by_tier: DataFrame, r_id: str, drop_reactions_when: str = 'same_tier')

Format edges for reactions where participants are on different tiers of a wiring hierarchy.

Parameters:
  • entities_ordered_by_tier (pd.DataFrame) – DataFrame of entities ordered by tier.

  • r_id (str) – Reaction ID.

  • drop_reactions_when (str, optional) – The condition under which to drop reactions as a network vertex. Default is ‘same_tier’.

Returns:

DataFrame of formatted edges for cross-tier reactions.

Return type:

pd.DataFrame

napistu.network.net_create_utils._format_same_tier_edges(rxn_species: DataFrame, r_id: str) DataFrame

Format edges for reactions where all participants are on the same tier of a wiring hierarchy.

Parameters:
  • rxn_species (pd.DataFrame) – DataFrame of reaction species for the reaction.

  • r_id (str) – Reaction ID.

Returns:

DataFrame of formatted edges for same-tier reactions.

Return type:

pd.DataFrame

Raises:

ValueError – If reaction has multiple distinct metadata.

napistu.network.net_create_utils._format_tier_combo(upstream_tier: DataFrame, downstream_tier: DataFrame) DataFrame

Create all edges between two tiers of a tiered reaction graph.

This function generates a set of edges by performing an all-vs-all combination between entities in the upstream and downstream tiers. Tiers represent an ordering along the molecular entities in a reaction, plus a tier for the reaction itself. Attributes such as stoichiometry and sbo_term are assigned from both the upstream and downstream tiers, providing complete information about both endpoints of each edge. Reaction entities have neither a stoichiometry nor sbo_term annotation, so these attributes will be missing (None/NaN) when a tier is a reaction.

Parameters:
  • upstream_tier (pd.DataFrame) – DataFrame containing upstream entities in a reaction (e.g., regulators or substrates).

  • downstream_tier (pd.DataFrame) – DataFrame containing downstream entities in a reaction (e.g., products or targets).

Returns:

DataFrame of edges, each with columns: ‘from’, ‘to’, ‘stoichiometry_upstream’, ‘stoichiometry_downstream’, ‘sbo_term_upstream’, ‘sbo_term_downstream’, and ‘r_id’. The number of edges is the product of the number of entities in the upstream tier and the number in the downstream tier. Attributes will be missing (None/NaN) if the corresponding tier is a reaction.

Return type:

pd.DataFrame

Notes

  • This function is used to build the edge list for tiered graphs, where each tier represents

a functional group (e.g., substrates, products, modifiers, reaction). - Both upstream and downstream attributes are included when available from the respective tiers. - Reaction entities themselves do not contribute stoichiometry or sbo_term attributes.

napistu.network.net_create_utils._interactor_duos_to_wide(interactor_duos: DataFrame)

Convert paired long format to wide format with ‘from’ and ‘to’ columns.

Parameters:

interactor_duos (pd.DataFrame) – DataFrame with exactly 2 rows per r_id, containing sc_id and stoichiometry

Returns:

Wide format with from_sc_id, from_stoichiometry, to_sc_id, to_stoichiometry columns

Return type:

pd.DataFrame

napistu.network.net_create_utils._log_pathological_same_tier(distinct_metadata: DataFrame, r_id: str) None

Log a warning if a reaction has multiple distinct metadata.

napistu.network.net_create_utils._reaction_species_to_tiers(rxn_species: DataFrame, graph_hierarchy_df: DataFrame, r_id: str) DataFrame

Map reaction species to tiers based on the graph hierarchy.

Parameters:
  • rxn_species (pd.DataFrame) – DataFrame of reaction species.

  • graph_hierarchy_df (pd.DataFrame) – DataFrame defining the graph hierarchy.

  • r_id (str) – Reaction ID.

Returns:

DataFrame of entities ordered by tier.

Return type:

pd.DataFrame

napistu.network.net_create_utils._should_drop_reaction(entities_ordered_by_tier: DataFrame, drop_reactions_when: str = 'same_tier')

Determine if a reaction should be dropped based on regulatory relationships and stringency.

Parameters:
  • entities_ordered_by_tier (pd.DataFrame) – The entities ordered by tier.

  • drop_reactions_when (str, optional) – The desired stringency for dropping reactions. Default is ‘same_tier’.

Returns:

True if the reaction should be dropped, False otherwise.

Return type:

bool

Notes

reactions are always dropped if they are on the same tier. This greatly decreases the number of vertices in a graph constructed from relatively dense interaction networks like STRING.

Raises:

ValueError – If drop_reactions_when is not a valid value.

napistu.network.net_create_utils._validate_interactor_duos(interactor_duos: DataFrame)

Logs cases when a pair of interactors have non-zero stoichiometry

napistu.network.net_create_utils._validate_sbo_indexed_rsc_stoi(rxn_species: DataFrame) None

Validate that rxn_species is a DataFrame with correct index and columns.

Parameters:

rxn_species (pd.DataFrame) – DataFrame of reaction species, indexed by SBO_TERM.

Return type:

None

Raises:
  • TypeError – If rxn_species is not a pandas DataFrame.

  • ValueError – If index or columns are not as expected.

napistu.network.net_create_utils.create_graph_hierarchy_df(wiring_approach: str) DataFrame

Create a DataFrame representing the graph hierarchy for a given wiring approach.

Parameters:

wiring_approach (str) – The type of tiered graph to work with. Each type has its own specification in constants.py.

Returns:

DataFrame with sbo_name, tier, and sbo_term.

Return type:

pd.DataFrame

Raises:

ValueError – If wiring_approach is not valid.

napistu.network.net_create_utils.format_tiered_reaction_species(rxn_species: DataFrame, r_id: str, graph_hierarchy_df: DataFrame, drop_reactions_when: str = 'same_tier') DataFrame

Create a Napistu graph from a reaction and its species.

Parameters:
  • rxn_species (pd.DataFrame) – The reaction’s participants indexed by SBO terms

  • r_id (str) – The ID of the reaction. Should be indexed by sbo_term and have columns

  • graph_hierarchy_df (pd.DataFrame) – The graph hierarchy.

  • drop_reactions_when (str, optional) – The condition under which to drop reactions as a network vertex. Default is ‘same_tier’.

Returns:

The edges of the Napistu graph for a single reaction.

Return type:

pd.DataFrame

napistu.network.net_create_utils.wire_reaction_species(reaction_species: DataFrame, wiring_approach: str, drop_reactions_when: str) DataFrame

Convert reaction species data into network edges using specified wiring approach.

This function processes reaction species data to create network edges that represent the relationships between molecular entities in a biological network. It handles both interactor pairs (processed en-masse) and other reaction species (processed using tiered algorithms based on the wiring approach).

Parameters:
  • reaction_species (pd.DataFrame) –

    DataFrame containing reaction species data with columns: - r_id : str

    Reaction identifier

    • sc_idstr

      Compartmentalized species identifier

    • stoichiometryfloat

      Stoichiometric coefficient (negative for reactants, positive for products, 0 for modifiers)

    • sbo_termstr

      Systems Biology Ontology term defining the role of the species in the reaction (e.g., ‘SBO:0000010’ for reactant, ‘SBO:0000011’ for product, ‘SBO:0000336’ for interactor)

  • wiring_approach (str) – The wiring approach to use for creating the network. Must be one of: - ‘bipartite’ : Creates bipartite network with molecules connected to reactions - ‘regulatory’ : Creates regulatory hierarchy (modifiers -> catalysts -> reactants -> reactions -> products) - ‘surrogate’ : Alternative layout with enzymes downstream of substrates

  • drop_reactions_when (str) – Condition under which to drop reactions as network vertices. Must be one of: - ‘always’ : Always drop reaction vertices - ‘edgelist’ : Drop if there are exactly 2 participants - ‘same_tier’ : Drop if there are 2 participants which are both “interactor”

Returns:

DataFrame containing network edges with columns: - from : str

Source node identifier (species or reaction ID)

  • tostr

    Target node identifier (species or reaction ID)

  • stoichiometryfloat

    Stoichiometric coefficient for the edge

  • sbo_termstr

    SBO term defining the relationship type

  • r_idstr

    Reaction identifier associated with the edge

Return type:

pd.DataFrame

Notes

The function processes reaction species in two phases:

  1. Interactor Processing: Pairs of interactors (SBO:0000336) are processed

    en-masse and converted to wide format edges.

  2. Tiered Processing: Non-interactor species are processed using tiered

    algorithms based on the wiring approach hierarchy. This creates edges between entities at different tiers in the hierarchy.

Reactions with ≤1 species are automatically dropped as they represent underspecified reactions (e.g., autoregulation or reactions with removed cofactors).

Examples

>>> from napistu.network import net_create_utils
>>> from napistu.constants import SBML_DFS, MINI_SBO_FROM_NAME, SBOTERM_NAMES
>>> import pandas as pd
>>>
>>> # Create sample reaction species data
>>> reaction_species = pd.DataFrame({
...     SBML_DFS.R_ID: ['R1', 'R1', 'R2', 'R2'],
...     SBML_DFS.SC_ID: ['A', 'B', 'C', 'D'],
...     SBML_DFS.STOICHIOMETRY: [-1, 1, 0, 0],
...     SBML_DFS.SBO_TERM: [
...         MINI_SBO_FROM_NAME[SBOTERM_NAMES.REACTANT],
...         MINI_SBO_FROM_NAME[SBOTERM_NAMES.PRODUCT],
...         MINI_SBO_FROM_NAME[SBOTERM_NAMES.INTERACTOR],
...         MINI_SBO_FROM_NAME[SBOTERM_NAMES.INTERACTOR]
...     ]
... })
>>>
>>> # Wire the reaction species using regulatory approach
>>> edges = wire_reaction_species(
...     reaction_species,
...     wiring_approach='regulatory',
...     drop_reactions_when='same_tier'
... )
Raises:

ValueError – If wiring_approach is not a valid value. If drop_reactions_when is not a valid value. If reaction species have unusable SBO terms.

See also

format_tiered_reaction_species

Process individual reactions with tiered algorithms

create_graph_hierarchy_df

Create hierarchy DataFrame for wiring approach