napistu.network.ng_utils

Utilities specific to NapistuGraph objects and the wider Napistu ecosystem.

This module contains utilities that are specific to NapistuGraph subclasses and require knowledge of the Napistu data model (SBML_dfs objects, etc.).

Functions

apply_weight_transformations(edges_df, ...)

Apply Weight Transformations to edge attributes.

compartmentalize_species(sbml_dfs, species)

Compartmentalize Species

compartmentalize_species_pairs(sbml_dfs, ...)

Compartmentalize Shortest Paths

create_entity_attrs_from_data_tables(...[, ...])

Create entity_attrs configuration from data tables.

format_napistu_graph_summary(data)

Format NapistuGraph summary data into a clean summary table for Jupyter display

get_minimal_sources_edges(vertices, sbml_dfs)

Assign edges to a set of sources.

get_sbml_dfs_vertex_summaries(sbml_dfs[, ...])

Prepare species and reaction ontology and/or source occurrence summaries which are ready to be merged with NapistuGraph vertices.

pluck_data(data_tables, entity_attrs)

Pluck data from a dictionary of DataFrames based on specified attributes.

pluck_entity_data(sbml_dfs, entity_attrs, ...)

Pluck Entity Attributes from an sbml_dfs based on a set of tables and variables to look for.

prepare_entity_data_extraction(graph, ...[, ...])

Prepare entity data extraction by validating inputs and determining which attributes to extract.

read_graph_attrs_spec(graph_attrs_spec_uri)

Read a YAML file containing the specification for adding reaction- and/or species-attributes to a napistu_graph.

separate_entity_attrs_by_source(...[, ...])

Separate entity attributes by data source (SBML vs side-loaded).

validate_assets(sbml_dfs[, napistu_graph, ...])

Validate Assets

class napistu.network.ng_utils._EntityAttrValidator(*, table: str, variable: str, trans: str | None = 'identity')

Bases: BaseModel

_abc_impl = <_abc._abc_data object>
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

table: str
trans: str | None
variable: str
napistu.network.ng_utils._validate_assets_graph_dist(napistu_graph: NapistuGraph, precomputed_distances: pd.DataFrame) None

Check a NapistuGraph and precomputed distances table for inconsistencies.

Parameters:
  • napistu_graph ("NapistuGraph") – The network representation (subclass of igraph.Graph).

  • precomputed_distances (pandas.DataFrame) – Precomputed distances between vertices in the network.

Return type:

None

Warns:

If edge weights are inconsistent between the graph and precomputed distances.

napistu.network.ng_utils._validate_assets_sbml_graph(sbml_dfs: sbml_dfs_core.SBML_dfs, napistu_graph: 'NapistuGraph' | ig.Graph) None

Check an sbml_dfs model and NapistuGraph for inconsistencies.

Parameters:
  • sbml_dfs (sbml_dfs_core.SBML_dfs) – The pathway representation.

  • napistu_graph ("NapistuGraph") – The network representation (subclass of igraph.Graph).

Return type:

None

Raises:

ValueError – If species names do not match between sbml_dfs and napistu_graph.

napistu.network.ng_utils._validate_entity_attrs(entity_attrs: dict, validate_transformations: bool = True, custom_transformations: dict | None = None) None

Validate that graph attributes are a valid format.

Parameters:
  • entity_attrs (dict) –

    Dictionary of entity attributes to validate. The structure should be: {

    ”attr_name”: {

    “table”: “table_name”, “variable”: “variable_name”, “trans”: “transformation_name”

    }

    } where “table” is the name of the table in the sbml_dfs to look for the variable, “variable” is the name of the variable in the table, “trans” (optional) is the name of the transformation to apply to the variable.

  • validate_transformations (bool, optional) – Whether to validate transformation names, by default True.

  • custom_transformations (dict, optional) – Dictionary of custom transformation functions, by default None. Keys are transformation names, values are transformation functions.

Return type:

None

Raises:
  • AssertionError – If entity_attrs is not a dictionary.

  • ValueError – If a transformation is not found in DEFINED_WEIGHT_TRANSFORMATION or custom_transformations.

napistu.network.ng_utils._validate_side_loaded_attributes(side_loaded_attributes: dict[str, DataFrame]) None

Validate that side_loaded_attributes is a dict of DataFrames with consistent index names.

This function ensures that all DataFrames in the dictionary can be concatenated by checking that they have the same index structure (single or multi-index).

Parameters:

side_loaded_attributes (dict[str, pd.DataFrame]) – Dictionary mapping table names to DataFrames

Raises:
  • TypeError – If side_loaded_attributes is not a dict or contains non-DataFrame values

  • ValueError – If DataFrames have inconsistent index names or structures

napistu.network.ng_utils._wt_transformation_identity(x)

Identity transformation for weights.

Parameters:

x (any) – Input value.

Returns:

The input value unchanged.

Return type:

any

napistu.network.ng_utils._wt_transformation_string(x)

Map STRING scores to a similar scale as topology weights.

Parameters:

x (float) – STRING score.

Returns:

Transformed STRING score.

Return type:

float

napistu.network.ng_utils._wt_transformation_string_inv(x)

Map STRING scores so they work with source weights.

Parameters:

x (float) – STRING score.

Returns:

Inverse transformed STRING score.

Return type:

float

napistu.network.ng_utils.apply_weight_transformations(edges_df: DataFrame, reaction_attrs: dict, custom_transformations: dict = None)

Apply Weight Transformations to edge attributes.

Parameters:
  • edges_df (pd.DataFrame) – A table of edges and their attributes extracted from a NapistuGraph.

  • reaction_attrs (dict) – A dictionary of attributes identifying weighting attributes within an sbml_df’s reaction_data, how they will be named in edges_df (the keys), and how they should be transformed (the “trans” aliases).

  • custom_transformations (dict, optional) – A dictionary mapping transformation names to functions. If provided, these will be checked before built-in transformations.

Returns:

edges_df with weight variables transformed.

Return type:

pd.DataFrame

Raises:

ValueError – If a weighting variable is missing or transformation is not found.

napistu.network.ng_utils.compartmentalize_species(sbml_dfs: SBML_dfs, species: str | list[str]) DataFrame

Compartmentalize Species

Returns the compartmentalized species IDs (sc_ids) corresponding to a list of species (s_ids)

Parameters:
  • sbml_dfs (SBML_dfs) – A model formed by aggregating pathways

  • species (list) – Species IDs

Return type:

pd.DataFrame containings the s_id and sc_id pairs

napistu.network.ng_utils.compartmentalize_species_pairs(sbml_dfs: SBML_dfs, origin_species: str | list[str], dest_species: str | list[str]) DataFrame

Compartmentalize Shortest Paths

For a set of origin and destination species pairs, consider each species in every compartment it operates in, seperately.

Parameters:
  • sbml_dfs (SBML_dfs) – A model formed by aggregating pathways

  • origin_species (list) – Species IDs as starting points

  • dest_species (list) – Species IDs as ending points

Return type:

pd.DataFrame containing pairs of origin and destination compartmentalized species

napistu.network.ng_utils.create_entity_attrs_from_data_tables(entity_data_dict: dict[str, DataFrame], table_names: list[str] | None = None, add_name_prefixes: bool = True) dict[str, dict[str, str]]

Create entity_attrs configuration from data tables.

This utility converts a dictionary of data tables into the entity_attrs format expected by NapistuGraph methods, automatically generating attribute configurations for all columns in the specified tables.

Parameters:
  • entity_data_dict (dict[str, pd.DataFrame]) – Dictionary mapping table names to DataFrames (e.g., sbml_dfs.species_data)

  • table_names (Optional[list[str]], default=None) – Specific table names to include. If None, includes all available tables.

  • add_name_prefixes (bool, default=True) – Whether to prefix attribute names with table name (e.g., “table_name_column_name”)

Returns:

Entity attributes configuration dictionary in the format: {

”attr_name”: {

“table”: “table_name”, “variable”: “column_name”

}

}

Return type:

dict[str, dict[str, str]]

Raises:

ValueError – If requested table names don’t exist in entity_data_dict

Examples

Create attrs from all species data tables: >>> entity_attrs = create_entity_attrs_from_data_tables(sbml_dfs.species_data)

Create attrs from specific tables: >>> entity_attrs = create_entity_attrs_from_data_tables( … sbml_dfs.reactions_data, … table_names=[“kinetics”, “literature”] … )

Create attrs without table name prefixes: >>> entity_attrs = create_entity_attrs_from_data_tables( … sbml_dfs.species_data, … add_name_prefixes=False … )

napistu.network.ng_utils.format_napistu_graph_summary(data)

Format NapistuGraph summary data into a clean summary table for Jupyter display

napistu.network.ng_utils.get_minimal_sources_edges(vertices: DataFrame, sbml_dfs: SBML_dfs, min_pw_size: int = 3, source_total_counts: Series | DataFrame | None = None, verbose: bool = False) DataFrame | None

Assign edges to a set of sources.

Parameters:
  • vertices (pd.DataFrame) – A table of vertices.

  • sbml_dfs (sbml_dfs_core.SBML_dfs) – A pathway model

  • min_pw_size (int) – the minimum size of a pathway to be considered

  • source_total_counts (pd.Series | pd.DataFrame) – A series of the total counts of each source or a pd.DataFrame with two columns: pathway_id and total_counts.

  • verbose (bool) – Whether to print verbose output

Returns:

reaction_sources – A table of reactions and the sources they are assigned to.

Return type:

pd.DataFrame

napistu.network.ng_utils.get_sbml_dfs_vertex_summaries(sbml_dfs, summary_types=['sources', 'ontologies'], priority_pathways=None, stratify_by_bqb=True, characteristic_only=False, dogmatic=False, add_name_prefixes=False, binarize=False, has_reactions=True) DataFrame

Prepare species and reaction ontology and/or source occurrence summaries which are ready to be merged with NapistuGraph vertices.

Parameters:
  • sbml_dfs (SBML_dfs) – A pathway model

  • summary_types (list) – The summary types to get

  • priority_pathways (list) – The priority pathways to get

  • stratify_by_bqb (bool) – Whether to stratify by BQB

  • characteristic_only (bool) – Whether to only get characteristic ontologies

  • dogmatic (bool) – Whether to use dogmatic ontologies

  • add_name_prefixes (bool, default False) – If True, add prefixes to column names: ‘source_’ for source data and ‘ontology_’ for ontology data

  • binarize (bool, optional) – Whether to convert the summary to binary values (0 vs 1+). Default is False.

  • has_reactions (bool, default True) – Whether the graph has reaction vertices. If False, reaction-specific summaries will be skipped.

napistu.network.ng_utils.pluck_data(data_tables: dict[str, DataFrame], entity_attrs: dict[str, dict]) DataFrame | None

Pluck data from a dictionary of DataFrames based on specified attributes.

Parameters:
  • data_tables (dict[str, pd.DataFrame]) – A dictionary mapping table names to pandas DataFrames.

  • entity_attrs (dict[str, dict]) –

    A dictionary containing the attributes to pull out. Of the form: {

    ”to_be_created_column_name”: {

    “table”: “table name in data_tables”, “variable”: “column name in the specified table”

    }

    }

Returns:

A table where all extracted attributes are merged based on a common index or None if no attributes were extracted. If the attribute dict is empty, returns None.

Return type:

pd.DataFrame or None

Raises:

ValueError – If requested tables/variables are missing.

napistu.network.ng_utils.pluck_entity_data(sbml_dfs: SBML_dfs, entity_attrs: dict[str, list[dict]] | list[dict], data_type: str, custom_transformations: dict[str, callable] | None = None, transform: bool = True) DataFrame | None

Pluck Entity Attributes from an sbml_dfs based on a set of tables and variables to look for.

Parameters:
  • sbml_dfs (sbml_dfs_core.SBML_dfs) – A mechanistic model.

  • entity_attrs (dict[str, list[dict]] | list[dict]) –

    A list of dicts containing the species/reaction attributes to pull out. Of the form: [

    ”to_be_created_graph_attr_name”: {

    “table”: “species/reactions data table”, “variable”: “variable in the data table”, “trans”: “optionally, a transformation to apply to the variable (where applicable)”

    }

    ]

    This can also be a dict of the form but this will result in a deprecation warning: {

    ”species”: << entity attributes list >> “reactions” : << entity attributes list >>

    }

  • data_type (str) – “species” or “reactions” to pull out species_data or reactions_data.

  • custom_transformations (dict[str, callable], optional) –

    A dictionary mapping transformation names to functions. If provided, these will be checked before built-in transformations. Example:

    custom_transformations = {“square”: lambda x: x**2}

  • transform (bool, default=True) – Whether to apply transformations to the extracted data. In a future version, this function will not support transformations by default.

Returns:

A table where all extracted attributes are merged based on a common index or None if no attributes were extracted. If the requested data_type is not present in graph_attrs, or if the attribute dict is empty, returns None. This is intended to allow optional annotation blocks.

Return type:

pd.DataFrame or None

Raises:

ValueError – If data_type is not valid or if requested tables/variables are missing.

napistu.network.ng_utils.prepare_entity_data_extraction(graph, entity_type: str, target_entity: str, mode: str = 'fresh', overwrite: bool = False) tuple[dict, set] | None

Prepare entity data extraction by validating inputs and determining which attributes to extract.

This utility captures the logic from _add_entity_data up to the pluck_entity_data call, handling validation, conflict checking, and attribute filtering.

Parameters:
  • graph (NapistuGraph) – The graph object containing entity attributes metadata

  • entity_type (str) – Either “reactions” or “species”

  • target_entity (str) – Either “edges” or “vertices” - determines where attributes will be added

  • mode (str, default="fresh") – Either “fresh” (replace existing) or “extend” (add new attributes only)

  • overwrite (bool, default=False) – Whether to allow overwriting existing attributes when conflicts arise

Returns:

If successful: entity_attrs_to_extract - a dictionary of entity attributes to extract drawn from the vertex/edge attributes metadata If failed: None

Return type:

entity_attrs_to_extract | None

Raises:

ValueError – If target_entity is invalid, mode is invalid, or conflicts exist without overwrite

napistu.network.ng_utils.read_graph_attrs_spec(graph_attrs_spec_uri: str) dict

Read a YAML file containing the specification for adding reaction- and/or species-attributes to a napistu_graph.

napistu.network.ng_utils.separate_entity_attrs_by_source(entity_attrs: dict[str, dict], entity_type: str, sbml_dfs: Any | None = None, side_loaded_attributes: dict[str, DataFrame] | None = None) tuple[dict[str, dict], dict[str, dict]]

Separate entity attributes by data source (SBML vs side-loaded).

This function categorizes entity attributes based on their table content: - SBML attributes: table names that exist in the sbml_dfs object for the specified entity_type - Side-loaded attributes: table names that exist in the side_loaded_attributes dict

Both types use the same structure: table, variable, transformation

Parameters:
  • entity_attrs (dict[str, dict]) – Dictionary of entity attributes to separate

  • entity_type (str) – Either “reactions” or “species” - determines which SBML tables to check

  • sbml_dfs (SBML_dfs, optional) – SBML_dfs object to check for valid table names

  • side_loaded_attributes (dict[str, pd.DataFrame], optional) – Dictionary mapping table names to DataFrames for side-loaded data

Returns:

(sbml_attrs, side_loaded_attrs) - two dictionaries containing separated attributes

Return type:

tuple[dict[str, dict], dict[str, dict]]

Raises:

ValueError – If an attribute has an invalid structure, if entity_type is invalid, if both sbml_dfs and side_loaded_attributes have overlapping table names, if neither data source is provided, or if required table names are missing from both sources

napistu.network.ng_utils.validate_assets(sbml_dfs: sbml_dfs_core.SBML_dfs, napistu_graph: 'NapistuGraph' | ig.Graph | None = None, precomputed_distances: pd.DataFrame | None = None, identifiers_df: pd.DataFrame | None = None) None

Validate Assets

Perform a few quick checks of inputs to catch inconsistencies.

Parameters:
  • sbml_dfs (sbml_dfs_core.SBML_dfs) – A pathway representation. (Required) napistu_graph : “NapistuGraph”, optional

  • igraph.Graph. (A network-based representation of sbml_dfs. NapistuGraph is a subclass of)

  • precomputed_distances (pandas.DataFrame, optional) – Precomputed distances between vertices in napistu_graph.

  • identifiers_df (pandas.DataFrame, optional) – A table of systematic identifiers for compartmentalized species in sbml_dfs.

Return type:

None

Warns:

If only sbml_dfs is provided and no other assets are given, a warning is logged.

Raises:

ValueError – If precomputed_distances is provided but napistu_graph is not.