napistu.network.ng_utils

Utilities specific to NapistuGraph objects and the wider Napistu ecosystem.

This module contains utilities that are specific to NapistuGraph subclasses and require knowledge of the Napistu data model (SBML_dfs objects, etc.).

Functions

`apply_weight_transformations`(edges_df, ...)	Apply Weight Transformations to edge attributes.
`compartmentalize_species`(sbml_dfs, species)	Compartmentalize Species
`compartmentalize_species_pairs`(sbml_dfs, ...)	Compartmentalize Shortest Paths
`create_entity_attrs_from_data_tables`(...[, ...])	Create entity_attrs configuration from data tables.
`format_napistu_graph_summary`(data)	Format NapistuGraph summary data into a clean summary table for Jupyter display
`get_minimal_sources_edges`(vertices, sbml_dfs)	Assign edges to a set of sources.
`get_sbml_dfs_vertex_summaries`(sbml_dfs[, ...])	Prepare species and reaction ontology and/or source occurrence summaries which are ready to be merged with NapistuGraph vertices.
`pluck_data`(data_tables, entity_attrs)	Pluck data from a dictionary of DataFrames based on specified attributes.
`pluck_entity_data`(sbml_dfs, entity_attrs, ...)	Pluck Entity Attributes from an sbml_dfs based on a set of tables and variables to look for.
`prepare_entity_data_extraction`(graph, ...[, ...])	Prepare entity data extraction by validating inputs and determining which attributes to extract.
`read_graph_attrs_spec`(graph_attrs_spec_uri)	Read a YAML file containing the specification for adding reaction- and/or species-attributes to a napistu_graph.
`separate_entity_attrs_by_source`(...[, ...])	Separate entity attributes by data source (SBML vs side-loaded).
`validate_assets`(sbml_dfs[, napistu_graph, ...])	Validate Assets

class napistu.network.ng_utils._EntityAttrValidator(*, table: str, variable: str, trans: str | None = 'identity')

Bases: BaseModel

_abc_impl = <_abc._abc_data object>

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

table: str

trans: str | None

variable: str

napistu.network.ng_utils._validate_assets_graph_dist(napistu_graph: NapistuGraph, precomputed_distances: pd.DataFrame) → None

Check a NapistuGraph and precomputed distances table for inconsistencies.

Parameters:

napistu_graph ("NapistuGraph") – The network representation (subclass of igraph.Graph).
precomputed_distances (pandas.DataFrame) – Precomputed distances between vertices in the network.

Return type:

None

Warns:

If edge weights are inconsistent between the graph and precomputed distances.

napistu.network.ng_utils._validate_assets_sbml_graph(sbml_dfs: sbml_dfs_core.SBML_dfs, napistu_graph: 'NapistuGraph' | ig.Graph) → None

Check an sbml_dfs model and NapistuGraph for inconsistencies.

Parameters:

sbml_dfs (sbml_dfs_core.SBML_dfs) – The pathway representation.
napistu_graph ("NapistuGraph") – The network representation (subclass of igraph.Graph).

Return type:

None

Raises:

ValueError – If species names do not match between sbml_dfs and napistu_graph.

napistu.network.ng_utils._validate_entity_attrs(entity_attrs: dict, validate_transformations: bool = True, custom_transformations: dict | None = None) → None

Validate that graph attributes are a valid format.

Parameters:

entity_attrs (dict) –
Dictionary of entity attributes to validate. The structure should be: {

”attr_name”: {
“table”: “table_name”, “variable”: “variable_name”, “trans”: “transformation_name”

}

} where “table” is the name of the table in the sbml_dfs to look for the variable, “variable” is the name of the variable in the table, “trans” (optional) is the name of the transformation to apply to the variable.
validate_transformations (bool, optional) – Whether to validate transformation names, by default True.
custom_transformations (dict, optional) – Dictionary of custom transformation functions, by default None. Keys are transformation names, values are transformation functions.

Return type:

None

Raises:

AssertionError – If entity_attrs is not a dictionary.
ValueError – If a transformation is not found in DEFINED_WEIGHT_TRANSFORMATION or custom_transformations.

napistu.network.ng_utils._validate_side_loaded_attributes(side_loaded_attributes: dict[str, DataFrame]) → None

Validate that side_loaded_attributes is a dict of DataFrames with consistent index names.

This function ensures that all DataFrames in the dictionary can be concatenated by checking that they have the same index structure (single or multi-index).

Parameters:

side_loaded_attributes (dict[str, pd.DataFrame]) – Dictionary mapping table names to DataFrames

Raises:

TypeError – If side_loaded_attributes is not a dict or contains non-DataFrame values
ValueError – If DataFrames have inconsistent index names or structures

napistu.network.ng_utils._wt_transformation_identity(x)

Identity transformation for weights.

Parameters:: x (any) – Input value.
Returns:: The input value unchanged.
Return type:: any

napistu.network.ng_utils._wt_transformation_string(x)

Map STRING scores to a similar scale as topology weights.

Parameters:: x (float) – STRING score.
Returns:: Transformed STRING score.
Return type:: float

napistu.network.ng_utils._wt_transformation_string_inv(x)

Map STRING scores so they work with source weights.

Parameters:: x (float) – STRING score.
Returns:: Inverse transformed STRING score.
Return type:: float

napistu.network.ng_utils.apply_weight_transformations(edges_df: DataFrame, reaction_attrs: dict, custom_transformations: dict = None)

Apply Weight Transformations to edge attributes.

Parameters:

edges_df (pd.DataFrame) – A table of edges and their attributes extracted from a NapistuGraph.
reaction_attrs (dict) – A dictionary of attributes identifying weighting attributes within an sbml_df’s reaction_data, how they will be named in edges_df (the keys), and how they should be transformed (the “trans” aliases).
custom_transformations (dict, optional) – A dictionary mapping transformation names to functions. If provided, these will be checked before built-in transformations.

Returns:

edges_df with weight variables transformed.

Return type:

pd.DataFrame

Raises:

ValueError – If a weighting variable is missing or transformation is not found.

napistu.network.ng_utils.compartmentalize_species(sbml_dfs: SBML_dfs, species: str | list[str]) → DataFrame

Compartmentalize Species

Returns the compartmentalized species IDs (sc_ids) corresponding to a list of species (s_ids)

Parameters:

sbml_dfs (SBML_dfs) – A model formed by aggregating pathways
species (list) – Species IDs

Return type:

pd.DataFrame containings the s_id and sc_id pairs

napistu.network.ng_utils.compartmentalize_species_pairs(sbml_dfs: SBML_dfs, origin_species: str | list[str], dest_species: str | list[str]) → DataFrame

Compartmentalize Shortest Paths

For a set of origin and destination species pairs, consider each species in every compartment it operates in, seperately.

Parameters:

sbml_dfs (SBML_dfs) – A model formed by aggregating pathways
origin_species (list) – Species IDs as starting points
dest_species (list) – Species IDs as ending points

Return type:

pd.DataFrame containing pairs of origin and destination compartmentalized species

napistu.network.ng_utils.create_entity_attrs_from_data_tables(entity_data_dict: dict[str, DataFrame], table_names: list[str] | None = None, add_name_prefixes: bool = True) → dict[str, dict[str, str]]

Create entity_attrs configuration from data tables.

This utility converts a dictionary of data tables into the entity_attrs format expected by NapistuGraph methods, automatically generating attribute configurations for all columns in the specified tables.

Parameters:

entity_data_dict (dict[str, pd.DataFrame]) – Dictionary mapping table names to DataFrames (e.g., sbml_dfs.species_data)
table_names (Optional[list[str]], default=None) – Specific table names to include. If None, includes all available tables.
add_name_prefixes (bool, default=True) – Whether to prefix attribute names with table name (e.g., “table_name_column_name”)

Returns:

Entity attributes configuration dictionary in the format: {

”attr_name”: {
“table”: “table_name”, “variable”: “column_name”

}

}

Return type:

dict[str, dict[str, str]]

Raises:

ValueError – If requested table names don’t exist in entity_data_dict

Examples

Create attrs from all species data tables: >>> entity_attrs = create_entity_attrs_from_data_tables(sbml_dfs.species_data)

Create attrs from specific tables: >>> entity_attrs = create_entity_attrs_from_data_tables( … sbml_dfs.reactions_data, … table_names=[“kinetics”, “literature”] … )

Create attrs without table name prefixes: >>> entity_attrs = create_entity_attrs_from_data_tables( … sbml_dfs.species_data, … add_name_prefixes=False … )

napistu.network.ng_utils.format_napistu_graph_summary(data): Format NapistuGraph summary data into a clean summary table for Jupyter display

napistu.network.ng_utils.get_minimal_sources_edges(vertices: DataFrame, sbml_dfs: SBML_dfs, min_pw_size: int = 3, source_total_counts: Series | DataFrame | None = None, verbose: bool = False) → DataFrame | None

Assign edges to a set of sources.

Parameters:

vertices (pd.DataFrame) – A table of vertices.
sbml_dfs (sbml_dfs_core.SBML_dfs) – A pathway model
min_pw_size (int) – the minimum size of a pathway to be considered
source_total_counts (pd.Series | pd.DataFrame) – A series of the total counts of each source or a pd.DataFrame with two columns: pathway_id and total_counts.
verbose (bool) – Whether to print verbose output

Returns:

reaction_sources – A table of reactions and the sources they are assigned to.

Return type:

pd.DataFrame

napistu.network.ng_utils.get_sbml_dfs_vertex_summaries(sbml_dfs, summary_types=['sources', 'ontologies'], priority_pathways=None, stratify_by_bqb=True, characteristic_only=False, dogmatic=False, add_name_prefixes=False, binarize=False, has_reactions=True) → DataFrame

Prepare species and reaction ontology and/or source occurrence summaries which are ready to be merged with NapistuGraph vertices.

Parameters:

sbml_dfs (SBML_dfs) – A pathway model
summary_types (list) – The summary types to get
priority_pathways (list) – The priority pathways to get
stratify_by_bqb (bool) – Whether to stratify by BQB
characteristic_only (bool) – Whether to only get characteristic ontologies
dogmatic (bool) – Whether to use dogmatic ontologies
add_name_prefixes (bool, default False) – If True, add prefixes to column names: ‘source_’ for source data and ‘ontology_’ for ontology data
binarize (bool, optional) – Whether to convert the summary to binary values (0 vs 1+). Default is False.
has_reactions (bool, default True) – Whether the graph has reaction vertices. If False, reaction-specific summaries will be skipped.

napistu.network.ng_utils.pluck_data(data_tables: dict[str, DataFrame], entity_attrs: dict[str, dict]) → DataFrame | None

Pluck data from a dictionary of DataFrames based on specified attributes.

Parameters:

data_tables (dict[str, pd.DataFrame]) – A dictionary mapping table names to pandas DataFrames.
entity_attrs (dict[str, dict]) –
A dictionary containing the attributes to pull out. Of the form: {

”to_be_created_column_name”: {
“table”: “table name in data_tables”, “variable”: “column name in the specified table”

}

}

Returns:

A table where all extracted attributes are merged based on a common index or None if no attributes were extracted. If the attribute dict is empty, returns None.

Return type:

pd.DataFrame or None

Raises:

ValueError – If requested tables/variables are missing.

napistu.network.ng_utils.pluck_entity_data(sbml_dfs: SBML_dfs, entity_attrs: dict[str, list[dict]] | list[dict], data_type: str, custom_transformations: dict[str, callable] | None = None, transform: bool = True) → DataFrame | None

Pluck Entity Attributes from an sbml_dfs based on a set of tables and variables to look for.

Parameters:

sbml_dfs (sbml_dfs_core.SBML_dfs) – A mechanistic model.
entity_attrs (dict[str, list[dict]] | list[dict]) –
A list of dicts containing the species/reaction attributes to pull out. Of the form: [

”to_be_created_graph_attr_name”: {
“table”: “species/reactions data table”, “variable”: “variable in the data table”, “trans”: “optionally, a transformation to apply to the variable (where applicable)”

}

]

This can also be a dict of the form but this will result in a deprecation warning: {

”species”: << entity attributes list >> “reactions” : << entity attributes list >>

}
data_type (str) – “species” or “reactions” to pull out species_data or reactions_data.
custom_transformations (dict[str, callable], optional) –
A dictionary mapping transformation names to functions. If provided, these will be checked before built-in transformations. Example:

custom_transformations = {“square”: lambda x: x**2}
transform (bool, default=True) – Whether to apply transformations to the extracted data. In a future version, this function will not support transformations by default.

Returns:

A table where all extracted attributes are merged based on a common index or None if no attributes were extracted. If the requested data_type is not present in graph_attrs, or if the attribute dict is empty, returns None. This is intended to allow optional annotation blocks.

Return type:

pd.DataFrame or None

Raises:

ValueError – If data_type is not valid or if requested tables/variables are missing.

napistu.network.ng_utils.prepare_entity_data_extraction(graph, entity_type: str, target_entity: str, mode: str = 'fresh', overwrite: bool = False) → tuple[dict, set] | None

Prepare entity data extraction by validating inputs and determining which attributes to extract.

This utility captures the logic from _add_entity_data up to the pluck_entity_data call, handling validation, conflict checking, and attribute filtering.

Parameters:

graph (NapistuGraph) – The graph object containing entity attributes metadata
entity_type (str) – Either “reactions” or “species”
target_entity (str) – Either “edges” or “vertices” - determines where attributes will be added
mode (str, default="fresh") – Either “fresh” (replace existing) or “extend” (add new attributes only)
overwrite (bool, default=False) – Whether to allow overwriting existing attributes when conflicts arise

Returns:

If successful: entity_attrs_to_extract - a dictionary of entity attributes to extract drawn from the vertex/edge attributes metadata If failed: None

Return type:

entity_attrs_to_extract | None

Raises:

ValueError – If target_entity is invalid, mode is invalid, or conflicts exist without overwrite

napistu.network.ng_utils.read_graph_attrs_spec(graph_attrs_spec_uri: str) → dict: Read a YAML file containing the specification for adding reaction- and/or species-attributes to a napistu_graph.

napistu.network.ng_utils.separate_entity_attrs_by_source(entity_attrs: dict[str, dict], entity_type: str, sbml_dfs: Any | None = None, side_loaded_attributes: dict[str, DataFrame] | None = None) → tuple[dict[str, dict], dict[str, dict]]

Separate entity attributes by data source (SBML vs side-loaded).

This function categorizes entity attributes based on their table content: - SBML attributes: table names that exist in the sbml_dfs object for the specified entity_type - Side-loaded attributes: table names that exist in the side_loaded_attributes dict

Both types use the same structure: table, variable, transformation

Parameters:

entity_attrs (dict[str, dict]) – Dictionary of entity attributes to separate
entity_type (str) – Either “reactions” or “species” - determines which SBML tables to check
sbml_dfs (SBML_dfs, optional) – SBML_dfs object to check for valid table names
side_loaded_attributes (dict[str, pd.DataFrame], optional) – Dictionary mapping table names to DataFrames for side-loaded data

Returns:

(sbml_attrs, side_loaded_attrs) - two dictionaries containing separated attributes

Return type:

tuple[dict[str, dict], dict[str, dict]]

Raises:

ValueError – If an attribute has an invalid structure, if entity_type is invalid, if both sbml_dfs and side_loaded_attributes have overlapping table names, if neither data source is provided, or if required table names are missing from both sources

napistu.network.ng_utils.validate_assets(sbml_dfs: sbml_dfs_core.SBML_dfs, napistu_graph: 'NapistuGraph' | ig.Graph | None = None, precomputed_distances: pd.DataFrame | None = None, identifiers_df: pd.DataFrame | None = None) → None

Validate Assets

Perform a few quick checks of inputs to catch inconsistencies.

Parameters:

sbml_dfs (sbml_dfs_core.SBML_dfs) – A pathway representation. (Required) napistu_graph : “NapistuGraph”, optional
igraph.Graph. (A network-based representation of sbml_dfs. NapistuGraph is a subclass of)
precomputed_distances (pandas.DataFrame, optional) – Precomputed distances between vertices in napistu_graph.
identifiers_df (pandas.DataFrame, optional) – A table of systematic identifiers for compartmentalized species in sbml_dfs.

Return type:

None

Warns:

If only sbml_dfs is provided and no other assets are given, a warning is logged.

Raises:

ValueError – If precomputed_distances is provided but napistu_graph is not.