napistu.network.ig_utils

General utilities for working with igraph.Graph objects.

This module contains utilities that can be broadly applied to any igraph.Graph object, not specific to NapistuGraph subclasses.

Public Functions

create_induced_subgraph(graph: Graph, vertices: Optional[list[str]] = None, n_vertices: int = 5000) -> Graph:

Create a subgraph from an igraph including a set of vertices and their connections.

define_graph_universe(graph: Graph, vertex_names: Optional[Union[List[str], pd.Series]] = None, edgelist: Optional[pd.DataFrame] = None, observed_only: bool = False, edge_filter_logic: str = ‘and’, include_self_edges: bool = False) -> Graph:

Define a graph universe for enrichment-style analyses.

filter_to_largest_subgraph(graph: Graph) -> Graph:

Filter an igraph to its largest weakly connected component.

filter_to_largest_subgraphs(graph: Graph, top_k: int) -> list[Graph]:

Filter an igraph to its largest weakly connected components.

get_graph_summary(graph: Graph) -> dict[str, Any]:

Calculate common summary statistics for an igraph network.

get_merge_keys(graph: Graph, merge_by: str = IGRAPH_DEFS.NAME) -> tuple[str, str, str]:

Get the merge keys for a graph based on the merge_by parameter.

graph_to_pandas_dfs(graph: Graph) -> tuple[pd.DataFrame, pd.DataFrame]:

Convert an igraph to Pandas DataFrames for vertices and edges.

validate_edge_attributes(graph: Graph, edge_attributes: list[str]) -> None:

Validate that all required edge attributes exist in an igraph.

validate_vertex_attributes(graph: Graph, vertex_attributes: list[str]) -> None:

Validate that all required vertex attributes exist in an igraph.

Functions

create_induced_subgraph(graph[, vertices, ...])

Create a subgraph from an igraph including a set of vertices and their connections.

define_graph_universe(graph[, vertex_names, ...])

Create a graph defining the search space for enrichment-style analyses.

filter_to_largest_subgraph(graph)

Filter an igraph to its largest weakly connected component.

filter_to_largest_subgraphs(graph, top_k)

Filter an igraph to its largest weakly connected components.

get_graph_summary(graph)

Calculate common summary statistics for an igraph network.

get_merge_keys(graph[, merge_by])

Get the merge keys for a graph based on the merge_by parameter.

graph_to_pandas_dfs(graph)

Convert an igraph to Pandas DataFrames for vertices and edges.

validate_edge_attributes(graph, edge_attributes)

Validate that all required edge attributes exist in an igraph.

validate_vertex_attributes(graph, ...)

Validate that all required vertex attributes exist in an igraph.

napistu.network.ig_utils._create_universe_edgelist(edge_filters: List[DataFrame], edge_filter_logic: str, selected_names: List[str], is_directed: bool) DataFrame

Create edgelist for universe from filters or complete graph.

Parameters:
  • edge_filters (List[pd.DataFrame]) – List of edgelist DataFrames to combine

  • edge_filter_logic (str) – ‘and’ for intersection, ‘or’ for union

  • selected_names (List[str]) – Vertex names in the universe

  • is_directed (bool) – Whether graph is directed

Returns:

Final edgelist with ‘source’ and ‘target’ columns

Return type:

pd.DataFrame

napistu.network.ig_utils._ensure_valid_attribute(graph: Graph, attribute: str, non_negative: bool = True)

Ensure a vertex attribute is present, numeric, finite, and optionally non-negative for all vertices.

This utility checks that the specified vertex attribute exists, is numeric, and (optionally) non-negative for all vertices in the graph. Missing or None values are treated as 0. Raises ValueError if the attribute is missing for all vertices, if all values are zero, or if any value is negative (if non_negative=True).

Parameters:
  • graph (NapistuGraph or Graph) – The input graph (NapistuGraph or igraph.Graph).

  • attribute (str) – The name of the vertex attribute to check.

  • non_negative (bool, default True) – Whether to require all values to be non-negative.

Returns:

Array of attribute values (with missing/None replaced by 0).

Return type:

np.ndarray

Raises:

ValueError – If the attribute is missing for all vertices, all values are zero, or any value is negative (if non_negative=True).

napistu.network.ig_utils._get_attribute_masks(graph: Graph, mask_specs: Dict[str, str | ndarray | List | None]) Dict[str, ndarray]

Generate boolean masks for each attribute based on specifications.

Parameters:
  • graph (Graph) – Input graph.

  • mask_specs (Dict[str, Union[str, np.ndarray, List, None]]) – Dictionary mapping each attribute to its mask specification.

Returns:

Dictionary mapping each attribute to its boolean mask array.

Return type:

Dict[str, np.ndarray]

napistu.network.ig_utils._get_top_n_component_stats(graph: Graph, components, component_sizes: Sequence[int], n: int = 10, ascending: bool = False) list[dict[str, Any]]

Summarize the top N components’ network properties.

Parameters:
  • graph (Graph) – The network.

  • components (list) – List of components (as lists of vertex indices).

  • component_sizes (Sequence[int]) – Sizes of each component.

  • n (int, optional) – Number of top components to return. Default is 10.

  • ascending (bool, optional) – If True, return smallest components; otherwise, largest. Default is False.

Returns:

Each dict contains: - ‘n’: size of the component - ‘examples’: up to 10 example vertex attribute dicts from the component

Return type:

list of dict

napistu.network.ig_utils._get_top_n_idx(arr: Sequence, n: int, ascending: bool = False) Sequence[int]

Returns the indices of the top n values in an array

Parameters:
  • arr (Sequence) – An array of values

  • n (int) – The number of top values to return

  • ascending (bool, optional) – Whether to return the top or bottom n values. Defaults to False.

Returns:

The indices of the top n values

Return type:

Sequence[int]

napistu.network.ig_utils._get_top_n_nodes(graph: Graph, vals: Sequence, val_name: str, n: int = 10, ascending: bool = False) list[dict[str, Any]]

Get the top N nodes by a node attribute.

Parameters:
  • graph (Graph) – The network.

  • vals (Sequence) – Sequence of node attribute values.

  • val_name (str) – Name of the attribute.

  • n (int, optional) – Number of top nodes to return. Default is 10.

  • ascending (bool, optional) – If True, return nodes with smallest values; otherwise, largest. Default is False.

Returns:

Each dict contains the value and the node’s attributes.

Return type:

list of dict

napistu.network.ig_utils._get_top_n_objects(object_vals: Sequence, objects: Sequence, n: int = 10, ascending: bool = False) list

Get the top N objects based on a ranking measure.

napistu.network.ig_utils._get_universe_degrees(universe: Graph, directed: bool = False) Tuple[ndarray, ndarray]

Extract degree information from universe graph.

Always returns (out_degrees, in_degrees). For undirected graphs, out_degrees == in_degrees == total_degree.

Parameters:
  • universe (igraph.Graph) – Universe graph

  • directed (bool) – Whether to compute directed degrees. If True, computes separate out and in degrees. If False, returns total degree for both out and in.

Returns:

(out_degrees, in_degrees) where each is shape (n_vertices,) For undirected graphs, both arrays are identical.

Return type:

Tuple[np.ndarray, np.ndarray]

Notes

For undirected graphs, out_degree == in_degree == total_degree. This allows calling code to avoid branching logic based on directedness.

Examples

>>> # Works the same for both directed and undirected
>>> out_deg, in_deg = _get_universe_degrees(universe, directed=False)
>>> out_deg, in_deg = _get_universe_degrees(universe, directed=True)
napistu.network.ig_utils._get_universe_edge_filters(graph: Graph, edgelist: DataFrame | None = None, observed_only: bool = False) List[DataFrame]

Build list of edge filters from user inputs.

Parameters:
  • graph (Graph) – Source graph

  • edgelist (pd.DataFrame, optional) – User-provided edgelist with ‘source’ and ‘target’ columns

  • observed_only (bool) – If True, extract observed edges from graph

Returns:

List of edgelist DataFrames to filter by

Return type:

List[pd.DataFrame]

napistu.network.ig_utils._get_universe_vertex_names(graph: Graph, vertex_names: List[str] | Series | None = None) List[str]

Get and validate vertex names for the universe.

Parameters:
  • graph (Graph) – Source graph

  • vertex_names (list of str or pd.Series, optional) – Vertex names to include. If None, includes all vertices.

Returns:

List of vertex names to include in universe

Return type:

List[str]

Raises:

ValueError – If any vertex names are not found in the graph

napistu.network.ig_utils._parse_mask_input(mask_input: str | ndarray | List | Dict | None, attributes: List[str], verbose: bool = False) Dict[str, str | ndarray | List | None]

Parse mask input and convert to attribute-specific mask specifications.

Parameters:
  • mask_input (str, np.ndarray, List, Dict, or None) – Mask specification that can be: - None: use all nodes for all attributes - “attr”: use each attribute as its own mask - np.ndarray/List: use same mask for all attributes - Dict: attribute-specific mask specifications

  • attributes (List[str]) – List of attribute names.

  • verbose (bool, optional) – Whether to print the mask input parsing result. Default is False.

Returns:

Dictionary mapping each attribute to its mask specification.

Return type:

Dict[str, Union[str, np.ndarray, List, None]]

napistu.network.ig_utils._print_mask_input_result(masks)

Print a readable summary of the result of _parse_mask_input(mask_input, attributes). Shows each attribute and its corresponding mask specification.

napistu.network.ig_utils.create_induced_subgraph(graph: Graph, vertices: list[str] | None = None, n_vertices: int = 5000) Graph

Create a subgraph from an igraph including a set of vertices and their connections.

Parameters:
  • graph (igraph.Graph) – The input network.

  • vertices (list, optional) – List of vertex names to include. If None, a random sample is selected.

  • n_vertices (int, optional) – Number of vertices to sample if vertices is None. Default is 5000.

Returns:

The induced subgraph.

Return type:

Graph

napistu.network.ig_utils.define_graph_universe(graph: Graph, vertex_names: List[str] | Series | None = None, edgelist: DataFrame | None = None, observed_only: bool = False, edge_filter_logic: str = 'and', include_self_edges: bool = False) Graph

Create a graph defining the search space for enrichment-style analyses.

The graph represents all possible edges for the null model. By default (no filters), creates a COMPLETE graph on all vertices.

Parameters:
  • graph (Graph) – Source graph (used for vertex names and directionality)

  • vertex_names (list of str or pd.Series, optional) – Vertex names to include in universe (matching graph vertex ‘name’ attribute). If None, includes all vertices.

  • edgelist (pd.DataFrame, optional) – Two-column DataFrame with columns ‘source’ and ‘target’ containing vertex names. Specifies edges to include in universe. If None and observed_only=False, creates complete graph.

  • observed_only (bool) – If True, extract edgelist from original graph where ‘observed’ attribute is True.

  • edge_filter_logic (str) – How to combine edgelist and observed_only filters: - ‘and’: Keep edges in BOTH edgelists (intersection) - ‘or’: Keep edges in EITHER edgelist (union)

  • include_self_edges (bool) – If True, include self-edges (i -> i) in universe. Default is False.

Returns:

Universe graph with same directionality as source. Vertex indices match the filtered vertex set.

Return type:

Graph

napistu.network.ig_utils.filter_to_largest_subgraph(graph: Graph) Graph

Filter an igraph to its largest weakly connected component.

Parameters:

graph (Graph) – The input network.

Returns:

The largest weakly connected component.

Return type:

Graph

napistu.network.ig_utils.filter_to_largest_subgraphs(graph: Graph, top_k: int) list[Graph]

Filter an igraph to its largest weakly connected components.

Parameters:
  • graph (Graph) – The input network.

  • top_k (int) – The number of largest components to return.

Returns:

A list of the top K largest components as graphs.

Return type:

list[Graph]

napistu.network.ig_utils.get_graph_summary(graph: Graph) dict[str, Any]

Calculate common summary statistics for an igraph network.

Parameters:

graph (Graph) – The input network.

Returns:

A dictionary of summary statistics with the following keys: - n_edges (int): number of edges - n_vertices (int): number of vertices - n_components (int): number of weakly connected components - stats_component_sizes (dict): summary statistics for the component sizes - top10_large_components (list[dict]): the top 10 largest components with 10 example vertices - top10_smallest_components (list[dict]): the top 10 smallest components with 10 example vertices - average_path_length (float): the average shortest path length between all vertices - top10_betweenness (list[dict]): the top 10 vertices by betweenness centrality - top10_harmonic_centrality (list[dict]): the top 10 vertices by harmonic centrality

Return type:

dict

napistu.network.ig_utils.get_merge_keys(graph: Graph, merge_by: str = 'name') tuple[str, str, str]

Get the merge keys for a graph based on the merge_by parameter.

Parameters:
  • graph (Graph) – The graph to get the merge keys for.

  • merge_by (str) – The attribute to merge by. Must be one of IGRAPH_DEFS.NAME or IGRAPH_DEFS.INDEX.

Returns:

The merge keys.

Return type:

tuple[str, str, str]

Raises:

ValueError – If merge_by is not one of IGRAPH_DEFS.NAME or IGRAPH_DEFS.INDEX. If the vertex attribute is not unique. If the expected attributes are not present in the graph.

napistu.network.ig_utils.graph_to_pandas_dfs(graph: Graph) tuple[DataFrame, DataFrame]

Convert an igraph to Pandas DataFrames for vertices and edges.

Parameters:

graph (Graph) – An igraph network.

Returns:

  • vertices (pandas.DataFrame) – A table with one row per vertex.

  • edges (pandas.DataFrame) – A table with one row per edge.

napistu.network.ig_utils.validate_edge_attributes(graph: Graph, edge_attributes: list[str]) None

Validate that all required edge attributes exist in an igraph.

Parameters:
  • graph (Graph) – The network.

  • edge_attributes (list of str) – List of edge attribute names to check.

Return type:

None

Raises:
  • TypeError – If “edge_attributes” is not a list or str.

  • ValueError – If any required edge attribute is missing from the graph.

napistu.network.ig_utils.validate_vertex_attributes(graph: Graph, vertex_attributes: list[str]) None

Validate that all required vertex attributes exist in an igraph.

Parameters:
  • graph (Graph) – The network.

  • vertex_attributes (list of str) – List of vertex attribute names to check.

Return type:

None

Raises:
  • TypeError – If “vertex_attributes” is not a list or str.

  • ValueError – If any required vertex attribute is missing from the graph.