napistu.identifiers

Systematic identifiers for species, reactions, compartments, etc.

Classes

Identifiers: Identifiers for a single entity or relationship.

Public Functions

construct_cspecies_identifiers: Construct compartmentalized species identifiers by adding sc_id to species_identifiers.
df_to_identifiers: Convert a DataFrame of identifier information to a Series of Identifiers objects.

Functions

`construct_cspecies_identifiers`(...)	Construct compartmentalized species identifiers by adding sc_id to species_identifiers.
`df_to_identifiers`(df)	Convert a DataFrame of identifier information to a Series of Identifiers objects.

Classes

Identifiers(id_list[, verbose])

Identifiers for a single entity or relationship.

class napistu.identifiers.Identifiers(id_list: list, verbose: bool = False)

Bases: object

Identifiers for a single entity or relationship.

df

a DataFrame of identifiers with columns ontology, identifier, url, bqb

Type:: pd.DataFrame

Properties

----------

ids

(deprecated) a list of identifiers which are each a dict containing an ontology and identifier

Type:: list

Public Methods

-------

get_all_bqbs: Returns a set of all BQB entries

get_all_ontologies: Returns a set of all ontology entries

has_ontology(ontologies): Returns a bool of whether 1+ of the ontologies was represented

hoist(ontology): Returns value(s) from an ontology

print: Print a table of identifiers

classmethod merge(identifier_series: Series) → Identifiers

Merge multiple Identifiers objects into a single Identifiers object.

Parameters:: identifier_series (pd.Series) – Series of Identifiers objects to merge
Returns:: New Identifiers object containing all unique identifiers
Return type:: Identifiers

__init__(id_list: list, verbose: bool = False) → None

Tracks a set of identifiers and the ontologies they belong to.

Parameters:

id_list (list) – a list of identifier dictionaries containing ontology, identifier, and optionally url
verbose (bool) – extra reporting, defaults to False

Return type:

None.

get_all_bqbs() → set[str]

Returns a set of all BQB entries

Returns:: A set containing all unique BQB values from the identifiers
Return type:: set[str]

get_all_ontologies(bqb_terms: list[str] = None) → set[str]

Returns a set of all ontology entries

Returns:: A set containing all unique ontology names from the identifiers
Return type:: set[str]

has_ontology(ontologies: str | list[str]) → bool

Check if specified ontologies are present in the identifiers.

Parameters:: ontologies (str or list of str) – Ontology name(s) to search for
Returns:: True if any specified ontologies are present
Return type:: bool

hoist(ontology: str, squeeze: bool = True) → str | list[str] | None

Returns value(s) from an ontology

Parameters:

ontology (str) – the ontology of interest
squeeze (bool) – if True, return a single value if possible

Returns:

the value(s) of an ontology of interest

Return type:

str or list

print(): Print a table of identifiers

property ids: list[dict]

class napistu.identifiers._IdentifierValidator(*, ontology: str, identifier: str, bqb: str, url: str | None = None)

Bases: BaseModel

_abc_impl = <_abc._abc_data object>

bqb: str

identifier: str

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

ontology: str

url: str | None

class napistu.identifiers._IdentifiersValidator(*, id_list: list[_IdentifierValidator])

Bases: BaseModel

_abc_impl = <_abc._abc_data object>

id_list: list[_IdentifierValidator]

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

napistu.identifiers._check_species_identifiers_table(species_identifiers: DataFrame, required_vars: set = {'bqb', 'identifier', 'ontology', 's_id', 's_name'})

napistu.identifiers._deduplicate_identifiers_by_priority(df: DataFrame, group_cols: list[str]) → DataFrame

Deduplicate identifiers by prioritizing BQB terms and URL presence.

Parameters:

df (pd.DataFrame) – DataFrame containing identifier information with BQB and URL columns
group_cols (list[str]) – Columns to group by for deduplication (e.g., [ontology, identifier] or [pk, ontology, identifier])

Returns:

Deduplicated DataFrame with highest priority entries retained

Return type:

pd.DataFrame

napistu.identifiers._prepare_species_identifiers(sbml_dfs: SBML_dfs, dogmatic: bool = False, species_identifiers: pd.DataFrame | None = None) → pd.DataFrame: Accepts and validates species_identifiers, or extracts a fresh table if None.

napistu.identifiers._validate_assets_sbml_ids(sbml_dfs: SBML_dfs, identifiers_df: pd.DataFrame) → None

Check an sbml_dfs file and identifiers table for inconsistencies.

Parameters:

sbml_dfs (sbml_dfs_core.SBML_dfs) – The sbml_dfs object to check
identifiers_df (pd.DataFrame) – The identifiers table to check

Return type:

None

Raises:

ValueError – If there are inconsistencies between the sbml_dfs and identifiers_df

napistu.identifiers.construct_cspecies_identifiers(species_identifiers: pd.DataFrame, cspecies_references: 'SBML_dfs' | pd.DataFrame) → pd.DataFrame

Construct compartmentalized species identifiers by adding sc_id to species_identifiers.

This function merges compartmentalized species IDs (sc_id) into a species_identifiers table, allowing you to work with compartmentalized species without loading the full sbml_dfs object.

Parameters:

species_identifiers (pd.DataFrame) – A species identifiers table with columns including s_id, ontology, identifier. Must satisfy SPECIES_IDENTIFIERS_REQUIRED_VARS.
cspecies_references (Union[sbml_dfs_core.SBML_dfs, pd.DataFrame]) – Either an sbml_dfs object from which compartmentalized_species will be extracted, or a 2-column DataFrame with s_id and sc_id columns.

Returns:

The species_identifiers table with an additional sc_id column. Each row in the original table will be expanded to include all corresponding sc_ids for that s_id.

Return type:

pd.DataFrame

napistu.identifiers.df_to_identifiers(df: DataFrame) → Series

Convert a DataFrame of identifier information to a Series of Identifiers objects.

Parameters:: df (pd.DataFrame) – DataFrame containing identifier information with required columns: ontology, identifier, url, bqb
Returns:: Series indexed by index_col containing Identifiers objects
Return type:: pd.Series