napistu.identifiers
Systematic identifiers for species, reactions, compartments, etc.
Classes
- Identifiers
Identifiers for a single entity or relationship.
Public Functions
- construct_cspecies_identifiers
Construct compartmentalized species identifiers by adding sc_id to species_identifiers.
- df_to_identifiers
Convert a DataFrame of identifier information to a Series of Identifiers objects.
Functions
Construct compartmentalized species identifiers by adding sc_id to species_identifiers. |
|
Convert a DataFrame of identifier information to a Series of Identifiers objects. |
Classes
|
Identifiers for a single entity or relationship. |
- class napistu.identifiers.Identifiers(id_list: list, verbose: bool = False)
Bases:
objectIdentifiers for a single entity or relationship.
- df
a DataFrame of identifiers with columns ontology, identifier, url, bqb
- Type:
pd.DataFrame
- Properties
- ----------
- ids
(deprecated) a list of identifiers which are each a dict containing an ontology and identifier
- Type:
list
- Public Methods
- -------
- get_all_bqbs
Returns a set of all BQB entries
- get_all_ontologies
Returns a set of all ontology entries
- has_ontology(ontologies)
Returns a bool of whether 1+ of the ontologies was represented
- hoist(ontology)
Returns value(s) from an ontology
- print
Print a table of identifiers
- classmethod merge(identifier_series: Series) Identifiers
Merge multiple Identifiers objects into a single Identifiers object.
- Parameters:
identifier_series (pd.Series) – Series of Identifiers objects to merge
- Returns:
New Identifiers object containing all unique identifiers
- Return type:
- __init__(id_list: list, verbose: bool = False) None
Tracks a set of identifiers and the ontologies they belong to.
- Parameters:
id_list (list) – a list of identifier dictionaries containing ontology, identifier, and optionally url
verbose (bool) – extra reporting, defaults to False
- Return type:
None.
- get_all_bqbs() set[str]
Returns a set of all BQB entries
- Returns:
A set containing all unique BQB values from the identifiers
- Return type:
set[str]
- get_all_ontologies(bqb_terms: list[str] = None) set[str]
Returns a set of all ontology entries
- Returns:
A set containing all unique ontology names from the identifiers
- Return type:
set[str]
- has_ontology(ontologies: str | list[str]) bool
Check if specified ontologies are present in the identifiers.
- Parameters:
ontologies (str or list of str) – Ontology name(s) to search for
- Returns:
True if any specified ontologies are present
- Return type:
bool
- hoist(ontology: str, squeeze: bool = True) str | list[str] | None
Returns value(s) from an ontology
- Parameters:
ontology (str) – the ontology of interest
squeeze (bool) – if True, return a single value if possible
- Returns:
the value(s) of an ontology of interest
- Return type:
str or list
- print()
Print a table of identifiers
- property ids: list[dict]
- class napistu.identifiers._IdentifierValidator(*, ontology: str, identifier: str, bqb: str, url: str | None = None)
Bases:
BaseModel- _abc_impl = <_abc._abc_data object>
- bqb: str
- identifier: str
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- ontology: str
- url: str | None
- class napistu.identifiers._IdentifiersValidator(*, id_list: list[_IdentifierValidator])
Bases:
BaseModel- _abc_impl = <_abc._abc_data object>
- id_list: list[_IdentifierValidator]
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- napistu.identifiers._check_species_identifiers_table(species_identifiers: DataFrame, required_vars: set = {'bqb', 'identifier', 'ontology', 's_id', 's_name'})
- napistu.identifiers._deduplicate_identifiers_by_priority(df: DataFrame, group_cols: list[str]) DataFrame
Deduplicate identifiers by prioritizing BQB terms and URL presence.
- Parameters:
df (pd.DataFrame) – DataFrame containing identifier information with BQB and URL columns
group_cols (list[str]) – Columns to group by for deduplication (e.g., [ontology, identifier] or [pk, ontology, identifier])
- Returns:
Deduplicated DataFrame with highest priority entries retained
- Return type:
pd.DataFrame
- napistu.identifiers._prepare_species_identifiers(sbml_dfs: SBML_dfs, dogmatic: bool = False, species_identifiers: pd.DataFrame | None = None) pd.DataFrame
Accepts and validates species_identifiers, or extracts a fresh table if None.
- napistu.identifiers._validate_assets_sbml_ids(sbml_dfs: SBML_dfs, identifiers_df: pd.DataFrame) None
Check an sbml_dfs file and identifiers table for inconsistencies.
- Parameters:
sbml_dfs (sbml_dfs_core.SBML_dfs) – The sbml_dfs object to check
identifiers_df (pd.DataFrame) – The identifiers table to check
- Return type:
None
- Raises:
ValueError – If there are inconsistencies between the sbml_dfs and identifiers_df
- napistu.identifiers.construct_cspecies_identifiers(species_identifiers: pd.DataFrame, cspecies_references: 'SBML_dfs' | pd.DataFrame) pd.DataFrame
Construct compartmentalized species identifiers by adding sc_id to species_identifiers.
This function merges compartmentalized species IDs (sc_id) into a species_identifiers table, allowing you to work with compartmentalized species without loading the full sbml_dfs object.
- Parameters:
species_identifiers (pd.DataFrame) – A species identifiers table with columns including s_id, ontology, identifier. Must satisfy SPECIES_IDENTIFIERS_REQUIRED_VARS.
cspecies_references (Union[sbml_dfs_core.SBML_dfs, pd.DataFrame]) – Either an sbml_dfs object from which compartmentalized_species will be extracted, or a 2-column DataFrame with s_id and sc_id columns.
- Returns:
The species_identifiers table with an additional sc_id column. Each row in the original table will be expanded to include all corresponding sc_ids for that s_id.
- Return type:
pd.DataFrame
- napistu.identifiers.df_to_identifiers(df: DataFrame) Series
Convert a DataFrame of identifier information to a Series of Identifiers objects.
- Parameters:
df (pd.DataFrame) – DataFrame containing identifier information with required columns: ontology, identifier, url, bqb
- Returns:
Series indexed by index_col containing Identifiers objects
- Return type:
pd.Series