napistu.ontologies.id_tables
Functions
|
Filter an identifier table by identifiers, ontologies, and BQB terms for a given entity type. |
- napistu.ontologies.id_tables._sanitize_id_table_bqbs(bqbs: str | list | set, id_table: DataFrame) set
Sanitize and validate BQBs against the id_table.
- Parameters:
bqbs (str, list, or set) – BQB terms to validate.
id_table (pd.DataFrame) – DataFrame containing BQB reference data.
- Returns:
Set of validated BQB terms.
- Return type:
set
- napistu.ontologies.id_tables._sanitize_id_table_identifiers(identifiers: str | list | set, id_table: DataFrame) set
Sanitize and validate identifiers against the id_table.
- Parameters:
identifiers (str, list, or set) – Identifier values to validate.
id_table (pd.DataFrame) – DataFrame containing identifier reference data.
- Returns:
Set of validated identifiers.
- Return type:
set
- napistu.ontologies.id_tables._sanitize_id_table_ontologies(ontologies: str | list | set, id_table: DataFrame) set
Sanitize and validate ontologies against the id_table.
- Parameters:
ontologies (str, list, or set) – Ontology names to validate.
id_table (pd.DataFrame) – DataFrame containing ontology reference data.
- Returns:
Set of validated ontology names.
- Return type:
set
- napistu.ontologies.id_tables._sanitize_id_table_values(values: str | list | set, id_table: DataFrame, column_name: str, valid_values: Set[str] | None = None, value_type_name: str = None) set
Generic function to sanitize and validate values against an id_table column.
- Parameters:
values (str, list, or set) – Values to sanitize and validate. Can be a single string, list of strings, or set of strings.
id_table (pd.DataFrame) – DataFrame containing the reference data to validate against.
column_name (str) – Name of the column in id_table to check values against.
valid_values (set of str, optional) – Optional set of globally valid values for additional validation (e.g., VALID_BQB_TERMS). If provided, values must be a subset of this set.
value_type_name (str, optional) – Human-readable name for the value type used in error messages. If None, defaults to column_name.
- Returns:
Set of sanitized and validated values.
- Return type:
set
- Raises:
ValueError – If values is not a string, list, or set. If any values are not in valid_values (when provided). If none of the requested values are present in the id_table.
Warning
Logs a warning if some (but not all) requested values are missing from id_table.
- napistu.ontologies.id_tables._validate_id_table(id_table: DataFrame, entity_type: str) None
Validate that the id_table contains the required columns and matches the schema for the given entity_type.
- Parameters:
id_table (pd.DataFrame) – DataFrame containing identifier mappings for a given entity type.
entity_type (str) – The type of entity (e.g., ‘species’, ‘reactions’) to validate against the schema.
- Return type:
None
- Raises:
ValueError – If entity_type is not present in the schema, or if required columns are missing in id_table.
- napistu.ontologies.id_tables.filter_id_table(id_table: DataFrame, identifiers: str | list | set | None = None, ontologies: str | list | set | None = None, bqbs: str | list | set | None = ['BQB_IS', 'BQB_IS_HOMOLOG_TO', 'BQB_IS_ENCODED_BY', 'BQB_ENCODES', 'BQB_HAS_PART']) DataFrame
Filter an identifier table by identifiers, ontologies, and BQB terms for a given entity type.
- Parameters:
id_table (pd.DataFrame) – DataFrame containing identifier mappings to be filtered.
identifiers (str, list, set, or None, optional) – Identifiers to filter by. If None, no filtering is applied on identifiers.
ontologies (str, list, set, or None, optional) – Ontologies to filter by. If None, no filtering is applied on ontologies.
bqbs (str, list, set, or None, optional) – BQB terms to filter by. If None, no filtering is applied on BQB terms. Default is [BQB.IS, BQB.HAS_PART].
- Returns:
Filtered DataFrame containing only rows matching the specified criteria.
- Return type:
pd.DataFrame
- Raises:
ValueError – If the id_table or filter values are invalid, or required columns are missing.