napistu.ingestion.organismal_species

This module contains the OrganismalSpeciesValidator class, which is used to validate and convert between common and Latin species names.

Classes

OrganismalSpeciesValidator:

A class for validating and converting between common and Latin species names.

Classes

OrganismalSpeciesValidator(species_input)

A class for validating and converting between common and Latin species names.

class napistu.ingestion.organismal_species.OrganismalSpeciesValidator(species_input: str)

Bases: object

A class for validating and converting between common and Latin species names.

Accepts either common names (e.g., ‘human’) or Latin names (e.g., ‘Homo sapiens’) and provides access to both forms through attributes.

Parameters:

species_input (str) – Either a common species name (e.g., ‘human’) or Latin species name (e.g., ‘Homo sapiens’). Case-insensitive.

common_name

The common species name (e.g., ‘human’).

Type:

str

latin_name

The Latin species name (e.g., ‘Homo sapiens’).

Type:

str

Public Methods
--------------
assert_supported

Assert that this species is supported, raising an exception if not.

ensure

Ensure that organismal_species is an OrganismalSpeciesValidator object.

get_available_species

Return a dictionary of all available species names.

lookup_custom_value

Look up a custom value for this species from a provided table.

validate_against_supported

Validate that this species is supported by a specific function or analysis.

Private Methods
---------------
_validate_and_set_species

Validate input and set both Latin and common names.

Raises:

ValueError – If the provided species_input is not recognized or is not a string.

Examples

>>> species = OrganismalSpeciesValidator("Homo sapiens")
>>> species.common_name
'human'
>>> species.latin_name
'Homo sapiens'
>>> species = OrganismalSpeciesValidator("mouse")
>>> species.latin_name
'Mus musculus'
>>> species = OrganismalSpeciesValidator("HUMAN")  # case-insensitive
>>> species.latin_name
'Homo sapiens'
classmethod ensure(organismal_species: str | OrganismalSpeciesValidator) OrganismalSpeciesValidator

Ensure that organismal_species is an OrganismalSpeciesValidator object.

If organismal_species is a string, it will be converted to an OrganismalSpeciesValidator. If it’s already an OrganismalSpeciesValidator, it will be returned as-is.

Parameters:

organismal_species (Union[str, OrganismalSpeciesValidator]) – Either a string species name or an OrganismalSpeciesValidator object

Returns:

The OrganismalSpeciesValidator object

Return type:

OrganismalSpeciesValidator

Raises:

ValueError – If organismal_species is neither a string nor an OrganismalSpeciesValidator

Examples

>>> validator = OrganismalSpeciesValidator.ensure("human")
>>> isinstance(validator, OrganismalSpeciesValidator)
True
>>> validator.latin_name
'Homo sapiens'
>>> existing_validator = OrganismalSpeciesValidator("mouse")
>>> validator = OrganismalSpeciesValidator.ensure(existing_validator)
>>> validator is existing_validator
True
classmethod get_available_species() Dict[str, list]

Return a dictionary of all available species names.

Returns:

Dictionary with keys ‘latin_names’ and ‘common_names’, each containing a list of available species names.

Return type:

Dict[str, list]

Examples

>>> available = OrganismalSpeciesValidator.get_available_species()
>>> available['latin_names']
['Homo sapiens', 'Mus musculus', ...]
>>> available['common_names']
['human', 'mouse', ...]
__init__(species_input: str) None

Initialize with either common or Latin species name.

Parameters:

species_input (str) – Either common name (e.g., ‘human’) or Latin species name (e.g., ‘Homo sapiens’). Case-insensitive.

Raises:

ValueError – If species_input is not recognized or is not a string.

_validate_and_set_species(species_input: str) None

Validate input and set both Latin and common names.

Parameters:

species_input (str) – The species name to validate and normalize.

Raises:

ValueError – If species_input is not a string or not found in known species.

assert_supported(supported_species: List[str] | Set[str], context: str = '') None

Assert that this species is supported, raising an exception if not.

Parameters:
  • supported_species (Union[List[str], Set[str]]) – Collection of supported species names. Can contain either common names, Latin names, or a mix of both. Case-insensitive matching.

  • context (str, optional) – Additional context for the error message (e.g., function name).

Raises:

ValueError – If this species is not in the supported list.

Examples

>>> species = OrganismalSpeciesValidator("human")
>>> species.assert_supported(["human", "mouse"], "my_analysis_function")
# No exception raised
>>> species = OrganismalSpeciesValidator("fly")
>>> species.assert_supported(["human", "mouse"], "my_analysis_function")
ValueError: Species 'fly (Drosophila melanogaster)' not supported by my_analysis_function...
lookup_custom_value(custom_table: Dict[str, Any], is_latin: bool = True) Any

Look up a custom value for this species from a provided table.

Parameters:
  • custom_table (Dict[str, Any]) – Dictionary mapping species names to custom values. Keys should be either Latin names (if is_latin=True) or common names (if is_latin=False).

  • is_latin (bool, default True) – If True, treats the keys in custom_table as Latin species names. If False, treats the keys as common species names.

Returns:

The value associated with this species in the custom table.

Return type:

Any

Raises:

ValueError – If this species is not found in the custom table.

Examples

>>> # Custom table with Latin names as keys
>>> psi_table = {
...     "Homo sapiens": "human",
...     "Mus musculus": "mouse",
...     "Saccharomyces cerevisiae": "yeast"
... }
>>> species = OrganismalSpeciesValidator("human")
>>> species.lookup_custom_value(psi_table, is_latin=True)
'human'
>>> # Custom table with common names as keys
>>> custom_ids = {
...     "human": "HUMAN_001",
...     "mouse": "MOUSE_001"
... }
>>> species = OrganismalSpeciesValidator("Homo sapiens")
>>> species.lookup_custom_value(custom_ids, is_latin=False)
'HUMAN_001'
>>> # Species not in table raises error
>>> species = OrganismalSpeciesValidator("fly")
>>> species.lookup_custom_value(psi_table, is_latin=True)
ValueError: Species 'fly (Drosophila melanogaster)' not found in custom table...
validate_against_supported(supported_species: List[str] | Set[str]) bool

Validate that this species is supported by a specific function or analysis.

Parameters:

supported_species (Union[List[str], Set[str]]) – Collection of supported species names. Can contain either common names, Latin names, or a mix of both. Case-insensitive matching.

Returns:

True if this species is in the supported list, False otherwise.

Return type:

bool

Examples

>>> species = OrganismalSpeciesValidator("human")
>>> species.validate_against_supported(["human", "mouse", "rat"])
True
>>> species = OrganismalSpeciesValidator("Homo sapiens")
>>> species.validate_against_supported(["Homo sapiens", "Mus musculus"])
True
>>> species = OrganismalSpeciesValidator("fly")
>>> species.validate_against_supported(["human", "mouse"])
False
property common_name: str

Get the common species name.

Returns:

The common species name (e.g., ‘human’).

Return type:

str

property latin_name: str

Get the Latin species name.

Returns:

The Latin species name (e.g., ‘Homo sapiens’).

Return type:

str