napistu.ingestion.string
Functions
|
Ingests string to sbml dfs |
|
Downloads string to the target uri |
|
Downloads string aliases to the target uri |
|
STRING Species URL |
- napistu.ingestion.string._build_interactor_edgelist(edgelist: DataFrame, upstream_col_name: str = 'protein1', downstream_col_name: str = 'protein2', add_reverse_interactions: bool = False) DataFrame
Format STRING interactions as reactions.
- napistu.ingestion.string._build_species_df(edgelist: DataFrame, aliases: DataFrame, alias_to_identifier: dict, source_col: str = 'protein1', target_col: str = 'protein2') DataFrame
Builds the species dataframe from the edgelist and aliases
- Parameters:
edgelist (pd.DataFrame) – edgelist
aliases (pd.DataFrame) – aliases
alias_to_identifier (dict[str, tuple[str, str]]) – map from an alias source to an ontology and a qualifier
source_col (str) – source column name
target_col (str) – target column name
- Returns:
species dataframe
- Return type:
pd.DataFrame
- napistu.ingestion.string._build_string_reaction_name(from_col: Series, to_col: Series) Series
Helper to build the reaction name for string reactions
- Parameters:
from_col (pd.Series) – from species
to_col (pd.Series) – to species
- Returns:
new name column
- Return type:
pd.Series
- napistu.ingestion.string._get_identifiers(row: DataFrame, alias_to_identifier: dict[str, tuple[str, str]], dat_alias: DataFrame) Identifiers
Helper function to get identifiers from a row of the string alias file
- Parameters:
row (pd.DataFrame) – grouped dataframe
alias_to_identifier (dict[str, tuple[str, str]]) – map from an alias source to an ontology and a qualifier
dat_alias (pd.DataFrame) – Helper dataframe with index=string_protein_id and columns=source (the source name), alias (the identifier)
- Returns:
An Identifiers object containing all identifiers
- Return type:
- napistu.ingestion.string._read_string(string_uri: str) DataFrame
Reads STRING interactions from uri (supports plain text or .gz).
- Parameters:
string_uri (str) – string uri
- Returns:
string edgelist
- Return type:
pd.DataFrame
- napistu.ingestion.string._read_string_aliases(string_aliases_uri: str) DataFrame
Reads STRING aliases from uri (supports plain text or .gz).
- Parameters:
string_aliases_uri (str) – string aliases uri
- Returns:
string aliases
- Return type:
pd.DataFrame
- napistu.ingestion.string.convert_string_to_sbml_dfs(string_uri: str, string_aliases_uri: str, organismal_species: str | OrganismalSpeciesValidator) SBML_dfs
Ingests string to sbml dfs
- Parameters:
string_uri (str) – URI for the string interactions file
string_aliases_uri (str) – URI for the string aliases file
organismal_species (str | OrganismalSpeciesValidator) – A species name: e.g., Homo sapiens
- Returns:
A STRING pathway representation as an SBML_dfs object
- Return type:
- napistu.ingestion.string.download_string(target_uri: str, organismal_species: str | OrganismalSpeciesValidator) None
Downloads string to the target uri
- Parameters:
target_uri (str) – target url
organismal_species (str | OrganismalSpeciesValidator) – A species name: e.g., Homo sapiens
- Return type:
None
- napistu.ingestion.string.download_string_aliases(target_uri: str, organismal_species: str | OrganismalSpeciesValidator) None
Downloads string aliases to the target uri
- Parameters:
target_uri (str) – target url
organismal_species (str | OrganismalSpeciesValidator) – A species name: e.g., Homo sapiens
- Return type:
None
- napistu.ingestion.string.get_string_species_url(organismal_species: str | OrganismalSpeciesValidator, asset: str, version: float = 11.5) str
STRING Species URL
Construct urls for downloading specific STRING tables
- Parameters:
organismal_species (str | OrganismalSpeciesValidator) – A species name: e.g., Homo sapiens.
asset (str) – The type of table to be downloaded. Currently “interactions” or “aliases”.
version (float) – The version of STRING to work with.
- Returns:
str
- Return type:
The download url