napistu.ingestion.string

Functions

convert_string_to_sbml_dfs(string_uri, ...)

Ingests string to sbml dfs

download_string(target_uri, organismal_species)

Downloads string to the target uri

download_string_aliases(target_uri, ...)

Downloads string aliases to the target uri

get_string_species_url(organismal_species, asset)

STRING Species URL

napistu.ingestion.string._build_interactor_edgelist(edgelist: DataFrame, upstream_col_name: str = 'protein1', downstream_col_name: str = 'protein2', add_reverse_interactions: bool = False) DataFrame

Format STRING interactions as reactions.

napistu.ingestion.string._build_species_df(edgelist: DataFrame, aliases: DataFrame, alias_to_identifier: dict, source_col: str = 'protein1', target_col: str = 'protein2') DataFrame

Builds the species dataframe from the edgelist and aliases

Parameters:
  • edgelist (pd.DataFrame) – edgelist

  • aliases (pd.DataFrame) – aliases

  • alias_to_identifier (dict[str, tuple[str, str]]) – map from an alias source to an ontology and a qualifier

  • source_col (str) – source column name

  • target_col (str) – target column name

Returns:

species dataframe

Return type:

pd.DataFrame

napistu.ingestion.string._build_string_reaction_name(from_col: Series, to_col: Series) Series

Helper to build the reaction name for string reactions

Parameters:
  • from_col (pd.Series) – from species

  • to_col (pd.Series) – to species

Returns:

new name column

Return type:

pd.Series

napistu.ingestion.string._get_identifiers(row: DataFrame, alias_to_identifier: dict[str, tuple[str, str]], dat_alias: DataFrame) Identifiers

Helper function to get identifiers from a row of the string alias file

Parameters:
  • row (pd.DataFrame) – grouped dataframe

  • alias_to_identifier (dict[str, tuple[str, str]]) – map from an alias source to an ontology and a qualifier

  • dat_alias (pd.DataFrame) – Helper dataframe with index=string_protein_id and columns=source (the source name), alias (the identifier)

Returns:

An Identifiers object containing all identifiers

Return type:

Identifiers

napistu.ingestion.string._read_string(string_uri: str) DataFrame

Reads STRING interactions from uri (supports plain text or .gz).

Parameters:

string_uri (str) – string uri

Returns:

string edgelist

Return type:

pd.DataFrame

napistu.ingestion.string._read_string_aliases(string_aliases_uri: str) DataFrame

Reads STRING aliases from uri (supports plain text or .gz).

Parameters:

string_aliases_uri (str) – string aliases uri

Returns:

string aliases

Return type:

pd.DataFrame

napistu.ingestion.string.convert_string_to_sbml_dfs(string_uri: str, string_aliases_uri: str, organismal_species: str | OrganismalSpeciesValidator) SBML_dfs

Ingests string to sbml dfs

Parameters:
  • string_uri (str) – URI for the string interactions file

  • string_aliases_uri (str) – URI for the string aliases file

  • organismal_species (str | OrganismalSpeciesValidator) – A species name: e.g., Homo sapiens

Returns:

A STRING pathway representation as an SBML_dfs object

Return type:

SBML_dfs

napistu.ingestion.string.download_string(target_uri: str, organismal_species: str | OrganismalSpeciesValidator) None

Downloads string to the target uri

Parameters:
Return type:

None

napistu.ingestion.string.download_string_aliases(target_uri: str, organismal_species: str | OrganismalSpeciesValidator) None

Downloads string aliases to the target uri

Parameters:
Return type:

None

napistu.ingestion.string.get_string_species_url(organismal_species: str | OrganismalSpeciesValidator, asset: str, version: float = 11.5) str

STRING Species URL

Construct urls for downloading specific STRING tables

Parameters:
  • organismal_species (str | OrganismalSpeciesValidator) – A species name: e.g., Homo sapiens.

  • asset (str) – The type of table to be downloaded. Currently “interactions” or “aliases”.

  • version (float) – The version of STRING to work with.

Returns:

str

Return type:

The download url