napistu.gcs.downloads

Module for downloading and loading Napistu public assets from GCS.

Functions

download_public_napistu_asset(asset, ...[, ...])

Download Public Napistu Asset

load_public_napistu_asset(asset, data_dir[, ...])

Load Public Napistu Asset

napistu.gcs.downloads._get_gcs_asset_path(asset: str, subasset: str | None, gcs_assets: GCSAssets) str

Get the GCS path for a given asset and subasset.

Parameters:
  • asset (str) – The name of the asset.

  • subasset (Optional[str]) – The name of the subasset.

  • gcs_assets (GCSAssets) – GCS assets configuration.

Returns:

The GCS path for the asset or subasset.

Return type:

str

napistu.gcs.downloads._remove_asset_files_if_needed(asset: str, data_dir: str, gcs_assets: GCSAssets | None = None) List[str]

Remove asset archive and any extracted directory from data_dir.

Parameters:
  • asset (str) – The asset key (e.g., ‘test_pathway’).

  • data_dir (str) – The directory where assets are stored.

  • gcs_assets (GCSAssets | None) – GCS assets configuration. If None (default), uses constants.GCS_ASSETS via from_dict.

Returns:

A list of the paths of the removed files.

Return type:

List[str]

napistu.gcs.downloads._validate_gcs_asset(asset: str, gcs_assets: GCSAssets) None

Validate a GCS asset by name.

napistu.gcs.downloads._validate_gcs_asset_version(asset: str, version: str | None, gcs_assets: GCSAssets) None

Validate a GCS asset version if specified.

napistu.gcs.downloads._validate_gcs_subasset(asset: str, subasset: str | None, gcs_assets: GCSAssets) None

Validate a subasset as belonging to a given asset.

napistu.gcs.downloads.download_public_napistu_asset(asset: str, version: str | None, out_path: str, gcs_assets: GCSAssets | None = None) None

Download Public Napistu Asset

Parameters:
  • asset (str) – The name of a Napistu public asset stored in Google Cloud Storage (GCS)

  • version (str) – The version of the asset to download

  • out_path (str) – Local location where the file should be saved.

  • gcs_assets (GCSAssets | None) – GCS assets configuration. If None (default), uses constants.GCS_ASSETS via from_dict. Can be overridden to use custom asset configurations.

Return type:

None

Examples

>>> from napistu.gcs import downloads
>>> from napistu.gcs.constants import GCS_ASSETS_NAMES
>>> downloads.download_public_napistu_asset(
...     asset=GCS_ASSETS_NAMES.TEST_PATHWAY,
...     version=None,
...     out_path="/tmp/test_pathway.tar.gz"
... )
napistu.gcs.downloads.load_public_napistu_asset(asset: str, data_dir: str, subasset: str | None = None, version: str | None = None, init_msg: str = 'The `data_dir` {data_dir} does not exist.', overwrite: bool = False, gcs_assets: GCSAssets | None = None) str

Load Public Napistu Asset

Download the asset asset to data_dir if it doesn’t already exist and return a path

Parameters:
  • asset (str) – The file to download (which will be unpacked if its a .tar.gz)

  • data_dir (str) – The local directory where assets should be stored

  • subasset (str) – The name of a subasset to load from within the asset bundle

  • version (str) – The version of the asset to load (if None, the latest version will be used)

  • init_msg (str) – Message to display if data_dir does not exist

  • overwrite (bool) – If True, always download the asset and re-extract it, even if it already exists

  • gcs_assets (GCSAssets | None) – GCS assets configuration. If None (default), uses constants.GCS_ASSETS via from_dict. Can be overridden to use custom asset configurations.

Returns:

asset_path: the path to a local file

Return type:

str

Examples

>>> from napistu.gcs import downloads
>>> from napistu.gcs.constants import GCS_ASSETS_NAMES, GCS_SUBASSET_NAMES
>>> path = downloads.load_public_napistu_asset(
...     asset=GCS_ASSETS_NAMES.TEST_PATHWAY,
...     data_dir="/tmp/napistu_data",
...     subasset=GCS_SUBASSET_NAMES.SBML_DFS
... )
>>> print(path)
/tmp/napistu_data/test_pathway/sbml_dfs.pkl