napistu.gcs.downloads
Module for downloading and loading Napistu public assets from GCS.
Functions
|
Download Public Napistu Asset |
|
Load Public Napistu Asset |
- napistu.gcs.downloads._get_gcs_asset_path(asset: str, subasset: str | None, gcs_assets: GCSAssets) str
Get the GCS path for a given asset and subasset.
- Parameters:
asset (str) – The name of the asset.
subasset (Optional[str]) – The name of the subasset.
gcs_assets (GCSAssets) – GCS assets configuration.
- Returns:
The GCS path for the asset or subasset.
- Return type:
str
- napistu.gcs.downloads._remove_asset_files_if_needed(asset: str, data_dir: str, gcs_assets: GCSAssets | None = None) List[str]
Remove asset archive and any extracted directory from data_dir.
- Parameters:
asset (str) – The asset key (e.g., ‘test_pathway’).
data_dir (str) – The directory where assets are stored.
gcs_assets (GCSAssets | None) – GCS assets configuration. If None (default), uses constants.GCS_ASSETS via from_dict.
- Returns:
A list of the paths of the removed files.
- Return type:
List[str]
- napistu.gcs.downloads._validate_gcs_asset(asset: str, gcs_assets: GCSAssets) None
Validate a GCS asset by name.
- napistu.gcs.downloads._validate_gcs_asset_version(asset: str, version: str | None, gcs_assets: GCSAssets) None
Validate a GCS asset version if specified.
- napistu.gcs.downloads._validate_gcs_subasset(asset: str, subasset: str | None, gcs_assets: GCSAssets) None
Validate a subasset as belonging to a given asset.
- napistu.gcs.downloads.download_public_napistu_asset(asset: str, version: str | None, out_path: str, gcs_assets: GCSAssets | None = None) None
Download Public Napistu Asset
- Parameters:
asset (str) – The name of a Napistu public asset stored in Google Cloud Storage (GCS)
version (str) – The version of the asset to download
out_path (str) – Local location where the file should be saved.
gcs_assets (GCSAssets | None) – GCS assets configuration. If None (default), uses constants.GCS_ASSETS via from_dict. Can be overridden to use custom asset configurations.
- Return type:
None
Examples
>>> from napistu.gcs import downloads >>> from napistu.gcs.constants import GCS_ASSETS_NAMES >>> downloads.download_public_napistu_asset( ... asset=GCS_ASSETS_NAMES.TEST_PATHWAY, ... version=None, ... out_path="/tmp/test_pathway.tar.gz" ... )
- napistu.gcs.downloads.load_public_napistu_asset(asset: str, data_dir: str, subasset: str | None = None, version: str | None = None, init_msg: str = 'The `data_dir` {data_dir} does not exist.', overwrite: bool = False, gcs_assets: GCSAssets | None = None) str
Load Public Napistu Asset
Download the asset asset to data_dir if it doesn’t already exist and return a path
- Parameters:
asset (str) – The file to download (which will be unpacked if its a .tar.gz)
data_dir (str) – The local directory where assets should be stored
subasset (str) – The name of a subasset to load from within the asset bundle
version (str) – The version of the asset to load (if None, the latest version will be used)
init_msg (str) – Message to display if data_dir does not exist
overwrite (bool) – If True, always download the asset and re-extract it, even if it already exists
gcs_assets (GCSAssets | None) – GCS assets configuration. If None (default), uses constants.GCS_ASSETS via from_dict. Can be overridden to use custom asset configurations.
- Returns:
asset_path: the path to a local file
- Return type:
str
Examples
>>> from napistu.gcs import downloads >>> from napistu.gcs.constants import GCS_ASSETS_NAMES, GCS_SUBASSET_NAMES >>> path = downloads.load_public_napistu_asset( ... asset=GCS_ASSETS_NAMES.TEST_PATHWAY, ... data_dir="/tmp/napistu_data", ... subasset=GCS_SUBASSET_NAMES.SBML_DFS ... ) >>> print(path) /tmp/napistu_data/test_pathway/sbml_dfs.pkl