hydromt.data_catalog.uri_resolvers.AzureBlobResolver#

pydantic model hydromt.data_catalog.uri_resolvers.AzureBlobResolver[source]#

URIResolver for Azure Blob Storage and ADLS Gen2 URIs.

Handles three URI styles:

  • abfs://container/path/to/data.zarr (ADLS Gen2 / fsspec)

  • https://<account>.blob.core.windows.net/<container>/path (HTTPS Blob)

  • azureml://subscriptions/<sub>/resourcegroups/<rg>/workspaces/<ws>/datastores/<ds>/paths/<path> (AzureML datastore)

All styles are normalised to abfs:// internally and passed on to adlfs.AzureBlobFileSystem (fsspec-compatible). For azureml:// URIs the azure-ai-ml package is required.

Parameters:
  • options (dict, optional) – Key/value options forwarded to adlfs.AzureBlobFileSystem (e.g. account_name, account_key, sas_token, connection_string, tenant_id, client_id, client_secret). The special key sas_token_url points to a URL that returns JSON with a "token" key (e.g. the Planetary Computer SAS API). When set and no explicit sas_token is provided, a fresh token is fetched automatically before each resolve() call.

  • filesystem (FSSpecFileSystem, optional) – Pre-constructed fsspec filesystem. When supplied it is used as-is and options are ignored for filesystem construction.

Notes

Authentication

Credentials are resolved in the following order of precedence:

  1. Explicit values in the catalog YAML (account_name, account_key, sas_token, connection_string).

  2. Environment variables: AZURE_STORAGE_ACCOUNT_NAME, AZURE_STORAGE_ACCOUNT_KEY, AZURE_STORAGE_SAS_TOKEN, AZURE_STORAGE_CONNECTION_STRING.

  3. azure.identity.DefaultAzureCredential (Managed Identity, Azure CLI, VS Code, environment-variable service principals, etc.).

Time-templated URIs

Placeholders {year}, {month}, {day}, {variable} are expanded using the same _expand_uri_placeholders utility as ConventionResolver.

ABFS to HTTPS conversion

rasterio and GDAL do not natively understand the abfs:// scheme. When a SAS token is available the resolver converts resolved abfs:// URIs to HTTPS blob URLs (https://<account>.blob.core.windows.net/<container>/<path>?<sas_token>) so that rasterio / GDAL can open the data via their built-in HTTPS / vsicurl handler. For drivers that go through fsspec (e.g. xarray with zarr), abfs:// URIs work as-is via adlfs.

Examples

Minimal catalog entry (anonymous public container):

uk_coastal_dem:
  data_type: RasterDataset
  driver: raster
  uri_resolver:
    name: azure_blob
  uri: "abfs://public-data/dem/uk_2m.tif"

With explicit SAS token and time-templated path:

rainfall_ensemble:
  data_type: RasterDataset
  driver: raster_xarray
  uri_resolver:
    name: azure_blob
    options:
      account_name: mystorageaccount
      sas_token: "sv=2022-11-02&ss=b&..."
  uri: "abfs://hydrodata/rainfall/{year}/{month}/precip.nc"

With automatic SAS token fetching (Planetary Computer):

cop_dem:
  data_type: RasterDataset
  driver: rasterio
  uri_resolver:
    name: azure_blob
    options:
      account_name: elevationeuwest
      sas_token_url: "https://planetarycomputer.microsoft.com/api/sas/v1/token/cop-dem-glo-30"
  uri: "abfs://copernicus-dem/COP30_hh/{variable}.tif"

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

name: ClassVar[str] = 'azure_blob'#
resolve(uri: str, *, time_range: TimeRange | None = None, zoom: int | Tuple[float, str] | None = None, mask: GeoDataFrame | None = None, variables: list[str] | None = None, metadata: SourceMetadata | None = None, handle_nodata: NoDataStrategy = NoDataStrategy.RAISE) list[str][source]#

Resolve an Azure Blob / ADLS Gen2 URI into a list of concrete paths.

Parameters:
  • uri (str) –

    URI to resolve. Accepted forms:

    • abfs://container/path/to/file.tif

    • abfs://container/path/{year}/{month}/file.nc (time template)

    • https://<account>.blob.core.windows.net/<container>/path

    • azureml://subscriptions/…/datastores/…/paths/…

  • time_range (TimeRange | None, optional) – Left-inclusive start/end time of the data, by default None. Required when uri contains {year} or {month} placeholders.

  • zoom (Zoom | None, optional) – Ignored — included for interface compatibility, by default None.

  • mask (gpd.GeoDataFrame | None, optional) – Ignored — included for interface compatibility, by default None.

  • variables (list[str] | None, optional) – Variable names used to expand {variable} placeholders, by default None.

  • metadata (SourceMetadata | None, optional) – DataSource metadata, by default None.

  • handle_nodata (NoDataStrategy, optional) – How to react when no data is found, by default NoDataStrategy.RAISE.

Returns:

Concrete URIs that downstream drivers can open. When a SAS token is available, abfs:// paths are converted to HTTPS blob URLs so that rasterio / GDAL can read them directly (see Notes in the class docstring).

Return type:

list[str]

Raises:
  • ValueError – If the URI scheme is not recognised as an Azure path.

  • NoDataException – When no data is found and handle_nodata is NoDataStrategy.RAISE.