veriflow.datasources.zarr#
Read Zarr stores from local disk or S3.
Classes
|
A datasource for reading Zarr stores compatible with the internal datamodel. |
|
A Zarr config element. |
|
Get S3 credentials and connection info safely from environment variables. |
- class veriflow.datasources.zarr.Zarr(config)[source]#
A datasource for reading Zarr stores compatible with the internal datamodel.
Wraps
xarray.open_zarr()and supports both local filesystem paths and remote URLs (currentlys3://is exercised). For S3 stores, credentials are taken from aS3AuthConfiginstance loaded from environment variables prefixed withS3_. Additionalstorage_optionsconfigured on theZarrConfigare merged on top and forwarded toxr.open_zarr.Note
The dataset must carry a
data_typeattribute that matches one of the supported data types; if the attribute is missing it will be set from the configuration.- Parameters:
config (ZarrConfig)
- config_class#
alias of
ZarrConfig
- class veriflow.datasources.zarr.ZarrConfig(*, import_adapter, source, data_type, general, id_mapping=None, path, auth_config=None, storage_options=None, consolidated=None, **extra_data)[source]#
A Zarr config element.
Reads a single Zarr store via
xarray.open_zarr(). Thepathmay point to a local directory or a remote location (e.g.s3://bucket/key/store.zarr). When the path uses ans3://URL, credentials and connection details are taken fromauth_config(anS3AuthConfig), which is populated from environment variables prefixed withS3_. Additionalstorage_optionsare merged on top of the ones derived fromauth_configand forwarded toxr.open_zarr.- Parameters:
import_adapter (Literal[DataSourceKind.ZARR])
source (Annotated[str, StringConstraints(strip_whitespace=None, to_upper=None, to_lower=None, strict=None, min_length=1, max_length=None, pattern=^[A-Za-z][A-Za-z0-9_]*$, ascii_only=None), MinLen(min_length=1)])
data_type (DataType)
general (Annotated[GeneralInfoConfig, SkipJsonSchema()])
id_mapping (Annotated[IdMappingConfig, SkipJsonSchema()] | None)
auth_config (S3AuthConfig | None)
consolidated (bool | None)
extra_data (Any)
- path: Annotated[str, FieldInfo(annotation=NoneType, required=True, description="Path to a single Zarr store. Local filesystem path (absolute or relative) or a remote URL such as 's3://bucket/key/store.zarr'.", metadata=[MinLen(min_length=1)])]#
- auth_config: Annotated[S3AuthConfig | None, FieldInfo(annotation=NoneType, required=False, default=None, description="Authentication configuration for remote stores. Only consulted when 'path' points to an 's3://' location. Credentials are loaded from S3_-prefixed environment variables; instantiate as 'auth_config: {}' in YAML to enable env-based loading.")]#
- storage_options: Annotated[dict[str, str] | None, FieldInfo(annotation=NoneType, required=False, default=None, description="Additional storage_options forwarded to xr.open_zarr. Merged on top of the options derived from 'auth_config'. Use this for advanced fsspec / s3fs settings not exposed by S3AuthConfig.")]#
- class veriflow.datasources.zarr.S3AuthConfig(_case_sensitive=None, _nested_model_default_partial_update=None, _env_prefix=None, _env_prefix_target=None, _env_file=PosixPath('.'), _env_file_encoding=None, _env_ignore_empty=None, _env_nested_delimiter=None, _env_nested_max_split=None, _env_parse_none_str=None, _env_parse_enums=None, _cli_prog_name=None, _cli_parse_args=None, _cli_settings_source=None, _cli_parse_none_str=None, _cli_hide_none_type=None, _cli_avoid_json=None, _cli_enforce_required=None, _cli_use_class_docs_for_groups=None, _cli_exit_on_error=None, _cli_prefix=None, _cli_flag_prefix_char=None, _cli_implicit_flags=None, _cli_ignore_unknown_args=None, _cli_kebab_case=None, _cli_shortcuts=None, _secrets_dir=None, _build_sources=None, *, endpoint_url=None, region_name=None, access_key_id=None, secret_access_key=None, session_token=None, anon=False)[source]#
Get S3 credentials and connection info safely from environment variables.
This config class inherits from
pydantic_settings.BaseSettings, that will try to infer field values from environment variables.Environment variables (all optional):
S3_ENDPOINT_URL: Custom S3 endpoint (e.g. for MinIO or non-AWS S3).S3_REGION_NAME: AWS region.S3_ACCESS_KEY_ID: Access key id.S3_SECRET_ACCESS_KEY: Secret access key.S3_SESSION_TOKEN: Session token (for temporary credentials).S3_ANON: Set totruefor anonymous access to public buckets.
Fields default to
None(orFalseforanon) so that callers can rely ons3fs/botocorefalling back to standard AWS credential discovery (e.g.~/.aws/credentials, instance metadata) when a value is not explicitly set.see: https://docs.pydantic.dev/latest/concepts/pydantic_settings/#usage
- Parameters:
_case_sensitive (bool | None)
_nested_model_default_partial_update (bool | None)
_env_prefix (str | None)
_env_prefix_target (EnvPrefixTarget | None)
_env_file (DotenvType | None)
_env_file_encoding (str | None)
_env_ignore_empty (bool | None)
_env_nested_delimiter (str | None)
_env_nested_max_split (int | None)
_env_parse_none_str (str | None)
_env_parse_enums (bool | None)
_cli_prog_name (str | None)
_cli_settings_source (CliSettingsSource[Any] | None)
_cli_parse_none_str (str | None)
_cli_hide_none_type (bool | None)
_cli_avoid_json (bool | None)
_cli_enforce_required (bool | None)
_cli_use_class_docs_for_groups (bool | None)
_cli_exit_on_error (bool | None)
_cli_prefix (str | None)
_cli_flag_prefix_char (str | None)
_cli_implicit_flags (bool | Literal['dual', 'toggle'] | None)
_cli_ignore_unknown_args (bool | None)
_cli_kebab_case (bool | Literal['all', 'no_enums'] | None)
_secrets_dir (PathType | None)
_build_sources (tuple[tuple[PydanticBaseSettingsSource, ...], dict[str, Any]] | None)
endpoint_url (AnyUrl | None)
region_name (str | None)
access_key_id (SecretStr | None)
secret_access_key (SecretStr | None)
session_token (SecretStr | None)
anon (bool)