Migrating the Data Catalog#
Overview#
The data catalog structure has been refactored to introduce a more modular design and
clearer separation of responsibilities across several new classes (DataSource, Driver, URIResolver, and DataAdapter):
URIResolveris in charge of parsing the path or URI of the file (e.g if you are using some keywords like{year}or{month}in your paths or if you want to read tiled raster)Driveris in charge of reading the data from the source (e.g reading a netcdf file from a local disk or from cloud)DataAdapteris in charge of harmonizing the data to standard HydroMT data structures (e.g. renaming variables, setting attributes, units conversion, etc.)DataSourceis the main class that ties everything together and is used by theDataCatalogto load data.
Key format changes:
pathis renamed touridriver:
filesystemordriver_kwargsmoved underdriver.drivercan be a single string or a dictionnary with name and options (passed to underlying function that will read the data, e.g. xarray.open_mfdataset, etc.).data_adapter:
unit_add,unit_mult,rename, etc. moved underdata_adapteruri_resolver: can be specified mostly in the case of tiled rasters to pass required options.
metadata:
crsandnodataare moved undermetadata(renamed frommeta)A single catalog entry can now reference multiple data variants or versions
See more information about the current format in the data catalog documentation.
How to upgrade#
All existing pre-defined catalogs have been updated to the new format. For your own catalogs, you can upgrade
easily with the HydroMT check command:
hydromt check -d /path/to/data_catalog.yml --format v0 --upgrade -v