Data Catalog#
The data catalog is the way of providing a HydroMT model class with raw data to chose from when setting up the data for a model. In case of HydroMT-FIAT, this is done for the data mentioned in the data section. The data catalog for HydroMT-FIAT therefore caters to vector, raster and tabular data. This section will show a brief overview of a data catalog entry for each of them.
Important
The format of the data catalog is YAML.
Warning
Not all data catalog functionality is specified here, for more information see the data catalog information in HydroMT-core.
Meta#
It is good practice to start each data catalog with the meta section. This section is not necessary however as data is automatically searched for relative to the data catalog’s location.
meta:
roots:
- < root-to-data-location-1 >
- < root-to-data-location-2 >
version: < version >
name: < name >
Vector#
This section will effectively entail the exposure geometry data. The driver is always defined as pyogrio and the data_type is always set to GeoDataFrame.
Example data catalog entry:
< your-dataset-name >:
data_type: GeoDataFrame
uri: < path-to-dataset >
driver:
name: pyogrio
filesystem: local
rename: # Optional
< columns-name >: < new-column-name >
unit_add: # Optional
< columns-name >: < value >
unit_mult: # Optional
< columns-name >: < value >
metadata: # Optional, but good practice
category: < data-category >
crs: 4326 # Whatever the dataset is in
source_version: 1.0
Explanation of optional vector data entry options:
Option |
Description |
|---|---|
rename |
Rename a column header of the attribute table |
unit_add |
Add a set value to the values in a numerical column |
unit_multi |
Multiply the values of a numerical column with a set value |
This data entry is mainly called upon by:
Raster#
This section will cover the data that is needed for setup up the hazard data and the gridded exposure data. The data type of a raster data entry is always set to rasterdataset. In contrast to the vector data entry, two drivers are applicable in the case of raster data. These are listed in the small table down below:
Driver name |
Data type |
|---|---|
rasterio |
Spatially referenced gridded data readable by GDAL (e.g. .tif) |
raster_xarray |
Spatially referenced netCDF data (.nc) and zarr archives (.zarr) |
Example data catalog entry:
< your-dataset-name >:
data_type: RasterDataset
uri: < path-to-dataset >
driver:
name: < driver-name >
filesystem: local
options:
chunk: # Not necessary, but very handy for large datasets
< x-dim >: < value > # e.g. lon: 1000
< y-dim >: < value >
rename: # Optional
< columns-name >: < new-column-name >
unit_add: # Optional
< columns-name >: < value >
unit_mult: # Optional
< columns-name >: < value >
metadata: # Optional, but good practice
category: < data-category >
crs: 4326 # Whatever the dataset is in
source_version: 1.0
Explanation of optional raster data entry options:
Option |
Description |
|---|---|
chunk (driver options) |
Set the chunking per dimension specific |
rename |
Rename a variable (layer or band) in the dataset |
unit_add |
Add a set value to the numerical variable |
unit_multi |
Multiply a numerical variable with a set value |
This data entry is mainly called upon by:
Tabular#
Tabular data is used in most places throughout HydroMT-FIAT either as direct concrete input for the vulnerability data or as linking tables in other different places. The data type of tabular data is always defined as DataFrame and the driver is always set to pandas.
Example data catalog entry:
< your-dataset-name >:
data_type: DataFrame
uri: < path-to-dataset >
driver:
name: pandas
filesystem: local
options:
header: null # null translates to None in Python -> no header
index_col: 0 # Chose the first column as index
parse_dates: false # Whether or not to parse datetime entries
rename: # Optional
< columns-name >: < new-column-name >
unit_add: # Optional
< columns-name >: < value >
unit_mult: # Optional
< columns-name >: < value >
metadata:
category: < data-category >
source_version: 1.0
Explanation of optional tabular data entry options:
Option |
Description |
|---|---|
header (driver options) |
Whether or not to set a row as the column headers and which row |
index_col (driver options) |
Which column to use as the index for the rows |
rename |
Rename a column header of the attribute table |
unit_add |
Add a set value to the values in a numerical column |
unit_multi |
Multiply the values of a numerical column with a set value |
This data entry is mainly called upon by: