Architecture#

HydroMT supports a large variety of models, which all require different types of data. It is therefore important that the API that HydroMT exposes is extendable. HydroMT is composed of a small set of key classes that support extension. In this section we walk through these classes and describe their main responsibilities and where they interact.

Model#

The Model is the main representation of the model that is being built. A model is built step by step by adding ModelComponent s to the Model. Plugins can define steps which act on these components to implement complex interactions between different components. The area of interest for the model can be defined by the SpatialModelComponent. The complete model building workflow can be encoded in a workflow file file.

ModelComponent#

A Model can be populated with many different ModelComponent s. A component can represent any type of data you have on your area of interest. This component can have many properties, but always has a read() and write() component to read in and write out data. A Model must have at least one ModelComponent.

DataCatalog#

Model s need data. Where the data should be found and how it should be loaded is defined in the DataCatalog. Each item in the catalog is a DataSource. Users can create their own catalogs, using a yaml format, or they can share their PredefinedCatalog using the Plugins system.

DataSource#

The DataSource is the python representation of a parsed entry in the DataCatalog. The source is responsible for validating the catalog entry. It also carries the DataAdapter, URIResolver and Driver and serves as an entrypoint to the data. Per HydroMT data type (e.g. RasterDataset, GeoDataFrame), HydroMT has one DataSource, e.g. RasterDatasetSource, sources.geodataframe.GeoDataFrameSource. The read() method governs the full process of discovery with the URIResolver, reading data with the Driver, and transforming the data to a HydroMT standard with a DataAdapter.

URIResolver#

Finding the right address where the requested data is stored is not always straightforward. Searching for data differs between finding data in a web-service, database, a catalog or when dealing with a certain naming convention. Exploring where the right data can be found is implemented in the URIResolver. The URIResolver takes a single uri from the data catalog, and the query parameters from the model, such as the region, or the time range, and returns multiple absolute paths, or uri s, that can be read into a single python representation (e.g. xarray.Dataset). The URIResolver is extendable, so Plugins or other code can subclass the Abstract URIResolver class to implement their own conventions for data discovery.

Driver#

The Driver class is responsible for reading a set of geospatial data formats, like a geojson file or zarr archive, into their python in-memory representations: geopandas.GeoDataFrame or xarray.Dataset respectively. This class can also be extended using the Plugins. Because the merging of different files from different DataSource s can be non-trivial, the driver is responsible to merge the different python objects coming from the driver to a single representation. This is then returned from the read() method. The query parameters vary per HydroMT data type, so there is is a different driver interface per type, e.g. RasterDatasetDriver, GeoDataFrameDriver. To help with different filesystems, the driver class is handed a fsspec.Filesystem.

DataAdapter#

The DataAdapter homogenizes the data coming from the Driver. This means slicing the data to the right region, renaming variables, changing units, regridding and more. The adapter has a transform() method that takes a python object and returns the same type, e.g. an xr.Dataset. This method also accepts query parameters based on the data type, so there is a single DataAdapter per HydroMT data type.

Architecture Diagram#

The above is summarized in the following architecture diagram. Only the aforementioned methods and properties are used.