{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Example: Define hydrological model regions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1",
   "metadata": {},
   "source": [
    "The first step when building a model is to define the model domain. For hydrological models this is typically a full **basin** or **sub-basin** which includes all upstream cells. For hydrodynamic models this can also be an **inter-basin** which does not necessary include all upstream cells. *HydroMT* has the [get_basin_geometry()](../_generated/hydromt.model.processes.basin_mask.get_basin_geometry.rst) method to do just that. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2",
   "metadata": {},
   "source": [
    "## Import packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "import geopandas as gpd\n",
    "import numpy as np\n",
    "from shapely.geometry import box"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# for plotting\n",
    "import cartopy.crs as ccrs\n",
    "import cartopy.io.img_tiles as cimgt\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "proj = ccrs.PlateCarree()  # plot projection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {},
   "outputs": [],
   "source": [
    "import hydromt\n",
    "from hydromt.model.processes.basin_mask import get_basin_geometry"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6",
   "metadata": {},
   "source": [
    "## Read data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7",
   "metadata": {},
   "source": [
    "Read data using the [DataCatalog](../_generated/hydromt.data_catalog.DataCatalog.rst). If no yml files with data sources are provided, by default data for the Piave basin is downloaded from the hydromt-artifacts to to `~/.hydromt_data/`. Links to the original data sources and data licenses can be found in the meta dict of each source. Here we use flow direction data from [MERIT Hydro](http://hydro.iis.u-tokyo.ac.jp/~yamadai/MERIT_Hydro). Note that the original `'MERIT Hydro'` data has been extended with a basin mask layer, which together with the basin index vector data makes faster basin delineation possible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# instantiate instance of Data Catalog\n",
    "data_catalog = hydromt.DataCatalog(data_libs=[\"artifact_data=v1.0.0\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# read MERIT hydro flow direction data\n",
    "print(data_catalog.get_source(\"merit_hydro\"))\n",
    "ds = data_catalog.get_rasterdataset(\"merit_hydro\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10",
   "metadata": {},
   "outputs": [],
   "source": [
    "# read MERIT hydro basin index vector data. This data contains bounding box geometries of all basins globally.\n",
    "# Here we pass the GeoDataFrameAdapter instead of the GeoDataFrame itself\n",
    "# a spatial subset of the data loaded within the get_basin_geometry method\n",
    "basin_index = data_catalog.get_source(\"merit_hydro_index\")\n",
    "print(basin_index)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {},
   "outputs": [],
   "source": [
    "# derive river geometry based on stream order >= 7 (for plotting only)\n",
    "flwdir = hydromt.gis.flw.flwdir_from_da(ds[\"flwdir\"], ftype=\"d8\")\n",
    "feats = flwdir.streams(mask=ds[\"strord\"] >= 7)\n",
    "gdf_riv = gpd.GeoDataFrame.from_features(feats)\n",
    "\n",
    "\n",
    "def plot(extent, gdf_bas, gdf_out):\n",
    "    plt.figure(figsize=(3, 5))\n",
    "    ax = plt.subplot(projection=proj)\n",
    "    ax.set_extent(extent, crs=proj)\n",
    "    ax.add_image(cimgt.QuadtreeTiles(), 10)\n",
    "    gdf_bas.boundary.plot(ax=ax, edgecolor=\"k\", zorder=2)\n",
    "    gdf_riv.plot(ax=ax, color=\"blue\", alpha=0.7)\n",
    "    gdf_out.plot(ax=ax, markersize=40, c=\"red\", zorder=2)\n",
    "    return ax"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "## Delineate basins"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13",
   "metadata": {},
   "source": [
    "Several examples to delineate sub-, inter- or full basins are provided below together with the command line interface (CLI) syntax for the HydroMT [build](../guides/user_guide/model_build.rst) command. All CLI options are described in the [parse_region_basin()](../_generated/hydromt.model.processes.region.parse_region_basin.rst) method."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14",
   "metadata": {},
   "source": [
    "Get the basin based on a point location `[x, y]`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15",
   "metadata": {},
   "outputs": [],
   "source": [
    "# CLI syntax: -r {'basin': [x, y]}\n",
    "# e.g.: -r {'basin': [12.6, 45.8]}\n",
    "\n",
    "xy = [12.6, 45.8]\n",
    "gdf_bas, _ = get_basin_geometry(\n",
    "    ds,\n",
    "    kind=\"basin\",\n",
    "    xy=xy,\n",
    "    basin_index=basin_index,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16",
   "metadata": {},
   "outputs": [],
   "source": [
    "# plot results\n",
    "gdf_xy = gpd.GeoDataFrame(geometry=gpd.points_from_xy(x=[xy[0]], y=[xy[1]]), crs=4326)\n",
    "extent = np.array(gdf_bas.buffer(0.1).total_bounds)[[0, 2, 1, 3]]\n",
    "\n",
    "plot(extent, gdf_bas, gdf_xy)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17",
   "metadata": {},
   "source": [
    "## Delineate sub-basins"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18",
   "metadata": {},
   "source": [
    "We need an initial bounding box to delineate the sub-basin. This can be derived from the *merit_hydro_index* or user provided initial `bounds`. \n",
    "The latter might be faster if delineating a small sub-basin from a large basin. A warning is raised if not all contributing cells are included."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19",
   "metadata": {},
   "source": [
    "Get a subbasin based on its outlet location `[x,y]`, snapped to a stream defined by a `<variable>:<threshold>` pair, in this case a stream order larger or equal to 7. The `variable` should be present in the dataset `ds`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "20",
   "metadata": {},
   "outputs": [],
   "source": [
    "# CLI syntax: -r {'subbasin': [x, y], <variable>: <threshold>, 'bounds': [xmin, ymin, xmax, ymax]}\n",
    "# e.g.: -r {'subbasin': [12.6, 45.8], 'strord': 7, 'bounds': [12.1, 45.5, 12.9, 46.5]}\n",
    "\n",
    "xy = [12.6, 45.8]\n",
    "bounds = [12.1, 45.5, 12.9, 46.5]\n",
    "gdf_bas, gdf_out = get_basin_geometry(\n",
    "    ds,\n",
    "    kind=\"subbasin\",\n",
    "    xy=xy,\n",
    "    strord=7,\n",
    "    bounds=bounds,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21",
   "metadata": {},
   "outputs": [],
   "source": [
    "# plot results\n",
    "gdf_bounds = gpd.GeoDataFrame(geometry=[box(*bounds)], crs=4326)\n",
    "extent = gdf_bounds.buffer(0.05).total_bounds[[0, 2, 1, 3]]\n",
    "\n",
    "ax = plot(extent, gdf_bas, gdf_out)\n",
    "gdf_bounds.boundary.plot(ax=ax, edgecolor=\"k\", ls=\"--\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22",
   "metadata": {},
   "source": [
    "Get a sub-basin based on the location where the stream flows out of a `bbox`. Here the stream defined by a `<variable>:<threshold>` pair, in this case a stream order larger or equal to 8. The `variable` should be present in the dataset `ds`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "23",
   "metadata": {},
   "outputs": [],
   "source": [
    "# CLI SYNTAX: -r {'subbasin': [xmin, ymin, xmax, ymax], <variable>: <threshold>}\n",
    "# e.g.: -r {'subbasin': [12.50, 45.72, 12.7, 46], 'strord': 8}\n",
    "\n",
    "bbox = [12.50, 45.72, 12.7, 46]\n",
    "gdf_bas, gdf_out = get_basin_geometry(\n",
    "    ds,\n",
    "    kind=\"subbasin\",\n",
    "    bbox=bbox,\n",
    "    strord=8,\n",
    "    basin_index=basin_index,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24",
   "metadata": {},
   "outputs": [],
   "source": [
    "# plot results\n",
    "gdf_bbox = gpd.GeoDataFrame(geometry=[box(*bbox)], crs=4326)\n",
    "extent = gdf_bas.buffer(0.05).total_bounds[[0, 2, 1, 3]]\n",
    "\n",
    "ax = plot(extent, gdf_bas, gdf_out)\n",
    "gdf_bbox.boundary.plot(ax=ax, edgecolor=\"r\", ls=\"-\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25",
   "metadata": {},
   "source": [
    "## Delineate interbasins"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "26",
   "metadata": {},
   "source": [
    "Get an inter-basin based on the streams within a `bbox`. The inter-basin is limited to the most downstream contiguous area within the bbox that drains to the stream.  Here the stream defined by a `<variable>:<threshold>` pair, in this case a stream order larger or equal to 8. The `variable` should be present in the dataset `ds`. The `buffer` is required to check wether streams flow in and out of the bbox. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "27",
   "metadata": {},
   "outputs": [],
   "source": [
    "# CLI SYNTAX -r {'interbasin': [xmin, ymin, xmax, ymax], <variable>: <threshold>}\n",
    "# e.g.: -r {'interbasin': [12.50, 45.72, 12.7, 46], 'strord': 8}\n",
    "\n",
    "bbox = [12.50, 45.72, 12.7, 46]\n",
    "gdf_bas, gdf_out = get_basin_geometry(\n",
    "    ds,\n",
    "    kind=\"interbasin\",\n",
    "    bbox=bbox,\n",
    "    strord=8,\n",
    "    buffer=20,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28",
   "metadata": {},
   "outputs": [],
   "source": [
    "# plot results\n",
    "gdf_bbox = gpd.GeoDataFrame(geometry=[box(*bbox)], crs=4326)\n",
    "extent = gdf_bbox.buffer(0.02).total_bounds[[0, 2, 1, 3]]\n",
    "\n",
    "ax = plot(extent, gdf_bas, gdf_out)\n",
    "gdf_bbox.boundary.plot(ax=ax, edgecolor=\"r\", ls=\"-\", zorder=1)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "default",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.21"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}