{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example: Working with geodatasets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from hydromt import DataCatalog\n",
    "\n",
    "dc = DataCatalog(data_libs=[\"artifact_data=v1.0.0\"])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we illustrate some common GIS problems and how the functionality of the DataArray/Dataset vector accessor can be used. The data is accessed using the HydroMT [DataCatalog](../_generated/hydromt.data_catalog.DataCatalog.rst). For more information see the [Reading vector data](reading_vector_data.ipynb) example."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Geospatial attributes "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some of the available geospatial attributes are listed here below. For all of attributes (and methods for that matter): check the [HydroMT API reference](../api/api.rst)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the waterlevel dataset from the datacatalog\n",
    "# The waterlevels are series per location (points; here in lat lon)\n",
    "ds = dc.get_geodataset(\"gtsmv3_eu_era5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Coordinate reference system\n",
    "ds.vector.crs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Geospatial data as geometry objects (we show the first 5 points)\n",
    "ds.vector.geometry.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# names of x- and y coordinates\n",
    "\n",
    "(ds.vector.x_name, ds.vector.y_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Bounding box of the geospatial data\n",
    "ds.vector.bounds"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reprojection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Reproject data to Pseudo Mercator (EPSG: 3857)\n",
    "ds_pm = ds.vector.to_crs(3857)\n",
    "\n",
    "# Coordinate reference system\n",
    "ds_pm.vector.crs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Check that the coordinate values are indeed no longer in degrees\n",
    "ds_pm.lon.values[0]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conversion"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create an ogr compliant dataset from ds\n",
    "# When written in netcdf4 format, this can be read by ogr (osgeo; QGIS)\n",
    "from numpy import mean\n",
    "\n",
    "ds_ogr = ds.vector.ogr_compliant(reducer=mean)\n",
    "ds_ogr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert the geospatial data to wkt strings\n",
    "ds_wkt = ds.vector.to_wkt()\n",
    "ds_wkt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert the geospatial data to geometry objects\n",
    "ds_geom = ds.vector.to_geom()\n",
    "ds_geom"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## I/O (Internal and External)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a dummy GeoDataFrame for this example\n",
    "import numpy as np\n",
    "from geopandas import GeoDataFrame\n",
    "from pyproj import CRS\n",
    "from shapely.geometry import Point\n",
    "\n",
    "from hydromt.gis import GeoDataArray\n",
    "\n",
    "gdf = GeoDataFrame(\n",
    "    [{\"Loc\": \"I\", \"Stuff\": 1}, {\"Loc\": \"II\", \"Stuff\": 2}],\n",
    "    geometry=[Point(0, 0), Point(1, 1)],\n",
    "    crs=CRS.from_epsg(4326),\n",
    ")\n",
    "\n",
    "ds_gdf = GeoDataArray.from_gdf(gdf, np.arange(gdf.index.size))\n",
    "ds_gdf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write the Dataset of the DataCatalog to a GeoDataFrame\n",
    "# Waterlevel has besides stations a time dimension\n",
    "# GeoDataFrames don't like vectors/ 2d array's, so if you want to keep the variable it can be reduced along the time dimension\n",
    "\n",
    "gdf = ds.vector.to_gdf(reducer=mean)\n",
    "gdf.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write the Dataset to an ogr compliant netcdf\n",
    "from os.path import join\n",
    "\n",
    "ds.vector.to_netcdf(join(\"tmpdir\", \"ds_ogr.nc\"), ogr_compliant=True)\n",
    "\n",
    "# It is indeed ogr compliant as ogrinfo.exe is able to read it\n",
    "!ogrinfo tmpdir/ds_ogr.nc"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  },
  "vscode": {
   "interpreter": {
    "hash": "3808d5b5b54949c7a0a707a38b0a689040fa9c90ab139a050e41373880719ab1"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}