{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Example: Reading 2D tabular data (DataFrame) "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1",
   "metadata": {},
   "source": [
    "This example illustrates the how to read 2D tabular data using the HydroMT [DataCatalog](../_generated/hydromt.data_catalog.DataCatalog.rst) with the `csv` driver."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from hydromt import DataCatalog\n",
    "\n",
    "data_catalog = DataCatalog(\"data/tabular_data_catalog.yml\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "3",
   "metadata": {},
   "source": [
    "## Pandas driver "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "### time series data"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "5",
   "metadata": {},
   "source": [
    "To read 2D tabular data from a comma-separated file (csv) and parse it into a [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) we use the [pandas.read_csv()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html). Any *driver_kwargs* in the data catalog are passed to this method, e.g., parsing dates in the \"time\" colum and setting this as the index.\n",
    "\n",
    "This works similarly for excel tables, but based on the [pandas.read_excel()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html#pandas.read_excel) method. \n",
    "\n",
    "For demonstration we use a dummy example timeseries data in csv. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6",
   "metadata": {},
   "outputs": [],
   "source": [
    "# inspect data source entry in data catalog yaml file\n",
    "data_catalog.get_source(\"example_csv_data\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7",
   "metadata": {},
   "source": [
    "We can load any 2D tabular data using [DataCatalog.get_dataframe()](../_generated/hydromt.data_catalog.DataCatalog.get_dataframe.rst). Note that if we don't provide any arguments it returns the full dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = data_catalog.get_dataframe(\"example_csv_data\")\n",
    "df.head()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "9",
   "metadata": {},
   "source": [
    "The data can be visualized with the [.plot()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html) pandas method. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.plot(y=\"col1\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "11",
   "metadata": {},
   "source": [
    "### reclassification table"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "Another typical usecase for tabular data are reclassification tables to reclassify e.g. land use data to manning roughness. An example of this data is shown in the cells below. Note tha the values are not validated and likely too high!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13",
   "metadata": {},
   "outputs": [],
   "source": [
    "# read both the vito_reclass and artifact_data data catalogs\n",
    "data_catalog = DataCatalog([\"data/vito_reclass.yml\", \"artifact_data\"])\n",
    "data_catalog.get_source(\"vito_reclass\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = data_catalog.get_dataframe(\"vito_reclass\")\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15",
   "metadata": {},
   "outputs": [],
   "source": [
    "da_lulc = data_catalog.get_rasterdataset(\"vito_2015\")\n",
    "da_man = da_lulc.raster.reclassify(df[[\"manning\"]])\n",
    "da_man[\"manning\"].plot.imshow()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "default",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.21"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}