.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "user-guide\10-cleanup.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_user-guide_10-cleanup.py: Data cleanup ============ More often than not, data contained in databases is not entirely consistent, causing errors. It therefore is useful to have some utilities at hand to clean up data. We included some convenience methods to help clean up inconsistent datasets. .. GENERATED FROM PYTHON SOURCE LINES 10-12 .. code-block:: Python :dedent: 1 .. GENERATED FROM PYTHON SOURCE LINES 14-15 We'll start with the usual imports .. GENERATED FROM PYTHON SOURCE LINES 15-19 .. code-block:: Python import numpy as np import imod .. GENERATED FROM PYTHON SOURCE LINES 20-21 These imports can be ignored .. GENERATED FROM PYTHON SOURCE LINES 21-24 .. code-block:: Python from imod.schemata import ValidationError from imod.util import print_if_error .. GENERATED FROM PYTHON SOURCE LINES 25-28 There is a separate example contained in :doc:`hondsrug ` that you should look at if you are interested in the model building .. GENERATED FROM PYTHON SOURCE LINES 28-33 .. code-block:: Python tmpdir = imod.util.temporary_directory() gwf_simulation = imod.data.hondsrug_simulation(tmpdir / "hondsrug_saved") .. GENERATED FROM PYTHON SOURCE LINES 34-40 Update existing model with new data ----------------------------------- Your dear colleague has brought you some new data for the river package, which should be much better than the previous dataset "riv" included in the database. .. GENERATED FROM PYTHON SOURCE LINES 41-45 .. code-block:: Python gwf_model = gwf_simulation["GWF"] new_riv_ds = imod.data.colleagues_river_data(tmpdir / "hondsrug_saved") .. GENERATED FROM PYTHON SOURCE LINES 46-47 Let's do a brief visual check if the colleague's data seems alright: .. GENERATED FROM PYTHON SOURCE LINES 47-52 .. code-block:: Python imod.visualize.plot_map( new_riv_ds["stage"].max(dim="layer"), "viridis", np.linspace(-1, 19, 9) ) .. image-sg:: /user-guide/images/sphx_glr_10-cleanup_001.png :alt: 10 cleanup :srcset: /user-guide/images/sphx_glr_10-cleanup_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (
, ) .. GENERATED FROM PYTHON SOURCE LINES 53-56 Hmmmm, the western side of the river stage grid seems suspiciously inactive... We have to contact our colleague later. For now, let's work with what we have and update the model. .. GENERATED FROM PYTHON SOURCE LINES 57-62 .. code-block:: Python old_riv = gwf_model.pop("riv") gwf_model["new_riv"] = imod.mf6.River(**new_riv_ds) .. GENERATED FROM PYTHON SOURCE LINES 63-64 Lets's write the simulation with our updated model! .. GENERATED FROM PYTHON SOURCE LINES 65-73 .. code-block:: Python tmp_dir = imod.util.temporary_directory() # Ignore the "with" statement, it is to succesfully render the Jupyter notebook # online. with print_if_error(ValidationError): gwf_simulation.write(tmp_dir) .. rst-class:: sphx-glr-script-out .. code-block:: none Simulation validation status: * GWF model: * new_riv package: * stage: * not all values comply with criterion: >= bottom_elevation * conductance: * nodata is not aligned with stage * not all values comply with criterion: > 0.0 * bottom_elevation: * nodata is not aligned with stage -> You might fix this by calling the package's ``.cleanup()`` method. .. GENERATED FROM PYTHON SOURCE LINES 74-81 Oh no! Our river package has a completely inconsistent dataset. The model validation raises the following issues: * The bottom elevation exceeds stage in some cells * NoData cells are not aligned between stage and conductance * NoData cells are not aligned between stage and bottom_elevation * There are conductance values with value <= 0.0 .. GENERATED FROM PYTHON SOURCE LINES 84-88 *Exercise*: Use the function ``imod.visualize.plot_map`` to visually inspect the errors in your colleagues dataset. You can alter the variable name to inspect different variables. .. GENERATED FROM PYTHON SOURCE LINES 89-94 .. code-block:: Python imod.visualize.plot_map( new_riv_ds["stage"].max(dim="layer"), "viridis", np.linspace(-1, 19, 9) ) .. image-sg:: /user-guide/images/sphx_glr_10-cleanup_002.png :alt: 10 cleanup :srcset: /user-guide/images/sphx_glr_10-cleanup_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (
, ) .. GENERATED FROM PYTHON SOURCE LINES 95-97 We can also check where stage exceeds river bottom elevation using basic xarray functionality. .. GENERATED FROM PYTHON SOURCE LINES 98-103 .. code-block:: Python stage_above_riv_bot = new_riv_ds["stage"] < new_riv_ds["bottom_elevation"] imod.visualize.plot_map(stage_above_riv_bot.max(dim="layer"), "viridis", [0, 1]) .. image-sg:: /user-guide/images/sphx_glr_10-cleanup_003.png :alt: 10 cleanup :srcset: /user-guide/images/sphx_glr_10-cleanup_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (
, ) .. GENERATED FROM PYTHON SOURCE LINES 104-107 So luckily the area affected by this error is only small. Let's investigate the size of the error here. .. GENERATED FROM PYTHON SOURCE LINES 108-115 .. code-block:: Python diff = new_riv_ds["bottom_elevation"] - new_riv_ds["stage"] max_diff = diff.max().values max_diff .. rst-class:: sphx-glr-script-out .. code-block:: none array(0.10000038) .. GENERATED FROM PYTHON SOURCE LINES 116-118 That is a relatively subtle error, which is probably why it slipped your colleague's attention. Let's plot the troublesome area: .. GENERATED FROM PYTHON SOURCE LINES 119-126 .. code-block:: Python imod.visualize.plot_map( diff.where(stage_above_riv_bot).max(dim="layer"), "viridis", np.linspace(0, max_diff, 9), ) .. image-sg:: /user-guide/images/sphx_glr_10-cleanup_004.png :alt: 10 cleanup :srcset: /user-guide/images/sphx_glr_10-cleanup_004.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (
, ) .. GENERATED FROM PYTHON SOURCE LINES 127-147 That is only a small area. Plotting these errors can help you analyze the size of problems and communicate with your colleague where there are problems. The ``cleanup`` method ---------------------- The data quality seems acceptable for now to carry on modelling, so we can already set up a model for the project. We need to communicate with colleagues later that there are mistakes, so that the errors can be corrected. iMOD Python has a set of utilities to help you with cleaning erronous data. Note that these cleanup utilities can only fix a limited set of common mistakes, in ways you might not need/like. It therefore is always wise to verify if you think the fixes are correct in this case. Let's clean up the problematic river data using the :meth:`imod.mf6.River.cleanup` method. Note that this method changes the data inplace. This means that the package will be updated, without returning a new copy of the package. This means that we need to copy the dirty dataset first, before we can do comparisons of the dataset before and after the fix. .. GENERATED FROM PYTHON SOURCE LINES 148-151 .. code-block:: Python dirty_ds = gwf_model["new_riv"].dataset.copy() .. GENERATED FROM PYTHON SOURCE LINES 152-156 We can now clean up the package. To clean up the River package, the method requires the model discretization, contained in the :class:`imod.mf6.StructuredDiscretization` package. In this model, we named this package "dis". .. GENERATED FROM PYTHON SOURCE LINES 157-163 .. code-block:: Python dis_pkg = gwf_model["dis"] gwf_model["new_riv"].cleanup(dis=dis_pkg) cleaned_ds = gwf_model["new_riv"].dataset .. GENERATED FROM PYTHON SOURCE LINES 164-170 According to the method's explanation (the method calls :func:`imod.prepare.cleanup_riv`), the bottom elevation should be altered by the cleanup function. Let's first see if the stages were altered, we'll calculate the difference between the original stage and cleaned version. .. GENERATED FROM PYTHON SOURCE LINES 171-179 .. code-block:: Python diff_stage = dirty_ds["stage"] - cleaned_ds["stage"] fig, ax = imod.visualize.plot_map( diff_stage.max(dim="layer"), "viridis", np.linspace(0, max_diff, 9) ) ax.set_title("stage lowered by cleanup (m)") .. image-sg:: /user-guide/images/sphx_glr_10-cleanup_005.png :alt: stage lowered by cleanup (m) :srcset: /user-guide/images/sphx_glr_10-cleanup_005.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Text(0.5, 1.0, 'stage lowered by cleanup (m)') .. GENERATED FROM PYTHON SOURCE LINES 180-184 The stages are indeed not altered. Let's see if the bottom elevations are lowered, we'll calculate the difference between the original bottom elevation and cleaned version. .. GENERATED FROM PYTHON SOURCE LINES 185-193 .. code-block:: Python diff_riv_bot = dirty_ds["bottom_elevation"] - cleaned_ds["bottom_elevation"] fig, ax = imod.visualize.plot_map( diff_riv_bot.max(dim="layer"), "viridis", np.linspace(0, max_diff, 9) ) ax.set_title("river bottom lowered by cleanup (m)") .. image-sg:: /user-guide/images/sphx_glr_10-cleanup_006.png :alt: river bottom lowered by cleanup (m) :srcset: /user-guide/images/sphx_glr_10-cleanup_006.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Text(0.5, 1.0, 'river bottom lowered by cleanup (m)') .. GENERATED FROM PYTHON SOURCE LINES 194-211 You can see the bottom elevation was lowered by the cleanup method. Furthermore the area in the west where no active stages were defined are also deactivated. We advice to always verify if the data is cleaned in a manner that fits your use-case. For example, you might need to raise the stages to the river bottom elevation, instead of lowering the latter to the former like the ``cleanup`` method just did. In such case, you need to manually fix the data. The ``cleanup`` method helps getting your data through the model validation, but is not guaranteed to be the "best" way to clean up your data! Writing the cleaned model ------------------------- Now that the river package has been cleaned, let's see if we can write the model. .. GENERATED FROM PYTHON SOURCE LINES 212-215 .. code-block:: Python gwf_simulation.write(tmp_dir) .. GENERATED FROM PYTHON SOURCE LINES 216-227 Great! The model was succesfully written! Cleaning data without a MODFLOW6 simulation ------------------------------------------- There might be situations where you do not have a MODFLOW6 simulation or River package at hand, and you still want to clean up your river grids. In this case, you can use the :func:`imod.prepare.cleanup_riv` function. This function requires you to to separately provide your grids and returns a dictionary of grids. .. GENERATED FROM PYTHON SOURCE LINES 228-241 .. code-block:: Python idomain = dis_pkg["idomain"] bottom = dis_pkg["bottom"] stage = dirty_ds["stage"] conductance = dirty_ds["conductance"] bottom_elevation = dirty_ds["bottom_elevation"] riv_cleaned_dict = imod.prepare.cleanup_riv( idomain, bottom, stage, conductance, bottom_elevation ) riv_cleaned_dict .. rst-class:: sphx-glr-script-out .. code-block:: none {'stage': Size: 5MB array([[[ nan, nan, nan, ..., 1.4399999, 1.41 , 1.39 ], [ nan, nan, nan, ..., nan, 1.39 , 1.39 ], [ nan, nan, nan, ..., nan, 1.0799999, 0.85 ], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ... [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]]], dtype=float32) Coordinates: * x (x) float64 4kB 2.375e+05 2.375e+05 2.376e+05 ... 2.5e+05 2.5e+05 * y (y) float64 2kB 5.64e+05 5.64e+05 5.639e+05 ... 5.59e+05 5.59e+05 * layer (layer) int32 52B 1 2 3 4 5 6 7 8 9 10 11 12 13 dx float64 8B 25.0 dy float64 8B -25.0, 'conductance': Size: 5MB array([[[ nan, nan, nan, ..., 29.86, 2.71, 29.86], [ nan, nan, nan, ..., nan, 4.52, 26.24], [ nan, nan, nan, ..., nan, 0.9 , 11.67], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]]], dtype=float32) Coordinates: * x (x) float64 4kB 2.375e+05 2.375e+05 2.376e+05 ... 2.5e+05 2.5e+05 * y (y) float64 2kB 5.64e+05 5.64e+05 5.639e+05 ... 5.59e+05 5.59e+05 * layer (layer) int32 52B 1 2 3 4 5 6 7 8 9 10 11 12 13 dx float64 8B 25.0 dy float64 8B -25.0, 'bottom_elevation': Size: 5MB array([[[ nan, nan, nan, ..., 1.39, 1.36, 1.34], [ nan, nan, nan, ..., nan, 1.34, 1.34], [ nan, nan, nan, ..., nan, 1.03, 0.51], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]]], dtype=float32) Coordinates: * x (x) float64 4kB 2.375e+05 2.375e+05 2.376e+05 ... 2.5e+05 2.5e+05 * y (y) float64 2kB 5.64e+05 5.64e+05 5.639e+05 ... 5.59e+05 5.59e+05 * layer (layer) int32 52B 1 2 3 4 5 6 7 8 9 10 11 12 13 dx float64 8B 25.0 dy float64 8B -25.0} .. GENERATED FROM PYTHON SOURCE LINES 242-244 This returns a dictionary which you can use however you want, and furthermore feed to the River package. .. GENERATED FROM PYTHON SOURCE LINES 245-250 .. code-block:: Python riv_pkg = imod.mf6.River(**riv_cleaned_dict) riv_pkg .. raw:: html
River
<xarray.Dataset> Size: 16MB
    Dimensions:           (x: 500, y: 200, layer: 13)
    Coordinates:
      * x                 (x) float64 4kB 2.375e+05 2.375e+05 ... 2.5e+05 2.5e+05
      * y                 (y) float64 2kB 5.64e+05 5.64e+05 ... 5.59e+05 5.59e+05
      * layer             (layer) int32 52B 1 2 3 4 5 6 7 8 9 10 11 12 13
        dx                float64 8B 25.0
        dy                float64 8B -25.0
    Data variables:
        stage             (layer, y, x) float32 5MB nan nan nan nan ... nan nan nan
        conductance       (layer, y, x) float32 5MB nan nan nan nan ... nan nan nan
        bottom_elevation  (layer, y, x) float32 5MB nan nan nan nan ... nan nan nan
        print_input       bool 1B False
        print_flows       bool 1B False
        save_flows        bool 1B False
        observations      object 8B None
        repeat_stress     object 8B None


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.587 seconds) .. _sphx_glr_download_user-guide_10-cleanup.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: 10-cleanup.ipynb <10-cleanup.ipynb>` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: 10-cleanup.py <10-cleanup.py>` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: 10-cleanup.zip <10-cleanup.zip>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_