Introduction

Vision

D-Eco Impact is an open source spatial ecological impact postprocessing model. This model is focused on aquatic application, suitable for coastal, river, lake, rural and urban applications. The model applies criteria rules to environmental conditions to evaluate the ecological state. These environmental conditions can be of varying detail (e.g. expert knowledge, measurements, or model output). The criteria applied can be of varying complexity (e.g., hard boundaries, gradual boundaries, multi variant relationships and AI deduced models). D-Eco Impact makes sure that the technical burden of applying these criteria to these environmental conditions is reduced. By reducing this technical burden, the following benefits can be achieved:

It will be easier to make use of differing environmental data sources and change them when new environmental model output or better describing data sources become available, without changing the ecological criteria.
More time can be spent on exploring the ecological criteria used and improving on knowledge that supports them.
The effect of changing the ecological criteria or underlying environmental data on the ecological result can be easier explored (e.g. spatial/ temporal resolution, accuracy of the environmental data used, missing pressures, knowledge rules used) while comparing the modelled result with the current situation in the field.

Data overview

We distinguish between four types of users for D-Eco Impact:

Users assessing the model results.
Users working with the model using established functionality through the input file.
Users expanding on the model by developing prototype functions on the existing framework.
Developers or co-creators of the model adding accepted functionality that will be available to other users of the model.

Processing

To support D-Eco Impact in providing one environmental input dataset and the analyses of the results we make use of the HydroMT_habitat plugin. HydroMT_habitat combines and prepares environmental data from various sources (e.g., expert knowledge, measurements, or model output) to one coherent dataset of environmental conditions, ready to be analyzed by D-Eco Impact. This dataset is a NetCDF file following the UGRID data format. developed for storing 1D, 2D and 3D (unstructured) data. HydroMT_habitat is also meant as a post-processing module, translating the D-Eco Impact result to a user required data format (e.g., shapefile, geopackage, GeoTiff, CSV) and providing statistical summaries (e.g. area created, change with previous scenario, most limiting environmental variable, least limiting environmental variable).

Workflow

Installation

D-Eco Impact is a command line operated model. To use D-Eco Impact (currently) an installation of Python and the used libraries is required. This is best achieved by installing D-Eco Impact in a virtual environment.

Conda or Visual Studio Code

Conda is a package and environment manager that can be used to install the poetry package and other packages needed to run D-Eco Impact.

Installation of D-Eco Impact with conda (use Miniforge or Miniconda)

Note: when using miniconda, make sure to update the defaults channel to conda-forge (instructions for changing the channel)!

Open a commandline tool (eg. cmd or powershell): sh $ conda create -y -c pip --name <env_name> python=3.11
Activate the newly created environment sh $ conda activate <env_name>
Move to the folder where you have placed the D-Eco Impact source code You can use cd ../ and cd to move to the location or use windows explorer and type “cmd” + enter in the path bar.
To install the required libraries Poetry is used. Use poetry 1.3 or higher: (installation instructions) If you prefer to install poetry with conda then we recommed to install poetry only to the base environment.

Activate base environment:

$ conda activate base

Install poetry using pip:

$ pip install poetry

Activate your created environment:

$ conda activate <env_name>

Poetry makes use of the poetry.lock and pyproject.toml (present in the D-Eco Impact folder) to find these required libraries. Execute the following command: poetry install NB. If errors occur while installing the libraries, this might have to do with your administrative rights. Either start the cmd prompt “As administrator” or discuss this with your IT support.
Now D-Eco Impact is ready to use. You can test this by executing one of the input yaml files. To execute use the following in the command prompt while your environment is active: python main.py <your_input_file>.yaml

Installation D-Eco impact with Visual Studio Code and venv

Install [Python version 3.11.2] (https://www.python.org/downloads/)
Open Visual Studio Code.
Press CRTL + Shift + P and type “Python: Create Environment” followed by enter, select “Venv”.
Place the environment in the D-Eco Impact folder.
Press CTRL + Shift + P and type “Python: Select interpreter” and select the newly created environment.
In the terminal in Visual Studio Code execute the following command: pip install poetry
In the terminal in Visual Studio Code execute the following command: poetry install
Now D-Eco Impact is setup for use. You can test this by executing one of the input yaml files. To execute use the following in the command prompt while your environment is active: python main.py <your_input_file>.yaml

How to Cite

If you found D-Eco Impact useful for your study, please cite it as:

Weeber, M., Elzinga, H., Schoonveld, W., Van de Vries, C., Klapwijk, M., Mischa, I., Rodriguez Aguilera, D., Farrag, M., Ye, Q., Markus, A., Van Oorschot, M., Saager, P., & Icke, J. (2024). D-Eco Impact (v0.3.0). Zenodo. https://doi.org/10.5281/zenodo.10941913

User manual

Visualization of input and output data

There are multiple ways that the data used and produced can be visualized. Here the use of Panoply to explore the data construct, Quickplot for 2D horizontal and 3D vertical visualization and QGIS for spatial relevant visualization are discussed.

Panoply

Panoply is a NetCDF viewer developed by NASA GISS. Panoply can be downloaded here.

Panoply is useful for exploring the content of NetCDF files. It allows the user to see which variables are present in the file, over which dimensions these variables contain values (e.g. x, y, z, time) and what metadata is supplied with each variable. Especially when you have gotten a NetCDF file that you are not familiar with on which data it contains it can be useful to open it first with Panoply.

Panoply

Quickplot

Quickplot is a Deltares visualization tool used amongst others for Delft3D 4 and Delft3D-FM models. Intern Deltares the latest version of Quickplot can be gathered here:

Quickplot is also co-delivered with the installation of one of the Delft3D suites.

Quickplot allows the visualization of UGRID NetCDF files, both in the horizontal, over time and in the vertical (for 3D models).

Quickplot

QGIS

QuantumGIS (QGIS) is open source free ware GIS software. The latest version of QGIS can be downloaded here

QGIS can handle 2D Mesh data directly. See the QGIS 3.28 documentation here. QGIS does however not recognize our newly created time axes (e.g. time_year, time_month after using the D-Eco Impact "time_aggregation_rule").

When it comes to 3D mesh data a Deltares plugin developed by Jan Mooiman (QGIS_Qmesh) can perform the visualisation. Also visualization through time is made easy with the QGIS_Qmesh plugin. Intern Deltares the latest version of this plugin can be gathered here: needs to be externally compiled here.

When Mesh data is loaded directly in QGIS the spatial relevance can be easily displayed using the plugin QuickMapServices > OSM layer.

QGIS

QMESH

Structure of the model input file and functionality

D-Eco Impact is steered through a YAML input file. This input file informs the model which data to use, what ecological knowledge rules to apply and where to write the output data. The easiest way to edit these YAML files is by using Notepad++. When starting with your first application with D-Eco Impact, make use of earlier models to setup your YAML input file and edit from there. When running the YAML file with D-Eco Impact, the model will inform you if there are inconsistencies in the file provided.

Importing and exporting data

Importing and exporting data is always arranged in the input-data and output-data header in the YAML file.

version: …………………….

input-data:
    …………………….
rules:
    …………………….
output-data:
    …………………….

The variables present in the input data, provided through “filename”, are selected for use. The filename is able to accept a pattern including a * in the name. Instead of using one single input file, all files matching the pattern within the folder are being processed by the same input_file.yaml. So, for example, if in a folder there are two files test_1.nc and test_2.nc, the user can set the filename to "test_*.nc" and both files will be processed. It is possible to filter the input data by providing a start date or end date (format: "dd-mm-yyyy"); this is optional. The variables that are used can be selected under “variable_mapping”. Here, you are also able to rename variables as the name used for storage is often cryptic.

At output data the location where the output file needs to be written can be provided through “filename”. In this output file only variables that have been used from the input data and variables that have been created in the model are stored. If the user gives a pattern (filename with asterisk for partitions) in the input-data filename, the output-data filename needs to match the corresponding amount of files that are being processed. Again in the example of two files (test_1.nc and test_2.nc) and an input-data filename of "test_*.nc", the user can either give an output-data filename with or without an asterisk. Without an asterisk (eg "output.nc"), the partitioned part of the input filename is used and extended to the output-data filename ("output_1.nc" and "output_2.nc"). With an asterisk (eg "*_output.nc") the * will provide the place where the partitioned part of the input file will be placed ("1_output.nc" and "2_output.nc"). It is possible to reduce the file size with the optional parameter "save_only_variables", which can take the name of one or several variables. The model needs at least one rule under “rules” to execute.

#FORMAT
version: <D-Eco_Impact_version_nr>

input-data:
  - dataset:
      filename: <path_to_file_including_file_name_and_type>
      start_date: "<start_date>"
      end_date: "<end_date>"
      variable_mapping:
        <variable1_input_file>: "<variable1_name_in_model>"
        <variable2_input_file>: "<variable2_name_in_model>"
        ………
rules:
        ………
output-data:
  filename: <path_to_file_including_file_name_and_type>
  save_only_variables: <variable, or list_of_variables>

#EXAMPLE  : Reading and writing an example model of the Volkerak-Zoommeer
version: 0.1.5

# Mapping: mesh2d_sa1              : Salinity (PSU)
#          mesh2d_s1                : Water level (m NAP)
#          mesh2d_waterdepth : Water depth (m NAP)
input-data:
  - dataset:
      filename: examples/data/FM-VZM_0000_map.nc
      start_date: "01-01-2011"
      end_date: "31-12-2015"
      variable_mapping:
        mesh2d_sa1: "salinity"
        mesh2d_s1: "water_level"
        mesh2d_waterdepth: "water_depth"

rules:
  - multiply_rule:
      name: make variable test
      description: Make a variable called test for testing purposes
      multipliers: [1.0]
      input_variable: water_depth
      output_variable: test

output-data:
  filename: examples/data_out/results_test8c.nc
  save_only_variables: test

Functionality

The functionality is always arranged in the form of rules under the rules header in the yaml file.

version: …………………….

input-data:
    …………………….
rules:
    …………………….
output-data:
    …………………….

The output of the following functionalities has been shown for a section of the Lake Volkerak 3D hydrodynamic model in the Netherlands. This hydrodynamic model output contains 6 years of data (2011 – 2016), with a timestep of 10 days. The 3D hydrodynamic model has been setup with 22 vertical layers and 3290 horizontal flexible mesh grid cells.

Volkerak

Rules

Multiply rule

- multiply_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      multipliers: [<value_to_multiply_with>]
      input_variable: <one_input_variable_name>
      output_variable: <one_output_variable_name>

- multiply_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      multipliers_table:
            - ["start_date", "end_date", "multipliers"]
            - [<DD-MM>, <DD-MM>, [<value_to_multiply_with>]]
            - [<DD-MM>, <DD-MM>, [<value_to_multiply_with>]]
      input_variable: <one_input_variable_name>
      output_variable: <one_output_variable_name>

The multiply rule allows for multiplication of variables. This could be used for unit transformation (e.g., salinity to chloride) or scenario creation (e.g., water level 80% of existing value). The rule operates on all cells both 3D (in horizontal as vertical) as in the time axes. The same dimensions are returned at the output variable. The rule needs to be applied to an existing variable. A new variable is created when the rule is executed.

When using the multiply rule with a start and end date (or multiple start and end dates) all periods that are not covered will be set to NaN. In this way the multiply rule can also be used as a filter in time. NaNs are ignored by any further calculations (for example the time_aggregation_rule).

#EXAMPLE: Salinity (psu) to chloride (mg/l) in a freshwater environment.
- multiply_rule:
      name: Salinity to chloride
      description: Converts salinity (psu) to chloride (CL- mg/l) for fresh water environments
      multipliers: [0.0018066, 1e5]
      input_variable: salinity
      output_variable: chloride

- multiply_rule:
      name: Select only the summer half year for chloride
      description: Select only the summer half year for chloride as this is important for plant growth
      multipliers_table:
            - ["start_date", "end_date", "multipliers"]
            - ["15-04"     , "15-09"   ,         [1.0]]
      input_variable:  chloride
      output_variable: chloride_grow_period

Result Multiply rule

FORMAT
- layer_filter_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      layer_number: <integer_nr_of_layer>
      input_variable: <one_3D_input_variable_name>
      output_variable: <one_output_variable_name>

The layer filter rule allows for the extraction of a layer from 3D variables. This could be used for extracting the top layer or bottom layer (e.g., from a multi layered model result). The rule operates on all layers in a 3D variable (in the vertical) as in the time axes and returns a 2D result with the time axes intact. The rule needs to be applied to an existing 3D variable. A new 2D variable is created when the rule is executed.

#EXAMPLE  : Extracts the chloride concentration at surface.
  - layer_filter_rule:
      name: Extract chloride at surface
      description: Extracts the chloride concentration at surface
      layer_number: 22
      input_variable: chloride
      output_variable: chloride_top_layer

Result Layer filter rule

Time aggregation rule

FORMAT
- time_aggregation_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      operation: <statistic_opperation_applied>
      time_scale : <time_aggregation_applied>
      input_variable: <one_input_variable_name>
      output_variable: <one_output_variable_name>

The time aggregation rule allows for calculating a statistical summary over the time axes of 3D and 2D variables. This could be used for calculating the maximum value over a year (e.g., for water level) or the minimum value over a month (e.g., oxygen concentration). The rule operates both on 3D variables and 2D variables as long as they have a time axis and returns a 3D or 2D result depending on input with the statistic calculated for a new time axis (e.g., year or month). Operations available: Add, Average, Median, Min, Max, period statistics, Stdev and Percentile(n). When using percentile, add a number for the nth percentile with brackets like this: percentile(10). Stdev calculates the standard- deviation over the time period. Under period statistics are explained further in the text.

Time aggregation available: Year, Month

The rule needs to be applied to an existing 2D/3D variable with time axis. A new 2D/3D variable with new time axis is created when the rule is executed. With a year timestep the result is written to the last day of the year, with a month timestep the result is written to the last day of the month per year.

#EXAMPLE  : Calculate the maximum water level in a year.
  - time_aggregation_rule:
      name: Maximum water level year
      description: Get maximum water level in a year
      operation: MAX
      time_scale: year
      input_variable: water_level
      output_variable: MAX_water_level_year

Result Time aggregation rule

Period statistics: Time aggregation rule with COUNT_PERIODS, AVG_DURATION_PERIODS, MIN_DURATION_PERIODS and MAX_DURATION_PERIODS

When the operation type period statistics is used, the user needs to make sure that the input data is always consisting of only 1 and 0. If there is no such layer, the user can make a combination of for example the classification rule together with the time aggregation rule. For example, water depth can be used to check whether the cells are dry or not (this can be done with a classification rule) and with the COUNT_PERIODS operation type in the time aggregation rule the number of consecutive periods within a year or month can be calculated (nr). AVG_DURATION_PERIODS, MIN_DURATION_PERIODS and MAX_DURATION_PERIODS take the respective statistic of the duration for those consecutive periods (duration). Empty values (NaN) are allowed and will be ignored. In case for a specific dimension only empty values occur, the result of the aggregation will be 0.

#EXAMPLE:

Calculate the number of consecutive periods of dry time monthly
    - classification_rule:
        name: Classify dry time
        description: Classify to 0 and 1 the dry time
        criteria_table:
            - ["output", "water_depth"]
            - [0, ">0.10"]
            - [1, "<0.10"]
        input_variables: ["water_depth"]
        output_variable: dry_time_classified

    - time_aggregation_rule:
        name: Count periods
        description: Count periods
        operation: COUNT_PERIODS
        time_scale: month
        input_variable: dry_time_classified
        output_variable: COUNT_PERIODS_water_level_month

Step function rule

FORMAT
- step_function_rule::
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      limit_response_table:
            - [ "limit", "response"]
            - [<limit_value>, <response_value>]
            - [<limit_value>, <response_value>]
      input_variable: <one_input_variable_name>
      output_variable: <one_output_variable_name>

The step function rule performs stepwise classification on the provided values of 3D and 2D variables time dependent arrays. This could be used for translating variables into classes (e.g., salinity classes based on salinity) or indicate suitable/unsuitable ranges (e.g., checking whether the water level falls between the maximum and minimum water level policy criteria). The rule operates both on 3D variables and 2D variables, independent of the time axes, and returns a binominal or classes in a 3D or 2D result, either with time axis, depending on input.

The rule needs to be applied to an existing 2D/3D variable with or without time axis. A new 2D/3D variable with or without time axis is created when the rule is executed.

#EXAMPLE  : Salinity classes.
    - step_function_rule:
      name: Classify salinity
      description: Make distinction between 0.0 – 0.5 , 0.5 – 1.2, 1.2 – 1.3 and >1.3 psu
      limit_response_table:
            - [ limit, response]
            - [-999.0 , 0.0 ]
            - [   0.0 , 1.0 ]
            - [   0.5 , 2.0 ]
            - [   1.2 , 3.0 ]
            - [   1.3 , 4.0 ]
            - [ 999.0 , 4.0 ]
      input_variable: salinity
      output_variable: salinity_class

Visualisation of input Step function rule

Result Step function rule

#EXAMPLE  : Check if the water level falls within the range of -0.10 and +0.15 m NAP.
  - step_function_rule:
      name: Check water level policy
      description: Check if water level is within -0.10 (minimum) and +0.15 (maximum) m NAP
      limit_response_table:
            - [ limit, response]
            - [-999.0  , 0.0 ]
            - [  -0.10 , 1.0 ]
            - [   0.15 , 0.0 ]
            - [ 999.0  , 0.0 ]
      input_variable: water_level
      output_variable : water_level_policy

Visualisation of input Step function rule 2

Result Step function rule 2

Response curve rule

FORMAT
- response_curve_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      response_table:
            - [ "input", "output"]
            - [<limit_value>, <response_value>]
            - [<limit_value>, <response_value>]
      input_variable: <one_input_variable_name>
      output_variable: <one_output_variable_name>

The response curve rule performs a linear interpolation over the provided values of the variables of 3D and 2D variables time dependent arrays. This could be used for a fuzzy logic translation of variables into ecological responses to these variables (e.g., suitability for aquatic plants based on light availability). The rule operates both on 3D variables and 2D variables, independent of the time axes, and returns decimal or fractional values in a 3D or 2D result, either with time axis, depending on input.

The rule needs to be applied to an existing 2D/3D variable with or without time axis. A new 2D/3D variable with or without time axis is created when the rule is executed.

#EXAMPLE  : Response of the habitat suitability of Long-leaf pond weed
# (Potamogeton nodosus)  to water depth.
# Suitable between 0.0 – 2.0 m and highly suitable between 0.5 – 1.0 m
- response_curve_rule:
      name: HSI Pond weed water depth
      description: Reponse of Pond weed (Potamogeton nodosus) to water depth
      response_table:
           - ["input",   "output"]
           - [-999.0 ,   0.0 ]
           - [   0.0 ,   0.0 ]
           - [   0.5 ,   1.0 ]
           - [   1.0 ,   1.0 ]
           - [   2.0 ,   0.0 ]
           - [ 999.0 ,   0.0 ]
      input_variable: water_depth
      output_variable: HSI_Pnodosus_water_depth

$Water depth (in m) is translated through a linear interpolation to show the suitability for P. nodosuse based on the shown relationship. The suitability is expressed in a fraction from 0.0 (unsuitable) and 1.0 (suitable).$

$Water depth (in m, left-hand side) is translated through a linear interpolation to show the suitability for P. nodosus (right-hand side) while maintaining the time, face and layer dimension. The suitability is expressed in a fraction from 0.0 (unsuitable) and 1.0 (suitable).$

Combine results rule

FORMAT
- combine_results_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      operation: <statistic_opperation_applied>
      input_variables: [<list with_input_variable_names>]
      output_variable: <one_output_variable_name>
      ignore_nan: <boolean>

The combine results rule combines the output of two or more variables to one output variable. The way this data is combined depends on the operation chosen. This could be used for adding mutual exclusive results (e.g., habitat suitability based on flow velocity and water depth) or assessing difference between results (e.g., waterlevel and bathymetry to get the water depth). The rule operates one or multiple 3D variables or 2D variables, independent of the time axes, as long as these all have the same dimensions and returns a single 3D or 2D result, either with time axis, depending on input.

Operations available: Add, Subtract, Multiply, Average, Median, Min and Max The parameter ignore_nan is optional and has a default value of False. When this parameter is set to True, empty values (NaN) will be ignored for all operations, except for Multiply.

The rule needs to be applied to an existing 2D/3D variables with or without time axis. A new 2D/3D variable with or without time axis is created when the rule is executed.

#EXAMPLE  : Calculate bathymetry over time
# This is just an example, there is a variable bed level without time (mesh2d_flowelem_bl)

- combine_results_rule:
      name: Calculate bathymetry
      description: Calculate bathymetry over time by adding water level and water depth
      operation: subtract
      input_variables: ["water_level","water_depth"]
      output_variable: bathymetry_time
      ignore: True

Result Combine rule

Result Combine rule 2

Formula rule

FORMAT
- formula_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      formula: <statistic_opperation_applied>
      input_variables: [<list with_input_variable_names>]
      output_variable: <one_output_variable_name>

With the formula based rule multiple variables can be combined in a flexible way. Operations that are supported are the standard operators.

The rule needs to be applied to an existing 2D/3D variables with or without time axis. A new 2D/3D variable with or without time axis is created when the rule is executed.

#EXAMPLE  : Calculate bathymetry over time
# This is just an example, there is a variable bedlevel without time (mesh2d_flowelem_bl)

- formula_rule:
      name: Calculate bathymetry
      description: Calculate bathymetry over time by adding water level and water depth
      formula: water_level + water_depth
      input_variables: ["water_level","water_depth"]
      output_variable: bathymetry_time

A lot of operators are supported with the formula based rule. Given two variables "x" and "y", formulas can be implemented for the following operators:

Operator	Name	Example
+	Addition	x + y
-	Subtraction	x - y
*	Multiplication	x * y
/	Division	x / y
%	Modulus	x % y
**	Exponentiation	x ** y
//	Floor division	x // y

When a formula results in a boolean, it will be converted to a float result. Meaning that True = 1 and False = 0. Comparison, logical, identity, identity and bitwise operators are supported:

Operator	Name	Example
==	Equal	x == y
!=	Not equal	x != y
>	Greater than	x > y
<	Less than	x < y
>=	Greater than or equal to	x >= y
<=	Less than or equal to	x <= y
//	Floor division	x // y
and	Returns True if both statements are true	x < 5 and x < 10
or	Returns True if one of the statements is true	x < 5 or x < 4
not	Reverse the result, returns False if the result is true	not(x < 5 a
is	Returns True if both variables are the same object	x is y
is not	Returns True if both variables are not the same object	x is not y
in	Returns True if a sequence with the specified value is present in the object	x in y
not in	Returns True if a sequence with the specified value is not present in the object	x not in

Operator	Name	Description	Example
&	AND	Sets each bit to 1 if both bits are 1	x & y
\|	OR	Sets each bit to 1 if one of two bits is 1	x \| y
^	XOR	Sets each bit to 1 if only one of two bits is 1	x ^ y
~	NOT	Inverts all the bits	~x
<<	Zero fill left shift	Shift left by pushing zeros in from the right and let the leftmost bits fall off	x << 2
>>	Signed right shift	Shift right by pushing copies of the leftmost bit in from the left, and let the rightmost bits fall off	x >> 2

For more information on these operators click here.

It is also possible to use functions of the libraries math and numpy. These are accessible by calling their full module names inside the formula, for instance: "numpy.where(water_depth > 1.0)" or "math.ceil(water_level)".

(Multiple) Classification rule

FORMAT
- classification_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      criteria_table:
            - [ "output"       , <input_variable_name1>, <input_variable_name2>]
            - [<response_value>,       <criteria_range>,       <criteria_range>]
            - [<response_value>,       <criteria_range>,       <criteria_range>]
      input_variables: [<list with_input_variable_names>]
      output_variable: <one_output_variable_name>

The classification rule allows for the classification based on the range of one or multiple input vairables. The value range can be indicated in multiple ways. This rule can be used for indicating suitability (0 or 1) or specify categories (1,2,3 etc). The rule will start with the last given criteria range row and work upwards, hence overwriting is possible. Currently there is no check whether possible ranges have been missed or are overlapping.

The rule needs to be applied to an existing 2D/3D variables with or without time axis. A new 2D/3D variable with or without time axis is created when the rule is executed.

Criteria ranges available are:

Criteria range	Example	Description
"-"	"-"	Value is not applicable to category, all is allowed
"criteria_value"	"5"	Value is exectly the criteria value (only applicable for integers)
">criteria_value"	">1"	Value needs to larger than criteria value
"<criteria_value"	"<0.5"	Value needs to be smaller than criteria value
">criteria_value"	">=1"	Value needs to larger than or equal to criteria value
"<criteria_value"	"<=0.5"	Value needs to be smaller than or equal to criteria value
"criteria_value1:criteria_value2"	"0.2:4"	Value needs to be equal or be in between criteria_value1 and criteria_value2

#EXAMPLE  : Determine the suitability for aquatic vegetation based on classification
  - classification_rule:
      name: Classification for aquatic plants
      description: Derive the classification for aquatic plants based on water depth, flow velocity and chloride levels
      criteria_table:
        - ["output", "MIN_water_depth_mNAP", "MAX_flow_velocity", "MAX_chloride"]
        - [     1  ,               "<0.10" ,                "-" ,            "-"] # too dry
        - [     2  ,                ">4.0" ,                "-" ,            "-"] # too deep
        - [     3  ,                   "-" ,                "-" ,         ">400"] # too salty
        - [     4  ,                   "-" ,             ">1.5" ,            "-"] # too fast flowing
        - [     5  ,            "0.10:4.0" ,          "0.0:1.5" ,       "0:400"] # perfect for aquatic plants
      input_variables: ["MIN_water_depth_mNAP", "MAX_flow_velocity", "MAX_chloride"]
      output_variable: aquatic_plant_classes


  - classification_rule:
      name: Suitability for aquatic plants
      description: Derive the suitability for aquatic plants based on the classification
      criteria_table:
        - ["output", "aquatic_plant_classes"]
        - [     0  ,                   "1:4"] # not suitable
        - [     1  ,                     "5"] # suitable
      input_variables: ["aquatic_plant_classes"]
      output_variable: aquatic_plant_suitability

Result Classification rule

Result Classification rule 2

Rolling statistic rule

FORMAT
- rolling_statistics_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      operation: <statistic_opperation_applied>
      time_scale : <time_step_unit_applied>
      period: <time_step_value_applied>
      input_variable: <one_input_variable_name>
      output_variable: <one_output_variable_name>

The rolling statistic rule allows for a rolling statistic based on the chosen operation and the time period over which the statistic should be repeated. The calculated statistic will be written to each last timestep that falls within the period. Operations available: Add, Average, Median, Min, Max, count_periods, Stdev and Percentile(n). When using percentile, add a number for the nth percentile with brackets like this: percentile(10).

Time scales available: hour, day Period can be a float or integer value.

The rule needs to be applied to an existing 2D/3D variables with time axis. A new 2D/3D variable with the same time axis is created when the rule is executed.

An explanation of how the rolling statistic rule works is shown in the table below:

timestep	1	2	3	4	5	6
period1	-	-	-	i
period2		-	-	-	i
period3			-	-	-	i

In the example shown above the stripe indicates the time period covered (4 timesteps in this case) and with i the location where the result of the statistic over that period is written. Hence, the first three timesteps in this example will not contain any values. This is repeated until the time series has been covered.

#EXAMPLE  : Determine a rolling statistic over salinity levels
  - rolling_statistics_rule:
      name: test rolling statistic 12.5 hours
      description: test rolling statistic 12.5 hours
      operation: MAX
      time_scale: hour
      period: 12.5
      input_variable: IN_salinity_PSU
      output_variable: salinity_tl_hour_max

  - rolling_statistics_rule:
      name: test rolling statistic 7 days
      description: test rolling statistic 7 days
      operation: MAX
      time_scale: day
      period: 7
      input_variable: IN_salinity_PSU
      output_variable: salinity_tl_week_max

Result Rolling statistic rule

Result Rolling statistic rule 2

Axis filter rule

FORMAT
- axis_filter_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      axis_name: <name_of_axis_applied>
      layer_number: <integer_nr_of_layer_in_axis_applied>
      input_variable: <one_3D_input_variable_name>
      output_variable: <one_output_variable_name>

The axis filter rule is close to the layer_filter_rule, however it allows for filtering on any axis present in the data. This allows for the selection of a specific time step, spatial cell or other data axis value.

The rule needs to be applied to an existing 2D/3D variables with or without time axis. A new 2D/3D variable with or without time axis is created when the rule is executed, with the exception of the axis that was filtered upon.

#EXAMPLE  : Select only the salinity in the cell for the channel entrance from the faces
  - axis_filter_rule:
      name: Filter face of channel entrance (13th face cell)
      description: Filter face of channel entrance (13th face cell)
      axis_name: mesh2d_nFaces
      layer_number: 13
      input_variable: IN_salinity_PSU
      output_variable: salinity_PSU_channel_entrance

Result Axis filter rule

Depth average rule

FORMAT
- depth_average_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      input_variable: <one_input_variable_name>
      bed_level_variable: <variable_indicating_bed_level>
      water_level_variable: <variable_indicating_water_level>
      interfaces_variable: <variable_indicating_interfaces>
      output_variable: <one_output_variable_name>

The depth average rule allows for an averaging over depth using the weighted values according to a mesh with z- or sigma-layers. The current implementation is only tested for input netCDF files generated by D-Hydro. The input file must include a variable containing the location of the horizontal interfaces between the layers over which the input variable will be averaged. Also two variables specifying the bedlevel and water level are needed. The input_variable will be a 2D/3D variable, with or without time axis. The output_variable has the same dimensions, excluding the dimension for the depth, as it will be represented as one averaged value per cell.

Note: combined z-sigma layers are currently not supported.

An explanation of how the depth rule works is shown in the example below.

Example depth average rule

The image shows a simplified model with the following dimensions: - mesh2d_nFaces = 6 (number of faces) - mesh2d_nLayers = 4 (number of layers in the z direction) - mesh2d_nInterfaces = 5 (number of interfaces that define the depth) - time = 2

Below are the variables belonging to this example:

\[ mesh2d\_interface\_z_{(mesh2d\_nInterfaces)} = \begin{bmatrix} \ 0 \\ \ -2 \\ \ -5 \\ \ -6.5 \\ \ -8.5 \\ \end{bmatrix} \]

\[ salinity _{(time, nFaces, nLayers)}= \begin{bmatrix} \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 \\ 2 & 2 & 2 & 2 & 2 & 2 \\ 3 & 3 & 3 & 3 & 3 & 3 \\ 4 & 4 & 4 & 4 & 4 & 4 \end{bmatrix} \begin{bmatrix} 1 & 1 & NaN & 1 & 1 & 1 \\ 2 & 2 & 2 & 2 & 2 & 2 \\ 3 & 3 & 3 & 3 & 3 & 3 \\ 4 & 4 & 4 & 4 & 4 & 4 \end{bmatrix} \end{bmatrix} \]

\[ mesh2d\_s1 _{(mesh2d\_nFaces, time)} = \begin{bmatrix} -1.4 & 0 \\ -1.6 & -1.6 \\ -3 & -3 \\ -1.4 & 3 \\ -1.6 & -1.6 \\ -1.6 & -1.6 \end{bmatrix} \]

\[ mesh2d\_flowelem\_bl _{(mesh2d\_nFaces)}= \begin{bmatrix} -7.8 \\ -7.3 \\ -5.2 \\-9.5 \\ -7 \\ -1.6 \\ \end{bmatrix} \]

This example results in the following output_variable.

\[ input\_variable _{(nFaces, time)}= \begin{bmatrix} 2.546875 & 2.269231 \\ 2.473684 & 2.473684 \\ 2.090909 & 2.090909 \\ 2.851852 & 2.2 \\ 2.388889 & 2.388889 \\ NaN & NaN \\ \end{bmatrix} \]

Below is an example of an input_file for the depth average rule:

#EXAMPLE  : Determine a depth average for over salinity
  - depth_average_rule:
      name: test depth average
      description: Test depth average
      input_variable: salinity
      bed_level_variable: mesh2d_flowelem_bl
      water_level_variable: mesh2d_s1
      interfaces_variable: mesh2d_interfaces_sigma
      output_variable: average_salinity

Filter extremes rule

FORMAT
- filter_extremes_rule:
      name: <name_of_rule_in_text>
      description: <description_of_rule_in_text>
      input_variable: <one_input_variable_name>
      output_variable: <one_output_variable_name>
      extreme_type: troughs or peaks
      distance: <int_of_time_scale>
      time_scale: second, hour, day, month or year
      mask: <boolean>

The filter extremes rule allows for temporal filtering of extremes in a dataset, i.e. peaks (local maxima) and troughs (local minima). The input variable can be any dimension, as long as it has a time dimension. If the variable mask = False, the output is a variable with the same shape as the input, but only values where the peaks occur and NaN values where no peak occur. If mask = True the output is a same sized variable with 1 (True) at the peak values and NaN elsewhere. Furthermore the user can add a distance (with timescale) as input to define the minimum distance between two peaks/troughs. This mask can be applied to another layer with the combine rule (operation: multiply).

Below an example of an input file to use the filter_extremes_rule.

#EXAMPLE  : Determine the peak waterlevel values
  - depth_average_rule:
      name: test filter extremes
      description: test filter extremes
      input_variable: water_level
      output_variable: water_level_mask
      extreme_type: peaks
      distance: 12
      time_scale: hour
      mask: True

The input above is part of a simple test to calculate the salinity at the peaks and troughs of the waterlevel. The extreme filter rule is first used to get the locations of the peaks and throughs of the water level (mask = True) and then with the combine rule the values of the salinity at these points are calculated. The figure below shows these results, the salinity (blue line) and water level are plotted (orange line). The calculated peaks and troughs are shown in purple and green respectively. This example can be reproduced with an iPython notebook (in D-EcoImpact/scripts/test_extreme_filter.ipynb), in this file is also the input_file.yaml included that is used for the calculation.

Example filter extremes rule

Including data from another YAML file

It is possible to include data in the YAML file that originates from another file. At the moment this is only applicable to another YAML file. This can be useful for storing large classification_rule tables in a separate file (for a better overview of the work file), but this functionality is not limited to that specific rule.

Here is the original rule:

#EXAMPLE  : Original
# This is a simplified example, only top layer of flow velocity and chloride was used and year statistics

  - classification_rule:
      name: classification for aquatic plants
      description: classification for aquatic plants based on water depth, flow velocity and chloride.
      criteria_table:
        - ["output", "MIN_water_depth_mNAP", "MAX_flow_velocity", "MAX_chloride"]
        - [     1  ,               "<0.10" ,                "-" ,            "-"] # too dry
        - [     2  ,                ">4.0" ,                "-" ,            "-"] # too deep
        - [     3  ,                   "-" ,                "-" ,         ">400"] # too salty
        - [     4  ,                   "-" ,             ">1.5" ,            "-"] # too fast flowing
        - [     5  ,            "0.10:4.0" ,          "0.0:1.5" ,        "0:400"] # perfect for aquatic plants

And this is the rule while making using of an inclusion from another file:

#EXAMPLE  : Original
# This is a simplified example, only top layer of flow velocity and chloride was used and year statistics

  - classification_rule:
      name: classification for aquatic plants
      description: classification for aquatic plants based on water depth, flow velocity and chloride.
      criteria_table: !include tables/aquatic_plant_criteria.yaml
      input_variables: ["MIN_water_depth_mNAP", "MAX_flow_velocity", "MAX_chloride"]
      output_variable: aquatic_plant_classes

And this is the included file from tables/aquatic_plant_criteria.yaml:

        - ["output", "MIN_water_depth_mNAP", "MAX_flow_velocity", "MAX_chloride"]
        - [     1  ,               "<0.10" ,                "-" ,            "-"] # too dry
        - [     2  ,                ">4.0" ,                "-" ,            "-"] # too deep
        - [     3  ,                   "-" ,                "-" ,         ">400"] # too salty
        - [     4  ,                   "-" ,             ">1.5" ,            "-"] # too fast flowing
        - [     5  ,            "0.10:4.0" ,          "0.0:1.5" ,        "0:400"] # perfect for aquatic plants

Examples

VKZM (3D) case on water level and chloride policy

Based on the criteria set in the “Waterakkoord” by RWS the water level in Lake Volkerak is not allowed to exceed under normal conditions 0.15 m NAP or go lower than -0.10 m NAP. In addition to this threshold the chloride level should not exceed 450 mg/l (between mid-March and mid-September, as measured at “Bathse burg”). This case was simplified by testing that the chloride level does not exceed 450 mg/l at any moment in the year in the top layer of the model. The dry embankment area and islands included in the model were not excluded from assessment (hence is indicated as where water level is too high in the result).

Volkerak Zoommeer

Meuse (2D) case on Potamogeton spp. habitat suitability

Based on the knowledgerules available through the KRW-Verkenner Rijkswateren the aquatic plant species Long-leaf pond weed (Potamogeton nodosus) and Sago pondweed (Potamogeton spectinatus) the criteria for flow velocity and water depth were applied to a predictive hydrodynamic scenario of the Meuse river. Based on these knowledge rules the habitat suitability for both species was assessed. The results of P. nodosus has been shown below.

WetenschappelijkeNaam	Compartiment	VariabeleNaam	Eenheid	Ondergrens	Bovengrens
Potamogeton nodosus	Omgeving	GemDiepte	m	0,05	2
Potamogeton nodosus	Water	Stroomsnelheid	m/s	0	2
Potamogeton nodosus	Water	Droogval	Categorie	1	2
Potamogeton pectinatus	Omgeving	GemDiepte	m	0,05	10
Potamogeton pectinatus	Water	Stroomsnelheid	m/s	0	2,5
Potamogeton pectinatus	Water	Droogval	Categorie	1	2

Habitat suitability criteria for flow velocity, water depth and desiccation for Potamogeton nodosus and Potamogeton pectinatus.

Maas

Development

Application overview

The application is setup using a layered architecture (see link).

To create the application you will need to create these three components: logger, data-access layer and model builder (see main.py).

    # configure logger and data-access layer
    logger: ILogger = LoggerFactory.create_logger()
    da_layer: IDataAccessLayer = DataAccessLayer(logger)
    model_builder = ModelBuilder(da_layer, logger)

    # create and run application
    application = Application(logger, da_layer, model_builder)
    application.run(path)

The logger provides logging functionality to the application, like reporting errors, warnings, user information and debug messages and is created using a factory pattern. The DataAccessLayer gives the application access to the file system and allows for parsing of input and output. The modelbuilder uses the builder pattern to create a model from a IModelData data object (created by the data-access layer).

Running the application

After constructing the application, the application should be ready to run. During the running of the application the following steps are executed.

Application execution

The application starts by reading the ModelData object from the input files via the IDataAccessLayer. This gets passed to the IModelBuilderto convert the ModelData into a IModel that can be run. The static ModelRunner will then be called to run the created IModel and do the real computation.

Model run

When the ModelRunner run_model command is executed, the following steps are performed (using RuleBasedModel and ICellBasedRule as an example).

Model execution

The ModelRunner starts by validating the model (RuleBasedModel in this example). The RuleBasedModel delegates the validation of the set of rules that it is composed with, calling the validate on every rule (ICellBasedRule in this example). After the model is successfully validated, the initialize of the model is called. In case of the RuleBasedModel, this creates an instance of the RuleProcessor and initializes it.

The ModelRunner continues by calling the execute method on the RuleBasedModel that in turn calls process_rules on the RuleBasedProcessor. This method loops over all the specified rules and executes the rules based on their type. So for example, with the ICellBasedRule the RuleBasedProcessor will loop over all the cells and call the ICellBasedRule execute method for every cell.

When the model execute has successfully finished with the execute step, the finalize method will be called on the model to clean up all resources.

Class diagram

Overview class diagram

Development D-Eco Impact

Workflow

Developer:

Move the jira issue you want to work on from "todo" into "in progress". (issue should be in the sprint, if not please discuss with product owner about changing the sprint scope).
Create a development branch from the main branch with the name based on that of the issue
feat[issue id] {summary of the issue}. For example: > feat[DEI-123] Improve functionality A

Then switch your local copy to the development branch.
Commit the necessary changes with clear messages on what has been done.
Verify if all checks have passed (a green checkmark is shown, not a red cross).

Is one or more checks fail, they must be fixed before continuing.
Once all checks pass, control if there are any changes in the main branch. If so, merge them to the development branch and fix all possible conflicts in the code, if any, and then go back to point 4 of this list.
Move the issue from In progress to In review and create a pull-request with the name of the branch previously assigned: > feat[issue id]{summary of the issue}.

Reviewer:

Change the status of the issue from In review to Being reviewed. This should make you automatically the assignee.
Look at the development details of the issue.
Open the linked pull-request in GitHub.
Change the reviewer to yourself if it didn't happen before, as indicated in point 1.
Go to the Files changed tab to see the modifications implemented for the issue.
Add your review comments (see comment on a PR documentation ).

Some points to analyse during the review are: * does the code work, including corner cases? * is the code in the right place? * is it readable? * is the code documented (all public methods and classes should have doc strings)? * are nameing conventions used properly? * is there any duplication of the code? * Is the code maintainable? * is the code covered by tests? * are all tests and checks green? * are the commit messages clear enough and do the satisfy the conventions?

7. Set the status of the issue (comment, approve or request changes). 1. Change the status if the issue in Jira corrspondingly:
- Approved -> In Test
- Request changes -> To do
- Comment -> In review (with the developer as assignee).

Tester:

Change issue status from "in test" to "being tested". This should make you the assignee.
For a bug or improvement, check out the main branch and try to reproduce the issue or to get familiar with the previous functionality.
Change your local check-out to the development branch (from which the pull-request was created).
Test now the new functionality or bug fix by running the main script from python in a clean python environment.
Try to think of situations or conditions that may have been forgotten to implement properly, and test these as well.
Add comments in the issue with your findings (ok or not because ...). Describe enough in detail so that other people can easily reproduce any problems found. If needed, provide any required (additioonal) data.
Move the issue in Jira to the new corresponding state:
- If the test is ok, to Merge.
- If the test is not ok, move to To do.

If test is succesful

Go to pull request on GitHub.
Check if there will be merge conflicts (shown by GitHub) and if the development branch is up to date with the main branch.
- If any merge conflicts are reported, then check with developer to resolve the merge issues.
- If the branch does not have any merge conflicts and is not up to date -> press the update branch button.
If the branch is up to date and does not have merge conflicts you can merge the pull request to the main branch.
Change issue status in jira from "merge" to "validate".
Change your local checkout to the main branch and do a few checks to see if the merge was correct.
If the merge was successful, change issue status in jira from "validate" to "done".

Agreements

Coding:

We use the PEP8 style guide for python development.
We use typing where possible.
We avoid using global variables.
We use encapsulation by only making the necessary imports and variables public.
For testing, we use the pytest module.
For checking the style guide, we use flake8 and pylint.
For managing external dependencies, we use poetry (.toml file).
We prefer to use VS Code for development (sharing settings using vscode folder) with the following plugins:
- autoDocstring
- python

API Reference

decoimpact

business

application

Module for Application class

!!! classes Application

`Application`

Application for running command-line

Source code in business/application.py

class Application:
    """Application for running command-line"""

    # get version
    APPLICATION_VERSION = read_version_number()
    APPLICATION_NAME = "D-EcoImpact"
    # separate version into major, minor and patch:
    APPLICATION_VERSION_PARTS = list(map(int, APPLICATION_VERSION.split(".", 2)))

    def __init__(
        self,
        logger: ILogger,
        da_layer: IDataAccessLayer,
        model_builder: IModelBuilder,
    ):
        """Creates an application based on provided logger, data-access layer
        and model builder

        Args:
            logger (ILogger): Logger that takes care of logging
            da_layer (IDataAccessLayer): data-access layer for reading/writing
            model_builder (IModelBuilder): builder for creating a model based on
            IModelData
        """
        self._logger = logger
        self._da_layer = da_layer
        self._model_builder = model_builder

    def run(self, input_path: Path):
        """Runs application

        Args:
            input_path (Path): path to input file
        """

        try:
            # show application version
            self._logger.log_info(f"Application version: {self.APPLICATION_VERSION}")

            # read input file
            model_data: IModelData = self._da_layer.read_input_file(input_path)
            str_input_version = "".join([str(x) + "." for x in model_data.version])[:-1]
            self._logger.log_info(f"Input file version: {str_input_version}")

            # check version:
            message = (
                f"Application version {self.APPLICATION_VERSION} is older"
                " than version from input file {str_input_version}"
            )
            # major version (app) should be equal or larger then input version --> error
            if self.APPLICATION_VERSION_PARTS[0] < model_data.version[0]:
                self._logger.log_error(message)
            # minor version (app) should be equal or larger then input version --> warn
            elif self.APPLICATION_VERSION_PARTS[1] < model_data.version[1]:
                self._logger.log_warning(message)

            # build model
            for dataset in model_data.datasets:
                input_files = self._da_layer.retrieve_file_names(dataset.path)
                output_path_base = Path(model_data.output_path)
                for key, file_name in input_files.items():
                    dataset.path = file_name
                    output_path = self._generate_output_path(output_path_base, key)

                    model_data.partition = key
                    model = self._model_builder.build_model(model_data)

                    # run model
                    _ModelRunner.run_model(model, self._logger)

                    # write output file
                    if model.status == _ModelStatus.FINALIZED:
                        settings = OutputFileSettings(
                            self.APPLICATION_NAME, self.APPLICATION_VERSION
                        )
                        settings.variables_to_save = model_data.output_variables

                        self._da_layer.write_output_file(
                            model.output_dataset, output_path, settings
                        )

        except Exception as exc:  # pylint: disable=broad-except
            self._logger.log_error(f"Exiting application after error: {exc}")

    def _generate_output_path(self, output_path_base, key):
        if "*" in output_path_base.stem:
            output_path = Path(str(output_path_base).replace("*", key))
        else:
            partition_part = ""
            if key:
                partition_part = f"_{key}"
            output_path = Path.joinpath(
                output_path_base.parent,
                f"{output_path_base.stem}{partition_part}{output_path_base.suffix}",
            )
        return output_path

`init(self, logger, da_layer, model_builder)` `special`

Creates an application based on provided logger, data-access layer and model builder

Parameters:

Name	Type	Description	Default
`logger`	`ILogger`	Logger that takes care of logging	required
`da_layer`	`IDataAccessLayer`	data-access layer for reading/writing	required
`model_builder`	`IModelBuilder`	builder for creating a model based on	required

Source code in business/application.py

def __init__(
    self,
    logger: ILogger,
    da_layer: IDataAccessLayer,
    model_builder: IModelBuilder,
):
    """Creates an application based on provided logger, data-access layer
    and model builder

    Args:
        logger (ILogger): Logger that takes care of logging
        da_layer (IDataAccessLayer): data-access layer for reading/writing
        model_builder (IModelBuilder): builder for creating a model based on
        IModelData
    """
    self._logger = logger
    self._da_layer = da_layer
    self._model_builder = model_builder

`run(self, input_path)`

Runs application

Parameters:

Name	Type	Description	Default
`input_path`	`Path`	path to input file	required

Source code in business/application.py

def run(self, input_path: Path):
    """Runs application

    Args:
        input_path (Path): path to input file
    """

    try:
        # show application version
        self._logger.log_info(f"Application version: {self.APPLICATION_VERSION}")

        # read input file
        model_data: IModelData = self._da_layer.read_input_file(input_path)
        str_input_version = "".join([str(x) + "." for x in model_data.version])[:-1]
        self._logger.log_info(f"Input file version: {str_input_version}")

        # check version:
        message = (
            f"Application version {self.APPLICATION_VERSION} is older"
            " than version from input file {str_input_version}"
        )
        # major version (app) should be equal or larger then input version --> error
        if self.APPLICATION_VERSION_PARTS[0] < model_data.version[0]:
            self._logger.log_error(message)
        # minor version (app) should be equal or larger then input version --> warn
        elif self.APPLICATION_VERSION_PARTS[1] < model_data.version[1]:
            self._logger.log_warning(message)

        # build model
        for dataset in model_data.datasets:
            input_files = self._da_layer.retrieve_file_names(dataset.path)
            output_path_base = Path(model_data.output_path)
            for key, file_name in input_files.items():
                dataset.path = file_name
                output_path = self._generate_output_path(output_path_base, key)

                model_data.partition = key
                model = self._model_builder.build_model(model_data)

                # run model
                _ModelRunner.run_model(model, self._logger)

                # write output file
                if model.status == _ModelStatus.FINALIZED:
                    settings = OutputFileSettings(
                        self.APPLICATION_NAME, self.APPLICATION_VERSION
                    )
                    settings.variables_to_save = model_data.output_variables

                    self._da_layer.write_output_file(
                        model.output_dataset, output_path, settings
                    )

    except Exception as exc:  # pylint: disable=broad-except
        self._logger.log_error(f"Exiting application after error: {exc}")

entities

i_model

Module for IModel Interface

!!! interfaces IModel

!!! classes ModelStatus

`IModel (ABC)`

Interface for models

Source code in entities/i_model.py

class IModel(ABC):
    """Interface for models"""

    @property
    @abstractmethod
    def name(self) -> str:
        """Name of the model"""

    @property
    @abstractmethod
    def status(self) -> ModelStatus:
        """Status of the model"""

    @status.setter
    @abstractmethod
    def status(self, status: ModelStatus):
        """Status of the model"""

    @property
    @abstractmethod
    def input_datasets(self) -> List[_xr.Dataset]:
        """Input datasets for the model"""

    @property
    @abstractmethod
    def output_dataset(self) -> _xr.Dataset:
        """Output dataset produced by this model"""

    @property
    def partition(self) -> str:
        """partition of the model"""

    @partition.setter
    def partition(self, partition: str):
        """partition of the model"""

    @abstractmethod
    def validate(self, logger: ILogger) -> bool:
        """Validates the model"""

    @abstractmethod
    def initialize(self, logger: ILogger) -> None:
        """Initializes the model"""

    @abstractmethod
    def execute(self, logger: ILogger) -> None:
        """Executes the model"""

    @abstractmethod
    def finalize(self, logger: ILogger) -> None:
        """Finalizes the model"""

`input_datasets: List[xarray.core.dataset.Dataset]` `property` `readonly`

Input datasets for the model

`name: str` `property` `readonly`

Name of the model

`output_dataset: Dataset` `property` `readonly`

Output dataset produced by this model

`partition: str` `property` `writable`

partition of the model

`status: ModelStatus` `property` `writable`

Status of the model

`execute(self, logger)`

Executes the model

Source code in entities/i_model.py

@abstractmethod
def execute(self, logger: ILogger) -> None:
    """Executes the model"""

`finalize(self, logger)`

Finalizes the model

Source code in entities/i_model.py

@abstractmethod
def finalize(self, logger: ILogger) -> None:
    """Finalizes the model"""

`initialize(self, logger)`

Initializes the model

Source code in entities/i_model.py

@abstractmethod
def initialize(self, logger: ILogger) -> None:
    """Initializes the model"""

`validate(self, logger)`

Validates the model

Source code in entities/i_model.py

@abstractmethod
def validate(self, logger: ILogger) -> bool:
    """Validates the model"""

`ModelStatus (Enum)`

Enum for the model status

Source code in entities/i_model.py

class ModelStatus(Enum):
    """Enum for the model status"""

    CREATED = auto()
    INITIALIZING = auto()
    INITIALIZED = auto()
    EXECUTING = auto()
    EXECUTED = auto()
    FINALIZING = auto()
    FINALIZED = auto()
    FAILED = auto()
    VALIDATING = auto()
    VALIDATED = auto()

rule_based_model

Module for RuleBasedModel class

!!! classes RuleBasedModel

`RuleBasedModel (IModel)`

Model class for models based on rules

Source code in entities/rule_based_model.py

class RuleBasedModel(IModel):
    """Model class for models based on rules"""

    # pylint: disable=too-many-arguments
    # pylint: disable=too-many-positional-arguments
    # pylint: disable=too-many-instance-attributes
    def __init__(
        self,
        input_datasets: List[_xr.Dataset],
        rules: List[IRule],
        mapping: Optional[dict[str, str]] = None,
        name: str = "Rule-Based model",
        partition: str = "",
    ) -> None:

        self._name = name
        self._status = ModelStatus.CREATED
        self._rules = rules
        self._input_datasets: List[_xr.Dataset] = input_datasets
        self._output_dataset: _xr.Dataset
        self._rule_processor: Optional[RuleProcessor]
        self._mappings = mapping
        self._partition = partition

    @property
    def name(self) -> str:
        """Name of the model"""
        return self._name

    @property
    def status(self) -> ModelStatus:
        """Status of the model"""
        return self._status

    @status.setter
    def status(self, status: ModelStatus):
        """Status of the model"""
        self._status = status

    @property
    def rules(self) -> List[IRule]:
        """Rules to execute"""
        return self._rules

    @property
    def input_datasets(self) -> List[_xr.Dataset]:
        """Input datasets for the model"""
        return self._input_datasets

    @property
    def output_dataset(self) -> _xr.Dataset:
        """Output dataset produced by this model"""
        return self._output_dataset

    @property
    def partition(self) -> str:
        """partition of the model"""
        return self._partition

    @partition.setter
    def partition(self, partition: str):
        """partition of the model"""
        self._partition = partition

    def validate(self, logger: ILogger) -> bool:
        """Validates the model"""

        valid = True

        if len(self._input_datasets) < 1:
            logger.log_error("Model does not contain any datasets.")
            valid = False

        if len(self._rules) < 1:
            logger.log_error("Model does not contain any rules.")
            valid = False

        for rule in self._rules:
            valid = rule.validate(logger) and valid

        if self._mappings is not None:
            valid = self._validate_mappings(self._mappings, logger) and valid

        return valid

    def initialize(self, logger: ILogger) -> None:
        """Initializes the model.
        Creates an output dataset which contains the necessary variables obtained
        from the input dataset.
        """

        self._output_dataset = _du.create_composed_dataset(
            self._input_datasets, self._make_output_variables_list(), self._mappings
        )

        self._rule_processor = RuleProcessor(self._rules, self._output_dataset)

        if not self._rule_processor.initialize(logger):
            logger.log_error("Initialization failed.")

    def execute(self, logger: ILogger) -> None:
        """Executes the model"""
        if self._rule_processor is None:
            raise RuntimeError("Processor is not set, please initialize model.")

        self._output_dataset = self._rule_processor.process_rules(
            self._output_dataset, logger
        )

    def finalize(self, logger: ILogger) -> None:
        """Finalizes the model"""

        logger.log_debug("Finalize the rule processor.")
        self._rule_processor = None

    def _make_output_variables_list(self) -> list:
        """Make the list of variables to be contained in the output dataset.
        A list of variables needed is obtained from the dummy variable and
        the dependent variables are recursively looked up. This is done to
        support XUgrid and to prevent invalid topologies.
        This also allows QuickPlot to visualize the results.

        Args:
            -

        Returns:
            list[str]: dummy, dependendent,  mapping and rule input variables
        """

        for dataset in self._input_datasets:
            dummy_var_name = _du.get_dummy_variable_in_ugrid(dataset)
            var_list = _du.get_dependent_var_list(dataset, dummy_var_name)

        mapping_keys = list((self._mappings or {}).keys())
        rule_names = [rule.name for rule in self._rules]
        all_inputs = self._get_direct_rule_inputs(rule_names)
        all_input_variables = _lu.flatten_list(list(all_inputs.values()))

        all_vars = var_list + mapping_keys + all_input_variables

        return _lu.remove_duplicates_from_list(all_vars)

    # pylint: disable=too-many-locals
    def _validate_mappings(self, mappings: dict[str, str], logger: ILogger) -> bool:
        """Checks if the provided mappings are valid.

        Args:
            mappings (dict[str, str]): mappings to check
            logger (ILogger): logger for logging messages

        Returns:
            bool: if mappings are valid
        """
        input_vars = _lu.flatten_list(
            [
                _lu.flatten_list([_du.list_vars(ds), _du.list_coords(ds)])
                for ds in self._input_datasets
            ]
        )

        valid = True

        # check if mapping keys are available in the input datasets
        mapping_vars_expected = list(mappings.keys())
        missing_vars = _lu.items_not_in(mapping_vars_expected, input_vars)

        if len(missing_vars) > 0:
            logger.log_error(
                "Could not find mapping variables "
                f"'{', '.join(missing_vars)}' in any input dataset."
            )
            valid = False

        # check for duplicates that will be created because of mapping
        duplicates_created = _lu.items_in(list(mappings.values()), input_vars)

        if len(duplicates_created) > 0:
            logger.log_error(
                "Mapping towards the following variables "
                f"'{', '.join(duplicates_created)}', will create duplicates with"
                " variables in the input datasets."
            )
            valid = False

        rule_names = [rule.name for rule in self._rules]

        rule_inputs = self._get_direct_rule_inputs(rule_names)

        # check for missing rule inputs
        for rule_name, rule_input in rule_inputs.items():
            needed_rule_inputs = _lu.remove_duplicates_from_list(rule_input)
            rule_input_vars = input_vars + list(mappings.values())
            missing_rule_inputs = _lu.items_not_in(needed_rule_inputs, rule_input_vars)
            if len(missing_rule_inputs) > 0:
                logger.log_error(
                    f"Missing the variables '{', '.join(missing_rule_inputs)}' that "
                    f"are required by '{rule_name}'."
                )
                valid = False

        return valid

    def _get_direct_rule_inputs(self, rule_names) -> Dict[str, List[str]]:
        """Gets the input variables directly needed by rules from
        input datasets.

        Returns:
            Dict[str, List[str]]
        """
        rule_input_vars = [rule.input_variable_names for rule in self._rules]
        rule_output_vars = [rule.output_variable_name for rule in self._rules]

        needed_input_per_rule = {}
        for index, inputs_per_rule in enumerate(rule_input_vars):
            needed_input_per_rule[rule_names[index]] = _lu.items_not_in(
                inputs_per_rule, rule_output_vars
            )

        return needed_input_per_rule

`input_datasets: List[xarray.core.dataset.Dataset]` `property` `readonly`

Input datasets for the model

`name: str` `property` `readonly`

Name of the model

`output_dataset: Dataset` `property` `readonly`

Output dataset produced by this model

`partition: str` `property` `writable`

partition of the model

`rules: List[decoimpact.business.entities.rules.i_rule.IRule]` `property` `readonly`

Rules to execute

`status: ModelStatus` `property` `writable`

Status of the model

`execute(self, logger)`

Executes the model

Source code in entities/rule_based_model.py

def execute(self, logger: ILogger) -> None:
    """Executes the model"""
    if self._rule_processor is None:
        raise RuntimeError("Processor is not set, please initialize model.")

    self._output_dataset = self._rule_processor.process_rules(
        self._output_dataset, logger
    )

`finalize(self, logger)`

Finalizes the model

Source code in entities/rule_based_model.py

def finalize(self, logger: ILogger) -> None:
    """Finalizes the model"""

    logger.log_debug("Finalize the rule processor.")
    self._rule_processor = None

`initialize(self, logger)`

Initializes the model. Creates an output dataset which contains the necessary variables obtained from the input dataset.

Source code in entities/rule_based_model.py

def initialize(self, logger: ILogger) -> None:
    """Initializes the model.
    Creates an output dataset which contains the necessary variables obtained
    from the input dataset.
    """

    self._output_dataset = _du.create_composed_dataset(
        self._input_datasets, self._make_output_variables_list(), self._mappings
    )

    self._rule_processor = RuleProcessor(self._rules, self._output_dataset)

    if not self._rule_processor.initialize(logger):
        logger.log_error("Initialization failed.")

`validate(self, logger)`

Validates the model

Source code in entities/rule_based_model.py

def validate(self, logger: ILogger) -> bool:
    """Validates the model"""

    valid = True

    if len(self._input_datasets) < 1:
        logger.log_error("Model does not contain any datasets.")
        valid = False

    if len(self._rules) < 1:
        logger.log_error("Model does not contain any rules.")
        valid = False

    for rule in self._rules:
        valid = rule.validate(logger) and valid

    if self._mappings is not None:
        valid = self._validate_mappings(self._mappings, logger) and valid

    return valid

rule_processor

Module for RuleProcessor class

!!! classes RuleProcessor

`RuleProcessor`

Model class for processing models based on rules

Source code in entities/rule_processor.py

class RuleProcessor:
    """Model class for processing models based on rules"""

    def __init__(self, rules: List[IRule], dataset: _xr.Dataset) -> None:
        """Creates instance of a rule processor using the provided
        rules and input datasets

        Args:
            rules (List[IRule]): rules to process
            input_dataset (_xr.Dataset): input dataset to use
        """
        if len(rules) < 1:
            raise ValueError("No rules defined.")

        if dataset is None:
            raise ValueError("No datasets defined.")

        self._rules = rules
        self._input_dataset = dataset
        self._processing_list: List[List[IRule]] = []

    def initialize(self, logger: ILogger) -> bool:
        """Creates an ordered list of rule arrays, where every rule array
        contains rules that can be processed simultaneously.

        Args:
            logger (ILogger): logger for reporting messages

        Returns:
            bool: A boolean to indicate if all the rules can be processed.
        """
        inputs: List[str] = []

        inputs = _lu.flatten_list(
            [_du.list_vars(self._input_dataset), _du.list_coords(self._input_dataset)]
        )

        tree, success = self._create_rule_sets(inputs, self._rules, [], logger)
        if success:
            self._processing_list = tree

        return success

    def process_rules(
        self, output_dataset: _xr.Dataset, logger: ILogger
    ) -> _xr.Dataset:
        """Processes the rules defined in the initialize method
        and adds the results to the provided output_dataset.

        Args:
            output_dataset (_xr.Dataset): Dataset to place the rule
                                          results into
            logger (ILogger): logger for reporting messages

        Raises:
            RuntimeError: if initialization is not correctly done
        """

        if len(self._processing_list) < 1:
            message = "Processor is not properly initialized, please initialize."
            raise RuntimeError(message)

        for rule_set in self._processing_list:
            for rule in rule_set:
                logger.log_info(f"Starting rule {rule.name}")

                rule_result = self._execute_rule(rule, output_dataset, logger)
                output_name = rule.output_variable_name

                output_dataset[output_name] = (
                    rule_result.dims,
                    rule_result.values,
                    rule_result.attrs,
                    rule_result.coords,
                )
                for coord_key in rule_result.coords:
                    # the coord_key is overwritten in case we don't have the if
                    # statement below
                    if coord_key not in output_dataset.coords:
                        output_dataset = output_dataset.assign_coords(
                            {coord_key: rule_result[coord_key]}
                        )
        return output_dataset

    def _create_rule_sets(
        self,
        inputs: List[str],
        unprocessed_rules: List[IRule],
        current_tree: List[List[IRule]],
        logger: ILogger,
    ) -> Tuple[List[List[IRule]], bool]:
        """Creates an ordered list of rule-sets that can be processed in parallel.

        Args:
            inputs (List[str]): input names that are available to rules
            unprocessed_rules (List[IRule]): rules that still need to be handled
            current_tree (List[List[IRule]]): the current output list state
            logger (ILogger): logger for logging messages

        Returns:
            Tuple[List[List[IRule]], bool]: Ordered list of rule-sets
        """
        solvable_rules = self._get_solvable_rules(inputs, unprocessed_rules)

        if len(solvable_rules) == 0:
            rules_list = [str(rule.name) for rule in unprocessed_rules]
            rules_text = ", ".join(rules_list)
            logger.log_warning(f"Some rules can not be resolved: {rules_text}")
            return [], False

        for rule in solvable_rules:
            unprocessed_rules.remove(rule)
            inputs.append(rule.output_variable_name)

        current_tree.append(solvable_rules)

        if len(unprocessed_rules) > 0:
            return self._create_rule_sets(
                inputs, unprocessed_rules, current_tree, logger
            )
        return current_tree, True

    def _get_solvable_rules(
        self, inputs: List[str], unprocessed_rules: List[IRule]
    ) -> List[IRule]:
        """Checks which rules can be resolved using the provided "inputs" list.

        Args:
            inputs (List[str]): available inputs to resolve rules with
            unprocessed_rules (List[IRule]): rules that need need to be checked

        Returns:
            List[IRule]: list of rules that can be resolved with the provided inputs
        """
        solvable_rules: List[IRule] = []

        for rule in unprocessed_rules:
            names = rule.input_variable_names
            if all(name in inputs for name in names):
                solvable_rules.append(rule)

        return solvable_rules

    def _execute_rule(
        self, rule: IRule, output_dataset: _xr.Dataset, logger: ILogger
    ) -> _xr.DataArray:
        """Processes the rule with the provided dataset.

        Returns:
            _xr.DataArray: result data set
        """

        variable_lookup = dict(self._get_rule_input_variables(rule, output_dataset))
        variables = list(variable_lookup.values())

        if isinstance(rule, IMultiArrayBasedRule):
            result = rule.execute(variable_lookup, logger)

            # set output attributes, based on first array
            self._set_output_attributes(rule, result, variables[0])
            return result

        if isinstance(rule, IMultiCellBasedRule):
            result = self._process_by_multi_cell(rule, variable_lookup, logger)
            self._set_output_attributes(rule, result, variables[0])
            return result

        if len(variables) != 1:
            raise NotImplementedError("Array based rule only supports one input array.")

        input_variable = variables[0]
        if isinstance(rule, IArrayBasedRule):
            result = rule.execute(input_variable, logger)
            self._set_output_attributes(rule, result, input_variable)
            return result

        if isinstance(rule, ICellBasedRule):
            result = self._process_by_cell(rule, input_variable, logger)
            self._set_output_attributes(rule, result, input_variable)
            return result

        raise NotImplementedError(f"Can not execute rule {rule.name}.")

    def _set_output_attributes(
        self, rule: IRule, result: _xr.DataArray, input_variable: _xr.DataArray
    ):
        self._copy_definition_attributes(input_variable, result)

        result.attrs["long_name"] = rule.output_variable_name
        result.attrs["standard_name"] = rule.output_variable_name

    def _copy_definition_attributes(
        self, source_array: _xr.DataArray, target_array: _xr.DataArray
    ) -> None:
        attributes_to_copy = ["location", "mesh"]

        for attribute_name in attributes_to_copy:
            target_array.attrs[attribute_name] = get_dict_element(
                attribute_name, source_array.attrs, False
            )

    def _process_by_cell(
        self, rule: ICellBasedRule, input_variable: _xr.DataArray, logger: ILogger
    ) -> _xr.DataArray:
        """Processes every value of the input_variable and creates a
        new one from it

        Args:
            rule (ICellBasedRule): rule to process
            input_variable (_xr.DataArray): input variable/data
            logger (ILogger): logger for log messages

        Returns:
            _xr.DataArray: _description_
        """
        np_array = input_variable.to_numpy()
        result_variable = _np.zeros_like(np_array)

        # define variables to count value exceedings (for some rules): min and max
        warning_counter_total = [0, 0]

        # execute rule and gather warnings for exceeded values (for some rules)
        for indices, value in _np.ndenumerate(np_array):
            result_variable[indices], warning_counter = rule.execute(value, logger)
            # update total counter for both min and max
            warning_counter_total[0] += warning_counter[0]
            warning_counter_total[1] += warning_counter[1]

        # show warnings values outside range (for some rules):
        if warning_counter_total[0] > 0:
            logger.log_warning(
                f"value less than min: {warning_counter_total[0]} occurence(s)"
            )
        if warning_counter_total[1] > 0:
            logger.log_warning(
                f"value greater than max: {warning_counter_total[1]} occurence(s)"
            )

        # use copy to get the same dimensions as the
        # original input variable
        return input_variable.copy(data=result_variable)

    def _process_by_multi_cell(
        self,
        rule: IMultiCellBasedRule,
        input_variables: Dict[str, _xr.DataArray],
        logger: ILogger,
    ) -> _xr.DataArray:
        """Processes every value of the input_variable and creates a
        new one from it

        Args:
            rule (IMultiCellBasedRule): rule to process
            input_variables (_xr.DataArray): input variables/data
            logger (ILogger): logger for log messages

        Returns:
            _xr.DataArray: _description_
        """
        if len(input_variables) < 1:
            raise NotImplementedError(
                f"Can not execute rule {rule.name} with no input variables."
            )

        value_arrays = list(input_variables.values())

        # Check the amount of dimensions of all variables
        len_dims = _np.array([len(vals.dims) for vals in value_arrays])

        # Use the variable with the most dimensions. Broadcast all other
        # variables to these dimensions
        most_dims_bool = len_dims == max(len_dims)

        ref_var = value_arrays[_np.argmax(len_dims)]
        for ind_vars, enough_dims in enumerate(most_dims_bool):
            if not enough_dims:
                var_orig = value_arrays[ind_vars]
                value_arrays[ind_vars] = self._expand_dimensions_of_variable(
                    var_orig, ref_var, logger
                )

        # Check if all variables now have the same dimensions
        self._check_variable_dimensions(value_arrays, rule)

        result_variable = _np.zeros_like(ref_var.to_numpy())
        cell_values = {}

        for indices, _ in _np.ndenumerate(ref_var.to_numpy()):
            for value in value_arrays:
                cell_values[value.name] = value.data[indices]

            result_variable[indices] = rule.execute(cell_values, logger)

        # use copy to get the same dimensions as the
        # original input variable
        return ref_var.copy(data=result_variable)

    def _get_rule_input_variables(
        self, rule: IRule, output_dataset: _xr.Dataset
    ) -> Iterable[Tuple[str, _xr.DataArray]]:
        input_variable_names = rule.input_variable_names

        for input_variable_name in input_variable_names:
            yield input_variable_name, self._get_variable_by_name(
                input_variable_name, output_dataset
            )

    def _get_variable_by_name(
        self, name: str, output_dataset: _xr.Dataset
    ) -> _xr.DataArray:
        # search output dataset (generated output)
        if name in output_dataset:
            return output_dataset[name]

        raise KeyError(
            f"Key {name} was not found in input datasets or "
            "in calculated output dataset.",
        )

    def _check_variable_dimensions(
        self, value_arrays: List[_xr.DataArray], rule: IMultiCellBasedRule
    ):
        for val_index in range(len(value_arrays) - 1):
            var1 = value_arrays[val_index]
            var2 = value_arrays[val_index + 1]
            diff = set(var1.dims) ^ set(var2.dims)

            # If the variables with the most dimensions have different dimensions,
            # stop the calculation
            if len(diff) != 0:
                raise NotImplementedError(
                    f"Can not execute rule {rule.name} with variables with different \
                    dimensions. Variable {var1.name} with dimensions:{var1.dims} is \
                    different than {var2.name} with dimensions:{var2.dims}"
                )

    def _expand_dimensions_of_variable(
        self, var_orig: _xr.DataArray, ref_var: _xr.DataArray, logger: ILogger
    ):
        """Creates a new data-array with the values of the var_org expanded to
        include all dimensions of the ref_var

        Args:
            var_orig (_xr.DataArray): variable to expand with extra dimensions
            ref_var (_xr.DataArray): reference variable to synchronize the
                                     dimensions with
            logger (ILogger): logger for logging messages
        """
        # Let the user know which variables will be broadcast to all dimensions
        dims_orig = var_orig.dims
        dims_result = ref_var.dims
        dims_diff = [str(x) for x in dims_result if x not in dims_orig]
        str_dims_broadcasted = ",".join(dims_diff)
        logger.log_info(
            f"""Variable {var_orig.name} will be expanded to the following \
            dimensions: {str_dims_broadcasted} """
        )

        # perform the broadcast
        var_broadcasted = _xr.broadcast(var_orig, ref_var)[0]

        # Make sure the dimensions are in the same order
        return var_broadcasted.transpose(*ref_var.dims)

`init(self, rules, dataset)` `special`

Creates instance of a rule processor using the provided rules and input datasets

Parameters:

Name	Type	Description	Default
`rules`	`List[IRule]`	rules to process	required
`input_dataset`	`_xr.Dataset`	input dataset to use	required

Source code in entities/rule_processor.py

def __init__(self, rules: List[IRule], dataset: _xr.Dataset) -> None:
    """Creates instance of a rule processor using the provided
    rules and input datasets

    Args:
        rules (List[IRule]): rules to process
        input_dataset (_xr.Dataset): input dataset to use
    """
    if len(rules) < 1:
        raise ValueError("No rules defined.")

    if dataset is None:
        raise ValueError("No datasets defined.")

    self._rules = rules
    self._input_dataset = dataset
    self._processing_list: List[List[IRule]] = []

`initialize(self, logger)`

Creates an ordered list of rule arrays, where every rule array contains rules that can be processed simultaneously.

Parameters:

Name	Type	Description	Default
`logger`	`ILogger`	logger for reporting messages	required

Returns:

Type	Description
`bool`	A boolean to indicate if all the rules can be processed.

Source code in entities/rule_processor.py

def initialize(self, logger: ILogger) -> bool:
    """Creates an ordered list of rule arrays, where every rule array
    contains rules that can be processed simultaneously.

    Args:
        logger (ILogger): logger for reporting messages

    Returns:
        bool: A boolean to indicate if all the rules can be processed.
    """
    inputs: List[str] = []

    inputs = _lu.flatten_list(
        [_du.list_vars(self._input_dataset), _du.list_coords(self._input_dataset)]
    )

    tree, success = self._create_rule_sets(inputs, self._rules, [], logger)
    if success:
        self._processing_list = tree

    return success

`process_rules(self, output_dataset, logger)`

Processes the rules defined in the initialize method and adds the results to the provided output_dataset.

Parameters:

Name	Type	Description	Default
`output_dataset`	`_xr.Dataset`	Dataset to place the rule results into	required
`logger`	`ILogger`	logger for reporting messages	required

Exceptions:

Type	Description
`RuntimeError`	if initialization is not correctly done

Source code in entities/rule_processor.py

def process_rules(
    self, output_dataset: _xr.Dataset, logger: ILogger
) -> _xr.Dataset:
    """Processes the rules defined in the initialize method
    and adds the results to the provided output_dataset.

    Args:
        output_dataset (_xr.Dataset): Dataset to place the rule
                                      results into
        logger (ILogger): logger for reporting messages

    Raises:
        RuntimeError: if initialization is not correctly done
    """

    if len(self._processing_list) < 1:
        message = "Processor is not properly initialized, please initialize."
        raise RuntimeError(message)

    for rule_set in self._processing_list:
        for rule in rule_set:
            logger.log_info(f"Starting rule {rule.name}")

            rule_result = self._execute_rule(rule, output_dataset, logger)
            output_name = rule.output_variable_name

            output_dataset[output_name] = (
                rule_result.dims,
                rule_result.values,
                rule_result.attrs,
                rule_result.coords,
            )
            for coord_key in rule_result.coords:
                # the coord_key is overwritten in case we don't have the if
                # statement below
                if coord_key not in output_dataset.coords:
                    output_dataset = output_dataset.assign_coords(
                        {coord_key: rule_result[coord_key]}
                    )
    return output_dataset

rules

axis_filter_rule

Module for AxisFilterRule class

!!! classes AxisFilterRule

`AxisFilterRule (RuleBase, IArrayBasedRule)`

Implementation for the axis filter rule

Source code in rules/axis_filter_rule.py

class AxisFilterRule(RuleBase, IArrayBasedRule):
    """Implementation for the axis filter rule"""

    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        element_index: int,
        axis_name: str,
    ):
        super().__init__(name, input_variable_names)
        self._element_index = element_index
        self._axis_name = axis_name

    @property
    def element_index(self) -> int:
        """Value index of the provided axis to filter on"""
        return self._element_index

    @property
    def axis_name(self) -> str:
        """Layer number property"""
        return self._axis_name

    def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
        """Obtain a 2D layer from a 3D variable

        Args:
            value (float): 3D value to obtain a layer from


        Returns:
            float: 2D variable
        """

        if self._axis_name not in value_array.dims:
            message = f"""Layer name is not in dim names \
                [{value_array.dims}] layer_name [{self._axis_name}]"""
            logger.log_error(message)
            raise IndexError(message)

        if not (
            self._element_index >= 0
            and self._element_index <= len(getattr(value_array, self._axis_name))
        ):
            message = f"""Layer number should be within range \
                [0,{len(getattr(value_array, self._axis_name))}]"""
            logger.log_error(message)
            raise IndexError(message)

        return value_array.isel({self._axis_name: self._element_index - 1})

`axis_name: str` `property` `readonly`

Layer number property

`element_index: int` `property` `readonly`

Value index of the provided axis to filter on

`execute(self, value_array, logger)`

Obtain a 2D layer from a 3D variable

Parameters:

Name	Type	Description	Default
`value`	`float`	3D value to obtain a layer from	required

Returns:

Type	Description
`float`	2D variable

Source code in rules/axis_filter_rule.py

def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
    """Obtain a 2D layer from a 3D variable

    Args:
        value (float): 3D value to obtain a layer from


    Returns:
        float: 2D variable
    """

    if self._axis_name not in value_array.dims:
        message = f"""Layer name is not in dim names \
            [{value_array.dims}] layer_name [{self._axis_name}]"""
        logger.log_error(message)
        raise IndexError(message)

    if not (
        self._element_index >= 0
        and self._element_index <= len(getattr(value_array, self._axis_name))
    ):
        message = f"""Layer number should be within range \
            [0,{len(getattr(value_array, self._axis_name))}]"""
        logger.log_error(message)
        raise IndexError(message)

    return value_array.isel({self._axis_name: self._element_index - 1})

classification_rule

Module for ClassificationRule class

!!! classes ClassificationRule

`ClassificationRule (RuleBase, IMultiArrayBasedRule)`

Implementation for the (multiple) classification rule

Source code in rules/classification_rule.py

class ClassificationRule(RuleBase, IMultiArrayBasedRule):
    """Implementation for the (multiple) classification rule"""

    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        criteria_table: Dict[str, List],
    ):
        super().__init__(name, input_variable_names)
        self._criteria_table = criteria_table

    @property
    def criteria_table(self) -> Dict:
        """Criteria property"""
        return self._criteria_table

    def execute(
        self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
    ) -> _xr.DataArray:
        """Determine the classification based on the table with criteria
        Args:
            values (Dict[str, float]): Dictionary holding the values
                                         for making the rule
        Returns:
            integer: classification
        """

        # Get all the headers in the criteria_table representing a value to be checked
        column_names = list(self._criteria_table.keys())
        column_names.remove("output")

        # Create an empty result_array to be filled
        result_array = _xr.zeros_like(value_arrays[column_names[0]])

        for row, out in reversed(list(enumerate(self._criteria_table["output"]))):
            criteria_comparison = _xr.full_like(value_arrays[column_names[0]], True)
            for column_name in column_names:
                # DataArray on which the criteria needs to be checked
                data = value_arrays[column_name]

                # Retrieving criteria and applying it in correct format (number,
                # range or comparison)
                criteria = self.criteria_table[column_name][row]
                comparison = self._get_comparison_for_criteria(criteria, data)
                if comparison is None:
                    comparison = True

                # Criteria_comparison == 1 -> to check where the value is True
                criteria_comparison = _xr.where(
                    comparison & (criteria_comparison == 1), True, False
                )
            # For the first row set the default to None, for all the other
            # rows use the already created dataarray
            default_val = None
            if row != len(self._criteria_table["output"]) - 1:
                default_val = result_array

            result_array = _xr.where(criteria_comparison, out, default_val)
        return result_array

    def _get_comparison_for_criteria(
        self, criteria: str, data: _xr.DataArray
    ) -> Optional[_xr.DataArray]:

        criteria_class = type_of_classification(criteria)

        comparison = None
        if criteria_class == "number":
            comparison = data == float(criteria)

        elif criteria_class == "range":
            begin, end = str_range_to_list(criteria)
            comparison = (data >= begin) & (data <= end)

        elif criteria_class == "larger_equal":
            comparison_val = read_str_comparison(criteria, ">=")
            comparison = data >= float(comparison_val)

        elif criteria_class == "smaller_equal":
            comparison_val = read_str_comparison(criteria, "<=")
            comparison = data <= float(comparison_val)

        elif criteria_class == "larger":
            comparison_val = read_str_comparison(criteria, ">")
            comparison = data > float(comparison_val)

        elif criteria_class == "smaller":
            comparison_val = read_str_comparison(criteria, "<")
            comparison = data < float(comparison_val)

        return comparison

`criteria_table: Dict` `property` `readonly`

Criteria property

`execute(self, value_arrays, logger)`

Determine the classification based on the table with criteria

Parameters:

Name	Type	Description	Default
`values`	`Dict[str, float]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`integer`	classification

Source code in rules/classification_rule.py

def execute(
    self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
) -> _xr.DataArray:
    """Determine the classification based on the table with criteria
    Args:
        values (Dict[str, float]): Dictionary holding the values
                                     for making the rule
    Returns:
        integer: classification
    """

    # Get all the headers in the criteria_table representing a value to be checked
    column_names = list(self._criteria_table.keys())
    column_names.remove("output")

    # Create an empty result_array to be filled
    result_array = _xr.zeros_like(value_arrays[column_names[0]])

    for row, out in reversed(list(enumerate(self._criteria_table["output"]))):
        criteria_comparison = _xr.full_like(value_arrays[column_names[0]], True)
        for column_name in column_names:
            # DataArray on which the criteria needs to be checked
            data = value_arrays[column_name]

            # Retrieving criteria and applying it in correct format (number,
            # range or comparison)
            criteria = self.criteria_table[column_name][row]
            comparison = self._get_comparison_for_criteria(criteria, data)
            if comparison is None:
                comparison = True

            # Criteria_comparison == 1 -> to check where the value is True
            criteria_comparison = _xr.where(
                comparison & (criteria_comparison == 1), True, False
            )
        # For the first row set the default to None, for all the other
        # rows use the already created dataarray
        default_val = None
        if row != len(self._criteria_table["output"]) - 1:
            default_val = result_array

        result_array = _xr.where(criteria_comparison, out, default_val)
    return result_array

combine_results_rule

Module for CombineResultsRule Class

!!! classes CombineResultsRule

`CombineResultsRule (RuleBase, IMultiArrayBasedRule)`

Implementation for the combine results rule

Source code in rules/combine_results_rule.py

class CombineResultsRule(RuleBase, IMultiArrayBasedRule):
    """Implementation for the combine results rule"""

    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        operation_type: MultiArrayOperationType,
        ignore_nan: bool = False,
    ):
        super().__init__(name, input_variable_names)
        self._operation_type: MultiArrayOperationType = operation_type
        self._ignore_nan = ignore_nan
        self._operations = self._create_operations()

    @property
    def operation_type(self) -> MultiArrayOperationType:
        """Name of the rule"""
        return self._operation_type

    @property
    def ignore_nan(self) -> bool:
        """Indicates if NaN values should be ignored in the calculations"""
        return self._ignore_nan

    def validate(self, logger: ILogger) -> bool:
        if self._operation_type not in self._operations:

            message = (
                f"Operation type {self._operation_type} is currently" " not supported."
            )

            logger.log_error(message)
            return False

        if len(self._input_variable_names) < 2:
            logger.log_error("Minimum of two input variables required.")
            return False

        return True

    def execute(
        self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
    ) -> _xr.DataArray:
        """Calculate simple statistical operations with two or more input arrays
        Args:
            input_arrays (DataArray): array list containing the variables
        Returns:
            DataArray: Input arrays
        """
        if len(value_arrays) != len(self._input_variable_names):
            raise ValueError("Not all expected arrays where provided.")

        np_arrays = [a_array.to_numpy() for a_array in value_arrays.values()]
        if not self._check_dimensions(np_arrays):
            raise ValueError("The arrays must have the same dimensions.")

        operation_to_use = self._operations[self._operation_type]

        first_value_array = next(iter(value_arrays.values()))

        result_variable = _xr.DataArray(
            data=operation_to_use(np_arrays),
            dims=first_value_array.dims,
            attrs=first_value_array.attrs,
        )

        return result_variable

    def _create_operations(self) -> dict[MultiArrayOperationType, Callable]:
        if self.ignore_nan:
            return {
                MultiArrayOperationType.MULTIPLY: lambda npa: _np.prod(npa, axis=0),
                MultiArrayOperationType.MIN: lambda npa: _np.nanmin(npa, axis=0),
                MultiArrayOperationType.MAX: lambda npa: _np.nanmax(npa, axis=0),
                MultiArrayOperationType.AVERAGE: lambda npa: _np.nanmean(npa, axis=0),
                MultiArrayOperationType.MEDIAN: lambda npa: _np.nanmedian(npa, axis=0),
                MultiArrayOperationType.ADD: lambda npa: _np.nansum(npa, axis=0),
                MultiArrayOperationType.SUBTRACT: lambda npa: _np.subtract(
                    npa[0], _np.nansum(npa[1:], axis=0)
                ),
            }
        # and if ignore_nan is False:
        return {
            MultiArrayOperationType.MULTIPLY: lambda npa: _np.prod(npa, axis=0),
            MultiArrayOperationType.MIN: lambda npa: _np.min(npa, axis=0),
            MultiArrayOperationType.MAX: lambda npa: _np.max(npa, axis=0),
            MultiArrayOperationType.AVERAGE: lambda npa: _np.average(npa, axis=0),
            MultiArrayOperationType.MEDIAN: lambda npa: _np.median(npa, axis=0),
            MultiArrayOperationType.ADD: lambda npa: _np.sum(npa, axis=0),
            MultiArrayOperationType.SUBTRACT: lambda npa: _np.subtract(
                npa[0], _np.sum(npa[1:], axis=0)
            ),
        }

    def _check_dimensions(self, np_arrays: List[_np.ndarray]) -> bool:
        """Brief check if all the arrays to be combined have the
           same size/dimension/length
        Args:
            np_arrays: List of numpy arrays
        Returns:
            Boolean: True of False
        """

        expected_dimensions = np_arrays[0].ndim
        for a_array in np_arrays[1:]:
            if expected_dimensions != _np.ndim(a_array):
                return False

        expected_shape = np_arrays[0].shape
        for a_array in np_arrays[1:]:
            if expected_shape != a_array.shape:
                return False
        return True

`ignore_nan: bool` `property` `readonly`

Indicates if NaN values should be ignored in the calculations

`operation_type: MultiArrayOperationType` `property` `readonly`

Name of the rule

`execute(self, value_arrays, logger)`

Calculate simple statistical operations with two or more input arrays

Parameters:

Name	Type	Description	Default
`input_arrays`	`DataArray`	array list containing the variables	required

Returns:

Type	Description
`DataArray`	Input arrays

Source code in rules/combine_results_rule.py

def execute(
    self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
) -> _xr.DataArray:
    """Calculate simple statistical operations with two or more input arrays
    Args:
        input_arrays (DataArray): array list containing the variables
    Returns:
        DataArray: Input arrays
    """
    if len(value_arrays) != len(self._input_variable_names):
        raise ValueError("Not all expected arrays where provided.")

    np_arrays = [a_array.to_numpy() for a_array in value_arrays.values()]
    if not self._check_dimensions(np_arrays):
        raise ValueError("The arrays must have the same dimensions.")

    operation_to_use = self._operations[self._operation_type]

    first_value_array = next(iter(value_arrays.values()))

    result_variable = _xr.DataArray(
        data=operation_to_use(np_arrays),
        dims=first_value_array.dims,
        attrs=first_value_array.attrs,
    )

    return result_variable

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/combine_results_rule.py

def validate(self, logger: ILogger) -> bool:
    if self._operation_type not in self._operations:

        message = (
            f"Operation type {self._operation_type} is currently" " not supported."
        )

        logger.log_error(message)
        return False

    if len(self._input_variable_names) < 2:
        logger.log_error("Minimum of two input variables required.")
        return False

    return True

depth_average_rule

Module for DepthAverageRule class

!!! classes DepthAverageRule

`DepthAverageRule (RuleBase, IMultiArrayBasedRule)`

Implementation for the depth average rule

Source code in rules/depth_average_rule.py

class DepthAverageRule(RuleBase, IMultiArrayBasedRule):
    """Implementation for the depth average rule"""

    # pylint: disable=too-many-locals
    def execute(
        self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
    ) -> _xr.DataArray:
        """Calculate depth average of assumed z-layers.

        Args:
            value_array (DataArray): Values to average over the depth

        Returns:
            DataArray: Depth-averaged values
        """

        # The first DataArray in our value_arrays contains the values to be averaged
        # but the name of the key is given by the user, and is unknown here, so use
        # the ordering defined in the parser.
        values_list = list(value_arrays.values())

        variable = values_list[0]
        bed_level_values = values_list[1]
        water_level_values = values_list[2]
        depths_interfaces = values_list[3]

        # Get the dimension names for the interfaces and for the layers
        dim_interfaces_name = list(depths_interfaces.dims)[0]
        interfaces_len = depths_interfaces[dim_interfaces_name].size

        dim_layer_name = [
            d for d in variable.dims if d not in water_level_values.dims
        ][0]
        layer_len = variable[dim_layer_name].size

        # interface dimension should always be one larger than layer dimension
        # Otherwise give an error to the user
        if interfaces_len != layer_len + 1:
            logger.log_error(
                f"The number of interfaces should be number of layers + 1. Number of "
                f"interfaces = {interfaces_len}. Number of layers = {layer_len}."
            )
            return variable

        # Deal with open layer system at water level and bed level
        depths_interfaces.values[depths_interfaces.values.argmin()] = -100000
        depths_interfaces.values[depths_interfaces.values.argmax()] = 100000

        # Broadcast the depths to the dimensions of the bed levels. Then make a
        # correction for the depths to the bed level, in other words all depths lower
        # than the bed level will be corrected to the bed level.
        depths_interfaces_broadcasted = depths_interfaces.broadcast_like(
            bed_level_values
        )

        corrected_depth_bed = depths_interfaces_broadcasted.where(
            bed_level_values < depths_interfaces_broadcasted, bed_level_values
        )

        # Make a similar correction for the waterlevels (first broadcast to match
        # dimensions and then replace all values higher than waterlevel with
        # waterlevel)
        corrected_depth_bed = corrected_depth_bed.broadcast_like(water_level_values)
        corrected_depth_bed = corrected_depth_bed.where(
            water_level_values > corrected_depth_bed, water_level_values
        )

        # Calculate the layer heights between depths
        layer_heights = corrected_depth_bed.diff(dim=dim_interfaces_name)
        layer_heights = layer_heights.rename({dim_interfaces_name: dim_layer_name})

        # Use the NaN filtering of the variable to set the correct depth per column
        layer_heights = layer_heights.where(variable.notnull())

        # Calculate depth average using relative value
        relative_values = variable * layer_heights

        # Calculate average
        return relative_values.sum(dim=dim_layer_name) / layer_heights.sum(
            dim=dim_layer_name
        )

`execute(self, value_arrays, logger)`

Calculate depth average of assumed z-layers.

Parameters:

Name	Type	Description	Default
`value_array`	`DataArray`	Values to average over the depth	required

Returns:

Type	Description
`DataArray`	Depth-averaged values

Source code in rules/depth_average_rule.py

def execute(
    self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
) -> _xr.DataArray:
    """Calculate depth average of assumed z-layers.

    Args:
        value_array (DataArray): Values to average over the depth

    Returns:
        DataArray: Depth-averaged values
    """

    # The first DataArray in our value_arrays contains the values to be averaged
    # but the name of the key is given by the user, and is unknown here, so use
    # the ordering defined in the parser.
    values_list = list(value_arrays.values())

    variable = values_list[0]
    bed_level_values = values_list[1]
    water_level_values = values_list[2]
    depths_interfaces = values_list[3]

    # Get the dimension names for the interfaces and for the layers
    dim_interfaces_name = list(depths_interfaces.dims)[0]
    interfaces_len = depths_interfaces[dim_interfaces_name].size

    dim_layer_name = [
        d for d in variable.dims if d not in water_level_values.dims
    ][0]
    layer_len = variable[dim_layer_name].size

    # interface dimension should always be one larger than layer dimension
    # Otherwise give an error to the user
    if interfaces_len != layer_len + 1:
        logger.log_error(
            f"The number of interfaces should be number of layers + 1. Number of "
            f"interfaces = {interfaces_len}. Number of layers = {layer_len}."
        )
        return variable

    # Deal with open layer system at water level and bed level
    depths_interfaces.values[depths_interfaces.values.argmin()] = -100000
    depths_interfaces.values[depths_interfaces.values.argmax()] = 100000

    # Broadcast the depths to the dimensions of the bed levels. Then make a
    # correction for the depths to the bed level, in other words all depths lower
    # than the bed level will be corrected to the bed level.
    depths_interfaces_broadcasted = depths_interfaces.broadcast_like(
        bed_level_values
    )

    corrected_depth_bed = depths_interfaces_broadcasted.where(
        bed_level_values < depths_interfaces_broadcasted, bed_level_values
    )

    # Make a similar correction for the waterlevels (first broadcast to match
    # dimensions and then replace all values higher than waterlevel with
    # waterlevel)
    corrected_depth_bed = corrected_depth_bed.broadcast_like(water_level_values)
    corrected_depth_bed = corrected_depth_bed.where(
        water_level_values > corrected_depth_bed, water_level_values
    )

    # Calculate the layer heights between depths
    layer_heights = corrected_depth_bed.diff(dim=dim_interfaces_name)
    layer_heights = layer_heights.rename({dim_interfaces_name: dim_layer_name})

    # Use the NaN filtering of the variable to set the correct depth per column
    layer_heights = layer_heights.where(variable.notnull())

    # Calculate depth average using relative value
    relative_values = variable * layer_heights

    # Calculate average
    return relative_values.sum(dim=dim_layer_name) / layer_heights.sum(
        dim=dim_layer_name
    )

filter_extremes_rule

Module for FilterExtremesRule class

!!! classes FilterExtremesRule

`FilterExtremesRule (RuleBase, IArrayBasedRule)`

Implementation for the filter extremes rule

Source code in rules/filter_extremes_rule.py

class FilterExtremesRule(RuleBase, IArrayBasedRule):
    """Implementation for the filter extremes rule"""

    # pylint: disable=too-many-arguments
    # pylint: disable=too-many-positional-arguments
    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        extreme_type: ExtremeTypeOptions,
        distance: int,
        time_scale: str,
        mask: bool,
    ):
        super().__init__(name, input_variable_names)
        self._settings = TimeOperationSettings(
            {"second": "s", "hour": "h", "day": "D", "month": "M", "year": "Y"}
        )
        self._extreme_type: ExtremeTypeOptions = extreme_type
        self._distance = distance
        self._settings.time_scale = time_scale
        self._mask = mask

    @property
    def settings(self):
        """Time operation settings"""
        return self._settings

    @property
    def extreme_type(self) -> ExtremeTypeOptions:
        """Type of extremes (peaks or troughs)"""
        return self._extreme_type

    @property
    def distance(self) -> int:
        """Minimal distance between peaks"""
        return self._distance

    @property
    def mask(self) -> bool:
        """Return either directly the values of the filtered array or a
        True/False array"""
        return self._mask

    def validate(self, logger: ILogger) -> bool:
        """Validates if the rule is valid

        Returns:
            bool: wether the rule is valid
        """
        return self.settings.validate(self.name, logger)

    def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
        """
        Retrieve the extremes
        extreme_type: Either retrieve the values at the peaks or troughs
        mask: If False return the values at the peaks, otherwise return a
        1 at the extreme locations.

        Args:
            value_array (DataArray): Values to filter at extremes

        Returns:
            DataArray: Filtered DataArray with only the extremes remaining
            at all other times the values are set to NaN
        """

        time_scale = get_dict_element(
            self.settings.time_scale, self.settings.time_scale_mapping
        )

        time_dim_name = get_time_dimension_name(value_array, logger)
        time = value_array.time.values
        timestep = (time[-1] - time[0]) / len(time)
        width_time = _np.timedelta64(self.distance, time_scale)
        distance = width_time / timestep

        results = _xr.apply_ufunc(
            self._process_peaks,
            value_array,
            input_core_dims=[[time_dim_name]],
            output_core_dims=[[time_dim_name]],
            vectorize=True,
            kwargs={
                "distance": distance,
                "mask": self.mask,
                "extreme_type": self.extreme_type,
            },
        )

        results = results.transpose(*value_array.dims)
        return results

    def _process_peaks(
        self, arr: _xr.DataArray, distance: float, mask: bool, extreme_type: str
    ):
        factor = 1
        if extreme_type == "troughs":
            factor = -1
        peaks, _ = signal.find_peaks(factor * arr, distance=distance)
        values = arr[peaks]
        if mask:
            values = True
        new_arr = _np.full_like(arr, _np.nan, dtype=float)
        new_arr[peaks] = values
        return new_arr

`distance: int` `property` `readonly`

Minimal distance between peaks

`extreme_type: ExtremeTypeOptions` `property` `readonly`

Type of extremes (peaks or troughs)

`mask: bool` `property` `readonly`

Return either directly the values of the filtered array or a True/False array

`settings` `property` `readonly`

Time operation settings

`execute(self, value_array, logger)`

Retrieve the extremes extreme_type: Either retrieve the values at the peaks or troughs mask: If False return the values at the peaks, otherwise return a 1 at the extreme locations.

Parameters:

Name	Type	Description	Default
`value_array`	`DataArray`	Values to filter at extremes	required

Returns:

Type	Description
`DataArray`	Filtered DataArray with only the extremes remaining at all other times the values are set to NaN

Source code in rules/filter_extremes_rule.py

def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
    """
    Retrieve the extremes
    extreme_type: Either retrieve the values at the peaks or troughs
    mask: If False return the values at the peaks, otherwise return a
    1 at the extreme locations.

    Args:
        value_array (DataArray): Values to filter at extremes

    Returns:
        DataArray: Filtered DataArray with only the extremes remaining
        at all other times the values are set to NaN
    """

    time_scale = get_dict_element(
        self.settings.time_scale, self.settings.time_scale_mapping
    )

    time_dim_name = get_time_dimension_name(value_array, logger)
    time = value_array.time.values
    timestep = (time[-1] - time[0]) / len(time)
    width_time = _np.timedelta64(self.distance, time_scale)
    distance = width_time / timestep

    results = _xr.apply_ufunc(
        self._process_peaks,
        value_array,
        input_core_dims=[[time_dim_name]],
        output_core_dims=[[time_dim_name]],
        vectorize=True,
        kwargs={
            "distance": distance,
            "mask": self.mask,
            "extreme_type": self.extreme_type,
        },
    )

    results = results.transpose(*value_array.dims)
    return results

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/filter_extremes_rule.py

def validate(self, logger: ILogger) -> bool:
    """Validates if the rule is valid

    Returns:
        bool: wether the rule is valid
    """
    return self.settings.validate(self.name, logger)

formula_rule

Module for Formula Rule class

!!! classes Formula Rule

`FormulaRule (RuleBase, IMultiCellBasedRule)`

Implementation for the Formula rule

Source code in rules/formula_rule.py

class FormulaRule(RuleBase, IMultiCellBasedRule):
    """Implementation for the Formula rule"""

    formula_output_name: str = "formula_result"

    def __init__(self, name: str, input_variable_names: List[str], formula: str):
        super().__init__(name, input_variable_names)
        self._formula = formula
        self._byte_code = None
        self._setup_environment()

    def validate(self, logger: ILogger) -> bool:
        try:
            byte_code = _compile_restricted(
                f"{self.formula_output_name} = {self._formula}",
                filename="<inline code>",
                mode="exec",
            )
            local_variables = dict.fromkeys(self.input_variable_names, 1.0)
            exec(byte_code, self._global_variables, local_variables)

        except (SyntaxError, NameError) as exception:
            logger.log_error(f"Could not create formula function: {exception}")
            return False

        return True

    @property
    def formula(self) -> str:
        """Multiplier property"""
        return self._formula

    def execute(self, values: Dict[str, float], logger: ILogger) -> float:
        """Calculates the formula based on the
        Args:
            values (DataArray): values to Formula
        Returns:
            float: Calculated float
        """

        if not self._byte_code:
            self._byte_code = _compile_restricted(
                f"{self.formula_output_name} = {self._formula}",
                filename="<inline code>",
                mode="exec",
            )

        local_variables = values.copy()

        try:
            exec(self._byte_code, self._global_variables, local_variables)

        except SyntaxError as exception:
            logger.log_error(f"The formula can not be executed. {exception}")

        return float(local_variables[self.formula_output_name])

    def _setup_environment(self):
        # use standard libraries that are considered safe
        self._safe_modules_dict = {
            "math": math,
            "numpy": numpy,
        }

        # Global data available in restricted code
        self._global_variables = {
            "__builtins__": {**_safe_builtins, "__import__": self._safe_import},
            **self._safe_modules_dict,
        }

        self._byte_code = None

    def _safe_import(self, name, *args, **kwargs):
        # Redefine import, to only import from safe modules
        if name not in self._safe_modules_dict:
            raise _ArgumentError(None, f"Importing {name!r} is not allowed!")
        return __import__(name, *args, **kwargs)

`formula: str` `property` `readonly`

Multiplier property

`execute(self, values, logger)`

Calculates the formula based on the

Parameters:

Name	Type	Description	Default
`values`	`DataArray`	values to Formula	required

Returns:

Type	Description
`float`	Calculated float

Source code in rules/formula_rule.py

def execute(self, values: Dict[str, float], logger: ILogger) -> float:
    """Calculates the formula based on the
    Args:
        values (DataArray): values to Formula
    Returns:
        float: Calculated float
    """

    if not self._byte_code:
        self._byte_code = _compile_restricted(
            f"{self.formula_output_name} = {self._formula}",
            filename="<inline code>",
            mode="exec",
        )

    local_variables = values.copy()

    try:
        exec(self._byte_code, self._global_variables, local_variables)

    except SyntaxError as exception:
        logger.log_error(f"The formula can not be executed. {exception}")

    return float(local_variables[self.formula_output_name])

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/formula_rule.py

def validate(self, logger: ILogger) -> bool:
    try:
        byte_code = _compile_restricted(
            f"{self.formula_output_name} = {self._formula}",
            filename="<inline code>",
            mode="exec",
        )
        local_variables = dict.fromkeys(self.input_variable_names, 1.0)
        exec(byte_code, self._global_variables, local_variables)

    except (SyntaxError, NameError) as exception:
        logger.log_error(f"Could not create formula function: {exception}")
        return False

    return True

i_array_based_rule

Module for IArrayBasedRule interface

!!! interfaces IArrayBasedRule

`IArrayBasedRule (IRule, ABC)`

Rule applied to an array of values

Source code in rules/i_array_based_rule.py

class IArrayBasedRule(IRule, ABC):
    """Rule applied to an array of values"""

    @abstractmethod
    def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
        """Executes the rule based on the provided array"""

`execute(self, value_array, logger)`

Executes the rule based on the provided array

Source code in rules/i_array_based_rule.py

@abstractmethod
def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
    """Executes the rule based on the provided array"""

i_cell_based_rule

Module for ICellBasedRule interface

!!! interfaces ICellBasedRule

`ICellBasedRule (IRule, ABC)`

Rule applied to every cell

Source code in rules/i_cell_based_rule.py

class ICellBasedRule(IRule, ABC):
    """Rule applied to every cell"""

    @abstractmethod
    def execute(self, value: float, logger: ILogger) -> float:
        """Executes the rule based on the provided value"""

`execute(self, value, logger)`

Executes the rule based on the provided value

Source code in rules/i_cell_based_rule.py

@abstractmethod
def execute(self, value: float, logger: ILogger) -> float:
    """Executes the rule based on the provided value"""

i_multi_array_based_rule

Module for IMultiArrayBasedRule interface

!!! interfaces IMultiArrayBasedRule

`IMultiArrayBasedRule (IRule, ABC)`

Rule applied to an a set of (named) arrays

Source code in rules/i_multi_array_based_rule.py

class IMultiArrayBasedRule(IRule, ABC):
    """Rule applied to an a set of (named) arrays"""

    @abstractmethod
    def execute(
        self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
    ) -> _xr.DataArray:
        """Executes the rule based on the provided array"""

`execute(self, value_arrays, logger)`

Executes the rule based on the provided array

Source code in rules/i_multi_array_based_rule.py

@abstractmethod
def execute(
    self, value_arrays: Dict[str, _xr.DataArray], logger: ILogger
) -> _xr.DataArray:
    """Executes the rule based on the provided array"""

i_multi_cell_based_rule

Module for IMultiCellBasedRule interface

!!! interfaces IMultiCellBasedRule

`IMultiCellBasedRule (IRule, ABC)`

Rule applied to every cell

Source code in rules/i_multi_cell_based_rule.py

class IMultiCellBasedRule(IRule, ABC):
    """Rule applied to every cell"""

    @abstractmethod
    def execute(self, values: Dict[str, float], logger: ILogger) -> float:
        """Executes the rule based on the provided value"""

`execute(self, values, logger)`

Executes the rule based on the provided value

Source code in rules/i_multi_cell_based_rule.py

@abstractmethod
def execute(self, values: Dict[str, float], logger: ILogger) -> float:
    """Executes the rule based on the provided value"""

i_rule

Module for IRule interface

!!! interfaces IRule

`IRule (ABC)`

Interface for rules

Source code in rules/i_rule.py

class IRule(ABC):

    """Interface for rules"""

    @property
    @abstractmethod
    def name(self) -> str:
        """Name of the rule"""

    @property
    @abstractmethod
    def description(self) -> str:
        """Description of the rule"""

    @property
    @abstractmethod
    def input_variable_names(self) -> List[str]:
        """Names of the input variable"""

    @property
    @abstractmethod
    def output_variable_name(self) -> str:
        """Name of the output variable"""

    @abstractmethod
    def validate(self, logger: ILogger) -> bool:
        """Validates if the rule is valid

        Returns:
            bool: wether the rule is valid
        """

`description: str` `property` `readonly`

Description of the rule

`input_variable_names: List[str]` `property` `readonly`

Names of the input variable

`name: str` `property` `readonly`

Name of the rule

`output_variable_name: str` `property` `readonly`

Name of the output variable

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/i_rule.py

@abstractmethod
def validate(self, logger: ILogger) -> bool:
    """Validates if the rule is valid

    Returns:
        bool: wether the rule is valid
    """

layer_filter_rule

Module for LayerFilterRule class

!!! classes LayerFilterRule

`LayerFilterRule (RuleBase, IArrayBasedRule)`

Implementation for the layer filter rule

Source code in rules/layer_filter_rule.py

class LayerFilterRule(RuleBase, IArrayBasedRule):
    """Implementation for the layer filter rule"""

    def __init__(self, name: str, input_variable_names: List[str], layer_number: int):
        super().__init__(name, input_variable_names)
        self._layer_number = layer_number

    @property
    def layer_number(self) -> int:
        """Layer number property"""
        return self._layer_number

    def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
        """Obtain a 2D layer from a 3D variable

        Args:
            value (float): 3D value to obtain a layer from


        Returns:
            float: 2D variable
        """

        dim_name = value_array.dims[2]

        if not (
            self._layer_number >= 0
            and self._layer_number <= len(getattr(value_array, dim_name))
        ):
            message = f"""Layer number should be within range \
                [0,{len(getattr(value_array, dim_name))}]"""
            logger.log_error(message)
            raise IndexError(message)

        return value_array[:, :, self._layer_number - 1]

`layer_number: int` `property` `readonly`

Layer number property

`execute(self, value_array, logger)`

Obtain a 2D layer from a 3D variable

Parameters:

Name	Type	Description	Default
`value`	`float`	3D value to obtain a layer from	required

Returns:

Type	Description
`float`	2D variable

Source code in rules/layer_filter_rule.py

def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
    """Obtain a 2D layer from a 3D variable

    Args:
        value (float): 3D value to obtain a layer from


    Returns:
        float: 2D variable
    """

    dim_name = value_array.dims[2]

    if not (
        self._layer_number >= 0
        and self._layer_number <= len(getattr(value_array, dim_name))
    ):
        message = f"""Layer number should be within range \
            [0,{len(getattr(value_array, dim_name))}]"""
        logger.log_error(message)
        raise IndexError(message)

    return value_array[:, :, self._layer_number - 1]

multiply_rule

Module for MultiplyRule class

!!! classes MultiplyRule

`MultiplyRule (RuleBase, IArrayBasedRule)`

Implementation for the multiply rule

Source code in rules/multiply_rule.py

class MultiplyRule(RuleBase, IArrayBasedRule):
    """Implementation for the multiply rule"""

    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        multipliers: List[List[float]],
        date_range: Optional[List[List[str]]] = None,
    ):
        super().__init__(name, input_variable_names)
        self._multipliers = multipliers
        self._date_range = date_range

    @property
    def multipliers(self) -> List[List[float]]:
        """Multiplier property"""
        return self._multipliers

    @property
    def date_range(self) -> Optional[List[List[str]]]:
        """Date range property"""
        return self._date_range

    def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
        """Multiplies the value with the specified multipliers. If there is no
        date range, multiply the whole DataArray with the same multiplier. If
        there is table format with date range, make sure that the correct values
        in time are multiplied with the corresponding multipliers.
        Args:
            value_array (DataArray): Values to multiply
        Returns:
            DataArray: Multiplied values
        """
        # Per time period multiple multipliers can be given, reduce this to
        # one multiplier by taking the product of all multipliers.
        result_multipliers = [_np.prod(mp) for mp in self._multipliers]
        old_dr = _xr.DataArray(value_array)
        new_dr = _xr.full_like(old_dr, _np.nan)

        for index, _mp in enumerate(result_multipliers):
            if self.date_range is not None and len(self.date_range) != 0:
                # Date is given in DD-MM, convert to MM-DD for comparison
                start = self._convert_datestr(self.date_range[index][0])
                end = self._convert_datestr(self.date_range[index][1])
                dr_date = old_dr.time.dt.strftime(r"%m-%d")
                new_dr = _xr.where(
                    (start < dr_date) & (dr_date < end), old_dr * _mp, new_dr
                )
            else:
                new_dr = old_dr * _mp
        return new_dr

    def _convert_datestr(self, date_str: str) -> str:
        parsed_str = _dt.strptime(date_str, r"%d-%m")
        return parsed_str.strftime(r"%m-%d")

`date_range: Optional[List[List[str]]]` `property` `readonly`

Date range property

`multipliers: List[List[float]]` `property` `readonly`

Multiplier property

`execute(self, value_array, logger)`

Multiplies the value with the specified multipliers. If there is no date range, multiply the whole DataArray with the same multiplier. If there is table format with date range, make sure that the correct values in time are multiplied with the corresponding multipliers.

Parameters:

Name	Type	Description	Default
`value_array`	`DataArray`	Values to multiply	required

Returns:

Type	Description
`DataArray`	Multiplied values

Source code in rules/multiply_rule.py

def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
    """Multiplies the value with the specified multipliers. If there is no
    date range, multiply the whole DataArray with the same multiplier. If
    there is table format with date range, make sure that the correct values
    in time are multiplied with the corresponding multipliers.
    Args:
        value_array (DataArray): Values to multiply
    Returns:
        DataArray: Multiplied values
    """
    # Per time period multiple multipliers can be given, reduce this to
    # one multiplier by taking the product of all multipliers.
    result_multipliers = [_np.prod(mp) for mp in self._multipliers]
    old_dr = _xr.DataArray(value_array)
    new_dr = _xr.full_like(old_dr, _np.nan)

    for index, _mp in enumerate(result_multipliers):
        if self.date_range is not None and len(self.date_range) != 0:
            # Date is given in DD-MM, convert to MM-DD for comparison
            start = self._convert_datestr(self.date_range[index][0])
            end = self._convert_datestr(self.date_range[index][1])
            dr_date = old_dr.time.dt.strftime(r"%m-%d")
            new_dr = _xr.where(
                (start < dr_date) & (dr_date < end), old_dr * _mp, new_dr
            )
        else:
            new_dr = old_dr * _mp
    return new_dr

options

multi_array_operation_type

Module for MultiArrayOperationType Class

!!! classes MultiArrayOperationType

`MultiArrayOperationType (IntEnum)`

Classify the multi array operation types.

Source code in options/multi_array_operation_type.py

class MultiArrayOperationType(IntEnum):
    """Classify the multi array operation types."""

    MULTIPLY = 1
    MIN = 2
    MAX = 3
    AVERAGE = 4
    MEDIAN = 5
    ADD = 6
    SUBTRACT = 7

options_filter_extreme_rule

Module for ExtremeTypeOptions Class

!!! classes ExtremeTypeOptions

`ExtremeTypeOptions (str, Enum)`

Classify the extreme type options.

Source code in options/options_filter_extreme_rule.py

class ExtremeTypeOptions(str, Enum):
    """Classify the extreme type options."""

    PEAKS = "peaks"
    TROUGHS = "troughs"

`format(self, format_spec)` `special`

Default object formatter.

Source code in options/options_filter_extreme_rule.py

def __format__(self, format_spec):
    return str.__format__(str(self), format_spec)

response_curve_rule

Module for Response Curve Rule class

!!! classes Response Curve Rule

`ResponseCurveRule (RuleBase, ICellBasedRule)`

Rule for response function

Source code in rules/response_curve_rule.py

class ResponseCurveRule(RuleBase, ICellBasedRule):
    """Rule for response function"""

    def __init__(
        self,
        name: str,
        input_variable_name: str,
        input_values: List[float],
        output_values: List[float],
    ):
        super().__init__(name, [input_variable_name])

        self._input_values = _np.array(input_values)
        self._output_values = _np.array(output_values)

    @property
    def input_values(self):
        """Input values property"""
        return self._input_values

    @property
    def output_values(self):
        """Output values property"""
        return self._output_values

    def validate(self, logger: ILogger) -> bool:
        if len(self._input_values) != len(self._output_values):
            logger.log_error("The input and output values must be equal.")
            return False
        if not (self._input_values == _np.sort(self._input_values)).all():
            logger.log_error("The input values should be given in a sorted order.")
            return False
        return True

    def execute(self, value: float, logger: ILogger):
        """Interpolate a variable, based on given input and output values.
        Values lower than lowest value will be set to NaN, values larger than
        the highest value will be set to NaN

        Args:
            value (float): value to classify
            input_values (_np.array): input values to use
            output_values (_np.array): output values to use

        Returns:
            float: response corresponding to value to classify
            int[]: number of warnings less than minimum and greater than maximum
        """

        values_input = self._input_values
        values_output = self._output_values
        warning_counter = [0, 0]

        # values are constant
        if value < _np.min(values_input):
            # count warning exceeding min:
            warning_counter[0] = 1
            return values_output[0], warning_counter

        if value > _np.max(values_input):
            # count warning exceeding max:
            warning_counter[1] = 1
            return values_output[-1], warning_counter

        return _np.interp(value, values_input, values_output), warning_counter

`input_values` `property` `readonly`

Input values property

`output_values` `property` `readonly`

Output values property

`execute(self, value, logger)`

Interpolate a variable, based on given input and output values. Values lower than lowest value will be set to NaN, values larger than the highest value will be set to NaN

Parameters:

Name	Type	Description	Default
`value`	`float`	value to classify	required
`input_values`	`_np.array`	input values to use	required
`output_values`	`_np.array`	output values to use	required

Returns:

Type	Description
`float`	response corresponding to value to classify int[]: number of warnings less than minimum and greater than maximum

Source code in rules/response_curve_rule.py

def execute(self, value: float, logger: ILogger):
    """Interpolate a variable, based on given input and output values.
    Values lower than lowest value will be set to NaN, values larger than
    the highest value will be set to NaN

    Args:
        value (float): value to classify
        input_values (_np.array): input values to use
        output_values (_np.array): output values to use

    Returns:
        float: response corresponding to value to classify
        int[]: number of warnings less than minimum and greater than maximum
    """

    values_input = self._input_values
    values_output = self._output_values
    warning_counter = [0, 0]

    # values are constant
    if value < _np.min(values_input):
        # count warning exceeding min:
        warning_counter[0] = 1
        return values_output[0], warning_counter

    if value > _np.max(values_input):
        # count warning exceeding max:
        warning_counter[1] = 1
        return values_output[-1], warning_counter

    return _np.interp(value, values_input, values_output), warning_counter

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/response_curve_rule.py

def validate(self, logger: ILogger) -> bool:
    if len(self._input_values) != len(self._output_values):
        logger.log_error("The input and output values must be equal.")
        return False
    if not (self._input_values == _np.sort(self._input_values)).all():
        logger.log_error("The input values should be given in a sorted order.")
        return False
    return True

rolling_statistics_rule

Module for RollingStatisticsRule class

!!! classes RollingStatisticsRule

`RollingStatisticsRule (RuleBase, IArrayBasedRule)`

Implementation for the rolling statistics rule

Source code in rules/rolling_statistics_rule.py

class RollingStatisticsRule(RuleBase, IArrayBasedRule):
    """Implementation for the rolling statistics rule"""

    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        operation_type: TimeOperationType,
    ):
        super().__init__(name, input_variable_names)
        self._settings = TimeOperationSettings({"hour": "H", "day": "D"})
        self._settings.percentile_value = 0
        self._settings.operation_type = operation_type
        self._settings.time_scale = "day"
        self._period = 1

    @property
    def settings(self):
        """Time operation settings"""
        return self._settings

    @property
    def period(self) -> float:
        """Operation type property"""
        return self._period

    @period.setter
    def period(self, period: float):
        self._period = period

    def validate(self, logger: ILogger) -> bool:
        """Validates if the rule is valid

        Returns:
            bool: wether the rule is valid
        """

        return self.settings.validate(self.name, logger)

    def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
        """Calculating the rolling statistics for a given period

        Args:
            value_array (DataArray): value to aggregate

        Returns:
            DataArray: Aggregated values
        """

        time_scale = get_dict_element(
            self.settings.time_scale, self.settings.time_scale_mapping
        )

        time_dim_name = get_time_dimension_name(value_array, logger)

        result = self._perform_operation(
            value_array,
            time_dim_name,
            time_scale,
            logger,
        )
        return result

    def _perform_operation(
        self,
        values: _xr.DataArray,
        time_dim_name: str,
        time_scale: str,
        logger: ILogger,
    ) -> _xr.DataArray:
        """Returns the values based on the operation type


        Args:
            values (_xr.DataArray): values
            time_dim_name (str): time dimension name
            dim_name (str): dimension name
            logger (ILogger): logger

        Raises:
            NotImplementedError: If operation type is not supported

        Returns:
            DataArray: Values of operation type
        """

        result_array = _cp.deepcopy(values)
        result_array = result_array.where(False, _np.nan)

        if time_scale == "H":
            operation_time_delta = _dt.timedelta(hours=self._period)
        elif time_scale == "D":
            operation_time_delta = _dt.timedelta(days=self._period)
        else:
            error_message = f"Invalid time scale provided : '{time_scale}'."
            logger.log_error(error_message)
            raise ValueError(error_message)

        time_delta_ms = _np.array([operation_time_delta], dtype="timedelta64[ms]")[0]
        last_timestamp = values.time.isel(time=-1).values
        for time_step in values.time.values:  # Interested in vectorizing this loop
            if last_timestamp - time_step < time_delta_ms:
                break

            data = values.sel(time=slice(time_step, time_step + time_delta_ms))
            last_timestamp_data = data.time.isel(time=-1).values
            result = self._apply_operation(data, time_dim_name)

            result_array.loc[{"time": last_timestamp_data}] = result

        return _xr.DataArray(result_array)

    def _apply_operation(
        self, data: _xr.DataArray, time_dim_name: str
    ) -> _xr.DataArray:
        operation_type = self.settings.operation_type

        if operation_type is TimeOperationType.ADD:
            result = data.sum(dim=time_dim_name)

        elif operation_type is TimeOperationType.MIN:
            result = data.min(dim=time_dim_name)

        elif operation_type is TimeOperationType.MAX:
            result = data.max(dim=time_dim_name)

        elif operation_type is TimeOperationType.AVERAGE:
            result = data.mean(dim=time_dim_name)

        elif operation_type is TimeOperationType.MEDIAN:
            result = data.median(dim=time_dim_name)

        elif operation_type is TimeOperationType.STDEV:
            result = data.std(dim=time_dim_name)

        elif operation_type is TimeOperationType.PERCENTILE:
            result = data.quantile(
                self.settings.percentile_value / 100, dim=time_dim_name
            ).drop_vars("quantile")

        else:
            raise NotImplementedError(
                f"The operation type '{operation_type}' " "is currently not supported"
            )

        return result

`period: float` `property` `writable`

Operation type property

`settings` `property` `readonly`

Time operation settings

`execute(self, value_array, logger)`

Calculating the rolling statistics for a given period

Parameters:

Name	Type	Description	Default
`value_array`	`DataArray`	value to aggregate	required

Returns:

Type	Description
`DataArray`	Aggregated values

Source code in rules/rolling_statistics_rule.py

def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
    """Calculating the rolling statistics for a given period

    Args:
        value_array (DataArray): value to aggregate

    Returns:
        DataArray: Aggregated values
    """

    time_scale = get_dict_element(
        self.settings.time_scale, self.settings.time_scale_mapping
    )

    time_dim_name = get_time_dimension_name(value_array, logger)

    result = self._perform_operation(
        value_array,
        time_dim_name,
        time_scale,
        logger,
    )
    return result

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/rolling_statistics_rule.py

def validate(self, logger: ILogger) -> bool:
    """Validates if the rule is valid

    Returns:
        bool: wether the rule is valid
    """

    return self.settings.validate(self.name, logger)

rule_base

Module for RuleBase

!!! classes RuleBase

`RuleBase (IRule, ABC)`

Implementation of the rule base

Source code in rules/rule_base.py

class RuleBase(IRule, ABC):
    """Implementation of the rule base"""

    def __init__(self, name: str, input_variable_names: List[str]):
        self._name = name
        self._description = ""
        self._input_variable_names = input_variable_names
        self._output_variable_name = "output"

    @property
    def name(self) -> str:
        """Name of the rule"""
        return self._name

    @name.setter
    def name(self, name: str):
        """Name of the rule"""
        self._name = name

    @property
    def description(self) -> str:
        """Description of the rule"""
        return self._description

    @description.setter
    def description(self, description: str):
        """Description of the rule"""
        self._description = description

    @property
    def input_variable_names(self) -> List[str]:
        """Name of the input variable"""
        return self._input_variable_names

    @input_variable_names.setter
    def input_variable_names(self, input_variable_names: List[str]) -> List[str]:
        """Name of the input variables"""
        self._input_variable_names = input_variable_names

    @property
    def output_variable_name(self) -> str:
        """Name of the output variable"""
        return self._output_variable_name

    @output_variable_name.setter
    def output_variable_name(self, output_variable_name: str):
        """Name of the output variable"""
        self._output_variable_name = output_variable_name

    def validate(self, logger: ILogger) -> bool:
        """Validates if the rule is valid

        Returns:
            bool: wether the rule is valid
        """
        return True

`description: str` `property` `writable`

Description of the rule

`input_variable_names: List[str]` `property` `writable`

Name of the input variable

`name: str` `property` `writable`

Name of the rule

`output_variable_name: str` `property` `writable`

Name of the output variable

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/rule_base.py

def validate(self, logger: ILogger) -> bool:
    """Validates if the rule is valid

    Returns:
        bool: wether the rule is valid
    """
    return True

step_function_rule

Module for StepFunction class

!!! classes StepFunction

`StepFunctionRule (RuleBase, ICellBasedRule)`

Rule for Step function

Defines a step function output (float) to an input (float).

The input sorted list [limit_1, limit_2, ..., limit_i, ..., limit_n] where limit_1 < limit_2 < ... < limit_i < ... < limit_n defines the limits of the interval for which the output values apply.

f(val) = f(limit_i) if limit_i<= val < limit_(i+1), no warning message is logged. f(val) = f(limit_1) if val = limit_1, no warning message is logged. f(val) = f(limit_1) if val < limit_1, and a warning message is logged. f(val) = f(limit_n) if val = limit_n, no warning message is logged. f(val) = f(limit_n) if val > limit_n, and a warning message is logged.

Source code in rules/step_function_rule.py

class StepFunctionRule(RuleBase, ICellBasedRule):
    """Rule for Step function

    Defines a step function output (float) to an input (float).

    The input sorted list [limit_1, limit_2, ..., limit_i, ..., limit_n]
    where limit_1 < limit_2 < ... < limit_i < ... < limit_n
    defines the limits of the interval for which the output values apply.

    f(val) = f(limit_i) if  limit_i<= val < limit_(i+1), no warning message is logged.
    f(val) = f(limit_1) if val = limit_1, no warning message is logged.
    f(val) = f(limit_1) if val < limit_1, and a warning message is logged.
    f(val) = f(limit_n) if val = limit_n, no warning message is logged.
    f(val) = f(limit_n) if val > limit_n, and a warning message is logged.

    """

    def __init__(
        self,
        name: str,
        input_variable_name: str,
        limits: List[float],
        responses: List[float],
    ):
        super().__init__(name, [input_variable_name])

        self._limits = _np.array(limits)
        self._responses = _np.array(responses)

    @property
    def limits(self):
        """Limits property"""
        return self._limits

    @property
    def responses(self):
        """Responses property"""
        return self._responses

    def validate(self, logger: ILogger) -> bool:
        if len(self._limits) != len(self._responses):
            logger.log_error("The number of limits and of responses must be equal.")
            return False
        if len(self._limits) != len(set(self._limits)):
            logger.log_error("Limits must be unique.")
            return False
        if not (self._limits == _np.sort(self._limits)).all():
            logger.log_error("The limits should be given in a sorted order.")
            return False
        return True

    def execute(self, value: float, logger: ILogger):
        """Classify a variable, based on given bins.
        Values lower than lowest bin will produce a warning and will
        be assigned class 0.
        Values larger than the largest bin will produce a warning
        and will get the highest bin index.

        Args:
            date (_type_): _description_
            value (float): value to classify

        Returns:
            float: response corresponding to value to classify
            int[]: number of warnings less than minimum and greater than maximum
        """

        bins = self._limits
        responses = self._responses

        # bins are constant
        selected_bin = -1
        warning_counter = [0, 0]
        if _np.isnan(value):
            return value, warning_counter
        if value < _np.min(bins):
            # count warning exceeding min:
            warning_counter[0] = 1
            selected_bin = 0
        else:
            selected_bin = _np.digitize(value, bins) - 1
            if value > _np.max(bins):
                # count warning exceeding max:
                warning_counter[1] = 1

        return responses[selected_bin], warning_counter

`limits` `property` `readonly`

Limits property

`responses` `property` `readonly`

Responses property

`execute(self, value, logger)`

Classify a variable, based on given bins. Values lower than lowest bin will produce a warning and will be assigned class 0. Values larger than the largest bin will produce a warning and will get the highest bin index.

Parameters:

Name	Type	Description	Default
`date`	`_type_`	description	required
`value`	`float`	value to classify	required

Returns:

Type	Description
`float`	response corresponding to value to classify int[]: number of warnings less than minimum and greater than maximum

Source code in rules/step_function_rule.py

def execute(self, value: float, logger: ILogger):
    """Classify a variable, based on given bins.
    Values lower than lowest bin will produce a warning and will
    be assigned class 0.
    Values larger than the largest bin will produce a warning
    and will get the highest bin index.

    Args:
        date (_type_): _description_
        value (float): value to classify

    Returns:
        float: response corresponding to value to classify
        int[]: number of warnings less than minimum and greater than maximum
    """

    bins = self._limits
    responses = self._responses

    # bins are constant
    selected_bin = -1
    warning_counter = [0, 0]
    if _np.isnan(value):
        return value, warning_counter
    if value < _np.min(bins):
        # count warning exceeding min:
        warning_counter[0] = 1
        selected_bin = 0
    else:
        selected_bin = _np.digitize(value, bins) - 1
        if value > _np.max(bins):
            # count warning exceeding max:
            warning_counter[1] = 1

    return responses[selected_bin], warning_counter

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/step_function_rule.py

def validate(self, logger: ILogger) -> bool:
    if len(self._limits) != len(self._responses):
        logger.log_error("The number of limits and of responses must be equal.")
        return False
    if len(self._limits) != len(set(self._limits)):
        logger.log_error("Limits must be unique.")
        return False
    if not (self._limits == _np.sort(self._limits)).all():
        logger.log_error("The limits should be given in a sorted order.")
        return False
    return True

string_parser_utils

Module for parser strings

`read_str_comparison(compare_str, operator)`

Read the string of a comparison (with specified operator) and validate if this is in the correct format (, eg: >100)

Parameters:

Name	Type	Description	Default
`compare_str`	`str`	String to be checked	required
`operator`	`str`	Operator to split on	required

Exceptions:

Type	Description
`ValueError`	If the compared value is not a number

Returns:

Type	Description
`float`	The number from the comparison string

Source code in rules/string_parser_utils.py

def read_str_comparison(compare_str: str, operator: str):
    """Read the string of a comparison (with specified operator) and
    validate if this is in the correct format (<operator><number>, eg: >100)

    Args:
        compare_str (str): String to be checked
        operator (str): Operator to split on

    Raises:
        ValueError: If the compared value is not a number

    Returns:
        float: The number from the comparison string
    """
    compare_str = compare_str.strip()
    try:
        compare_list = compare_str.split(operator)
        if len(compare_list) != 2:
            raise IndexError(
                f'Input "{compare_str}" is not a valid comparison '
                f"with operator: {operator}"
            )
        compare_val = compare_list[1]
        return float(compare_val)
    except ValueError as exc:
        raise ValueError(
            f'Input "{compare_str}" is not a valid comparison with '
            f"operator: {operator}"
        ) from exc

`str_range_to_list(range_string)`

Convert a string with a range in the form "x:y" of floats to two elements (begin and end of range).

Parameters:

Name	Type	Description	Default
`range_string`	`str`	String to be converted to a range (begin and end)	required

Exceptions:

Type	Description
`ValueError`	If the string is not properly defined

Returns:

Type	Description
`floats`	Return the begin and end value of the range

Source code in rules/string_parser_utils.py

def str_range_to_list(range_string: str):
    """Convert a string with a range in the form "x:y" of floats to
    two elements (begin and end of range).

    Args:
        range_string (str): String to be converted to a range (begin and end)

    Raises:
        ValueError: If the string is not properly defined

    Returns:
        floats: Return the begin and end value of the range
    """
    range_string = range_string.strip()
    try:
        begin, end = range_string.split(":")
        return float(begin), float(end)
    except ValueError as exc:
        raise ValueError(f'Input "{range_string}" is not a valid range') from exc

`type_of_classification(class_val)`

Determine which type of classification is required: number, range, or NA (not applicable)

Parameters:

Name	Type	Description	Default
`class_val`	`_type_`	String to classify	required

Exceptions:

Type	Description
`ValueError`	Error when the string is not properly defined

Returns:

Type	Description
`str`	Type of classification

Source code in rules/string_parser_utils.py

def type_of_classification(class_val: Any) -> str:
    """Determine which type of classification is required: number, range, or
    NA (not applicable)

    Args:
        class_val (_type_): String to classify

    Raises:
        ValueError: Error when the string is not properly defined

    Returns:
        str: Type of classification
    """
    class_type = None
    if isinstance(class_val, (float, int)):
        class_type = "number"
    elif isinstance(class_val, str):
        class_val = class_val.strip()
        if class_val in ("-", ""):
            class_type = "NA"
        elif ":" in class_val:
            str_range_to_list(class_val)
            class_type = "range"
        elif ">=" in class_val:
            read_str_comparison(class_val, ">=")
            class_type = "larger_equal"
        elif "<=" in class_val:
            read_str_comparison(class_val, "<=")
            class_type = "smaller_equal"
        elif ">" in class_val:
            read_str_comparison(class_val, ">")
            class_type = "larger"
        elif "<" in class_val:
            read_str_comparison(class_val, "<")
            class_type = "smaller"

    if not class_type:
        try:
            float(class_val)
            class_type = "number"
        except TypeError as exc:
            raise ValueError(f"No valid criteria is given: {class_val}") from exc

    return class_type

time_aggregation_rule

Module for TimeAggregationRule class

!!! classes TimeAggregationRule

`TimeAggregationRule (RuleBase, IArrayBasedRule)`

Implementation for the time aggregation rule

Source code in rules/time_aggregation_rule.py

class TimeAggregationRule(RuleBase, IArrayBasedRule):
    """Implementation for the time aggregation rule"""

    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        operation_type: TimeOperationType,
    ):
        super().__init__(name, input_variable_names)
        self._settings = TimeOperationSettings({"month": "ME", "year": "YE"})
        self._settings.percentile_value = 0
        self._settings.operation_type = operation_type
        self._settings.time_scale = "year"

    @property
    def settings(self):
        """Time operation settings"""
        return self._settings

    def validate(self, logger: ILogger) -> bool:
        """Validates if the rule is valid

        Returns:
            bool: wether the rule is valid
        """
        return self.settings.validate(self.name, logger)

    def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
        """Aggregates the values for the specified start and end date

        Args:
            value_array (DataArray): value to aggregate

        Returns:
            DataArray: Aggregated values
        """
        settings = self._settings
        if settings.operation_type is TimeOperationType.COUNT_PERIODS:
            # Check if all values in a COUNT_PERIODS value array
            #  are either 0 or 1 or NaN
            compare_values = (
                (value_array == 0) | (value_array == 1) | _np.isnan(value_array)
            )
            check_values = _xr.where(compare_values, True, False)
            if False in check_values:
                raise ValueError(
                    "The value array for the time aggregation rule with operation type"
                    " COUNT_PERIODS should only contain the values 0 and 1 (or NaN)."
                )

        dim_name = get_dict_element(settings.time_scale, settings.time_scale_mapping)

        time_dim_name = get_time_dimension_name(value_array, logger)
        aggregated_values = value_array.resample({time_dim_name: dim_name}, skipna=True)

        result = self._perform_operation(aggregated_values)
        # create a new aggregated time dimension based on original time dimension

        result_time_dim_name = f"{time_dim_name}_{settings.time_scale}"
        result = result.rename({time_dim_name: result_time_dim_name})

        for key, value in value_array[time_dim_name].attrs.items():
            if value:
                result[result_time_dim_name].attrs[key] = value

        result = result.assign_coords(
            {result_time_dim_name: result[result_time_dim_name]}
        )
        result[result_time_dim_name].attrs["long_name"] = result_time_dim_name
        result[result_time_dim_name].attrs["standard_name"] = result_time_dim_name

        return result

    def _perform_operation(self, aggregated_values: DataArrayResample) -> _xr.DataArray:
        """Returns the values based on the operation type

        Args:
            aggregated_values (DataArrayResample): aggregate values

        Raises:
            NotImplementedError: If operation type is not supported

        Returns:
            DataArray: Values of operation type
        """
        period_operations = [
            TimeOperationType.COUNT_PERIODS,
            TimeOperationType.MAX_DURATION_PERIODS,
            TimeOperationType.AVG_DURATION_PERIODS,
        ]

        operation_type = self.settings.operation_type

        if operation_type is TimeOperationType.ADD:
            result = aggregated_values.sum()

        elif operation_type is TimeOperationType.MIN:
            result = aggregated_values.min()

        elif operation_type is TimeOperationType.MAX:
            result = aggregated_values.max()

        elif operation_type is TimeOperationType.AVERAGE:
            result = aggregated_values.mean()

        elif operation_type is TimeOperationType.MEDIAN:
            result = aggregated_values.median()

        elif operation_type in period_operations:
            result = aggregated_values.reduce(self.analyze_groups, dim="time")

        elif operation_type is TimeOperationType.STDEV:
            result = aggregated_values.std()

        elif operation_type is TimeOperationType.PERCENTILE:
            result = aggregated_values.quantile(
                self.settings.percentile_value / 100
            ).drop_vars("quantile")

        else:
            raise NotImplementedError(
                f"The operation type '{operation_type}' " "is currently not supported"
            )

        return _xr.DataArray(result)

    def count_groups(self, elem):
        """
        Count the amount of times the groups of 1 occur.

        Args:
            elem (Array): the data array in N-dimensions

        Returns:
            List: list with the counted periods
        """
        # in case of an example array with 5 values [1,1,0,1,0]:
        # subtract last 4 values from the first 4 values: [1,0,1,0] - [1,1,0,1]:
        # (the result of this example differences: [0,-1,1,-1])
        differences = _np.diff(elem)
        # First add the first element of the array to the difference array (as this
        # could also indicate a beginning of a group or not and the diff is calculated
        # from the second element)
        # when the difference of two neighbouring elements is 1, this indicates the
        # start of a group. to count the number of groups: count the occurences of
        # difference == 1: (the result of this examples: 1 + 1 = 2)
        differences = _np.append(differences, elem[0])
        return _np.count_nonzero(differences == 1)

    def duration_groups(self, elem):
        """
        Create an array that cumulative sums the values of the groups in the array,
        but restarts when a 0 occurs. For example: [0, 1, 1, 0, 1, 1, 1, 0, 1]
        This function will return: [0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 0, 1]

        Args:
            elem (List): the data array in N-dimensions

        Returns:
            List: List with the duration of the periods
        """
        # Function to create a cumsum over the groups (where the elements in elem are 1)
        cumsum_groups = _np.frompyfunc(lambda a, b: a + b if b == 1 else 0, 2, 1)
        return cumsum_groups.accumulate(elem)

    def analyze_groups(self, elem, axis):
        """This function analyzes the input array (N-dimensional array containing 0
        and 1) The function will reduce the array over the time axis, depending on a
        certain time operation type. Below are the operation types with what this
        function will do to this example input array: [0, 1, 1, 0, 1, 0]. A period
        is all consecutive 1 values.
            - COUNT_PERIODS: count the amount of periods (result: 2)
            - MAX_DURATION_PERIODS: gives the longest period (result: 2)
            - AVG_DURATION_PERIODS: gives the average of periods (result: 1.5)

        Args:
            elem (Array): the data array in N-dimensions
            axis (integer): the value describing the time axis

        Returns:
            array: array with the analyzed periods, with the same dimensions as elem
        """
        # Determine the number of axes in the array
        no_axis = len(_np.shape(elem))

        # The reduce function that calls this analyze_groups function should be reduces
        # over the time axis. The argument axis in this function gives a number of which
        # axis is in fact the time axis. This axis needs to move to the last position,
        # because we need to reduce the N-dimensional arary to a 1D array with all the
        # values in time for a specific cell in order to do the calculation for that
        # cell. Because we are looping over the N-dimensional array iteratively, we
        # should only move the time axis the first time this function is called (so when
        # the axis is not yet set to -1!)
        if axis != -1:
            elem = _np.moveaxis(elem, axis, -1)
            axis = -1

        #  in case of 1 dimension:
        if no_axis == 1:
            # remove NaN values from the array (these are to be ignored)
            elem = elem[~_np.isnan(elem)]
            if len(elem) == 0:
                return 0
            if self.settings.operation_type is TimeOperationType.COUNT_PERIODS:
                group_result = self.count_groups(elem)
            elif self.settings.operation_type is TimeOperationType.MAX_DURATION_PERIODS:
                group_result = _np.max((self.duration_groups(elem)))
            elif self.settings.operation_type is TimeOperationType.AVG_DURATION_PERIODS:
                period = float(_np.sum(elem))
                group_count = float(self.count_groups(elem))
                group_result = _np.divide(
                    period,
                    group_count,
                    out=_np.zeros_like(period),
                    where=group_count != 0,
                )

        # in case of multiple dimensions:
        else:
            group_result = []
            for sub_elem in elem:
                # loop through this recursive function, determine output per axis:
                group_result_row = self.analyze_groups(sub_elem, axis)
                # add the result to the list of results, per axis:
                group_result.append(group_result_row)

        return group_result

`settings` `property` `readonly`

Time operation settings

`analyze_groups(self, elem, axis)`

This function analyzes the input array (N-dimensional array containing 0 and 1) The function will reduce the array over the time axis, depending on a certain time operation type. Below are the operation types with what this function will do to this example input array: [0, 1, 1, 0, 1, 0]. A period is all consecutive 1 values. - COUNT_PERIODS: count the amount of periods (result: 2) - MAX_DURATION_PERIODS: gives the longest period (result: 2) - AVG_DURATION_PERIODS: gives the average of periods (result: 1.5)

Parameters:

Name	Type	Description	Default
`elem`	`Array`	the data array in N-dimensions	required
`axis`	`integer`	the value describing the time axis	required

Returns:

Type	Description
`array`	array with the analyzed periods, with the same dimensions as elem

Source code in rules/time_aggregation_rule.py

def analyze_groups(self, elem, axis):
    """This function analyzes the input array (N-dimensional array containing 0
    and 1) The function will reduce the array over the time axis, depending on a
    certain time operation type. Below are the operation types with what this
    function will do to this example input array: [0, 1, 1, 0, 1, 0]. A period
    is all consecutive 1 values.
        - COUNT_PERIODS: count the amount of periods (result: 2)
        - MAX_DURATION_PERIODS: gives the longest period (result: 2)
        - AVG_DURATION_PERIODS: gives the average of periods (result: 1.5)

    Args:
        elem (Array): the data array in N-dimensions
        axis (integer): the value describing the time axis

    Returns:
        array: array with the analyzed periods, with the same dimensions as elem
    """
    # Determine the number of axes in the array
    no_axis = len(_np.shape(elem))

    # The reduce function that calls this analyze_groups function should be reduces
    # over the time axis. The argument axis in this function gives a number of which
    # axis is in fact the time axis. This axis needs to move to the last position,
    # because we need to reduce the N-dimensional arary to a 1D array with all the
    # values in time for a specific cell in order to do the calculation for that
    # cell. Because we are looping over the N-dimensional array iteratively, we
    # should only move the time axis the first time this function is called (so when
    # the axis is not yet set to -1!)
    if axis != -1:
        elem = _np.moveaxis(elem, axis, -1)
        axis = -1

    #  in case of 1 dimension:
    if no_axis == 1:
        # remove NaN values from the array (these are to be ignored)
        elem = elem[~_np.isnan(elem)]
        if len(elem) == 0:
            return 0
        if self.settings.operation_type is TimeOperationType.COUNT_PERIODS:
            group_result = self.count_groups(elem)
        elif self.settings.operation_type is TimeOperationType.MAX_DURATION_PERIODS:
            group_result = _np.max((self.duration_groups(elem)))
        elif self.settings.operation_type is TimeOperationType.AVG_DURATION_PERIODS:
            period = float(_np.sum(elem))
            group_count = float(self.count_groups(elem))
            group_result = _np.divide(
                period,
                group_count,
                out=_np.zeros_like(period),
                where=group_count != 0,
            )

    # in case of multiple dimensions:
    else:
        group_result = []
        for sub_elem in elem:
            # loop through this recursive function, determine output per axis:
            group_result_row = self.analyze_groups(sub_elem, axis)
            # add the result to the list of results, per axis:
            group_result.append(group_result_row)

    return group_result

`count_groups(self, elem)`

Count the amount of times the groups of 1 occur.

Parameters:

Name	Type	Description	Default
`elem`	`Array`	the data array in N-dimensions	required

Returns:

Type	Description
`List`	list with the counted periods

Source code in rules/time_aggregation_rule.py

def count_groups(self, elem):
    """
    Count the amount of times the groups of 1 occur.

    Args:
        elem (Array): the data array in N-dimensions

    Returns:
        List: list with the counted periods
    """
    # in case of an example array with 5 values [1,1,0,1,0]:
    # subtract last 4 values from the first 4 values: [1,0,1,0] - [1,1,0,1]:
    # (the result of this example differences: [0,-1,1,-1])
    differences = _np.diff(elem)
    # First add the first element of the array to the difference array (as this
    # could also indicate a beginning of a group or not and the diff is calculated
    # from the second element)
    # when the difference of two neighbouring elements is 1, this indicates the
    # start of a group. to count the number of groups: count the occurences of
    # difference == 1: (the result of this examples: 1 + 1 = 2)
    differences = _np.append(differences, elem[0])
    return _np.count_nonzero(differences == 1)

`duration_groups(self, elem)`

Create an array that cumulative sums the values of the groups in the array, but restarts when a 0 occurs. For example: [0, 1, 1, 0, 1, 1, 1, 0, 1] This function will return: [0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 0, 1]

Parameters:

Name	Type	Description	Default
`elem`	`List`	the data array in N-dimensions	required

Returns:

Type	Description
`List`	List with the duration of the periods

Source code in rules/time_aggregation_rule.py

def duration_groups(self, elem):
    """
    Create an array that cumulative sums the values of the groups in the array,
    but restarts when a 0 occurs. For example: [0, 1, 1, 0, 1, 1, 1, 0, 1]
    This function will return: [0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 0, 1]

    Args:
        elem (List): the data array in N-dimensions

    Returns:
        List: List with the duration of the periods
    """
    # Function to create a cumsum over the groups (where the elements in elem are 1)
    cumsum_groups = _np.frompyfunc(lambda a, b: a + b if b == 1 else 0, 2, 1)
    return cumsum_groups.accumulate(elem)

`execute(self, value_array, logger)`

Aggregates the values for the specified start and end date

Parameters:

Name	Type	Description	Default
`value_array`	`DataArray`	value to aggregate	required

Returns:

Type	Description
`DataArray`	Aggregated values

Source code in rules/time_aggregation_rule.py

def execute(self, value_array: _xr.DataArray, logger: ILogger) -> _xr.DataArray:
    """Aggregates the values for the specified start and end date

    Args:
        value_array (DataArray): value to aggregate

    Returns:
        DataArray: Aggregated values
    """
    settings = self._settings
    if settings.operation_type is TimeOperationType.COUNT_PERIODS:
        # Check if all values in a COUNT_PERIODS value array
        #  are either 0 or 1 or NaN
        compare_values = (
            (value_array == 0) | (value_array == 1) | _np.isnan(value_array)
        )
        check_values = _xr.where(compare_values, True, False)
        if False in check_values:
            raise ValueError(
                "The value array for the time aggregation rule with operation type"
                " COUNT_PERIODS should only contain the values 0 and 1 (or NaN)."
            )

    dim_name = get_dict_element(settings.time_scale, settings.time_scale_mapping)

    time_dim_name = get_time_dimension_name(value_array, logger)
    aggregated_values = value_array.resample({time_dim_name: dim_name}, skipna=True)

    result = self._perform_operation(aggregated_values)
    # create a new aggregated time dimension based on original time dimension

    result_time_dim_name = f"{time_dim_name}_{settings.time_scale}"
    result = result.rename({time_dim_name: result_time_dim_name})

    for key, value in value_array[time_dim_name].attrs.items():
        if value:
            result[result_time_dim_name].attrs[key] = value

    result = result.assign_coords(
        {result_time_dim_name: result[result_time_dim_name]}
    )
    result[result_time_dim_name].attrs["long_name"] = result_time_dim_name
    result[result_time_dim_name].attrs["standard_name"] = result_time_dim_name

    return result

`validate(self, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/time_aggregation_rule.py

def validate(self, logger: ILogger) -> bool:
    """Validates if the rule is valid

    Returns:
        bool: wether the rule is valid
    """
    return self.settings.validate(self.name, logger)

time_operation_settings

Module for TimeOperationSettings class

!!! classes TimeOperationSettings

`TimeOperationSettings`

Implementation for the time operation settings

Source code in rules/time_operation_settings.py

class TimeOperationSettings:
    """Implementation for the time operation settings"""

    def __init__(
        self,
        time_scale_mapping: Dict[str, str],
    ):
        if len(time_scale_mapping) == 0:
            raise ValueError("The time_scale_mapping does not contain any values")

        self._time_scale_mapping = time_scale_mapping
        self._time_scale = next(iter(time_scale_mapping.keys()))
        self._operation_type = TimeOperationType.AVERAGE
        self._percentile_value = 0.0

    @property
    def operation_type(self) -> TimeOperationType:
        """Operation type property"""
        return self._operation_type

    @operation_type.setter
    def operation_type(self, operation_type: TimeOperationType):
        self._operation_type = operation_type

    @property
    def percentile_value(self) -> float:
        """Operation parameter property"""
        return self._percentile_value

    @percentile_value.setter
    def percentile_value(self, percentile_value: float):
        self._percentile_value = percentile_value

    @property
    def time_scale(self) -> str:
        """Time scale property"""
        return self._time_scale

    @time_scale.setter
    def time_scale(self, time_scale: str):
        self._time_scale = time_scale.lower()

    @property
    def time_scale_mapping(self) -> Dict[str, str]:
        """Time scale mapping property"""
        return self._time_scale_mapping

    def validate(self, rule_name: str, logger: ILogger) -> bool:
        """Validates if the rule is valid

        Returns:
            bool: wether the rule is valid
        """
        valid = True
        allowed_time_scales = self.time_scale_mapping.keys()

        if self.time_scale not in allowed_time_scales:
            options = ",".join(allowed_time_scales)
            logger.log_error(
                f"The provided time scale '{self.time_scale}' "
                f"of rule '{rule_name}' is not supported.\n"
                f"Please select one of the following types: "
                f"{options}"
            )
            valid = False

        return valid

`operation_type: TimeOperationType` `property` `writable`

Operation type property

`percentile_value: float` `property` `writable`

Operation parameter property

`time_scale: str` `property` `writable`

Time scale property

`time_scale_mapping: Dict[str, str]` `property` `readonly`

Time scale mapping property

`validate(self, rule_name, logger)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in rules/time_operation_settings.py

def validate(self, rule_name: str, logger: ILogger) -> bool:
    """Validates if the rule is valid

    Returns:
        bool: wether the rule is valid
    """
    valid = True
    allowed_time_scales = self.time_scale_mapping.keys()

    if self.time_scale not in allowed_time_scales:
        options = ",".join(allowed_time_scales)
        logger.log_error(
            f"The provided time scale '{self.time_scale}' "
            f"of rule '{rule_name}' is not supported.\n"
            f"Please select one of the following types: "
            f"{options}"
        )
        valid = False

    return valid

utils

command_line_utils

Module for command line utils

`read_command_line_arguments()`

Reads the command line arguments given to the tool

Returns:

Type	Description
`Path`	input yaml path

Source code in utils/command_line_utils.py

def read_command_line_arguments():
    """Reads the command line arguments given to the tool

    Returns:
        Path: input yaml path
    """

    # Initialize parser with the multiline description
    parser = argparse.ArgumentParser(
        description=PROGRAM_DESCRIPTION,
        formatter_class=argparse.RawTextHelpFormatter,
    )

    # Adding optional argument
    parser.add_argument(
        "input_file",
        nargs="?",
        help="Input yaml file",
    )
    parser.add_argument("-v", "--version", action="store_true", help="Show version")

    # Read arguments from command line
    args = parser.parse_args()

    if args.input_file:
        input_path = Path(args.input_file)
    elif args.version:
        version = read_version_number()
        print("D-EcoImpact version:", version)
        sys.exit()
    else:
        print("\nNo inputfile given.\n")
        print("===========================================")
        parser.print_help()
        print("===========================================")
        input("\nPlease provide an input.yaml file. Hit Enter to exit.\n")
        sys.exit()
    return input_path

data_array_utils

Library for utility functions regarding an xarray data-arrays

`get_time_dimension_name(variable, logger)`

Retrieves the dimension name

Parameters:

Name	Type	Description	Default
`value_array`	`DataArray`	values to get time dimension	required

Exceptions:

Type	Description
`ValueError`	If time dimension could not be found

Returns:

Type	Description
`str`	time dimension name

Source code in utils/data_array_utils.py

def get_time_dimension_name(variable: _xr.DataArray, logger: ILogger) -> str:
    """Retrieves the dimension name

    Args:
        value_array (DataArray): values to get time dimension

    Raises:
        ValueError: If time dimension could not be found

    Returns:
        str: time dimension name
    """

    for dim in variable.dims:
        dim_values = variable[dim]

        # check if the dimension type is a datetime type
        if dim_values.dtype.name.startswith("datetime64"):
            return str(dim)

    message = f"No time dimension found for {variable.name}"
    logger.log_error(message)
    raise ValueError(message)

dataset_utils

Library for utility functions regarding an xarray dataset

`add_variable(dataset, variable, variable_name)`

Add variable to dataset.

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to add to	required
`variable`	`_xr.DataArray`	Variable containing new data	required
`variable_name`	`str`	Name of new variable	required

Exceptions:

Type	Description
`ValueError`	When variable can not be added

Returns:

Type	Description
`_xr.Dataset`	original dataset

Source code in utils/dataset_utils.py

def add_variable(
    dataset: _xr.Dataset, variable: _xr.DataArray, variable_name: str
) -> _xr.Dataset:
    """Add variable to dataset.

    Args:
        dataset (_xr.Dataset): Dataset to add to
        variable (_xr.DataArray): Variable containing new data
        variable_name (str): Name of new variable

    Raises:
        ValueError: When variable can not be added

    Returns:
        _xr.Dataset: original dataset
    """
    if not isinstance(variable, _xr.DataArray):
        raise ValueError("ERROR: Cannot add variable to dataset")

    dataset[variable_name] = (variable.dims, variable.data)
    try:
        dataset[variable_name] = (variable.dims, variable.data)
    except ValueError as exc:
        raise ValueError("ERROR: Cannot add variable to dataset") from exc

    return dataset

`copy_dataset(dataset)`

Copy dataset to new dataset

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to remove variable from	required
`variable`	`str`	Variable to remove	required

Exceptions:

Type	Description
`ValueError`	When variable can not be removed

Returns:

Type	Description
`_xr.Dataset`	Original dataset

Source code in utils/dataset_utils.py

def copy_dataset(dataset: _xr.Dataset) -> _xr.Dataset:
    """Copy dataset to new dataset

    Args:
        dataset (_xr.Dataset): Dataset to remove variable from
        variable (str): Variable to remove

    Raises:
        ValueError: When variable can not be removed

    Returns:
        _xr.Dataset: Original dataset
    """
    try:
        output_dataset = dataset.copy(deep=False)
    except ValueError as exc:
        raise ValueError("ERROR: Cannot copy dataset.") from exc
    return output_dataset

`create_composed_dataset(input_datasets, variables_to_use, mapping)`

Creates a dataset based on the provided input datasets and the selected variables.

Parameters:

Name	Type	Description	Default
`input_datasets`	`List[_xr.Dataset]`	inputs to copy the data from	required
`variables_to_use`	`List[str]`	selected variables to copy	required
`mapping`	`dict[str, str]`	mapping for variables to rename after copying	required

Returns:

Type	Description
`_xr.Dataset`	composed dataset (with selected variables)

Source code in utils/dataset_utils.py

def create_composed_dataset(
    input_datasets: List[_xr.Dataset],
    variables_to_use: List[str],
    mapping: Optional[dict[str, str]],
) -> _xr.Dataset:
    """Creates a dataset based on the provided input datasets and
    the selected variables.

    Args:
        input_datasets (List[_xr.Dataset]): inputs to copy the data from
        variables_to_use (List[str]): selected variables to copy
        mapping (dict[str, str]): mapping for variables to rename after copying

    Returns:
        _xr.Dataset: composed dataset (with selected variables)
    """
    merged_dataset = merge_list_of_datasets(input_datasets)
    cleaned_dataset = remove_all_variables_except(merged_dataset, variables_to_use)

    if mapping is None or len(mapping) == 0:
        return cleaned_dataset

    return cleaned_dataset.rename_vars(mapping)

`get_dependent_var_list(dataset, dummy_vars)`

Obtain the list of variables in a dataset. The variables are recursively looked up based on the dummy variable. This is done to support XUgrid and to prevent invalid topologies. This also allows QuickPlot to visualize the results.

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to search for dummy variable	required
`dummy_vars`	`List[str]`	dummy variables	required

Returns:

Type	Description
`List[str]`	dependent variables

Source code in utils/dataset_utils.py

def get_dependent_var_list(dataset: _xr.Dataset, dummy_vars) -> List:
    """Obtain the list of variables in a dataset.
    The variables are
    recursively looked up based on the dummy variable.
    This is done to support XUgrid and to prevent invalid topologies.
    This also allows QuickPlot to visualize the results.

    Args:
        dataset (_xr.Dataset): Dataset to search for dummy variable
        dummy_vars (List[str]): dummy variables
    Returns:
        List[str]: dependent variables
    """

    var_list = rec_search_dep_vars(dataset, dummy_vars, [], [])

    var_list += dummy_vars
    return _lu.remove_duplicates_from_list(var_list)

`get_dependent_vars_by_var_name(dataset, var_name)`

Get all the variables that are described in the attributes of the dummy variable, associated with the UGrid standard.

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to get dependent variables from	required
`var_name`	`str`	the name of the dummy variable	required

Returns:

Type	Description
`list[str]`	list of the dependent variables to copy

Source code in utils/dataset_utils.py

def get_dependent_vars_by_var_name(dataset: _xr.Dataset, var_name: str) -> List[str]:
    """Get all the variables that are described in the attributes of the dummy variable,
    associated with the UGrid standard.

    Args:
        dataset (_xr.Dataset): Dataset to get dependent variables from
        var_name (str): the name of the dummy variable

    Returns:
        list[str]: list of the dependent variables to copy
    """

    vars_to_check = ["_coordinates", "_connectivity", "bounds"]

    attrs_list = []

    attrs = dataset[var_name].attrs
    for attr in attrs.items():
        if any(attr[0].endswith(var_check) for var_check in vars_to_check):
            attrs_list = list(set(attrs_list + attr[1].split(" ")))
    return attrs_list

`get_dummy_variable_in_ugrid(dataset)`

Get the name of the variable that serves as the dummy variable in the UGrid.

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to search for dummy variable	required

Returns:

Type	Description
`str`	name of the dummy variable

Source code in utils/dataset_utils.py

def get_dummy_variable_in_ugrid(dataset: _xr.Dataset) -> list:
    """Get the name of the variable that serves as the dummy variable in the UGrid.

    Args:
        dataset (_xr.Dataset): Dataset to search for dummy variable

    Returns:
        str: name of the dummy variable
    """
    dummy = [
        name
        for name in dataset.data_vars
        if ("cf_role", "mesh_topology") in dataset[name].attrs.items()
    ]

    if len(dummy) == 0:
        raise ValueError(
            "No dummy variable defined and therefore input dataset does "
            "not comply with UGrid convention."
        )

    return dummy

`list_coords(dataset)`

List coordinates in dataset

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to list variables from	required

Returns:

Type	Description
`list[str]`	list_variables

Source code in utils/dataset_utils.py

def list_coords(dataset: _xr.Dataset) -> list[str]:
    """List coordinates in dataset

    Args:
        dataset (_xr.Dataset): Dataset to list variables from

    Returns:
        list_variables
    """
    return list((dataset.coords or {}).keys())

`list_vars(dataset)`

List variables in dataset

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to list variables from	required

Returns:

Type	Description
`list[str]`	list_variables

Source code in utils/dataset_utils.py

def list_vars(dataset: _xr.Dataset) -> list[str]:
    """List variables in dataset

    Args:
        dataset (_xr.Dataset): Dataset to list variables from

    Returns:
        list_variables
    """
    return list((dataset.data_vars or {}).keys())

`merge_datasets(dataset1, dataset2)`

Merge two datasets into one dataset.

Parameters:

Name	Type	Description	Default
`dataset1`	`_xr.Dataset`	Dataset 1 to merge	required
`dataset2`	`_xr.Dataset`	Dataset 2 to merge	required

Exceptions:

Type	Description
`ValueError`	When datasets cannot be merged

Returns:

Type	Description
`_xr.Dataset`	Original dataset

Source code in utils/dataset_utils.py

def merge_datasets(dataset1: _xr.Dataset, dataset2: _xr.Dataset) -> _xr.Dataset:
    """Merge two datasets into one dataset.

    Args:
        dataset1 (_xr.Dataset): Dataset 1 to merge
        dataset2 (_xr.Dataset): Dataset 2 to merge

    Raises:
        ValueError: When datasets cannot be merged

    Returns:
        _xr.Dataset: Original dataset
    """
    try:
        output_dataset = dataset1.merge(dataset2, compat="identical")
    except ValueError as exc:
        raise ValueError(f"ERROR: Cannot merge {dataset1} and {dataset2}.") from exc
    return output_dataset

`merge_list_of_datasets(list_datasets)`

Merge list of datasets into 1 dataset

Parameters:

Name	Type	Description	Default
`list_datasets`	`list`	list of datasets to merge	required

Exceptions:

Type	Description
`ValueError`	When datasets cannot be merged

Returns:

Type	Description
`_xr.Dataset`	Original dataset

Source code in utils/dataset_utils.py

def merge_list_of_datasets(list_datasets: list[_xr.Dataset]) -> _xr.Dataset:
    """Merge list of datasets into 1 dataset

    Args:
        list_datasets (list): list of datasets to merge

    Raises:
        ValueError: When datasets cannot be merged

    Returns:
        _xr.Dataset: Original dataset
    """
    try:
        output_dataset = _xr.merge(list_datasets, compat="identical")
    except ValueError as exc:
        raise ValueError(f"ERROR: Cannot merge {list_datasets}.") from exc
    return output_dataset

`rec_search_dep_vars(dataset, var_list, dep_vars, checked_vars)`

Recursive function to loop over all variables defined in the attribute of the dummy variable to find which are dependent and also the variables that are then again dependent on those variables etc.

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	the dataset to check	required
`var_list`	`List[str]`	a list of dummy variable names to start the check	required
`dep_vars`	`List[str]`	a list of dependent variables found	required
`checked_vars`	`List[str]`	a list of variables that have already been checked in this function (it's a check so the function does not endlessly keep searching in the variables)	required

Returns:

Type	Description
`list[str]`	list of names of dependent variables

Source code in utils/dataset_utils.py

def rec_search_dep_vars(
    dataset: _xr.Dataset,
    var_list: List[str],
    dep_vars: List[str],
    checked_vars: List[str],
) -> list[str]:
    """Recursive function to loop over all variables defined in the
    attribute of the dummy variable to find which are dependent and
    also the variables that are then again dependent on those variables etc.

    Args:
        dataset (_xr.Dataset): the dataset to check
        var_list (List[str]): a list of dummy variable names to start the check
        dep_vars (List[str]): a list of dependent variables found
        checked_vars (List[str]): a list of variables that have already been
            checked in this function (it's a check so the function does not endlessly
            keep searching in the variables)

    Returns:
        list[str]: list of names of dependent variables
    """
    for var_name in var_list:
        if var_name not in checked_vars:
            dep_var = get_dependent_vars_by_var_name(dataset, var_name)
            checked_vars.append(var_name)
            if len(dep_var) > 0:
                dep_vars = list(set(dep_var + dep_vars))
                dep_vars = list(
                    set(
                        dep_vars
                        + rec_search_dep_vars(dataset, dep_var, dep_vars, checked_vars)
                    )
                )

    return dep_vars

`reduce_dataset_for_writing(dataset, save_only_variables, logger)`

Reduce dataset before writing by only saving selected variables

Parameters:

Name	Type	Description	Default
`dataset`	`DataSet`	dataset	required
`save_only_variables`	`List[str]`	optional list of variables to be saved. If	required

Exceptions:

Type	Description
`OSError`	If save_only_variables do not exist in dataset

Returns:

Type	Description
	dataset

Source code in utils/dataset_utils.py

def reduce_dataset_for_writing(
    dataset: _xr.Dataset, save_only_variables: List[str], logger: ILogger
):
    """Reduce dataset before writing by only saving selected variables

    Args:
        dataset (DataSet): dataset
        save_only_variables (List[str]): optional list of variables to be saved. If
        empty, all variables are saved

    Raises:
        OSError: If save_only_variables do not exist in dataset

    Returns:
        dataset
    """
    for var in save_only_variables:
        if var not in dataset:
            msg = f"ERROR: variable {var} is not present in dataset"
            logger.log_error(msg)
            raise OSError(msg)

    dataset = remove_all_variables_except(dataset, save_only_variables)
    return dataset

`remove_all_variables_except(dataset, variables_to_keep)`

Remove all variables from dataset except provided list of variables.

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to remove variables from	required
`variables_to_keep`	`List[str]`	selected variables to keep	required

Returns:

Type	Description
`_xr.Dataset`	reduced dataset (containing selected variables)

Source code in utils/dataset_utils.py

def remove_all_variables_except(
    dataset: _xr.Dataset, variables_to_keep: List[str]
) -> _xr.Dataset:
    """Remove all variables from dataset except provided list of variables.

    Args:
        dataset (_xr.Dataset): Dataset to remove variables from
        variables_to_keep (List[str]): selected variables to keep

    Returns:
        _xr.Dataset: reduced dataset (containing selected variables)
    """
    dummy_var = get_dummy_variable_in_ugrid(dataset)
    dependent_var_list = get_dependent_var_list(dataset, dummy_var)
    variables_to_keep += dummy_var + dependent_var_list

    all_variables = list_vars(dataset)

    variables_to_remove = [
        item for item in all_variables if item not in list(variables_to_keep)
    ]
    cleaned_dataset = remove_variables(dataset, variables_to_remove)

    return cleaned_dataset

`remove_variables(dataset, variables)`

Remove variable from dataset

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to remove variable from	required
`variables`	`str/list`	Variable(s) to remove	required

Exceptions:

Type	Description
`ValueError`	When variable can not be removed

Returns:

Type	Description
`_xr.Dataset`	Original dataset

Source code in utils/dataset_utils.py

def remove_variables(dataset: _xr.Dataset, variables: list[str]) -> _xr.Dataset:
    """Remove variable from dataset

    Args:
        dataset (_xr.Dataset): Dataset to remove variable from
        variables (str/list): Variable(s) to remove

    Raises:
        ValueError: When variable can not be removed

    Returns:
        _xr.Dataset: Original dataset
    """
    try:
        dataset = dataset.drop_vars(variables)
    except ValueError as exc:
        raise ValueError(f"ERROR: Cannot remove {variables} from dataset.") from exc
    return dataset

`rename_variable(dataset, variable_old, variable_new)`

Rename variable in dataset

Parameters:

Name	Type	Description	Default
`dataset`	`_xr.Dataset`	Dataset to remove variable from	required
`variable_old`	`str`	Variable to rename, old name	required
`variable_new`	`str`	Variable to rename, new name	required

Exceptions:

Type	Description
`ValueError`	When variable can not be renamed

Returns:

Type	Description
`_xr.Dataset`	Original dataset

Source code in utils/dataset_utils.py

def rename_variable(
    dataset: _xr.Dataset, variable_old: str, variable_new: str
) -> _xr.Dataset:
    """Rename variable in dataset

    Args:
        dataset (_xr.Dataset): Dataset to remove variable from
        variable_old (str): Variable to rename, old name
        variable_new (str): Variable to rename, new name

    Raises:
        ValueError: When variable can not be renamed

    Returns:
        _xr.Dataset: Original dataset
    """
    mapping_dict = {variable_old: variable_new}
    try:
        output_dataset = dataset.rename(mapping_dict)
    except ValueError as exc:
        raise ValueError(
            f"ERROR: Cannot rename variable {variable_old} to {variable_new}."
        ) from exc
    return output_dataset

list_utils

Library for list utility functions

`flatten_list(_2d_list)`

Flattens list of lists to one list.

Parameters:

Name	Type	Description	Default
`_2d_list`	`list`	list to be flattened	required

Returns:

Type	Description
`list`	flat list

Source code in utils/list_utils.py

def flatten_list(_2d_list: list[Any]) -> list:
    """Flattens list of lists to one list.

    Args:
        _2d_list (list): list to be flattened

    Returns:
        list: flat list
    """
    flat_list = []
    # Iterate through the outer list
    for element in _2d_list:
        if isinstance(element, list):
            flat_list = flat_list + list(element)
        else:
            flat_list.append(element)

    return flat_list

`items_in(first, second)`

Returns a list of items in the first list that are in the second list.

Parameters:

Name	Type	Description	Default
`first`	`List[str]`	list of items to iterate	required
`second`	`List[str]`	list of items to check	required

Returns:

Type	Description
`List[str]`	list of items that were in second list

Source code in utils/list_utils.py

def items_in(first: List[str], second: List[str]) -> List[str]:
    """Returns a list of items in the first list that are in the second list.

    Args:
        first (List[str]): list of items to iterate
        second (List[str]): list of items to check

    Returns:
        List[str]: list of items that were in second list
    """
    return list(filter(lambda var: var in second, first))

`items_not_in(first, second)`

Returns a list of items in the first list that are not in the second list.

Parameters:

Name	Type	Description	Default
`first`	`List[str]`	list of items to iterate	required
`second`	`List[str]`	list of items to check	required

Returns:

Type	Description
`List[str]`	list of items that were not in second list

Source code in utils/list_utils.py

def items_not_in(first: List[str], second: List[str]) -> List[str]:
    """Returns a list of items in the first list that are not in the second list.

    Args:
        first (List[str]): list of items to iterate
        second (List[str]): list of items to check

    Returns:
        List[str]: list of items that were not in second list
    """
    return list(filter(lambda var: var not in second, first))

`remove_duplicates_from_list(list_with_duplicates)`

Removes duplicates from list.

Parameters:

Name	Type	Description	Default
`list`	`list`	list to be made distinct	required

Returns:

Type	Description
`list`	list without duplicates

Source code in utils/list_utils.py

def remove_duplicates_from_list(list_with_duplicates: list) -> list:
    """Removes duplicates from list.

    Args:
        list (list): list to be made distinct

    Returns:
        list: list without duplicates
    """

    return list(set(list_with_duplicates))

version_utils

Module for version utils

`read_version_number()`

Reads the version of the tool

Returns:

Type	Description
`str`	version number of tool

Source code in utils/version_utils.py

def read_version_number():
    """Reads the version of the tool

    Returns:
        str: version number of tool
    """
    version_string = version("decoimpact")
    return version_string

workflow

i_model_builder

Module for IModelBuilder interface

!!! interfacess IModelBuilder

`IModelBuilder (ABC)`

Factory for creating models

Source code in workflow/i_model_builder.py

class IModelBuilder(ABC):
    """Factory for creating models"""

    @abstractmethod
    def build_model(self, model_data: IModelData) -> IModel:
        """Creates an model based on model data

        Returns:
            IModel: instance of a model based on model data
        """

`build_model(self, model_data)`

Creates an model based on model data

Returns:

Type	Description
`IModel`	instance of a model based on model data

Source code in workflow/i_model_builder.py

@abstractmethod
def build_model(self, model_data: IModelData) -> IModel:
    """Creates an model based on model data

    Returns:
        IModel: instance of a model based on model data
    """

model_builder

Module for ModelBuilder class

!!! classes ModelBuilder

`ModelBuilder (IModelBuilder)`

Factory for creating models

Source code in workflow/model_builder.py

class ModelBuilder(IModelBuilder):
    """Factory for creating models"""

    def __init__(self, da_layer: IDataAccessLayer, logger: ILogger) -> None:
        self._logger = logger
        self._da_layer = da_layer

    def build_model(self, model_data: IModelData) -> IModel:
        """Creates a model based on model data.
        Current mapping works only for one dataset.

        Returns:
            IModel: instance of a model based on model data
        """

        self._logger.log_info("Creating rule-based model")

        datasets = [self._da_layer.read_input_dataset(ds) for ds in model_data.datasets]
        rules = list(ModelBuilder._create_rules(model_data.rules))

        mapping = model_data.datasets[0].mapping

        model: IModel = RuleBasedModel(
            datasets, rules, mapping, model_data.name, model_data.partition
        )

        return model

    @staticmethod
    def _create_rules(rule_data: List[IRuleData]) -> Iterable[IRule]:
        for rule_data_object in rule_data:
            yield ModelBuilder._create_rule(rule_data_object)

    @staticmethod
    def _set_default_fields(rule_data: IRuleData, rule: RuleBase):
        rule.description = rule_data.description
        rule.output_variable_name = rule_data.output_variable

    @staticmethod
    def _create_rule(rule_data: IRuleData) -> IRule:

        # from python >3.10 we can use match/case, better solution
        # until then disable pylint.
        # pylint: disable=too-many-branches
        if isinstance(rule_data, IMultiplyRuleData):
            rule = MultiplyRule(
                rule_data.name,
                [rule_data.input_variable],
                rule_data.multipliers,
                rule_data.date_range,
            )
        elif isinstance(rule_data, IDepthAverageRuleData):
            rule = DepthAverageRule(
                rule_data.name,
                rule_data.input_variables,
            )
        elif isinstance(rule_data, IFilterExtremesRuleData):
            rule = FilterExtremesRule(
                rule_data.name,
                rule_data.input_variables,
                rule_data.extreme_type,
                rule_data.distance,
                rule_data.time_scale,
                rule_data.mask,
            )
        elif isinstance(rule_data, ILayerFilterRuleData):
            rule = LayerFilterRule(
                rule_data.name,
                [rule_data.input_variable],
                rule_data.layer_number,
            )
        elif isinstance(rule_data, IAxisFilterRuleData):
            rule = AxisFilterRule(
                rule_data.name,
                [rule_data.input_variable],
                rule_data.element_index,
                rule_data.axis_name,
            )
        elif isinstance(rule_data, IStepFunctionRuleData):
            rule = StepFunctionRule(
                rule_data.name,
                rule_data.input_variable,
                rule_data.limits,
                rule_data.responses,
            )
        elif isinstance(rule_data, ITimeAggregationRuleData):
            rule = TimeAggregationRule(
                rule_data.name, [rule_data.input_variable], rule_data.operation
            )
            rule.settings.percentile_value = rule_data.percentile_value
            rule.settings.time_scale = rule_data.time_scale
        elif isinstance(rule_data, IRollingStatisticsRuleData):
            rule = RollingStatisticsRule(
                rule_data.name, [rule_data.input_variable], rule_data.operation
            )
            rule.settings.percentile_value = rule_data.percentile_value
            rule.settings.time_scale = rule_data.time_scale
            rule.period = rule_data.period
        elif isinstance(rule_data, ICombineResultsRuleData):
            rule = CombineResultsRule(
                rule_data.name,
                rule_data.input_variable_names,
                MultiArrayOperationType[rule_data.operation_type],
                rule_data.ignore_nan
            )
        elif isinstance(rule_data, IResponseCurveRuleData):
            rule = ResponseCurveRule(
                rule_data.name,
                rule_data.input_variable,
                rule_data.input_values,
                rule_data.output_values,
            )
        elif isinstance(rule_data, IFormulaRuleData):
            rule = FormulaRule(
                rule_data.name,
                rule_data.input_variable_names,
                rule_data.formula,
            )
        elif isinstance(rule_data, IClassificationRuleData):
            rule = ClassificationRule(
                rule_data.name, rule_data.input_variable_names, rule_data.criteria_table
            )
        else:
            error_str = (
                f"The rule type of rule '{rule_data.name}' is currently "
                "not implemented"
            )
            raise NotImplementedError(error_str)

        if isinstance(rule, RuleBase):
            ModelBuilder._set_default_fields(rule_data, rule)
        return rule

`build_model(self, model_data)`

Creates a model based on model data. Current mapping works only for one dataset.

Returns:

Type	Description
`IModel`	instance of a model based on model data

Source code in workflow/model_builder.py

def build_model(self, model_data: IModelData) -> IModel:
    """Creates a model based on model data.
    Current mapping works only for one dataset.

    Returns:
        IModel: instance of a model based on model data
    """

    self._logger.log_info("Creating rule-based model")

    datasets = [self._da_layer.read_input_dataset(ds) for ds in model_data.datasets]
    rules = list(ModelBuilder._create_rules(model_data.rules))

    mapping = model_data.datasets[0].mapping

    model: IModel = RuleBasedModel(
        datasets, rules, mapping, model_data.name, model_data.partition
    )

    return model

model_runner

Module for ModelRunner class

!!! classes ModelRunner

`ModelRunner`

Runner for models

Source code in workflow/model_runner.py

class ModelRunner:
    """Runner for models"""

    @staticmethod
    def run_model(model: IModel, logger: ILogger) -> bool:
        """Runs the provided model

        Args:
            model (IModel): model to run
        """

        success = True

        success = ModelRunner._change_state(
            model.validate, model, logger, ModelStatus.VALIDATING, ModelStatus.VALIDATED
        )
        success = success and ModelRunner._change_state(
            model.initialize,
            model,
            logger,
            ModelStatus.INITIALIZING,
            ModelStatus.INITIALIZED,
        )
        success = success and ModelRunner._change_state(
            model.execute, model, logger, ModelStatus.EXECUTING, ModelStatus.EXECUTED
        )
        success = success and ModelRunner._change_state(
            model.finalize, model, logger, ModelStatus.FINALIZING, ModelStatus.FINALIZED
        )

        if success:
            part_str = ""
            if model.partition:
                part_str = f" (Partition: {model.partition})"
            logger.log_info(
                f'Model "{model.name}{part_str}" has successfully finished running'
            )

        return success

    @staticmethod
    def _change_state(
        action: Callable[[ILogger], Any],
        model: IModel,
        log: ILogger,
        pre_status: ModelStatus,
        post_status: ModelStatus,
    ) -> bool:

        part_str = ""
        if model.partition:
            part_str = f" (Partition: {model.partition})"

        log.log_info(f'Model "{model.name}{part_str}" -> {str(pre_status)}')
        model.status = pre_status

        success = ModelRunner._change_state_core(action, log)

        if success:
            model.status = post_status
            message = f'Model "{model.name}{part_str}" -> {str(post_status)}'
            log.log_info(message)
            return True

        model.status = ModelStatus.FAILED
        message = (
            f'Model "{model.name}{part_str}" transition from '
            f"{str(pre_status)} to {str(post_status)} has failed."
        )

        log.log_error(message)

        return False

    @staticmethod
    def _change_state_core(action: Callable[[ILogger], Any], logger: ILogger) -> bool:

        try:
            return_value = action(logger)

            if isinstance(return_value, bool) and return_value is False:
                return False

            return True

        except RuntimeError:
            return False

`run_model(model, logger)` `staticmethod`

Runs the provided model

Parameters:

Name	Type	Description	Default
`model`	`IModel`	model to run	required

Source code in workflow/model_runner.py

@staticmethod
def run_model(model: IModel, logger: ILogger) -> bool:
    """Runs the provided model

    Args:
        model (IModel): model to run
    """

    success = True

    success = ModelRunner._change_state(
        model.validate, model, logger, ModelStatus.VALIDATING, ModelStatus.VALIDATED
    )
    success = success and ModelRunner._change_state(
        model.initialize,
        model,
        logger,
        ModelStatus.INITIALIZING,
        ModelStatus.INITIALIZED,
    )
    success = success and ModelRunner._change_state(
        model.execute, model, logger, ModelStatus.EXECUTING, ModelStatus.EXECUTED
    )
    success = success and ModelRunner._change_state(
        model.finalize, model, logger, ModelStatus.FINALIZING, ModelStatus.FINALIZED
    )

    if success:
        part_str = ""
        if model.partition:
            part_str = f" (Partition: {model.partition})"
        logger.log_info(
            f'Model "{model.name}{part_str}" has successfully finished running'
        )

    return success

crosscutting

i_logger

Module for ILogger interface

!!! interfaces ILogger

`ILogger (ABC)`

Interface for a Logger

Source code in crosscutting/i_logger.py

class ILogger(ABC):
    """Interface for a Logger"""

    @abstractmethod
    def log_error(self, message: str) -> None:
        """Logs an error message

        Args:
            message (str): message to log
        """

    @abstractmethod
    def log_warning(self, message: str) -> None:
        """Logs a warning message

        Args:
            message (str): message to log
        """

    @abstractmethod
    def log_info(self, message: str) -> None:
        """Logs a info message

        Args:
            message (str): message to log
        """

    @abstractmethod
    def log_debug(self, message: str) -> None:
        """Logs a debug message

        Args:
            message (str): message to log
        """

`log_debug(self, message)`

Logs a debug message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/i_logger.py

@abstractmethod
def log_debug(self, message: str) -> None:
    """Logs a debug message

    Args:
        message (str): message to log
    """

`log_error(self, message)`

Logs an error message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/i_logger.py

@abstractmethod
def log_error(self, message: str) -> None:
    """Logs an error message

    Args:
        message (str): message to log
    """

`log_info(self, message)`

Logs a info message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/i_logger.py

@abstractmethod
def log_info(self, message: str) -> None:
    """Logs a info message

    Args:
        message (str): message to log
    """

`log_warning(self, message)`

Logs a warning message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/i_logger.py

@abstractmethod
def log_warning(self, message: str) -> None:
    """Logs a warning message

    Args:
        message (str): message to log
    """

logger_factory

Module for LoggerFactory class

!!! classes LoggerFactory

`LoggerFactory`

Factory for creating loggers

Source code in crosscutting/logger_factory.py

class LoggerFactory:
    """Factory for creating loggers"""

    @staticmethod
    def create_logger() -> ILogger:
        """Creates a logger

        Returns:
            Logger: created logger
        """
        return LoggingLogger()

`create_logger()` `staticmethod`

Creates a logger

Returns:

Type	Description
`Logger`	created logger

Source code in crosscutting/logger_factory.py

@staticmethod
def create_logger() -> ILogger:
    """Creates a logger

    Returns:
        Logger: created logger
    """
    return LoggingLogger()

logging_logger

Module for LoggingLogger class

!!! classes LoggingLogger

`LoggingLogger (ILogger)`

Logger implementation based on default logging library

Source code in crosscutting/logging_logger.py

class LoggingLogger(ILogger):
    """Logger implementation based on default logging library"""

    def __init__(self) -> None:
        super().__init__()
        self._log = self._setup_logging()

    def log_error(self, message: str) -> None:
        """Logs an error message

        Args:
            message (str): message to log
        """
        self._log.error(message)

    def log_warning(self, message: str) -> None:
        """Logs a warning message

        Args:
            message (str): message to log
        """
        self._log.warning(message)

    def log_info(self, message: str) -> None:
        """Logs a info message

        Args:
            message (str): message to log
        """
        self._log.info(message)

    def log_debug(self, message: str) -> None:
        """Logs a debug message

        Args:
            message (str): message to log
        """
        self._log.debug(message)

    def _setup_logging(self) -> _log.Logger:
        """Sets logging information and logger setup"""
        _log.basicConfig(
            level=_log.INFO,
            format="%(asctime)s: %(levelname)-8s %(message)s",
            datefmt="%m-%d %H:%M:%S",
            filename="decoimpact.log",
            encoding="utf-8",  # Only for Python > 3.9
            filemode="w",
        )

        # define a Handler which writes INFO messages or higher to the sys.stderr
        console = _log.StreamHandler()
        console.setLevel(_log.INFO)

        # set a format which is simpler for console use
        formatter = _log.Formatter("%(asctime)s: %(levelname)-8s %(message)s")

        # tell the handler to use this format
        console.setFormatter(formatter)
        logger = _log.getLogger()

        # add the handler to the root logger
        logger.addHandler(console)

        return _log.getLogger()

`log_debug(self, message)`

Logs a debug message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/logging_logger.py

def log_debug(self, message: str) -> None:
    """Logs a debug message

    Args:
        message (str): message to log
    """
    self._log.debug(message)

`log_error(self, message)`

Logs an error message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/logging_logger.py

def log_error(self, message: str) -> None:
    """Logs an error message

    Args:
        message (str): message to log
    """
    self._log.error(message)

`log_info(self, message)`

Logs a info message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/logging_logger.py

def log_info(self, message: str) -> None:
    """Logs a info message

    Args:
        message (str): message to log
    """
    self._log.info(message)

`log_warning(self, message)`

Logs a warning message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in crosscutting/logging_logger.py

def log_warning(self, message: str) -> None:
    """Logs a warning message

    Args:
        message (str): message to log
    """
    self._log.warning(message)

data

api

i_axis_filter_rule_data

Module for IAxisFilterRuleData interface

!!! interfaces IAxisFilterRuleData

`IAxisFilterRuleData (IRuleData, ABC)`

Data for a axis filter rule

Source code in api/i_axis_filter_rule_data.py

class IAxisFilterRuleData(IRuleData, ABC):
    """Data for a axis filter rule"""

    @property
    @abstractmethod
    def input_variable(self) -> str:
        """Property for the nput variable"""

    @property
    @abstractmethod
    def element_index(self) -> int:
        """Property for the index of the element on the axis to filter on"""

    @property
    @abstractmethod
    def axis_name(self) -> str:
        """Property for the dim name"""

`axis_name: str` `property` `readonly`

Property for the dim name

`element_index: int` `property` `readonly`

Property for the index of the element on the axis to filter on

`input_variable: str` `property` `readonly`

Property for the nput variable

i_classification_rule_data

Module for IClassificationRuleData interface

!!! interfaces IClassificationRuleData

`IClassificationRuleData (IRuleData, ABC)`

Data for a combine Results Rule

Source code in api/i_classification_rule_data.py

class IClassificationRuleData(IRuleData, ABC):
    """Data for a combine Results Rule"""

    @property
    @abstractmethod
    def input_variable_names(self) -> List[str]:
        """Name of the input variable"""

    @property
    @abstractmethod
    def criteria_table(self) -> Dict[str, List]:
        """Property for the formula"""

`criteria_table: Dict[str, List]` `property` `readonly`

Property for the formula

`input_variable_names: List[str]` `property` `readonly`

Name of the input variable

i_combine_results_rule_data

Module for ICombineResultsRuleData interface

!!! interfaces ICombineResultsRuleData

`ICombineResultsRuleData (IRuleData, ABC)`

Data for a combine Results Rule

Source code in api/i_combine_results_rule_data.py

class ICombineResultsRuleData(IRuleData, ABC):
    """Data for a combine Results Rule"""

    @property
    @abstractmethod
    def input_variable_names(self) -> List[str]:
        """Name of the input variable"""

    @property
    @abstractmethod
    def operation_type(self) -> str:
        """Property for the operation_type"""

    @property
    @abstractmethod
    def ignore_nan(self) -> bool:
        """Property for the ignore_nan flag"""

`ignore_nan: bool` `property` `readonly`

Property for the ignore_nan flag

`input_variable_names: List[str]` `property` `readonly`

Name of the input variable

`operation_type: str` `property` `readonly`

Property for the operation_type

i_data_access_layer

Module for IDataAccessLayer interface

!!! interfaces IDataAccessLayer

`IDataAccessLayer (ABC)`

Interface for the data layer

Source code in api/i_data_access_layer.py

class IDataAccessLayer(ABC):
    """Interface for the data layer"""

    @abstractmethod
    def retrieve_file_names(self, path: Path) -> dict:
        """
        Find all files according to the pattern in the path string

        Args:
            path (str): path to input file (with * for generic part)

        Returns:
            List: List of strings with all files in folder according to pattern

        """

    @abstractmethod
    def read_input_file(self, path: Path) -> IModelData:
        """Reads input file from provided path

        Args:
            path (str): path to input file

        Returns:
            IModelData: Data regarding model
        """

    @abstractmethod
    def read_input_dataset(self, dataset_data: IDatasetData) -> _xr.Dataset:
        """Uses the provided dataset_data to create/read a xarray Dataset

        Args:
            dataset_data (IDatasetData): dataset data for creating an
                                         xarray dataset

        Returns:
            _xr.Dataset: Dataset based on provided dataset_data
        """

    @abstractmethod
    def write_output_file(
        self, dataset: _xr.Dataset, path: Path, settings: OutputFileSettings
    ) -> None:
        """Write output files to provided path

        Args:
            dataset (XArray dataset): dataset to write
            path (str): path to output file
            settings (OutputFileSettings): settings to use for saving output

        Returns:
            None

        Raises:
            FileExistsError: if output file location does not exist
            OSError: if output file cannot be written
        """

`read_input_dataset(self, dataset_data)`

Uses the provided dataset_data to create/read a xarray Dataset

Parameters:

Name	Type	Description	Default
`dataset_data`	`IDatasetData`	dataset data for creating an xarray dataset	required

Returns:

Type	Description
`_xr.Dataset`	Dataset based on provided dataset_data

Source code in api/i_data_access_layer.py

@abstractmethod
def read_input_dataset(self, dataset_data: IDatasetData) -> _xr.Dataset:
    """Uses the provided dataset_data to create/read a xarray Dataset

    Args:
        dataset_data (IDatasetData): dataset data for creating an
                                     xarray dataset

    Returns:
        _xr.Dataset: Dataset based on provided dataset_data
    """

`read_input_file(self, path)`

Reads input file from provided path

Parameters:

Name	Type	Description	Default
`path`	`str`	path to input file	required

Returns:

Type	Description
`IModelData`	Data regarding model

Source code in api/i_data_access_layer.py

@abstractmethod
def read_input_file(self, path: Path) -> IModelData:
    """Reads input file from provided path

    Args:
        path (str): path to input file

    Returns:
        IModelData: Data regarding model
    """

`retrieve_file_names(self, path)`

Find all files according to the pattern in the path string

Parameters:

Name	Type	Description	Default
`path`	`str`	path to input file (with * for generic part)	required

Returns:

Type	Description
`List`	List of strings with all files in folder according to pattern

Source code in api/i_data_access_layer.py

@abstractmethod
def retrieve_file_names(self, path: Path) -> dict:
    """
    Find all files according to the pattern in the path string

    Args:
        path (str): path to input file (with * for generic part)

    Returns:
        List: List of strings with all files in folder according to pattern

    """

`write_output_file(self, dataset, path, settings)`

Write output files to provided path

Parameters:

Name	Type	Description	Default
`dataset`	`XArray dataset`	dataset to write	required
`path`	`str`	path to output file	required
`settings`	`OutputFileSettings`	settings to use for saving output	required

Returns:

Type	Description
`None`	None

Exceptions:

Type	Description
`FileExistsError`	if output file location does not exist
`OSError`	if output file cannot be written

Source code in api/i_data_access_layer.py

@abstractmethod
def write_output_file(
    self, dataset: _xr.Dataset, path: Path, settings: OutputFileSettings
) -> None:
    """Write output files to provided path

    Args:
        dataset (XArray dataset): dataset to write
        path (str): path to output file
        settings (OutputFileSettings): settings to use for saving output

    Returns:
        None

    Raises:
        FileExistsError: if output file location does not exist
        OSError: if output file cannot be written
    """

i_dataset

Module for IDatasetData interface

!!! interfaces IDatasetData

`IDatasetData (ABC)`

Interface for dataset information

Source code in api/i_dataset.py

class IDatasetData(ABC):
    """Interface for dataset information"""

    @property
    @abstractmethod
    def path(self) -> Path:
        """File path to the dataset"""

    @property
    @abstractmethod
    def start_date(self) -> str:
        """start date to filter the dataset"""
        # start_date is passed as string (not datetime) because it is optional

    @property
    @abstractmethod
    def end_date(self) -> str:
        """end date to filter the dataset"""
        # end_date is passed as string (not datetime) because it is optional

    @property
    @abstractmethod
    def mapping(self) -> dict[str, str]:
        """Variable name mapping (source to target)"""

    @path.setter
    def path(self, path: Path):
        """path of the model"""

`end_date: str` `property` `readonly`

end date to filter the dataset

`mapping: dict[str, str]` `property` `readonly`

Variable name mapping (source to target)

`path: Path` `property` `writable`

File path to the dataset

`start_date: str` `property` `readonly`

start date to filter the dataset

i_depth_average_rule_data

Module for IDepthAverageRuleData interface

!!! interfaces IDepthAverageRuleData

`IDepthAverageRuleData (IRuleData, ABC)`

Data for a DepthAverageRule

Source code in api/i_depth_average_rule_data.py

class IDepthAverageRuleData(IRuleData, ABC):
    """Data for a DepthAverageRule"""

    @property
    @abstractmethod
    def input_variables(self) -> List[str]:
        """List with input variable name, bed level name,
        water level name and interface name (z or sigma)"""

`input_variables: List[str]` `property` `readonly`

List with input variable name, bed level name, water level name and interface name (z or sigma)

i_filter_extremes_rule_data

Module for IFilterExtremesRuleData interface

!!! interfaces IFilterExtremesRuleData

`IFilterExtremesRuleData (IRuleData, ABC)`

Data for a filter extremes rule

Source code in api/i_filter_extremes_rule_data.py

class IFilterExtremesRuleData(IRuleData, ABC):
    """Data for a filter extremes rule"""

    @property
    @abstractmethod
    def input_variables(self) -> List[str]:
        """List with input variable name"""

    @property
    @abstractmethod
    def extreme_type(self) -> str:
        """Type of extremes [peaks or throughs]"""

    @property
    @abstractmethod
    def distance(self) -> int:
        """Property for the distance between peaks"""

    @property
    @abstractmethod
    def time_scale(self) -> str:
        """Property for the timescale of the distance between peaks"""

    @property
    @abstractmethod
    def mask(self) -> bool:
        """Property for mask"""

`distance: int` `property` `readonly`

Property for the distance between peaks

`extreme_type: str` `property` `readonly`

Type of extremes [peaks or throughs]

`input_variables: List[str]` `property` `readonly`

List with input variable name

`mask: bool` `property` `readonly`

Property for mask

`time_scale: str` `property` `readonly`

Property for the timescale of the distance between peaks

i_formula_rule_data

Module for IFormulaRuleData interface

!!! interfaces IFormulaRuleData

`IFormulaRuleData (IRuleData, ABC)`

Data for a combine Results Rule

Source code in api/i_formula_rule_data.py

class IFormulaRuleData(IRuleData, ABC):
    """Data for a combine Results Rule"""

    @property
    @abstractmethod
    def input_variable_names(self) -> List[str]:
        """Name of the input variable"""

    @property
    @abstractmethod
    def formula(self) -> str:
        """Property for the formula"""

`formula: str` `property` `readonly`

Property for the formula

`input_variable_names: List[str]` `property` `readonly`

Name of the input variable

i_layer_filter_rule_data

Module for ILayerFilterRuleData interface

!!! interfaces ILayerFilterRuleData

`ILayerFilterRuleData (IRuleData, ABC)`

Data for a layer filter rule

Source code in api/i_layer_filter_rule_data.py

class ILayerFilterRuleData(IRuleData, ABC):
    """Data for a layer filter rule"""

    @property
    @abstractmethod
    def input_variable(self) -> str:
        """Property for the nput variable"""

    @property
    @abstractmethod
    def layer_number(self) -> int:
        """Property for the layer number"""

`input_variable: str` `property` `readonly`

Property for the nput variable

`layer_number: int` `property` `readonly`

Property for the layer number

i_model_data

Module for IModelData interface

!!! interfaces IModelData

`IModelData (ABC)`

Interface for the model data

Source code in api/i_model_data.py

class IModelData(ABC):
    """Interface for the model data"""

    @property
    @abstractmethod
    def name(self) -> str:
        """Name of the model"""

    @property
    @abstractmethod
    def version(self) -> List[int]:
        """Version of the model"""

    @property
    @abstractmethod
    def datasets(self) -> List[IDatasetData]:
        """Datasets of the model"""

    @property
    @abstractmethod
    def output_path(self) -> Path:
        """Model path to the output file"""

    @property
    @abstractmethod
    def output_variables(self) -> List[str]:
        """Output variables when a selection of output variables is made"""

    @property
    @abstractmethod
    def rules(self) -> List[IRuleData]:
        """Rules of the model"""

`datasets: List[decoimpact.data.api.i_dataset.IDatasetData]` `property` `readonly`

Datasets of the model

`name: str` `property` `readonly`

Name of the model

`output_path: Path` `property` `readonly`

Model path to the output file

`output_variables: List[str]` `property` `readonly`

Output variables when a selection of output variables is made

`rules: List[decoimpact.data.api.i_rule_data.IRuleData]` `property` `readonly`

Rules of the model

`version: List[int]` `property` `readonly`

Version of the model

i_multiply_rule_data

Module for IMultiplyRuleData interface

!!! interfaces IMultiplyRuleData

`IMultiplyRuleData (IRuleData, ABC)`

Data for a multiply rule

Source code in api/i_multiply_rule_data.py

class IMultiplyRuleData(IRuleData, ABC):
    """Data for a multiply rule"""

    @property
    @abstractmethod
    def input_variable(self) -> str:
        """Name of the input variable"""

    @property
    @abstractmethod
    def multipliers(self) -> List[List[float]]:
        """Name of the input variable"""

    @property
    @abstractmethod
    def date_range(self) -> Optional[List[List[str]]]:
        """Array with date ranges"""

`date_range: Optional[List[List[str]]]` `property` `readonly`

Array with date ranges

`input_variable: str` `property` `readonly`

Name of the input variable

`multipliers: List[List[float]]` `property` `readonly`

Name of the input variable

i_response_curve_rule_data

Module for IResponseCurveRuleData interface

!!! interfaces IResponseCurveRuleData

`IResponseCurveRuleData (IRuleData, ABC)`

Data for a response curve rule

Source code in api/i_response_curve_rule_data.py

class IResponseCurveRuleData(IRuleData, ABC):
    """Data for a response curve rule"""

    @property
    @abstractmethod
    def input_variable(self) -> str:
        """Property for the input variable"""

    @property
    @abstractmethod
    def input_values(self) -> List[float]:
        """Property for the input values"""

    @property
    @abstractmethod
    def output_values(self) -> List[float]:
        """Property for the output values"""

`input_values: List[float]` `property` `readonly`

Property for the input values

`input_variable: str` `property` `readonly`

Property for the input variable

`output_values: List[float]` `property` `readonly`

Property for the output values

i_rolling_statistics_rule_data

Module for IRollingStatisticsRuleData interface

!!! interfaces IRollingStatisticsRuleData

`IRollingStatisticsRuleData (IRuleData, ABC)`

Data for a RollingStatisticsRule

Source code in api/i_rolling_statistics_rule_data.py

class IRollingStatisticsRuleData(IRuleData, ABC):
    """Data for a RollingStatisticsRule"""

    @property
    @abstractmethod
    def input_variable(self) -> str:
        """Name of the input variable"""

    @property
    @abstractmethod
    def operation(self) -> TimeOperationType:
        """Operation type"""

    @property
    @abstractmethod
    def percentile_value(self) -> float:
        """Operation parameter"""

    @property
    @abstractmethod
    def time_scale(self) -> str:
        """Time scale"""

    @property
    @abstractmethod
    def period(self) -> float:
        """Period"""

`input_variable: str` `property` `readonly`

Name of the input variable

`operation: TimeOperationType` `property` `readonly`

Operation type

`percentile_value: float` `property` `readonly`

Operation parameter

`period: float` `property` `readonly`

Period

`time_scale: str` `property` `readonly`

Time scale

i_rule_data

Module for IRuleData interface

!!! interfaces IRuleData

`IRuleData (ABC)`

Interface for rules data information

Source code in api/i_rule_data.py

class IRuleData(ABC):
    """Interface for rules data information"""

    @property
    @abstractmethod
    def name(self) -> str:
        """Name of the rule"""

    @property
    @abstractmethod
    def description(self) -> str:
        """Description of the rule"""

    @property
    @abstractmethod
    def output_variable(self) -> str:
        """Read the rule using the name"""

`description: str` `property` `readonly`

Description of the rule

`name: str` `property` `readonly`

Name of the rule

`output_variable: str` `property` `readonly`

Read the rule using the name

i_step_function_rule_data

Module for IStepFunctionRuleData interface

!!! interfaces IStepFunctionRuleData

`IStepFunctionRuleData (IRuleData, ABC)`

Data for a step function rule

Source code in api/i_step_function_rule_data.py

class IStepFunctionRuleData(IRuleData, ABC):
    """Data for a step function rule"""

    @property
    @abstractmethod
    def input_variable(self) -> str:
        """Name of the input variable"""

    @property
    @abstractmethod
    def limits(self) -> List[float]:
        """Limits of the intervals defining the step function rule"""

    @property
    @abstractmethod
    def responses(self) -> List[float]:
        """Responses corresponding to each of the intervals
        defining the step function rule"""

`input_variable: str` `property` `readonly`

Name of the input variable

`limits: List[float]` `property` `readonly`

Limits of the intervals defining the step function rule

`responses: List[float]` `property` `readonly`

Responses corresponding to each of the intervals defining the step function rule

i_time_aggregation_rule_data

Module for ITimeAggregationRuleData interface

!!! interfaces ITimeAggregationRuleData

`ITimeAggregationRuleData (IRuleData, ABC)`

Data for a TimeAggregationRule

Source code in api/i_time_aggregation_rule_data.py

class ITimeAggregationRuleData(IRuleData, ABC):
    """Data for a TimeAggregationRule"""

    @property
    @abstractmethod
    def input_variable(self) -> str:
        """Name of the input variable"""

    @property
    @abstractmethod
    def operation(self) -> TimeOperationType:
        """Operation type"""

    @property
    @abstractmethod
    def percentile_value(self) -> float:
        """Operation parameter"""

    @property
    @abstractmethod
    def time_scale(self) -> str:
        """Time scale"""

`input_variable: str` `property` `readonly`

Name of the input variable

`operation: TimeOperationType` `property` `readonly`

Operation type

`percentile_value: float` `property` `readonly`

Operation parameter

`time_scale: str` `property` `readonly`

Time scale

output_file_settings

Module for OutputFileSettings class

!!! classes OutputFileSettings

`OutputFileSettings`

settings class used to store information about how to write the output file

Source code in api/output_file_settings.py

class OutputFileSettings:
    """settings class used to store information about how to write the
    output file"""

    def __init__(self, application_name: str, application_version: str) -> None:
        """Creates an instance of OutputFileSettings

        Args:
            application_version (str) : version of the application
            application_name (str) : name of the application
        """
        self._application_name: str = application_name
        self._application_version: str = application_version
        self._variables_to_save: Optional[List[str]] = None

    @property
    def application_name(self) -> str:
        """name of the application"""
        return self._application_name

    @property
    def application_version(self) -> str:
        """version of the application"""
        return self._application_version

    @property
    def variables_to_save(self) -> Optional[List[str]]:
        """variables to save to the output"""
        return self._variables_to_save

    @variables_to_save.setter
    def variables_to_save(self, variables_to_save: Optional[List[str]]):
        self._variables_to_save = variables_to_save

`application_name: str` `property` `readonly`

name of the application

`application_version: str` `property` `readonly`

version of the application

`variables_to_save: Optional[List[str]]` `property` `writable`

variables to save to the output

`init(self, application_name, application_version)` `special`

Creates an instance of OutputFileSettings

Parameters:

Name	Type	Description	Default
`application_version`	`str)`	version of the application	required
`application_name`	`str)`	name of the application	required

Source code in api/output_file_settings.py

def __init__(self, application_name: str, application_version: str) -> None:
    """Creates an instance of OutputFileSettings

    Args:
        application_version (str) : version of the application
        application_name (str) : name of the application
    """
    self._application_name: str = application_name
    self._application_version: str = application_version
    self._variables_to_save: Optional[List[str]] = None

time_operation_type

Module for TimeOperationType

!!! classes TimeOperationType

`TimeOperationType (IntEnum)`

Classify the time operation types.

Source code in api/time_operation_type.py

class TimeOperationType(IntEnum):
    """Classify the time operation types."""

    ADD = 1
    MIN = 2
    MAX = 3
    AVERAGE = 4
    MEDIAN = 5
    COUNT_PERIODS = 6
    MAX_DURATION_PERIODS = 7
    AVG_DURATION_PERIODS = 8
    STDEV = 9
    PERCENTILE = 10

dictionary_utils

Module for dictionary utilities

`convert_table_element(table)`

Convert a table element into a dictionary

Parameters:

Name	Type	Description	Default
`table`	`list[Any]`	Table to convert	required

Exceptions:

Type	Description
`ValueError`	When table is not correctly defined

Returns:

Type	Description
`Dict[Any, Any]`	readable dictionary with parsed headers and values.

Source code in data/dictionary_utils.py

def convert_table_element(table: List[Any]) -> Dict[Any, Any]:
    """Convert a table element into a dictionary

    Args:
        table (list[Any]): Table to convert
    Raises:
        ValueError: When table is not correctly defined

    Returns:
        Dict[Any, Any]: readable dictionary with parsed headers and values.
    """

    if len(table) <= 1:
        raise ValueError(
            "Define a correct table with the headers in the first row and values in \
            the others."
        )

    if not all(len(row) == len(table[0]) for row in table):
        raise ValueError("Make sure that all rows in the table have the same length.")

    headers = table[0]

    if len(headers) != len(set(headers)):
        seen = set()
        dupes = [x for x in headers if x in seen or seen.add(x)]
        raise ValueError(
            f"There should only be unique headers. Duplicate values: {dupes}"
        )

    values = list(map(list, zip(*table[1:])))  # transpose list
    return dict(zip(headers, values))

`get_dict_element(key, contents, required=True)`

Tries to get an element from the provided dictionary.

Parameters:

Name	Type	Description	Default
`key`	`str`	Name of the element to search for	required
`contents`	`Dict[str, Any]`	Dictionary to search	required
`required`	`bool`	If the key needs to be there. Defaults	`True`

Exceptions:

Type	Description
`AttributeError`	Thrown when the key is required but is missing).

Returns:

Type	Description
`T`	Value for the specified key

Source code in data/dictionary_utils.py

def get_dict_element(
    key: str, contents: Dict[str, ValueT], required: bool = True
) -> Optional[ValueT]:
    """Tries to get an element from the provided dictionary.

    Args:
        key (str): Name of the element to search for
        contents (Dict[str, Any]): Dictionary to search
        required (bool, optional): If the key needs to be there. Defaults
        to True.
    Raises:
        AttributeError: Thrown when the key is required but is missing).

    Returns:
        T: Value for the specified key
    """
    has_element = key in contents.keys()

    if has_element:
        return contents[key]

    if required:
        raise AttributeError(f"Missing element {key}")

    return None

entities

axis_filter_rule_data

Module for AxisFilterRuleData class

!!! classes AxisFilterRuleData

`AxisFilterRuleData (IAxisFilterRuleData, RuleData)`

Class for storing data related to axis filter rule rule

Source code in entities/axis_filter_rule_data.py

class AxisFilterRuleData(IAxisFilterRuleData, RuleData):
    """Class for storing data related to axis filter rule rule"""

    def __init__(
        self, name: str, element_index: int, axis_name: str, input_variable: str
    ):
        super().__init__(name)
        self._input_variable = input_variable
        self._element_index = element_index
        self._axis_name = axis_name

    @property
    def input_variable(self) -> str:
        """Property for the input variable"""
        return self._input_variable

    @property
    def element_index(self) -> int:
        """Property for the index of the element on the axis to filter on"""
        return self._element_index

    @property
    def axis_name(self) -> str:
        """Property for the dimension name"""
        return self._axis_name

`axis_name: str` `property` `readonly`

Property for the dimension name

`element_index: int` `property` `readonly`

Property for the index of the element on the axis to filter on

`input_variable: str` `property` `readonly`

Property for the input variable

classification_rule_data

Module for (multiple) ClassificationRule class

!!! classes (multiple) ClassificationRuleData

`ClassificationRuleData (IClassificationRuleData, RuleData)`

Class for storing data related to formula rule

Source code in entities/classification_rule_data.py

class ClassificationRuleData(IClassificationRuleData, RuleData):
    """Class for storing data related to formula rule"""

    def __init__(
        self,
        name: str,
        input_variable_names: List[str],
        criteria_table: Dict[str, List],
    ):
        super().__init__(name)
        self._input_variable_names = input_variable_names
        self._criteria_table = criteria_table

    @property
    def criteria_table(self) -> Dict:
        """Criteria property"""
        return self._criteria_table

    @property
    def input_variable_names(self) -> List[str]:
        return self._input_variable_names

`criteria_table: Dict` `property` `readonly`

Criteria property

`input_variable_names: List[str]` `property` `readonly`

Name of the input variable

combine_results_rule_data

Module for CombineResultsRuleData class

!!! classes CombineResultsRuleData

`CombineResultsRuleData (ICombineResultsRuleData, RuleData)`

Class for storing data related to combine results rule

Source code in entities/combine_results_rule_data.py

class CombineResultsRuleData(ICombineResultsRuleData, RuleData):
    """Class for storing data related to combine results rule"""

    def __init__(self, name: str, input_variable_names: List[str],
                 operation_type: str, ignore_nan: bool = False):
        super().__init__(name)
        self._input_variable_names = input_variable_names
        self._operation_type = operation_type
        self._ignore_nan = ignore_nan

    @property
    def input_variable_names(self) -> List[str]:
        """Name of the input variable"""
        return self._input_variable_names

    @property
    def operation_type(self) -> str:
        """Name of the input variable"""
        return self._operation_type

    @property
    def ignore_nan(self) -> bool:
        """Property for the ignore_nan flag"""
        return self._ignore_nan

`ignore_nan: bool` `property` `readonly`

Property for the ignore_nan flag

`input_variable_names: List[str]` `property` `readonly`

Name of the input variable

`operation_type: str` `property` `readonly`

Name of the input variable

data_access_layer

Module for DataAccessLayer class

!!! classes DataAccessLayer

`DataAccessLayer (IDataAccessLayer)`

Implementation of the data layer

Source code in entities/data_access_layer.py

class DataAccessLayer(IDataAccessLayer):
    """Implementation of the data layer"""

    def __init__(self, logger: ILogger):
        self._logger = logger

    def retrieve_file_names(self, path: Path) -> dict:
        """
        Find all files according to the pattern in the path string
        If the user gives one filename, one file is returned. The user
        can give in a * in the filename and all files that correspond to
        that pattern will be retrieved.

        Args:
            path (str): path to input file (with * for generic part)

        Returns:
            List: List of strings with all files in folder according to pattern

        """
        name_list = list(path.parent.glob(path.name))
        # check if there is at least 1 file found.
        if len(name_list) == 0:
            message = f"""No files found for inputfilename {path.name}. \
                          Make sure the input file location is valid."""
            raise FileExistsError(message)

        names = {}
        for name in name_list:
            if "*" in path.name:
                part = re.findall(path.name.replace("*", "(.*)"), name.as_posix())
                names["_".join(part)] = name
            else:
                names[""] = name
        return names

    def read_input_file(self, path: Path) -> IModelData:
        """Reads input file from provided path

        Args:
            path (str): path to input file

        Returns:
            IModelData: Data regarding model

        Raises:
            FileExistsError: if file does not exist
            AttributeError: if yaml data is invalid
        """
        self._logger.log_info(f"Creating model data based on yaml file {path}")

        if not path.exists():
            msg = f"ERROR: The input file {path} does not exist."
            self._logger.log_error(msg)
            raise FileExistsError(msg)

        with open(path, "r", encoding="utf-8") as stream:
            contents: dict[Any, Any] = _yaml.load(
                stream, Loader=self.__create_yaml_loader()
            )
            model_data_builder = ModelDataBuilder(self._logger)
            try:
                yaml_data = model_data_builder.parse_yaml_data(contents)
            except AttributeError as exc:
                raise AttributeError(f"Error reading input file. {exc}") from exc
            return yaml_data

    def read_input_dataset(self, dataset_data: IDatasetData) -> _xr.Dataset:
        """Uses the provided dataset_data to create/read a xarray Dataset

        Args:
            dataset_data (IDatasetData): dataset data for creating an
                                         xarray dataset

        Returns:
            _xr.Dataset: Dataset based on provided dataset_data
        """
        # get start and end date from input file and convert to date format
        # if start or end date is not given, then use None to slice the data
        date_format = "%d-%m-%Y"
        filter_start_date = None
        ds_start_date = dataset_data.start_date
        if ds_start_date != "None":
            filter_start_date = datetime.strptime(ds_start_date, date_format)
        filter_end_date = None
        ds_end_date = dataset_data.end_date
        if ds_end_date != "None":
            filter_end_date = datetime.strptime(ds_end_date, date_format)

        if dataset_data.path.suffix != ".nc":
            message = f"""The file {dataset_data.path} is not supported. \
                          Currently only UGrid (NetCDF) files are supported."""
            raise NotImplementedError(message)

        # open input dataset (from .nc file)
        try:
            dataset: _xr.Dataset = _xr.open_dataset(
                dataset_data.path, mask_and_scale=True
            )
            # mask_and_scale argument is needed to prevent inclusion of NaN's
            # in dataset for missing values. This inclusion converts integers
            # to floats
        except ValueError as exc:
            msg = "ERROR: Cannot open input .nc file -- " + str(dataset_data.path)
            raise ValueError(msg) from exc

        # apply time filter on input dataset
        try:
            if filter_start_date is not None or filter_end_date is not None:
                time_filter = f"({filter_start_date}, {filter_end_date})"
                self._logger.log_info(f"Applying time filter {time_filter} on dataset")
                dataset = dataset.sel(time=slice(filter_start_date, filter_end_date))
        except ValueError as exc:
            msg = "ERROR: error applying time filter on dataset"
            raise ValueError(msg) from exc
        return dataset

    def write_output_file(
        self, dataset: _xr.Dataset, path: Path, settings: OutputFileSettings
    ) -> None:
        """Write XArray dataset to specified path

        Args:
            dataset (XArray dataset): dataset to write
            path (str): path to output file
            settings (OutputFileSettings): settings to use for saving output

        Returns:
            None

        Raises:
            FileExistsError: if output file location does not exist
            OSError: if output file cannot be written
        """
        self._logger.log_info(f"Writing model output data to {path}")

        if not Path.exists(path.parent):
            # try to make intermediate folders
            Path(path.parent).mkdir(parents=True, exist_ok=True)

            if not Path.exists(path.parent):
                message = f"""The path {path.parent} is not found. \
                            Make sure the output file location is valid."""
                raise FileExistsError(message)

        if Path(path).suffix != ".nc":
            message = f"""The file {path} is not supported. \
                          Currently only UGrid (NetCDF) files are supported."""
            raise NotImplementedError(message)

        try:
            dataset.attrs["Version"] = settings.application_version
            dataset.attrs["Generated by"] = settings.application_name

            if settings.variables_to_save and len(settings.variables_to_save) > 0:
                dataset = reduce_dataset_for_writing(
                    dataset, settings.variables_to_save, self._logger
                )

            dataset.to_netcdf(path, format="NETCDF4")
            # D-Flow FM sometimes still uses netCDF3.
            # If necessary we can revert to "NETCDF4_CLASSIC"
            # (Data is stored in an HDF5 file, using only netCDF 3 compatible
            # API features.)

            # TO DO: write application_version to output file as a global attribute
        except OSError as exc:
            msg = f"ERROR: Cannot write output .nc file -- {path}"
            self._logger.log_error(msg)
            raise OSError(msg) from exc

    def yaml_include_constructor(self, loader: _yaml.Loader, node: _yaml.Node) -> Any:
        """constructor function to make !include (referencedfile) possible"""

        file_path = Path(loader.name).parent
        file_path = file_path.joinpath(loader.construct_yaml_str(node)).resolve()
        with open(file=file_path, mode="r", encoding="utf-8") as incl_file:
            return _yaml.load(incl_file, type(loader))

    def __create_yaml_loader(self):
        """create yaml loader"""

        loader = _yaml.FullLoader
        loader.add_constructor("!include", self.yaml_include_constructor)

        # Add support for scientific notation (example 1e5=100000)
        # Define the YAML float tag and regex pattern for scientific notation
        float_decimal = r"[-+]?(?:\d[\d_]*)\.[0-9_]*(?:[eE][-+]?\d+)?"
        float_exponent = r"[-+]?(?:\d[\d_]*)(?:[eE][-+]?\d+)"
        float_leading_dot = r"\.[\d_]+(?:[eE][-+]\d+)?"
        float_time = r"[-+]?\d[\d_]*(?::[0-5]?\d)+\.[\d_]*"
        float_inf = r"[-+]?\.(?:inf|Inf|INF)"
        float_nan = r"\.(?:nan|NaN|NAN)"

        float_regex_pattern = rf"""^(?:
            {float_decimal}
            |{float_exponent}
            |{float_leading_dot}
            |{float_time}
            |{float_inf}
            |{float_nan})$"""

        float_regex = re.compile(float_regex_pattern, re.X)

        loader.add_implicit_resolver(
            "tag:yaml.org,2002:float",
            float_regex,
            list("-+0123456789."),
        )

        return loader

`read_input_dataset(self, dataset_data)`

Uses the provided dataset_data to create/read a xarray Dataset

Parameters:

Name	Type	Description	Default
`dataset_data`	`IDatasetData`	dataset data for creating an xarray dataset	required

Returns:

Type	Description
`_xr.Dataset`	Dataset based on provided dataset_data

Source code in entities/data_access_layer.py

def read_input_dataset(self, dataset_data: IDatasetData) -> _xr.Dataset:
    """Uses the provided dataset_data to create/read a xarray Dataset

    Args:
        dataset_data (IDatasetData): dataset data for creating an
                                     xarray dataset

    Returns:
        _xr.Dataset: Dataset based on provided dataset_data
    """
    # get start and end date from input file and convert to date format
    # if start or end date is not given, then use None to slice the data
    date_format = "%d-%m-%Y"
    filter_start_date = None
    ds_start_date = dataset_data.start_date
    if ds_start_date != "None":
        filter_start_date = datetime.strptime(ds_start_date, date_format)
    filter_end_date = None
    ds_end_date = dataset_data.end_date
    if ds_end_date != "None":
        filter_end_date = datetime.strptime(ds_end_date, date_format)

    if dataset_data.path.suffix != ".nc":
        message = f"""The file {dataset_data.path} is not supported. \
                      Currently only UGrid (NetCDF) files are supported."""
        raise NotImplementedError(message)

    # open input dataset (from .nc file)
    try:
        dataset: _xr.Dataset = _xr.open_dataset(
            dataset_data.path, mask_and_scale=True
        )
        # mask_and_scale argument is needed to prevent inclusion of NaN's
        # in dataset for missing values. This inclusion converts integers
        # to floats
    except ValueError as exc:
        msg = "ERROR: Cannot open input .nc file -- " + str(dataset_data.path)
        raise ValueError(msg) from exc

    # apply time filter on input dataset
    try:
        if filter_start_date is not None or filter_end_date is not None:
            time_filter = f"({filter_start_date}, {filter_end_date})"
            self._logger.log_info(f"Applying time filter {time_filter} on dataset")
            dataset = dataset.sel(time=slice(filter_start_date, filter_end_date))
    except ValueError as exc:
        msg = "ERROR: error applying time filter on dataset"
        raise ValueError(msg) from exc
    return dataset

`read_input_file(self, path)`

Reads input file from provided path

Parameters:

Name	Type	Description	Default
`path`	`str`	path to input file	required

Returns:

Type	Description
`IModelData`	Data regarding model

Exceptions:

Type	Description
`FileExistsError`	if file does not exist
`AttributeError`	if yaml data is invalid

Source code in entities/data_access_layer.py

def read_input_file(self, path: Path) -> IModelData:
    """Reads input file from provided path

    Args:
        path (str): path to input file

    Returns:
        IModelData: Data regarding model

    Raises:
        FileExistsError: if file does not exist
        AttributeError: if yaml data is invalid
    """
    self._logger.log_info(f"Creating model data based on yaml file {path}")

    if not path.exists():
        msg = f"ERROR: The input file {path} does not exist."
        self._logger.log_error(msg)
        raise FileExistsError(msg)

    with open(path, "r", encoding="utf-8") as stream:
        contents: dict[Any, Any] = _yaml.load(
            stream, Loader=self.__create_yaml_loader()
        )
        model_data_builder = ModelDataBuilder(self._logger)
        try:
            yaml_data = model_data_builder.parse_yaml_data(contents)
        except AttributeError as exc:
            raise AttributeError(f"Error reading input file. {exc}") from exc
        return yaml_data

`retrieve_file_names(self, path)`

Find all files according to the pattern in the path string If the user gives one filename, one file is returned. The user can give in a * in the filename and all files that correspond to that pattern will be retrieved.

Parameters:

Name	Type	Description	Default
`path`	`str`	path to input file (with * for generic part)	required

Returns:

Type	Description
`List`	List of strings with all files in folder according to pattern

Source code in entities/data_access_layer.py

def retrieve_file_names(self, path: Path) -> dict:
    """
    Find all files according to the pattern in the path string
    If the user gives one filename, one file is returned. The user
    can give in a * in the filename and all files that correspond to
    that pattern will be retrieved.

    Args:
        path (str): path to input file (with * for generic part)

    Returns:
        List: List of strings with all files in folder according to pattern

    """
    name_list = list(path.parent.glob(path.name))
    # check if there is at least 1 file found.
    if len(name_list) == 0:
        message = f"""No files found for inputfilename {path.name}. \
                      Make sure the input file location is valid."""
        raise FileExistsError(message)

    names = {}
    for name in name_list:
        if "*" in path.name:
            part = re.findall(path.name.replace("*", "(.*)"), name.as_posix())
            names["_".join(part)] = name
        else:
            names[""] = name
    return names

`write_output_file(self, dataset, path, settings)`

Write XArray dataset to specified path

Parameters:

Name	Type	Description	Default
`dataset`	`XArray dataset`	dataset to write	required
`path`	`str`	path to output file	required
`settings`	`OutputFileSettings`	settings to use for saving output	required

Returns:

Type	Description
`None`	None

Exceptions:

Type	Description
`FileExistsError`	if output file location does not exist
`OSError`	if output file cannot be written

Source code in entities/data_access_layer.py

def write_output_file(
    self, dataset: _xr.Dataset, path: Path, settings: OutputFileSettings
) -> None:
    """Write XArray dataset to specified path

    Args:
        dataset (XArray dataset): dataset to write
        path (str): path to output file
        settings (OutputFileSettings): settings to use for saving output

    Returns:
        None

    Raises:
        FileExistsError: if output file location does not exist
        OSError: if output file cannot be written
    """
    self._logger.log_info(f"Writing model output data to {path}")

    if not Path.exists(path.parent):
        # try to make intermediate folders
        Path(path.parent).mkdir(parents=True, exist_ok=True)

        if not Path.exists(path.parent):
            message = f"""The path {path.parent} is not found. \
                        Make sure the output file location is valid."""
            raise FileExistsError(message)

    if Path(path).suffix != ".nc":
        message = f"""The file {path} is not supported. \
                      Currently only UGrid (NetCDF) files are supported."""
        raise NotImplementedError(message)

    try:
        dataset.attrs["Version"] = settings.application_version
        dataset.attrs["Generated by"] = settings.application_name

        if settings.variables_to_save and len(settings.variables_to_save) > 0:
            dataset = reduce_dataset_for_writing(
                dataset, settings.variables_to_save, self._logger
            )

        dataset.to_netcdf(path, format="NETCDF4")
        # D-Flow FM sometimes still uses netCDF3.
        # If necessary we can revert to "NETCDF4_CLASSIC"
        # (Data is stored in an HDF5 file, using only netCDF 3 compatible
        # API features.)

        # TO DO: write application_version to output file as a global attribute
    except OSError as exc:
        msg = f"ERROR: Cannot write output .nc file -- {path}"
        self._logger.log_error(msg)
        raise OSError(msg) from exc

`yaml_include_constructor(self, loader, node)`

constructor function to make !include (referencedfile) possible

Source code in entities/data_access_layer.py

def yaml_include_constructor(self, loader: _yaml.Loader, node: _yaml.Node) -> Any:
    """constructor function to make !include (referencedfile) possible"""

    file_path = Path(loader.name).parent
    file_path = file_path.joinpath(loader.construct_yaml_str(node)).resolve()
    with open(file=file_path, mode="r", encoding="utf-8") as incl_file:
        return _yaml.load(incl_file, type(loader))

dataset_data

Module for DatasetData interface

!!! classes DatasetData

`DatasetData (IDatasetData)`

Class for storing dataset information

Source code in entities/dataset_data.py

class DatasetData(IDatasetData):
    """Class for storing dataset information"""

    def __init__(self, dataset: dict[str, Any]):
        """Create DatasetData based on provided info dictionary

        Args:
            dataset (dict[str, Any]):
        """
        self._path = Path(get_dict_element("filename", dataset)).resolve()
        self._start_date = str(get_dict_element("start_date", dataset, False))
        self._end_date = str(get_dict_element("end_date", dataset, False))
        self._get_mapping(dataset)

    @property
    def path(self) -> Path:
        """File path to the input dataset"""
        return self._path

    @property
    def start_date(self) -> str:
        """optional start date to filter the dataset"""
        # start_date is passed as string (not datetime) because it is optional
        return self._start_date

    @property
    def end_date(self) -> str:
        """optional end date to filter the dataset"""
        # end_date is passed as string (not datetime) because it is optional
        return self._end_date

    @property
    def mapping(self) -> dict[str, str]:
        """Variable name mapping (source to target)"""
        return self._mapping

    @path.setter
    def path(self, path: Path):
        """path of the model"""
        self._path = path

    def _get_mapping(self, dataset: dict[str, Any]):
        """Get mapping specified in input file

        Args:
            dataset (dict[str, Any]):
        """
        self._mapping = get_dict_element("variable_mapping", dataset, False)

`end_date: str` `property` `readonly`

optional end date to filter the dataset

`mapping: dict[str, str]` `property` `readonly`

Variable name mapping (source to target)

`path: Path` `property` `writable`

File path to the input dataset

`start_date: str` `property` `readonly`

optional start date to filter the dataset

`init(self, dataset)` `special`

Create DatasetData based on provided info dictionary

Parameters:

Name	Type	Description	Default
`dataset`	`dict[str, Any]`		required

Source code in entities/dataset_data.py

def __init__(self, dataset: dict[str, Any]):
    """Create DatasetData based on provided info dictionary

    Args:
        dataset (dict[str, Any]):
    """
    self._path = Path(get_dict_element("filename", dataset)).resolve()
    self._start_date = str(get_dict_element("start_date", dataset, False))
    self._end_date = str(get_dict_element("end_date", dataset, False))
    self._get_mapping(dataset)

depth_average_rule_data

Module for (multiple) DepthAverageRule class

!!! classes (multiple) DepthAverageRuleData

`DepthAverageRuleData (IDepthAverageRuleData, RuleData)`

Class for storing data related to depth average rule

Source code in entities/depth_average_rule_data.py

class DepthAverageRuleData(IDepthAverageRuleData, RuleData):
    """Class for storing data related to depth average rule"""

    def __init__(
        self,
        name: str,
        input_variables: List[str],
    ):
        super().__init__(name)
        self._input_variables = input_variables

    @property
    def input_variables(self) -> List[str]:
        """List with input variables"""
        return self._input_variables

`input_variables: List[str]` `property` `readonly`

List with input variables

filter_extremes_rule_data

Module for FilterExtremesRuleData class

!!! classes FilterExtremesRuleData

`FilterExtremesRuleData (IFilterExtremesRuleData, RuleData)`

Class for storing data related to filter extremes rule

Source code in entities/filter_extremes_rule_data.py

class FilterExtremesRuleData(IFilterExtremesRuleData, RuleData):
    """Class for storing data related to filter extremes rule"""

    # pylint: disable=too-many-arguments
    # pylint: disable=too-many-positional-arguments
    def __init__(
        self,
        name: str,
        input_variables: List[str],
        extreme_type: str,
        distance: int,
        time_scale: str,
        mask: bool,
    ):
        super().__init__(name)
        self._input_variables = input_variables
        self._extreme_type = extreme_type
        self._distance = distance
        self._time_scale = time_scale
        self._mask = mask

    @property
    def input_variables(self) -> List[str]:
        """List with input variables"""
        return self._input_variables

    @property
    def extreme_type(self) -> str:
        """Property for the extremes type"""
        return self._extreme_type

    @property
    def distance(self) -> int:
        """Property for the distance between peaks"""
        return self._distance

    @property
    def time_scale(self) -> str:
        """Property for the timescale of the distance between peaks"""
        return self._time_scale

    @property
    def mask(self) -> bool:
        """Property for mask"""
        return self._mask

`distance: int` `property` `readonly`

Property for the distance between peaks

`extreme_type: str` `property` `readonly`

Property for the extremes type

`input_variables: List[str]` `property` `readonly`

List with input variables

`mask: bool` `property` `readonly`

Property for mask

`time_scale: str` `property` `readonly`

Property for the timescale of the distance between peaks

formula_rule_data

Module for FormulaRuleData class

!!! classes FormulaRuleData

`FormulaRuleData (IFormulaRuleData, RuleData)`

Class for storing data related to formula rule

Source code in entities/formula_rule_data.py

class FormulaRuleData(IFormulaRuleData, RuleData):
    """Class for storing data related to formula rule"""

    def __init__(self, name: str, input_variable_names: List[str], formula: str):
        super().__init__(name)
        self._input_variable_names = input_variable_names
        self._formula = formula

    @property
    def input_variable_names(self) -> List[str]:
        """List of input variable names"""
        return self._input_variable_names

    @property
    def formula(self) -> str:
        """Formula as string using input variable names"""
        return self._formula

`formula: str` `property` `readonly`

Formula as string using input variable names

`input_variable_names: List[str]` `property` `readonly`

List of input variable names

layer_filter_rule_data

Module for LayerFilterRuleData class

!!! classes LayerFilterRuleData

`LayerFilterRuleData (ILayerFilterRuleData, RuleData)`

Class for storing data related to layer filter rule rule

Source code in entities/layer_filter_rule_data.py

class LayerFilterRuleData(ILayerFilterRuleData, RuleData):
    """Class for storing data related to layer filter rule rule"""

    def __init__(self, name: str, layer_number: int, input_variable: str):
        super().__init__(name)
        self._input_variable = input_variable
        self._layer_number = layer_number

    @property
    def input_variable(self) -> str:
        """Property for the input variable"""
        return self._input_variable

    @property
    def layer_number(self) -> int:
        """Property for the layer number"""
        return self._layer_number

`input_variable: str` `property` `readonly`

Property for the input variable

`layer_number: int` `property` `readonly`

Property for the layer number

model_data_builder

Module for ModelDataBuilder class

`ModelDataBuilder`

Builder for creating Model data objects (parsing rules and datasets read from the input file to Rule and DatasetData objects)

Source code in entities/model_data_builder.py

class ModelDataBuilder:
    """Builder for creating Model data objects (parsing rules and datasets
    read from the input file to Rule and DatasetData objects)"""

    def __init__(self, logger: ILogger) -> None:
        """Create ModelDataBuilder"""
        self._rule_parsers = list(rule_parsers())
        self._logger = logger

    def parse_yaml_data(self, contents: dict[Any, Any]) -> IModelData:
        """Parse the Yaml input file into a data object

        Raises:
            AttributeError: when version is not available from the input file
        """
        input_version = self._parse_input_version(contents)
        if not input_version:
            raise AttributeError(name="Version not available from input file")

        input_datasets = list(self._parse_input_datasets(contents))
        output_path = self._parse_output_dataset(contents)
        output_variables = self._parse_save_only_variables(contents)
        rules = list(self._parse_rules(contents))

        model_data = YamlModelData("Model 1", input_version)
        model_data.datasets = input_datasets
        model_data.output_path = output_path
        model_data.output_variables = list(output_variables)
        model_data.rules = rules
        return model_data

    def _parse_input_version(self, contents: dict[Any, Any]) -> Optional[List[int]]:
        input_version = None
        try:
            # read version string
            version_string: str = get_dict_element("version", contents)

            # check existence of version_string
            if len(str(version_string)) == 0 or version_string is None:
                self._logger.log_error(
                    f"Version ('{version_string}')" + " in input yaml is missing"
                )
            else:
                # split string into 3 list items
                version_list = version_string.split(".", 2)

                # convert str[] to int[]
                input_version = list(map(int, version_list))

        except (ValueError, AttributeError, TypeError) as exception:
            self._logger.log_error(f"Invalid version in input yaml: {exception}")
            return None

        return input_version

    def _parse_input_datasets(self, contents: dict[str, Any]) -> Iterable[IDatasetData]:
        input_datasets: List[dict[str, Any]] = get_dict_element("input-data", contents)

        for input_dataset in input_datasets:
            yield DatasetData(get_dict_element("dataset", input_dataset))

    def _parse_output_dataset(self, contents: dict[str, Any]) -> Path:
        output_data: dict[str, Any] = get_dict_element("output-data", contents)

        return Path(output_data["filename"])

    def _parse_save_only_variables(self, contents: dict[str, Any]) -> Iterable[str]:
        output_data: dict[str, Any] = get_dict_element("output-data", contents)
        save_only_variables = output_data.get("save_only_variables", [])

        # Convert to list if not already one
        if isinstance(save_only_variables, str):
            save_only_variables = [save_only_variables]

        return save_only_variables

    def _parse_rules(self, contents: dict[str, Any]) -> Iterable[IRuleData]:
        rules: List[dict[str, Any]] = get_dict_element("rules", contents)

        for rule in rules:
            rule_type_name = list(rule.keys())[0]
            rule_dict = rule[rule_type_name]

            parser = self._get_rule_data_parser(rule_type_name)

            yield parser.parse_dict(rule_dict, self._logger)

    def _get_rule_data_parser(self, rule_name: str) -> IParserRuleBase:
        for parser in rule_parsers():
            if parser.rule_type_name == rule_name:
                return parser

        raise KeyError(f"No parser for {rule_name}")

`init(self, logger)` `special`

Create ModelDataBuilder

Source code in entities/model_data_builder.py

def __init__(self, logger: ILogger) -> None:
    """Create ModelDataBuilder"""
    self._rule_parsers = list(rule_parsers())
    self._logger = logger

`parse_yaml_data(self, contents)`

Parse the Yaml input file into a data object

Exceptions:

Type	Description
`AttributeError`	when version is not available from the input file

Source code in entities/model_data_builder.py

def parse_yaml_data(self, contents: dict[Any, Any]) -> IModelData:
    """Parse the Yaml input file into a data object

    Raises:
        AttributeError: when version is not available from the input file
    """
    input_version = self._parse_input_version(contents)
    if not input_version:
        raise AttributeError(name="Version not available from input file")

    input_datasets = list(self._parse_input_datasets(contents))
    output_path = self._parse_output_dataset(contents)
    output_variables = self._parse_save_only_variables(contents)
    rules = list(self._parse_rules(contents))

    model_data = YamlModelData("Model 1", input_version)
    model_data.datasets = input_datasets
    model_data.output_path = output_path
    model_data.output_variables = list(output_variables)
    model_data.rules = rules
    return model_data

multiply_rule_data

Module for MultiplyRuleData class

!!! classes MultiplyRuleData

`MultiplyRuleData (IMultiplyRuleData, RuleData)`

Class for storing data related to multiply rule

Source code in entities/multiply_rule_data.py

class MultiplyRuleData(IMultiplyRuleData, RuleData):
    """Class for storing data related to multiply rule"""

    def __init__(
        self,
        name: str,
        multipliers: List[List[float]],
        input_variable: str,
        date_range: Optional[List[List[str]]] = None,
    ):
        super().__init__(name)
        self._input_variable = input_variable
        self._multipliers = multipliers
        self._date_range = date_range

    @property
    def input_variable(self) -> str:
        """Name of the input variable"""
        return self._input_variable

    @property
    def multipliers(self) -> List[List[float]]:
        """List of list with the multipliers"""
        return self._multipliers

    @property
    def date_range(self) -> Optional[List[List[str]]]:
        """List of list with start and end dates"""
        return self._date_range

`date_range: Optional[List[List[str]]]` `property` `readonly`

List of list with start and end dates

`input_variable: str` `property` `readonly`

Name of the input variable

`multipliers: List[List[float]]` `property` `readonly`

List of list with the multipliers

response_curve_rule_data

Module for ReponseCurveRuleData class

!!! classes ReponseCurveRuleData

`ResponseCurveRuleData (IResponseCurveRuleData, RuleData)`

Class for storing data related to multiply rule

Source code in entities/response_curve_rule_data.py

class ResponseCurveRuleData(IResponseCurveRuleData, RuleData):
    """Class for storing data related to multiply rule"""

    def __init__(
        self,
        name: str,
        input_variable: str,
        input_values: List[float],
        output_values: List[float],
    ):
        super().__init__(name)
        self._input_variable = input_variable
        self._input_values = input_values
        self._output_values = output_values

    @property
    def input_variable(self) -> str:
        """Property for the input variable"""
        return self._input_variable

    @property
    def input_values(self) -> List[float]:
        """Property for the input values"""
        return self._input_values

    @property
    def output_values(self) -> List[float]:
        """Property for the output values"""
        return self._output_values

`input_values: List[float]` `property` `readonly`

Property for the input values

`input_variable: str` `property` `readonly`

Property for the input variable

`output_values: List[float]` `property` `readonly`

Property for the output values

rolling_statistics_rule_data

Module for RollingStatisticsRuleData class

!!! classes RollingStatisticsRuleData

`RollingStatisticsRuleData (TimeOperationRuleData, IRollingStatisticsRuleData)`

Class for storing data related to rolling_statistic rule

Source code in entities/rolling_statistics_rule_data.py

class RollingStatisticsRuleData(TimeOperationRuleData, IRollingStatisticsRuleData):
    """Class for storing data related to rolling_statistic rule"""

    def __init__(
        self,
        name: str,
        operation: TimeOperationType,
        input_variable: str,
        period: float,
    ):
        super().__init__(name, operation)
        self._input_variable = input_variable
        self._period = period

    @property
    def input_variable(self) -> str:
        """Name of the input variable"""
        return self._input_variable

    @property
    def period(self) -> float:
        """Period type"""
        return self._period

`input_variable: str` `property` `readonly`

Name of the input variable

`period: float` `property` `readonly`

Period type

rule_data

Module for RuleData interface

!!! classes RuleData

`RuleData (IRuleData, ABC)`

Class for storing rule information

Source code in entities/rule_data.py

class RuleData(IRuleData, ABC):
    """Class for storing rule information"""

    def __init__(self, name: str):
        """Create RuleData based on provided info dictionary

        Args:
            info (dict[str, Any]):
        """
        self._name = name
        self._output_variable = "output"
        self._description = ""

    @property
    def name(self) -> str:
        """Name to the rule"""
        return self._name

    @property
    def description(self) -> str:
        """Description of the rule"""
        return self._description

    @description.setter
    def description(self, description: str):
        self._description = description

    @property
    def output_variable(self) -> str:
        """Name of the output variable of the rule"""
        return self._output_variable

    @output_variable.setter
    def output_variable(self, output_variable: str):
        self._output_variable = output_variable

`description: str` `property` `writable`

Description of the rule

`name: str` `property` `readonly`

Name to the rule

`output_variable: str` `property` `writable`

Name of the output variable of the rule

`init(self, name)` `special`

Create RuleData based on provided info dictionary

Parameters:

Name	Type	Description	Default
`info`	`dict[str, Any]`		required

Source code in entities/rule_data.py

def __init__(self, name: str):
    """Create RuleData based on provided info dictionary

    Args:
        info (dict[str, Any]):
    """
    self._name = name
    self._output_variable = "output"
    self._description = ""

step_function_data

Module for StepFunctionRuleData class

!!! classes StepFunctionRuleData

`StepFunctionRuleData (IStepFunctionRuleData, RuleData)`

Class for storing data related to step function rule

Source code in entities/step_function_data.py

class StepFunctionRuleData(IStepFunctionRuleData, RuleData):
    """Class for storing data related to step function rule"""

    def __init__(
        self,
        name: str,
        limits: List[float],
        responses: List[float],
        input_variable: str,
    ):
        super().__init__(name)
        self._input_variable = input_variable
        self._limits = limits
        self._responses = responses

    @property
    def input_variable(self) -> str:
        """Name of the input variable"""
        return self._input_variable

    @property
    def limits(self) -> List[float]:
        """Limits of the interval definition for the step function rule"""
        return self._limits

    @property
    def responses(self) -> List[float]:
        """Step wise responses corresponding to each interval defined by the limits"""
        return self._responses

`input_variable: str` `property` `readonly`

Name of the input variable

`limits: List[float]` `property` `readonly`

Limits of the interval definition for the step function rule

`responses: List[float]` `property` `readonly`

Step wise responses corresponding to each interval defined by the limits

time_aggregation_rule_data

Module for TimeAggregationRuleData class

!!! classes TimeAggregationRuleData

`TimeAggregationRuleData (TimeOperationRuleData, ITimeAggregationRuleData)`

Class for storing data related to time_aggregation rule

Source code in entities/time_aggregation_rule_data.py

class TimeAggregationRuleData(TimeOperationRuleData, ITimeAggregationRuleData):
    """Class for storing data related to time_aggregation rule"""

    def __init__(self, name: str, operation: TimeOperationType, input_variable: str):
        super().__init__(name, operation)
        self._input_variable = input_variable

    @property
    def input_variable(self) -> str:
        """Name of the input variable"""
        return self._input_variable

`input_variable: str` `property` `readonly`

Name of the input variable

time_operation_rule_data

Module for TimeOperationRuleData class

!!! classes TimeOperationRuleData

`TimeOperationRuleData (RuleData)`

Base class for rule data related to time operations

Source code in entities/time_operation_rule_data.py

class TimeOperationRuleData(RuleData):
    """Base class for rule data related to time operations"""

    def __init__(
        self,
        name: str,
        operation: TimeOperationType,
    ):
        super().__init__(name)
        self._operation = operation
        self._percentile_value = 0
        self._time_scale = "year"

    @property
    def operation(self) -> TimeOperationType:
        """Operation type"""
        return self._operation

    @property
    def percentile_value(self) -> float:
        """Operation parameter"""
        return self._percentile_value

    @percentile_value.setter
    def percentile_value(self, percentile_value: float):
        self._percentile_value = percentile_value

    @property
    def time_scale(self) -> str:
        """Time scale type"""
        return self._time_scale

    @time_scale.setter
    def time_scale(self, time_scale: str):
        self._time_scale = time_scale

`operation: TimeOperationType` `property` `readonly`

Operation type

`percentile_value: float` `property` `writable`

Operation parameter

`time_scale: str` `property` `writable`

Time scale type

yaml_model_data

Module for YamlModelData class

!!! classes YamlModelData

`YamlModelData (IModelData)`

Implementation of the model data

Source code in entities/yaml_model_data.py

class YamlModelData(IModelData):
    """Implementation of the model data"""

    def __init__(self, name: str, version: List[int]):
        self._name = name
        self._version = version
        self._datasets = []
        self._output_path = Path("")
        self._output_variables = []
        self._rules = []

    @property
    def name(self) -> str:
        """Name of the model"""
        return self._name

    @property
    def version(self) -> List[int]:
        """Version of the model"""
        return self._version

    @property
    def datasets(self) -> List[IDatasetData]:
        """Datasets of the model"""
        return self._datasets

    @datasets.setter
    def datasets(self, datasets: List[IDatasetData]):
        self._datasets = datasets

    @property
    def output_path(self) -> Path:
        """Model path to the output file"""
        return self._output_path

    @output_path.setter
    def output_path(self, output_path: Path):
        self._output_path = output_path

    @property
    def output_variables(self) -> List[str]:
        """Output variables"""
        return self._output_variables

    @output_variables.setter
    def output_variables(self, output_variables: List[str]):
        self._output_variables = output_variables

    @property
    def rules(self) -> List[IRuleData]:
        """Rules of the model"""
        return self._rules

    @rules.setter
    def rules(self, rules: List[IRuleData]):
        self._rules = rules

`datasets: List[decoimpact.data.api.i_dataset.IDatasetData]` `property` `writable`

Datasets of the model

`name: str` `property` `readonly`

Name of the model

`output_path: Path` `property` `writable`

Model path to the output file

`output_variables: List[str]` `property` `writable`

Output variables

`rules: List[decoimpact.data.api.i_rule_data.IRuleData]` `property` `writable`

Rules of the model

`version: List[int]` `property` `readonly`

Version of the model

parsers

criteria_table_validaton_helper

Module for validation logic of the (ClassificationRule) criteria table

`validate_table_coverage(crit_table, logger)`

Check if the criteria for the parameters given in the criteria_table cover the entire range of data values. If not give the user feedback (warnings) concerning gaps and overlaps.

Parameters:

Name	Type	Description	Default
`crit_table`	`Dict[str, Any]`	User input describing criteria per parameter	required

Source code in parsers/criteria_table_validaton_helper.py

def validate_table_coverage(crit_table: Dict[str, Any], logger: ILogger):
    """Check if the criteria for the parameters given in the criteria_table
    cover the entire range of data values. If not give the user feedback (warnings)
    concerning gaps and overlaps.

    Args:
        crit_table (Dict[str, Any]): User input describing criteria per parameter
    """
    criteria_table = crit_table.copy()
    del criteria_table["output"]

    new_crit_table = criteria_table.copy()
    unique = True

    # If only 1 parameter is given in the criteria_table check the first parameter
    # on all values and not only the unique values.
    if len(new_crit_table.items()) == 1:
        unique = False

    # Make a loop over all variables from right to left to check combinations
    msgs = []
    for key in reversed(criteria_table.keys()):
        msgs = msgs + list(
            _divide_table_in_unique_chunks(new_crit_table, logger, {}, unique)
        )
        del new_crit_table[key]

    max_msg = 6
    if len(msgs) < max_msg:
        logger.log_warning("\n".join(msgs))
    else:
        # Only show the first 6 lines. Print all msgs to a txt file.
        logger.log_warning("\n".join(msgs[:max_msg]))
        logger.log_warning(
            f"{len(msgs)} warnings found concerning coverage of the "
            f"parameters. Only first {max_msg} warnings are shown. See "
            "multiple_classification_rule_warnings.log file for all warnings."
        )
        with open(
            "multiple_classification_rule_warnings.log", "w", encoding="utf-8"
        ) as file:
            file.write("\n".join(msgs))

i_parser_rule_base

Module for IParserRuleBase class !!! classes IParserRuleBase

`IParserRuleBase (ABC)`

Class for the parser of the basic rules

Source code in parsers/i_parser_rule_base.py

class IParserRuleBase(ABC):
    """Class for the parser of the basic rules"""

    @property
    @abstractmethod
    def rule_type_name(self) -> str:
        """Type name for the rule"""

    @abstractmethod
    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a rule
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a rule

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/i_parser_rule_base.py

@abstractmethod
def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a rule
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """

parser_axis_filter_rule

Module for ParserLayerFilterRule class

!!! classes ParserLayerFilterRule

`ParserAxisFilterRule (IParserRuleBase)`

Class for creating a AxisFilterRuleData

Source code in parsers/parser_axis_filter_rule.py

class ParserAxisFilterRule(IParserRuleBase):
    """Class for creating a AxisFilterRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "axis_filter_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name = get_dict_element("name", dictionary)
        description = get_dict_element("description", dictionary)
        input_variable_name = get_dict_element("input_variable", dictionary)
        axis_name = get_dict_element("axis_name", dictionary)
        if not isinstance(axis_name, str):
            message = (
                "Dimension name should be a string, "
                f"received a {type(axis_name)}: {axis_name}"
            )
            raise ValueError(message)

        element_index = get_dict_element("layer_number", dictionary)
        if not isinstance(element_index, int):
            message = (
                "Layer number should be an integer, "
                f"received a {type(element_index)}: {element_index}"
            )
            raise ValueError(message)
        output_variable_name = get_dict_element("output_variable", dictionary)

        rule_data = AxisFilterRuleData(
            name, element_index, axis_name, input_variable_name
        )

        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_axis_filter_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name = get_dict_element("name", dictionary)
    description = get_dict_element("description", dictionary)
    input_variable_name = get_dict_element("input_variable", dictionary)
    axis_name = get_dict_element("axis_name", dictionary)
    if not isinstance(axis_name, str):
        message = (
            "Dimension name should be a string, "
            f"received a {type(axis_name)}: {axis_name}"
        )
        raise ValueError(message)

    element_index = get_dict_element("layer_number", dictionary)
    if not isinstance(element_index, int):
        message = (
            "Layer number should be an integer, "
            f"received a {type(element_index)}: {element_index}"
        )
        raise ValueError(message)
    output_variable_name = get_dict_element("output_variable", dictionary)

    rule_data = AxisFilterRuleData(
        name, element_index, axis_name, input_variable_name
    )

    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_classification_rule

Module for ParserClassificationRule class

!!! classes ParserClassificationRule

`ParserClassificationRule (IParserRuleBase)`

Class for creating a ClassificationRuleData

Source code in parsers/parser_classification_rule.py

class ParserClassificationRule(IParserRuleBase):
    """Class for creating a ClassificationRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "classification_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name: str = get_dict_element("name", dictionary)
        input_variable_names: List[str] = get_dict_element(
            "input_variables", dictionary
        )
        criteria_table_list: List[Any] = get_dict_element("criteria_table", dictionary)
        criteria_table = convert_table_element(criteria_table_list)

        validate_table_with_input(criteria_table, input_variable_names)
        validate_table_coverage(criteria_table, logger)

        output_variable_name: str = get_dict_element("output_variable", dictionary)
        description: str = get_dict_element("description", dictionary)

        rule_data = ClassificationRuleData(name, input_variable_names, criteria_table)
        rule_data.description = description
        rule_data.output_variable = output_variable_name

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_classification_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name: str = get_dict_element("name", dictionary)
    input_variable_names: List[str] = get_dict_element(
        "input_variables", dictionary
    )
    criteria_table_list: List[Any] = get_dict_element("criteria_table", dictionary)
    criteria_table = convert_table_element(criteria_table_list)

    validate_table_with_input(criteria_table, input_variable_names)
    validate_table_coverage(criteria_table, logger)

    output_variable_name: str = get_dict_element("output_variable", dictionary)
    description: str = get_dict_element("description", dictionary)

    rule_data = ClassificationRuleData(name, input_variable_names, criteria_table)
    rule_data.description = description
    rule_data.output_variable = output_variable_name

    return rule_data

parser_combine_results_rule

Module for Parser CombineResultsRule class

!!! classes CombineResultsRuleParser

`ParserCombineResultsRule (IParserRuleBase)`

Class for creating a CombineResultsRuleData

Source code in parsers/parser_combine_results_rule.py

class ParserCombineResultsRule(IParserRuleBase):
    """Class for creating a CombineResultsRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "combine_results_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to an IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name = get_dict_element("name", dictionary, True)
        input_variable_names = get_dict_element("input_variables", dictionary, True)
        operation_type: str = get_dict_element("operation", dictionary, True)
        self._validate_operation_type(operation_type)
        operation_type = operation_type.upper()
        output_variable_name = get_dict_element("output_variable", dictionary)
        description = get_dict_element("description", dictionary, False)
        if not description:
            description = ""

        ignore_nan = get_dict_element("ignore_nan", dictionary, False)

        rule_data = CombineResultsRuleData(name, input_variable_names,
                                           operation_type, ignore_nan)
        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

    def _validate_operation_type(self, operation_type: Any):
        """
        Validates if the operation type is well formed (a string)
        and if it has been implemented."""
        if not isinstance(operation_type, str):
            message = f"""Operation must be a string, \
                received: {operation_type}"""
            raise ValueError(message)
        if operation_type.upper() not in dir(MultiArrayOperationType):
            possible_operations = [
                "\n" + operation_name
                for operation_name in dir(MultiArrayOperationType)
                if not operation_name.startswith("_")
            ]
            message = f"""Operation must be one of: {possible_operations}"""
            raise ValueError(message)

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to an IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_combine_results_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to an IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name = get_dict_element("name", dictionary, True)
    input_variable_names = get_dict_element("input_variables", dictionary, True)
    operation_type: str = get_dict_element("operation", dictionary, True)
    self._validate_operation_type(operation_type)
    operation_type = operation_type.upper()
    output_variable_name = get_dict_element("output_variable", dictionary)
    description = get_dict_element("description", dictionary, False)
    if not description:
        description = ""

    ignore_nan = get_dict_element("ignore_nan", dictionary, False)

    rule_data = CombineResultsRuleData(name, input_variable_names,
                                       operation_type, ignore_nan)
    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_depth_average_rule

Module for ParserDepthAverageRule class

!!! classes ParserDepthAverageRule

`ParserDepthAverageRule (IParserRuleBase)`

Class for creating a ParserDepthAverageRule

Source code in parsers/parser_depth_average_rule.py

class ParserDepthAverageRule(IParserRuleBase):
    """Class for creating a ParserDepthAverageRule"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "depth_average_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name: str = get_dict_element("name", dictionary)
        bed_level_variable = get_dict_element("bed_level_variable", dictionary)
        water_level_variable = get_dict_element("water_level_variable", dictionary)
        interfaces_variable = get_dict_element("interfaces_variable", dictionary)

        input_variable_names: List[str] = [
            get_dict_element("input_variable", dictionary),
            bed_level_variable,
            water_level_variable,
            interfaces_variable,
        ]

        output_variable_name: str = get_dict_element("output_variable", dictionary)
        description: str = get_dict_element("description", dictionary, False) or ""

        rule_data = DepthAverageRuleData(
            name,
            input_variable_names,
        )

        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_depth_average_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name: str = get_dict_element("name", dictionary)
    bed_level_variable = get_dict_element("bed_level_variable", dictionary)
    water_level_variable = get_dict_element("water_level_variable", dictionary)
    interfaces_variable = get_dict_element("interfaces_variable", dictionary)

    input_variable_names: List[str] = [
        get_dict_element("input_variable", dictionary),
        bed_level_variable,
        water_level_variable,
        interfaces_variable,
    ]

    output_variable_name: str = get_dict_element("output_variable", dictionary)
    description: str = get_dict_element("description", dictionary, False) or ""

    rule_data = DepthAverageRuleData(
        name,
        input_variable_names,
    )

    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_filter_extremes_rule

Module for ParserFilterExtremesRule class

!!! classes ParserFilterExtremesRule

`ParserFilterExtremesRule (IParserRuleBase)`

Class for creating a ParserFilterExtremesRule

Source code in parsers/parser_filter_extremes_rule.py

class ParserFilterExtremesRule(IParserRuleBase):
    """Class for creating a ParserFilterExtremesRule"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "filter_extremes_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name: str = get_dict_element("name", dictionary)
        input_variable_names: List[str] = [
            get_dict_element("input_variable", dictionary)
        ]
        output_variable_name: str = get_dict_element("output_variable", dictionary)
        description: str = get_dict_element("description", dictionary, False) or ""

        extreme_type_name = "extreme_type"
        extreme_type: str = get_dict_element(extreme_type_name, dictionary)
        self._validate_extreme_type(extreme_type, extreme_type_name)

        distance_name = "distance"
        distance: int = get_dict_element("distance", dictionary) or 0
        validate_type(distance, distance_name, int)

        time_scale: str = get_dict_element("time_scale", dictionary) or "D"

        mask_name = "mask"
        mask: bool = get_dict_element("mask", dictionary) or False
        validate_type(mask, mask_name, bool)

        rule_data = FilterExtremesRuleData(
            name, input_variable_names, extreme_type, distance, time_scale, mask
        )

        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

    def _validate_extreme_type(self, extreme_type: Any, name: str):
        """
        Validates if the extreme type is well formed (a string)
        and has the correct values
        """
        validate_type(extreme_type, name, str)
        if extreme_type.upper() not in dir(ExtremeTypeOptions):
            message = (
                f"""Extreme_type must be one of: [{', '.join(ExtremeTypeOptions)}]"""
            )
            raise ValueError(message)

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_filter_extremes_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name: str = get_dict_element("name", dictionary)
    input_variable_names: List[str] = [
        get_dict_element("input_variable", dictionary)
    ]
    output_variable_name: str = get_dict_element("output_variable", dictionary)
    description: str = get_dict_element("description", dictionary, False) or ""

    extreme_type_name = "extreme_type"
    extreme_type: str = get_dict_element(extreme_type_name, dictionary)
    self._validate_extreme_type(extreme_type, extreme_type_name)

    distance_name = "distance"
    distance: int = get_dict_element("distance", dictionary) or 0
    validate_type(distance, distance_name, int)

    time_scale: str = get_dict_element("time_scale", dictionary) or "D"

    mask_name = "mask"
    mask: bool = get_dict_element("mask", dictionary) or False
    validate_type(mask, mask_name, bool)

    rule_data = FilterExtremesRuleData(
        name, input_variable_names, extreme_type, distance, time_scale, mask
    )

    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_formula_rule

Module for Parser FormulaRule class

!!! classes FormulaRuleParser

`ParserFormulaRule (IParserRuleBase)`

Class for creating a FormulaRuleData

Source code in parsers/parser_formula_rule.py

class ParserFormulaRule(IParserRuleBase):
    """Class for creating a FormulaRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "formula_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to an IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name = get_dict_element("name", dictionary, True)
        input_variable_names = get_dict_element("input_variables", dictionary, True)
        formula: str = get_dict_element("formula", dictionary, True)
        self._validate_formula(formula)
        output_variable_name = get_dict_element("output_variable", dictionary)
        description = get_dict_element("description", dictionary, False)
        if not description:
            description = ""

        rule_data = FormulaRuleData(name, input_variable_names, formula)
        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

    def _validate_formula(self, formula: str):
        """
        Validates if the formula is well formed (a string)."""
        if not isinstance(formula, str):
            message = f"""Formula must be a string, \
                received: {formula} (type: {type(formula)})"""
            raise ValueError(message)

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to an IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_formula_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to an IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name = get_dict_element("name", dictionary, True)
    input_variable_names = get_dict_element("input_variables", dictionary, True)
    formula: str = get_dict_element("formula", dictionary, True)
    self._validate_formula(formula)
    output_variable_name = get_dict_element("output_variable", dictionary)
    description = get_dict_element("description", dictionary, False)
    if not description:
        description = ""

    rule_data = FormulaRuleData(name, input_variable_names, formula)
    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_layer_filter_rule

Module for ParserLayerFilterRule class

!!! classes ParserLayerFilterRule

`ParserLayerFilterRule (IParserRuleBase)`

Class for creating a LayerFilterRuleData

Source code in parsers/parser_layer_filter_rule.py

class ParserLayerFilterRule(IParserRuleBase):
    """Class for creating a LayerFilterRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "layer_filter_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name = get_dict_element("name", dictionary)
        description = get_dict_element("description", dictionary)
        input_variable_name = get_dict_element("input_variable", dictionary)
        layer_number = get_dict_element("layer_number", dictionary)
        if not isinstance(layer_number, int):
            message = (
                "Layer number should be an integer, "
                f"received a {type(layer_number)}: {layer_number}"
            )
            raise ValueError(message)
        output_variable_name = get_dict_element("output_variable", dictionary)

        rule_data = LayerFilterRuleData(name, layer_number, input_variable_name)

        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_layer_filter_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name = get_dict_element("name", dictionary)
    description = get_dict_element("description", dictionary)
    input_variable_name = get_dict_element("input_variable", dictionary)
    layer_number = get_dict_element("layer_number", dictionary)
    if not isinstance(layer_number, int):
        message = (
            "Layer number should be an integer, "
            f"received a {type(layer_number)}: {layer_number}"
        )
        raise ValueError(message)
    output_variable_name = get_dict_element("output_variable", dictionary)

    rule_data = LayerFilterRuleData(name, layer_number, input_variable_name)

    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_multiply_rule

Module for ParserMultiplyRule class

!!! classes ParserMultiplyRule

`ParserMultiplyRule (IParserRuleBase)`

Class for creating a MultiplyRuleData

Source code in parsers/parser_multiply_rule.py

class ParserMultiplyRule(IParserRuleBase):
    """Class for creating a MultiplyRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "multiply_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name = get_dict_element("name", dictionary)
        input_variable_name = get_dict_element("input_variable", dictionary)
        output_variable_name = get_dict_element("output_variable", dictionary)
        description = get_dict_element("description", dictionary, False) or ""

        multipliers = [get_dict_element("multipliers", dictionary, False)]
        date_range = []

        if not multipliers[0]:
            multipliers_table = get_dict_element("multipliers_table", dictionary)
            multipliers_dict = convert_table_element(multipliers_table)
            multipliers = get_dict_element("multipliers", multipliers_dict)
            start_date = get_dict_element("start_date", multipliers_dict)
            end_date = get_dict_element("end_date", multipliers_dict)

            validate_type_date(start_date, "start_date")
            validate_type_date(end_date, "end_date")
            validate_start_before_end(start_date, end_date)

            date_range = list(zip(start_date, end_date))
        validate_all_instances_number(sum(multipliers, []), "Multipliers")

        rule_data = MultiplyRuleData(
            name,
            multipliers,
            input_variable_name,
            date_range,
        )

        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_multiply_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name = get_dict_element("name", dictionary)
    input_variable_name = get_dict_element("input_variable", dictionary)
    output_variable_name = get_dict_element("output_variable", dictionary)
    description = get_dict_element("description", dictionary, False) or ""

    multipliers = [get_dict_element("multipliers", dictionary, False)]
    date_range = []

    if not multipliers[0]:
        multipliers_table = get_dict_element("multipliers_table", dictionary)
        multipliers_dict = convert_table_element(multipliers_table)
        multipliers = get_dict_element("multipliers", multipliers_dict)
        start_date = get_dict_element("start_date", multipliers_dict)
        end_date = get_dict_element("end_date", multipliers_dict)

        validate_type_date(start_date, "start_date")
        validate_type_date(end_date, "end_date")
        validate_start_before_end(start_date, end_date)

        date_range = list(zip(start_date, end_date))
    validate_all_instances_number(sum(multipliers, []), "Multipliers")

    rule_data = MultiplyRuleData(
        name,
        multipliers,
        input_variable_name,
        date_range,
    )

    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_response_curve_rule

Module for ParserResponseRule class !!! classes ParserResponseRule

`ParserResponseCurveRule (IParserRuleBase)`

Class for creating a ResponseRuleData

Source code in parsers/parser_response_curve_rule.py

class ParserResponseCurveRule(IParserRuleBase):
    """Class for creating a ResponseRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "response_curve_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                            for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name = get_dict_element("name", dictionary)
        description = get_dict_element("description", dictionary)
        input_variable_name = get_dict_element("input_variable", dictionary)

        # read response_table and convert it to dicts with input and output values
        response_table_list = get_dict_element("response_table", dictionary)
        response_table = convert_table_element(response_table_list)
        input_values = response_table["input"]
        output_values = response_table["output"]

        # check that response table has exactly two columns:
        if len(response_table) != 2:
            raise ValueError("ERROR: response table should have exactly 2 columns")

        # validate input values to be int/float
        if not all(isinstance(m, (int, float)) for m in input_values):
            message = (
                "Input values should be a list of int or floats, "
                f"received: {input_values}"
            )
            position_error = "".join(
                [
                    f"ERROR in position {index} is type {type(m)}. "
                    for (index, m) in enumerate(input_values)
                    if not isinstance(m, (int, float))
                ]
            )
            raise ValueError(f"{position_error}{message}")

        # validate output_values to be int/float
        if not all(isinstance(m, (int, float)) for m in output_values):
            message = (
                "Output values should be a list of int or floats, "
                f"received: {output_values}"
            )
            position_error = "".join(
                [
                    f"ERROR in position {index} is type {type(m)}. "
                    for (index, m) in enumerate(output_values)
                    if not isinstance(m, (int, float))
                ]
            )
            raise ValueError(f"{position_error}{message}")

        output_variable_name = get_dict_element("output_variable", dictionary)

        rule_data = ResponseCurveRuleData(
            name, input_variable_name, input_values, output_values
        )

        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_response_curve_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                        for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name = get_dict_element("name", dictionary)
    description = get_dict_element("description", dictionary)
    input_variable_name = get_dict_element("input_variable", dictionary)

    # read response_table and convert it to dicts with input and output values
    response_table_list = get_dict_element("response_table", dictionary)
    response_table = convert_table_element(response_table_list)
    input_values = response_table["input"]
    output_values = response_table["output"]

    # check that response table has exactly two columns:
    if len(response_table) != 2:
        raise ValueError("ERROR: response table should have exactly 2 columns")

    # validate input values to be int/float
    if not all(isinstance(m, (int, float)) for m in input_values):
        message = (
            "Input values should be a list of int or floats, "
            f"received: {input_values}"
        )
        position_error = "".join(
            [
                f"ERROR in position {index} is type {type(m)}. "
                for (index, m) in enumerate(input_values)
                if not isinstance(m, (int, float))
            ]
        )
        raise ValueError(f"{position_error}{message}")

    # validate output_values to be int/float
    if not all(isinstance(m, (int, float)) for m in output_values):
        message = (
            "Output values should be a list of int or floats, "
            f"received: {output_values}"
        )
        position_error = "".join(
            [
                f"ERROR in position {index} is type {type(m)}. "
                for (index, m) in enumerate(output_values)
                if not isinstance(m, (int, float))
            ]
        )
        raise ValueError(f"{position_error}{message}")

    output_variable_name = get_dict_element("output_variable", dictionary)

    rule_data = ResponseCurveRuleData(
        name, input_variable_name, input_values, output_values
    )

    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_rolling_statistics_rule

Module for ParserRollingStatisticsRule class

!!! classes ParserRollingStatisticsRule

`ParserRollingStatisticsRule (IParserRuleBase)`

Class for creating a RollingStatisticsRuleData

Source code in parsers/parser_rolling_statistics_rule.py

class ParserRollingStatisticsRule(IParserRuleBase):
    """Class for creating a RollingStatisticsRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "rolling_statistics_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        # get elements
        name: str = get_dict_element("name", dictionary)
        input_variable_name: str = get_dict_element("input_variable", dictionary)
        operation: str = get_dict_element("operation", dictionary)
        time_scale: str = get_dict_element("time_scale", dictionary)
        period: float = get_dict_element("period", dictionary)
        description = get_dict_element("description", dictionary, False)
        output_variable_name = get_dict_element("output_variable", dictionary)

        if not period:
            message = f"Period is not of a predefined type. Should be  \
                      a float or integer value. Received: {period}"
            raise ValueError(message)

        operation_value, percentile_value = parse_operation_values(operation)

        rule_data = RollingStatisticsRuleData(
            name, operation_value, input_variable_name, period
        )

        rule_data.time_scale = time_scale
        rule_data.percentile_value = percentile_value
        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_rolling_statistics_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    # get elements
    name: str = get_dict_element("name", dictionary)
    input_variable_name: str = get_dict_element("input_variable", dictionary)
    operation: str = get_dict_element("operation", dictionary)
    time_scale: str = get_dict_element("time_scale", dictionary)
    period: float = get_dict_element("period", dictionary)
    description = get_dict_element("description", dictionary, False)
    output_variable_name = get_dict_element("output_variable", dictionary)

    if not period:
        message = f"Period is not of a predefined type. Should be  \
                  a float or integer value. Received: {period}"
        raise ValueError(message)

    operation_value, percentile_value = parse_operation_values(operation)

    rule_data = RollingStatisticsRuleData(
        name, operation_value, input_variable_name, period
    )

    rule_data.time_scale = time_scale
    rule_data.percentile_value = percentile_value
    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_step_function_rule

Module for ParserStepFunctionRule class !!! classes ParserStepFunctionRule

`ParserStepFunctionRule (IParserRuleBase)`

Class for creating a StepFunction

Source code in parsers/parser_step_function_rule.py

class ParserStepFunctionRule(IParserRuleBase):
    """Class for creating a StepFunction"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "step_function_rule"

    def parse_dict(self, dictionary: dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a rule
        Args:
            dictionary (dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        name: str = get_dict_element("name", dictionary)
        input_variable_name: str = get_dict_element("input_variable", dictionary)
        limit_response_table_list = get_dict_element("limit_response_table", dictionary)
        limit_response_table = convert_table_element(limit_response_table_list)
        limits = limit_response_table["limit"]
        responses = limit_response_table["response"]

        output_variable_name: str = get_dict_element("output_variable", dictionary)
        description: str = get_dict_element("description", dictionary, False)

        if not all(a < b for a, b in zip(limits, limits[1:])):
            logger.log_warning(
                "Limits were not ordered. They have been sorted increasingly,"
                " and their respective responses accordingly too."
            )
            unsorted_map = list(zip(limits, responses))
            sorted_map = sorted(unsorted_map, key=lambda x: x[0])
            limits, responses = map(list, zip(*sorted_map))

        rule_data = StepFunctionRuleData(name, limits, responses, input_variable_name)

        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

    def _are_sorted(self, list_numbers: List[float]):
        return all(a < b for a, b in zip(list_numbers, list_numbers[1:]))

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a rule

Parameters:

Name	Type	Description	Default
`dictionary`	`dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_step_function_rule.py

def parse_dict(self, dictionary: dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a rule
    Args:
        dictionary (dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    name: str = get_dict_element("name", dictionary)
    input_variable_name: str = get_dict_element("input_variable", dictionary)
    limit_response_table_list = get_dict_element("limit_response_table", dictionary)
    limit_response_table = convert_table_element(limit_response_table_list)
    limits = limit_response_table["limit"]
    responses = limit_response_table["response"]

    output_variable_name: str = get_dict_element("output_variable", dictionary)
    description: str = get_dict_element("description", dictionary, False)

    if not all(a < b for a, b in zip(limits, limits[1:])):
        logger.log_warning(
            "Limits were not ordered. They have been sorted increasingly,"
            " and their respective responses accordingly too."
        )
        unsorted_map = list(zip(limits, responses))
        sorted_map = sorted(unsorted_map, key=lambda x: x[0])
        limits, responses = map(list, zip(*sorted_map))

    rule_data = StepFunctionRuleData(name, limits, responses, input_variable_name)

    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

parser_time_aggregation_rule

Module for ParserTimeAggregationRule class

!!! classes ParserTimeAggregationRule

`ParserTimeAggregationRule (IParserRuleBase)`

Class for creating a TimeAggregationRuleData

Source code in parsers/parser_time_aggregation_rule.py

class ParserTimeAggregationRule(IParserRuleBase):
    """Class for creating a TimeAggregationRuleData"""

    @property
    def rule_type_name(self) -> str:
        """Type name for the rule"""
        return "time_aggregation_rule"

    def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
        """Parses the provided dictionary to a IRuleData
        Args:
            dictionary (Dict[str, Any]): Dictionary holding the values
                                         for making the rule
        Returns:
            RuleBase: Rule based on the provided data
        """
        # get elements
        name: str = get_dict_element("name", dictionary)
        description: str = get_dict_element("description", dictionary, False)
        input_variable_name: str = get_dict_element("input_variable", dictionary)
        operation: str = get_dict_element("operation", dictionary)
        time_scale: str = get_dict_element("time_scale", dictionary)
        output_variable_name: str = get_dict_element("output_variable", dictionary)

        operation_value, percentile_value = parse_operation_values(operation)

        rule_data = TimeAggregationRuleData(name, operation_value, input_variable_name)

        rule_data.percentile_value = percentile_value
        rule_data.time_scale = time_scale
        rule_data.output_variable = output_variable_name
        rule_data.description = description

        return rule_data

`rule_type_name: str` `property` `readonly`

Type name for the rule

`parse_dict(self, dictionary, logger)`

Parses the provided dictionary to a IRuleData

Parameters:

Name	Type	Description	Default
`dictionary`	`Dict[str, Any]`	Dictionary holding the values for making the rule	required

Returns:

Type	Description
`RuleBase`	Rule based on the provided data

Source code in parsers/parser_time_aggregation_rule.py

def parse_dict(self, dictionary: Dict[str, Any], logger: ILogger) -> IRuleData:
    """Parses the provided dictionary to a IRuleData
    Args:
        dictionary (Dict[str, Any]): Dictionary holding the values
                                     for making the rule
    Returns:
        RuleBase: Rule based on the provided data
    """
    # get elements
    name: str = get_dict_element("name", dictionary)
    description: str = get_dict_element("description", dictionary, False)
    input_variable_name: str = get_dict_element("input_variable", dictionary)
    operation: str = get_dict_element("operation", dictionary)
    time_scale: str = get_dict_element("time_scale", dictionary)
    output_variable_name: str = get_dict_element("output_variable", dictionary)

    operation_value, percentile_value = parse_operation_values(operation)

    rule_data = TimeAggregationRuleData(name, operation_value, input_variable_name)

    rule_data.percentile_value = percentile_value
    rule_data.time_scale = time_scale
    rule_data.output_variable = output_variable_name
    rule_data.description = description

    return rule_data

rule_parsers

Module for available list of RuleParsers

!!! classes RuleParsers

`rule_parsers()`

Function to return rule parsers

Source code in parsers/rule_parsers.py

def rule_parsers() -> Iterator[IParserRuleBase]:
    """Function to return rule parsers"""
    yield ParserMultiplyRule()
    yield ParserCombineResultsRule()
    yield ParserLayerFilterRule()
    yield ParserTimeAggregationRule()
    yield ParserRollingStatisticsRule()
    yield ParserStepFunctionRule()
    yield ParserResponseCurveRule()
    yield ParserFormulaRule()
    yield ParserClassificationRule()
    yield ParserAxisFilterRule()
    yield ParserDepthAverageRule()
    yield ParserFilterExtremesRule()

time_operation_parsing

Module for ParserTimeAggregationRule class

!!! classes ParserTimeAggregationRule

`parse_operation_values(operation_str)`

parses the operation_str to a TimeOperationType and optional percentile value

Parameters:

Name	Type	Description	Default
`operation_str`	`str`	string containing the time operation type	required

Exceptions:

Type	Description
`ValueError`	if the time operation type is a unknown TimeOperationType
`ValueError`	the operation parameter (percentile value) is not a number or < 0 or > 100

Returns:

Type	Description
`Tuple[TimeOperationType, float]`	parsed TimeOperationType and percentile value

Source code in parsers/time_operation_parsing.py

def parse_operation_values(operation_str: str) -> Tuple[TimeOperationType, float]:
    """parses the operation_str to a TimeOperationType and optional
    percentile value

    Args:
        operation_str (str): string containing the time operation type

    Raises:
        ValueError: if the time operation type is a unknown TimeOperationType
        ValueError: the operation parameter (percentile value) is not a
                    number or < 0 or > 100

    Returns:
        Tuple[TimeOperationType, float]: parsed TimeOperationType and percentile value
    """

    # if operation contains percentile,
    # extract percentile value from operation:
    if str.startswith(operation_str, "PERCENTILE"):
        try:
            percentile_value = float(str(operation_str)[11:-1])
        except ValueError as exc:
            message = (
                "Operation percentile is missing valid value like 'percentile(10)'"
            )
            raise ValueError(message) from exc

        # test if percentile_value is within expected limits:
        if percentile_value < 0 or percentile_value > 100:
            message = "Operation percentile should be a number between 0 and 100."
            raise ValueError(message)
        return TimeOperationType.PERCENTILE, percentile_value

    # validate operation
    match_operation = [o for o in TimeOperationType if o.name == operation_str]
    operation_value = next(iter(match_operation), None)

    if not operation_value:
        message = (
            f"Operation '{operation_str}' is not of a predefined type. Should be in:"
            + f"{[o.name for o in TimeOperationType]}."
        )
        raise ValueError(message)

    return operation_value, 0

validation_utils

Module for Validation functions

`validate_all_instances_number(data, name)`

Check if all instances in a list are of type int or float

Parameters:

Name	Type	Description	Default
`data`	`List`	List to check	required
`name`	`str`	Name to give in the error message	required

Exceptions:

Type	Description
`ValueError`	Raise an error to define which value is incorrect

Source code in parsers/validation_utils.py

def validate_all_instances_number(data: List, name: str):
    """Check if all instances in a list are of type int or float

    Args:
        data (List): List to check
        name (str): Name to give in the error message

    Raises:
        ValueError: Raise an error to define which value is incorrect
    """
    if not all(isinstance(m, (int, float)) for m in data):
        message = f"{name} should be a list of int or floats, received: {data}"
        position_error = "".join(
            [
                f"ERROR in position {index} is type {type(m)}. "
                for (index, m) in enumerate(data)
                if not isinstance(m, (int, float))
            ]
        )
        raise ValueError(f"{position_error}{message}")

`validate_start_before_end(start_list, end_list)`

Validate if for each row in the table the start date is before the end date.

Parameters:

Name	Type	Description	Default
`start_list`	`List[str]`	list of dates	required
`end_list`	`List[str]`	list of dates	required

Source code in parsers/validation_utils.py

def validate_start_before_end(start_list: List[str], end_list: List[str]):
    """Validate if for each row in the table the start date is before the end date.

    Args:
        start_list (List[str]): list of dates
        end_list (List[str]): list of dates
    """

    for index, (start, end) in enumerate(zip(start_list, end_list)):
        start_str = datetime.strptime(start, r"%d-%m")
        end_str = datetime.strptime(end, r"%d-%m").replace()

        if start_str >= end_str:
            message = (
                f"All start dates should be before the end dates. "
                f"ERROR in position {index} where start: "
                f"{start} and end: {end}."
            )
            raise ValueError(message)

`validate_table_with_input(table, input_variable_names)`

Check if the headers of the input table and the input variable names match

Parameters:

Name	Type	Description	Default
`table`	`_type_`	Table to check the headers from	required
`input_variable_names`	`_type_`	Variable input names	required

Exceptions:

Type	Description
`ValueError`	If there is a mismatch notify the user.

Source code in parsers/validation_utils.py

def validate_table_with_input(table, input_variable_names):
    """Check if the headers of the input table and the input variable names match

    Args:
        table (_type_): Table to check the headers from
        input_variable_names (_type_): Variable input names

    Raises:
        ValueError: If there is a mismatch notify the user.
    """
    headers = list(table.keys())
    difference = list(set(headers) - set(input_variable_names))
    if len(difference) != 1:
        raise ValueError(
            f"The headers of the table {headers} and the input "
            f"variables {input_variable_names} should match. "
            f"Mismatch: {difference}"
        )
    if difference[0] != "output":
        raise ValueError("Define an output column with the header 'output'.")

`validate_type(variable, name, expected_type)`

Validation function to check if the variable is of the expected type. Otherwise give a ValueError

Parameters:

Name	Type	Description	Default
`variable`	`Any`	the variable to check	required
`name`	`str`	the name of the variable	required
`type`	`str`	the type the variable should have	required

Exceptions:

Type	Description
`ValueError`	If type is not what is should be, raise error

Source code in parsers/validation_utils.py

def validate_type(variable: Any, name: str, expected_type: Any):
    """Validation function to check if the variable is of the
    expected type. Otherwise give a ValueError

    Args:
        variable (Any): the variable to check
        name (str): the name of the variable
        type (str): the type the variable should have

    Raises:
        ValueError: If type is not what is should be, raise error
    """
    if not isinstance(variable, expected_type):
        raise ValueError(
            f"The inputfield {name} must be of type {expected_type.__name__}, "
            f"but is of type {type(variable).__name__}"
        )

`validate_type_date(data, name)`

Check if all dates in list are a datestring of format: DD-MM

Parameters:

Name	Type	Description	Default
`data`	`str`	List of date strings	required
`name`	`str`	Name of data to address in error message	required

Exceptions:

Type	Description
`ValueError`	Raise this error to indicate which value is not

Source code in parsers/validation_utils.py

def validate_type_date(data: List[str], name: str):
    """
    Check if all dates in list are a datestring of format: DD-MM

    Args:
        data (str): List of date strings
        name (str): Name of data to address in error message

    Raises:
        ValueError: Raise this error to indicate which value is not
        a date in the proper format.
    """

    for index, date_string in enumerate(data):
        try:
            datetime.strptime(date_string, r"%d-%m")
        except TypeError as exc:
            message = (
                f"{name} should be a list of strings, "
                f"received: {data}. ERROR in position {index} is "
                f"type {type(date_string)}."
            )
            raise TypeError(message) from exc
        except ValueError as exc:
            message = (
                f"{name} should be a list of date strings with Format DD-MM, "
                f"received: {data}. ERROR in position {index}, string: {date_string}."
            )
            raise ValueError(message) from exc

examples

python_test_of_functions

Example for building a model in code

`ScreenLogger (ILogger)`

Logger implementation based on default logging library

Source code in examples/python_test_of_functions.py

class ScreenLogger(ILogger):
    """Logger implementation based on default logging library"""

    def log_error(self, message: str) -> None:
        print("error:" + message)

    def log_warning(self, message: str) -> None:
        print("warning:" + message)

    def log_info(self, message: str) -> None:
        print("info:" + message)

    def log_debug(self, message: str) -> None:
        pass

`log_debug(self, message)`

Logs a debug message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in examples/python_test_of_functions.py

def log_debug(self, message: str) -> None:
    pass

`log_error(self, message)`

Logs an error message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in examples/python_test_of_functions.py

def log_error(self, message: str) -> None:
    print("error:" + message)

`log_info(self, message)`

Logs a info message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in examples/python_test_of_functions.py

def log_info(self, message: str) -> None:
    print("info:" + message)

`log_warning(self, message)`

Logs a warning message

Parameters:

Name	Type	Description	Default
`message`	`str`	message to log	required

Source code in examples/python_test_of_functions.py

def log_warning(self, message: str) -> None:
    print("warning:" + message)

main

Main script for running model using command-line

`main(path)`

Main function to run the application when running via command-line

Parameters:

Name	Type	Description	Default
`input_path`	`Path`	path to the input file	required

Source code in D-EcoImpact/main.py

def main(path: Path):
    """Main function to run the application when running via command-line

    Args:
        input_path (Path): path to the input file
    """

    # configure logger and data-access layer
    logger: ILogger = LoggerFactory.create_logger()
    da_layer: IDataAccessLayer = DataAccessLayer(logger)
    model_builder = ModelBuilder(da_layer, logger)

    # create and run application
    application = Application(logger, da_layer, model_builder)
    application.run(path)

pyinstaller

Main script for creating executable based on Python source files

`install()`

Function to create self-contained executable out of python files. Function can be called from command line using a poetry function. Contains settings for pyinstaller.

Source code in D-EcoImpact/pyinstaller.py

def install():
    """Function to create self-contained executable out of python files.
    Function can be called from command line using a poetry function.
    Contains settings for pyinstaller."""

    # MDK: this warning is disabled on purpose. Using PyInstaller.__main__
    # comes directly from the documentation of PyInstaller.

    # pylint: disable=maybe-no-member
    PyInstaller.__main__.run(
        [
            PATH_TO_MAIN,
            "--name=decoimpact",
            "--console",
            # other pyinstaller options...
        ]
    )

scripts

This file creates a simple NetCDF containing a simplified 3D grid with 3 data variables.

tests

`testing_utils`

Helper module for test utilities

`find_log_message_by_level(captured_log, level)`

Finds the correct record from the captured_log using the provided level Only one message is expected to be found

Parameters:

Name	Type	Description	Default
`captured_log`	`LogCaptureFixture`	captured log messages (just add "caplog: LogCaptureFixture" to your test method)	required
`level`	`str`	level of the log message (like "INFO" or "ERROR")	required

Returns:

Type	Description
`LogRecord`	found record for the provided log level

Source code in tests/testing_utils.py

def find_log_message_by_level(captured_log: LogCaptureFixture, level: str) -> LogRecord:
    """Finds the correct record from the captured_log using the provided level
    Only one message is expected to be found

    Args:
        captured_log (LogCaptureFixture): captured log messages
                                          (just add "caplog: LogCaptureFixture"
                                          to your test method)
        level (str): level of the log message (like "INFO" or "ERROR")

    Returns:
        LogRecord: found record for the provided log level

    """
    records = list(filter(lambda r: r.levelname == level, captured_log.records))

    # expect only one message for the provided level
    assert len(records) == 1

    return records[0]

`get_test_data_path()`

Creates default test data folder path based on current test path

Returns:

Type	Description
`str`	path to the default test data folder

Source code in tests/testing_utils.py

def get_test_data_path() -> str:
    """Creates default test data folder path based on current test path

    Returns:
        str: path to the default test data folder
    """
    test_info: str = getenv("PYTEST_CURRENT_TEST", "")
    return test_info.split(".py::")[0] + "_data"

tests

business

entities

rules

test_axis_filter_rule

Tests for AxisFilterRule class

`test_create_axis_filter_rule_should_set_defaults()`

Test creating a AxisFilterRule with defaults

Source code in tests/business/entities/rules/test_axis_filter_rule.py

def test_create_axis_filter_rule_should_set_defaults():
    """Test creating a AxisFilterRule with defaults"""

    # Arrange & Act
    rule = AxisFilterRule("test", ["foo"], 3, "boo")

    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert rule.input_variable_names == ["foo"]
    assert rule.output_variable_name == "output"
    assert rule.element_index == 3
    assert rule.axis_name == "boo"
    assert isinstance(rule, AxisFilterRule)

`test_execute_value_array_axis_filtered()`

Test execute of layer filter rule

Source code in tests/business/entities/rules/test_axis_filter_rule.py

def test_execute_value_array_axis_filtered():
    """Test execute of layer filter rule"""
    # Arrange & Act
    logger = Mock(ILogger)
    rule = AxisFilterRule("test", ["foo"], 1, "dim_1")
    data = [[1, 2], [3, 4]]
    value_array = _xr.DataArray(data, dims=("dim_1", "dim_2"))
    filtered_array = rule.execute(value_array, logger)

    result_data = [1, 2]
    result_array = _xr.DataArray(result_data, dims=("dim_2"))

    # Assert
    assert _xr.testing.assert_equal(filtered_array, result_array) is None

test_classification_rule

Tests for Classification class

`test_create_classification_rule_should_set_defaults()`

Test creating a classification rule with defaults

Source code in tests/business/entities/rules/test_classification_rule.py

def test_create_classification_rule_should_set_defaults():
    """Test creating a classification rule with defaults"""

    # test data
    criteria_test_table = {
        "output": [1, 2, 3, 4],
        "water_depth": [0.1, 3.33, 5, 5],
        "temperature": ["-", "0.1: 15", 15, ">15"],
    }

    # Arrange and act
    rule = ClassificationRule("test", ["water_depth", "salinity"], criteria_test_table)

    # assert
    assert rule.name == "test"
    assert rule.input_variable_names == ["water_depth", "salinity"]
    assert rule.criteria_table == criteria_test_table
    assert rule.output_variable_name == "output"
    assert rule.description == ""
    assert isinstance(rule, ClassificationRule)

`test_execute_classification()`

Test executing a classification of values

Source code in tests/business/entities/rules/test_classification_rule.py

def test_execute_classification():
    """Test executing a classification of values"""

    # test data
    criteria_test_table = {
        "output": [100, 200, 300, 400, 500, 900, 111, 222, 333],
        "water_depth": [11, 12, 13, 13, 15, 0, "-", "0", "0"],
        "salinity": ["-", "0.5: 5.5", 8.8, 8.8, 9, 0, ">10", "0", "0"],
        "temperature": ["-", "-", "-", "-", ">25.0", 0, "<0", "0", "0"],
        "another_val": ["-", "-", "-", "-", "-", "<0", ">0", ">=33", "<=24"],
    }

    # arrange
    logger = Mock(ILogger)
    rule = ClassificationRule("test", ["water_depth", "salinity"], criteria_test_table)
    test_data = {
        "water_depth": _xr.DataArray([13, 0, 11, 15, 12, 20, 0, 0]),
        "salinity": _xr.DataArray([8.8, 0, 2, 9, 2.5, 11, 0, 0]),
        "temperature": _xr.DataArray([20, -5, 20, 28, 1, -5, 0, 0]),
        "another_val": _xr.DataArray([1, 2, 3, 4, 5, 9, 22, 33]),
    }

    # expected results:
    # 1: take first when multiple apply --> 300
    # 2: no possible classification --> None
    # 3: allow '-' --> 100
    # 4: greater than '>' --> 500
    # 5: range --> 200
    # 6: smaller than '<' --> 111
    # 7: greater than/equal to '>=' --> 222
    # 8: smaller than/equal to '<=' --> 333
    expected_result = _xr.DataArray([300, None, 100, 500, 200, 111, 333, 222])

    # act
    test_result = rule.execute(test_data, logger)

    # assert
    assert _xr.testing.assert_equal(test_result, expected_result) is None

test_combine_results_rule

Tests for RuleBase class

`test_all_operations_combine_results_rule(operation, expected_result)`

Test the outcome of each operand for the combine results rule

Source code in tests/business/entities/rules/test_combine_results_rule.py

@pytest.mark.parametrize(
    "operation, expected_result",
    [
        (MultiArrayOperationType.MIN, [4, 5, 3]),
        (MultiArrayOperationType.MAX, [20, 12, 24]),
        (MultiArrayOperationType.MULTIPLY, [1200, 420, 432]),
        (MultiArrayOperationType.AVERAGE, [13, 8, 11]),
        (MultiArrayOperationType.MEDIAN, [15, 7, 6]),
        (MultiArrayOperationType.ADD, [39, 24, 33]),
        (MultiArrayOperationType.SUBTRACT, [1, -10, -27]),
    ],
)
def test_all_operations_combine_results_rule(
    operation: MultiArrayOperationType, expected_result: List[float]
):
    """Test the outcome of each operand for the combine results rule"""
    # Arrange
    logger = Mock(ILogger)
    dict_vars = {
        "var1_name": _xr.DataArray([20, 7, 3]),
        "var2_name": _xr.DataArray([4, 5, 6]),
        "var3_name": _xr.DataArray([15, 12, 24]),
    }

    # Act
    rule = CombineResultsRule(
        "test_name",
        ["var1_name", "var2_name", "var3_name"],
        operation,
    )
    obtained_result = rule.execute(dict_vars, logger)

    # Assert
    _xr.testing.assert_equal(obtained_result, _xr.DataArray(expected_result))

`test_all_operations_ignore_nan(operation, expected_result)`

Test the outcome of each operand for the combine results rule

Source code in tests/business/entities/rules/test_combine_results_rule.py

@pytest.mark.parametrize(
    "operation, expected_result",
    [
        (MultiArrayOperationType.MIN, [4, 5, 3]),
        (MultiArrayOperationType.MAX, [20, 12, 24]),
        (MultiArrayOperationType.MULTIPLY, [_np.nan, 420, 432]),
        (MultiArrayOperationType.AVERAGE, [12, 8, 11]),
        (MultiArrayOperationType.MEDIAN, [12, 7, 6]),
        (MultiArrayOperationType.ADD, [24, 24, 33]),
        (MultiArrayOperationType.SUBTRACT, [16, -10, -27]),
    ],
)
def test_all_operations_ignore_nan(
    operation: MultiArrayOperationType, expected_result: List[float]
):
    """Test the outcome of each operand for the combine results rule"""
    # Arrange
    logger = Mock(ILogger)
    dict_vars = {
        "var1_name": _xr.DataArray([20, 7, 3]),
        "var2_name": _xr.DataArray([4, 5, 6]),
        "var3_name": _xr.DataArray([_np.nan, 12, 24]),
    }

    # Act
    rule = CombineResultsRule(
        "test_name", ["var1_name", "var2_name", "var3_name"], operation, ignore_nan=True
    )
    obtained_result = rule.execute(dict_vars, logger)

    # Assert
    _xr.testing.assert_equal(obtained_result, _xr.DataArray(expected_result))

`test_all_operations_incl_nan(operation, expected_result)`

Test the outcome of each operand for the combine results rule

Source code in tests/business/entities/rules/test_combine_results_rule.py

@pytest.mark.parametrize(
    "operation, expected_result",
    [
        (MultiArrayOperationType.MIN, [_np.nan, 5, 3]),
        (MultiArrayOperationType.MAX, [_np.nan, 12, 24]),
        (MultiArrayOperationType.MULTIPLY, [_np.nan, 420, 432]),
        (MultiArrayOperationType.AVERAGE, [_np.nan, 8, 11]),
        (MultiArrayOperationType.MEDIAN, [_np.nan, 7, 6]),
        (MultiArrayOperationType.ADD, [_np.nan, 24, 33]),
        (MultiArrayOperationType.SUBTRACT, [_np.nan, -10, -27]),
    ],
)
def test_all_operations_incl_nan(
    operation: MultiArrayOperationType, expected_result: List[float]
):
    """Test the outcome of each operand for the combine results rule"""
    # Arrange
    logger = Mock(ILogger)
    dict_vars = {
        "var1_name": _xr.DataArray([20, 7, 3]),
        "var2_name": _xr.DataArray([4, 5, 6]),
        "var3_name": _xr.DataArray([_np.nan, 12, 24]),
    }

    # Act
    rule = CombineResultsRule(
        "test_name",
        ["var1_name", "var2_name", "var3_name"],
        operation,
    )
    obtained_result = rule.execute(dict_vars, logger)

    # Assert
    _xr.testing.assert_equal(obtained_result, _xr.DataArray(expected_result))

`test_create_combine_results_rule_with_all_fields()`

Test creating a combine results rule with all fields

Source code in tests/business/entities/rules/test_combine_results_rule.py

def test_create_combine_results_rule_with_all_fields():
    """Test creating a combine results rule with all fields"""

    # Arrange & Act
    rule = CombineResultsRule(
        "test_rule_name", ["foo", "hello"], MultiArrayOperationType.MULTIPLY
    )

    rule.description = "test description"

    # Assert
    assert isinstance(rule, CombineResultsRule)
    assert rule.name == "test_rule_name"
    assert rule.description == "test description"
    assert rule.input_variable_names == ["foo", "hello"]
    assert rule.operation_type == MultiArrayOperationType.MULTIPLY
    assert rule.output_variable_name == "output"

`test_create_combine_results_rule_with_defaults()`

Test creating a combine results rule with defaults

Source code in tests/business/entities/rules/test_combine_results_rule.py

def test_create_combine_results_rule_with_defaults():
    """Test creating a combine results rule with defaults"""

    # Arrange & Act
    rule = CombineResultsRule(
        "test_rule_name", ["foo", "hello"], MultiArrayOperationType.MULTIPLY
    )
    # Assert
    assert isinstance(rule, CombineResultsRule)
    assert rule.name == "test_rule_name"
    assert rule.description == ""
    assert rule.input_variable_names == ["foo", "hello"]
    assert rule.operation_type == MultiArrayOperationType.MULTIPLY
    assert rule.output_variable_name == "output"

`test_dims_present_in_result()`

Test that the dims metadata of the result is equal to the one of the first xarray used.

Source code in tests/business/entities/rules/test_combine_results_rule.py

def test_dims_present_in_result():
    """Test that the dims metadata of the result is equal to the one of the first xarray used."""
    # Arrange
    logger = Mock(ILogger)
    raw_data_1 = _np.ones((10, 20))
    raw_data_2 = 2 * _np.ones((10, 20))
    raw_data = [raw_data_1, raw_data_2]
    xarray_data = [
        _xr.DataArray(data=arr, dims=["test_dimension_1", "test_dimension_2"])
        for arr in raw_data
    ]
    dict_data = {"var1_name": xarray_data[0], "var2_name": xarray_data[1]}

    # Act
    rule = CombineResultsRule(
        "test_name", ["var1_name", "var2_name"], MultiArrayOperationType.ADD
    )
    obtained_result = rule.execute(dict_data, logger)

    # Assert
    # _xr.testing.assert_equal(obtained_result.dims, xarray_data[0].dims)
    assert obtained_result.dims == xarray_data[0].dims

`test_execute_error_combine_results_rule_different_lengths()`

Test setting input_variable_names of a RuleBase

Source code in tests/business/entities/rules/test_combine_results_rule.py

def test_execute_error_combine_results_rule_different_lengths():
    """Test setting input_variable_names of a RuleBase"""

    # Arrange & Act
    rule = CombineResultsRule(
        "test", ["foo_data", "hello_data"], MultiArrayOperationType.MULTIPLY
    )
    value_array = {
        "foo_data": _xr.DataArray([1, 2, 3]),
        "hello_data": _xr.DataArray([4, 3, 2, 1]),
    }

    # Assert
    with pytest.raises(ValueError) as exc_info:
        rule.execute(value_array, logger=Mock(ILogger))

    exception_raised = exc_info.value
    assert exception_raised.args[0] == "The arrays must have the same dimensions."

`test_execute_error_combine_results_rule_different_shapes()`

Test setting input_variable_names of a RuleBase

Source code in tests/business/entities/rules/test_combine_results_rule.py

def test_execute_error_combine_results_rule_different_shapes():
    """Test setting input_variable_names of a RuleBase"""

    # Arrange & Act
    rule = CombineResultsRule(
        "test", ["foo_data", "hello_data"], MultiArrayOperationType.MULTIPLY, False
    )
    value_array = {
        "foo_data": _xr.DataArray([[1, 2], [3, 4]]),
        "hello_data": _xr.DataArray([4, 3, 2, 1]),
    }

    # Assert
    with pytest.raises(ValueError) as exc_info:
        rule.execute(value_array, logger=Mock(ILogger))

    exception_raised = exc_info.value
    assert exception_raised.args[0] == "The arrays must have the same dimensions."

`test_no_validate_error_with_correct_rule()`

Test a correct combine results rule validates without error

Source code in tests/business/entities/rules/test_combine_results_rule.py

def test_no_validate_error_with_correct_rule():
    """Test a correct combine results rule validates without error"""

    # Arrange
    logger = Mock(ILogger)
    rule = CombineResultsRule(
        "test_rule_name", ["foo", "hello"], MultiArrayOperationType.MULTIPLY
    )

    # Act
    valid = rule.validate(logger)

    # Assert
    assert isinstance(rule, CombineResultsRule)
    assert valid

test_depth_average_rule

Tests for RuleBase class

`test_create_depth_average_rule_with_defaults()`

Test creating a depth average rule with defaults

Source code in tests/business/entities/rules/test_depth_average_rule.py

def test_create_depth_average_rule_with_defaults():
    """Test creating a depth average rule with defaults"""

    # Arrange & Act
    rule = DepthAverageRule("test_rule_name",
                            ["foo", "hello"],
                            )

    # Assert
    assert isinstance(rule, DepthAverageRule)
    assert rule.name == "test_rule_name"
    assert rule.description == ""
    assert rule.input_variable_names == ["foo", "hello"]
    assert rule.output_variable_name == "output"

`test_depth_average_rule(data_variable, mesh2d_interface_z, mesh2d_flowelem_bl, mesh2d_s1, result_data)`

Make sure the calculation of the depth average is correct. Including differing water and bed levels.

Source code in tests/business/entities/rules/test_depth_average_rule.py

@pytest.mark.parametrize(
    "data_variable, mesh2d_flowelem_bl, mesh2d_s1, mesh2d_interface_z, result_data",
    [
        [
            _np.array([[[20, 40], [91, 92]]]),
            _np.array([-2, -2]),
            _np.array([[0, 0]]),
            _np.array([0, -1, -2]),
            _np.array([[30.0, 91.5]]),
        ],
        [
            _np.tile(_np.arange(4, 0, -1), (2, 4, 1)),
            _np.array([-10, -5, -10, -5]),
            _np.array([[0, 0, -1.5, -1.5], [0, -6, 5, -5]]),
            _np.array([-10, -6, -3, -1, 0]),
            _np.array(
                [[3.0, 2.2, 3.29411765, 2.57142857], [3.0, _np.nan, 2.33333, _np.nan]]
            ),
        ],
        # Added this next test as to match the example in documentation
        [
            _np.tile(_np.arange(4, 0, -1), (2, 6, 1)),
            _np.array([-7.8, -7.3, -5.2, -9.5, -7, -1.6]),
            _np.array(
                [[-1.4, -1.6, -3, -1.4, -1.6, -1.6], [0, -1.6, -3, 3, -1.6, -1.6]]
            ),
            _np.array([-8.5, -6.5, -5, -2, 0]),
            _np.array(
                [
                    [2.546875, 2.473684, 2.090909, 2.851852, 2.388889, _np.nan],
                    [2.269231, 2.473684, 2.090909, 2.2, 2.388889, _np.nan],
                ]
            ),
        ],
    ],
)
def test_depth_average_rule(
    data_variable: List[float],
    mesh2d_interface_z: List[float],
    mesh2d_flowelem_bl: List[float],
    mesh2d_s1: List[float],
    result_data: List[float],
):
    """Make sure the calculation of the depth average is correct. Including
    differing water and bed levels."""
    logger = Mock(ILogger)
    rule = DepthAverageRule(
        name="test",
        input_variable_names=["foo",
                              "mesh2d_flowelem_bl",
                              "mesh2d_s1",
                              "mesh2d_interface_z"],
    )

    # Create dataset
    ds = _xr.Dataset(
        {
            "var_3d": (["time", "mesh2d_nFaces", "mesh2d_nLayers"], data_variable),
            "mesh2d_flowelem_bl": (["mesh2d_nFaces"], mesh2d_flowelem_bl),
            "mesh2d_s1": (["time", "mesh2d_nFaces"], mesh2d_s1),
            "mesh2d_interface_z": (["mesh2d_nInterfaces"], mesh2d_interface_z),
        }
    )

    value_arrays = {
        "var_3d": ds["var_3d"],
        "mesh2d_flowelem_bl": ds["mesh2d_flowelem_bl"],
        "mesh2d_s1": ds["mesh2d_s1"],
        "mesh2d_interface_z": ds["mesh2d_interface_z"],
    }

    depth_average = rule.execute(value_arrays, logger)

    result_array = _xr.DataArray(
        result_data,
        dims=["time", "mesh2d_nFaces"],
    )

    assert _xr.testing.assert_allclose(depth_average, result_array, atol=1e-08) is None

`test_dimension_error()`

If the number of interfaces > number of layers + 1. Give an error, no calculation is possible

Source code in tests/business/entities/rules/test_depth_average_rule.py

def test_dimension_error():
    """If the number of interfaces > number of layers + 1. Give an error, no
    calculation is possible"""
    logger = Mock(ILogger)
    rule = DepthAverageRule(
        name="test",
        input_variable_names=["foo",
                              "mesh2d_flowelem_bl",
                              "mesh2d_s1",
                              "mesh2d_interface_z"],
    )

    # Create dataset
    ds = _xr.Dataset(
        {
            "var_3d": (
                ["time", "mesh2d_nFaces", "mesh2d_nLayers"],
                _np.array([[[20, 40], [91, 92]]]),
            ),
            "mesh2d_interface_z": (
                ["mesh2d_nInterfaces"],
                _np.array([0, -1, -2, -3, -4]),
            ),
            "mesh2d_flowelem_bl": (
                ["mesh2d_nFaces"],
                _np.array([-2, -2]),
            ),
            "mesh2d_s1": (["time", "mesh2d_nFaces"], _np.array([[0, 0]])),
        }
    )

    value_arrays = {
        "var_3d": ds["var_3d"],
        "mesh2d_flowelem_bl": ds["mesh2d_flowelem_bl"],
        "mesh2d_s1": ds["mesh2d_s1"],
        "mesh2d_interface_z": ds["mesh2d_interface_z"],
    }

    rule.execute(value_arrays, logger)
    logger.log_error.assert_called_with(
        "The number of interfaces should be number of layers + 1. Number of "
        "interfaces = 5. Number of layers = 2."
    )

`test_no_validate_error_with_correct_rule()`

Test a correct depth average rule validates without error

Source code in tests/business/entities/rules/test_depth_average_rule.py

def test_no_validate_error_with_correct_rule():
    """Test a correct depth average rule validates without error"""

    # Arrange
    rule = DepthAverageRule(
        "test_rule_name",
        ["foo", "hello"],
    )

    # Assert
    assert isinstance(rule, DepthAverageRule)

test_filter_extremes_rule

Tests for RuleBase class

`test_create_filter_extremes_rule_with_defaults()`

Test creating a filter extremes rule with defaults

Source code in tests/business/entities/rules/test_filter_extremes_rule.py

def test_create_filter_extremes_rule_with_defaults():
    """Test creating a filter extremes rule with defaults"""

    # Arrange & Act
    rule = FilterExtremesRule("test_rule_name", ["input_var"], "peaks", 5, "hour", True)

    # Assert
    assert isinstance(rule, FilterExtremesRule)
    assert rule.name == "test_rule_name"
    assert rule.description == ""
    assert rule.input_variable_names == ["input_var"]
    assert rule.extreme_type == "peaks"
    assert rule.distance == 5
    assert rule.settings.time_scale == "hour"
    assert rule.mask

`test_filter_extremes_rule(data_variable, result_data, time_data, mask, distance, time_scale, extreme_type)`

Make sure the calculation of the filter extremes is correct. Including differing water and bed levels.

Source code in tests/business/entities/rules/test_filter_extremes_rule.py

@pytest.mark.parametrize(
    "data_variable, result_data, time_data, mask, distance, time_scale, extreme_type",
    [
        # Test 1: check for multiple dimensions!
        (
            [
                [
                    [1, 0],
                    [0, 3],
                    [-1, 0],
                    [0, 4],
                    [1, 0],
                    [2, 5],
                    [1, 0],
                    [0, 6],
                    [-3, 0],
                    [-4, 7],
                    [-2, 0],
                    [-1, 8],
                    [-3, 0],
                    [-5, 9],
                ]
            ],
            [
                [
                    [np.nan, np.nan],
                    [np.nan, 3],
                    [np.nan, np.nan],
                    [np.nan, 4],
                    [np.nan, np.nan],
                    [2, 5],
                    [np.nan, np.nan],
                    [np.nan, 6],
                    [np.nan, np.nan],
                    [np.nan, 7],
                    [np.nan, np.nan],
                    [-1, 8],
                    [np.nan, np.nan],
                    [np.nan, np.nan],
                ]
            ],
            [
                np.datetime64("2005-02-25T01:30"),
                np.datetime64("2005-02-25T02:30"),
                np.datetime64("2005-02-25T03:30"),
                np.datetime64("2005-02-25T04:30"),
                np.datetime64("2005-02-25T05:30"),
                np.datetime64("2005-02-25T06:30"),
                np.datetime64("2005-02-25T07:30"),
                np.datetime64("2005-02-25T08:30"),
                np.datetime64("2005-02-25T09:30"),
                np.datetime64("2005-02-25T10:30"),
                np.datetime64("2005-02-25T11:30"),
                np.datetime64("2005-02-25T12:30"),
                np.datetime64("2005-02-25T13:30"),
                np.datetime64("2005-02-25T14:30"),
            ],
            False,
            1,
            "hour",
            "peaks",
        ),
        # Test 2: multiple times
        (
            [
                [
                    [0, 0],
                    [5, 3],
                    [0, 5],
                    [6, 4],
                    [0, 4],
                ],
                [
                    [0, 0],
                    [5, 6],
                    [0, 5],
                    [0, 7],
                    [0, 4],
                ],
            ],
            [
                [
                    [np.nan, np.nan],
                    [5, np.nan],
                    [np.nan, 5],
                    [6, np.nan],
                    [np.nan, np.nan],
                ],
                [
                    [np.nan, np.nan],
                    [5, 6],
                    [np.nan, np.nan],
                    [np.nan, 7],
                    [np.nan, np.nan],
                ],
            ],
            [
                np.datetime64("2005-02-25T01:30"),
                np.datetime64("2005-02-25T02:30"),
                np.datetime64("2005-02-25T03:30"),
                np.datetime64("2005-02-25T04:30"),
                np.datetime64("2005-02-25T05:30"),
            ],
            False,
            1,
            "hour",
            "peaks",
        ),
        # Test 3: Maks true
        (
            [
                [
                    [0, 0],
                    [5, 3],
                    [0, 5],
                    [6, 4],
                    [0, 4],
                ]
            ],
            [
                [
                    [np.nan, np.nan],
                    [1.0, np.nan],
                    [np.nan, 1.0],
                    [1.0, np.nan],
                    [np.nan, np.nan],
                ],
            ],
            [
                np.datetime64("2005-02-25T01:30"),
                np.datetime64("2005-02-25T02:30"),
                np.datetime64("2005-02-25T03:30"),
                np.datetime64("2005-02-25T04:30"),
                np.datetime64("2005-02-25T05:30"),
            ],
            True,
            1,
            "hour",
            "peaks",
        ),
        # Test 4: Different time dimension
        (
            [
                [
                    [1, 0],
                    [0, 3],
                    [-1, 0],
                    [0, 4],
                    [1, 0],
                    [2, 5],
                    [1, 0],
                    [0, 6],
                    [-3, 0],
                    [-4, 7],
                    [-2, 0],
                    [-1, 8],
                    [-3, 0],
                    [-5, 9],
                ]
            ],
            [
                [
                    [np.nan, np.nan],
                    [np.nan, np.nan],
                    [np.nan, np.nan],
                    [np.nan, 4],
                    [np.nan, np.nan],
                    [2, np.nan],
                    [np.nan, np.nan],
                    [np.nan, 6],
                    [np.nan, np.nan],
                    [np.nan, np.nan],
                    [np.nan, np.nan],
                    [-1, 8],
                    [np.nan, np.nan],
                    [np.nan, np.nan],
                ]
            ],
            [
                np.datetime64("2005-02-25T01:30"),
                np.datetime64("2005-02-25T02:30"),
                np.datetime64("2005-02-25T03:30"),
                np.datetime64("2005-02-25T04:30"),
                np.datetime64("2005-02-25T05:30"),
                np.datetime64("2005-02-25T06:30"),
                np.datetime64("2005-02-25T07:30"),
                np.datetime64("2005-02-25T08:30"),
                np.datetime64("2005-02-25T09:30"),
                np.datetime64("2005-02-25T10:30"),
                np.datetime64("2005-02-25T11:30"),
                np.datetime64("2005-02-25T12:30"),
                np.datetime64("2005-02-25T13:30"),
                np.datetime64("2005-02-25T14:30"),
            ],
            False,
            3,
            "hour",
            "peaks",
        ),
        # Test 5: Test troughs
        (
            [
                [
                    [1, 0],
                    [0, 3],
                    [-1, 0],
                    [0, 4],
                    [1, 0],
                    [2, 5],
                    [1, 0],
                    [0, 6],
                    [-3, 0],
                    [-4, 7],
                    [-2, 0],
                    [-1, 8],
                    [-3, 0],
                    [-5, 9],
                ]
            ],
            [
                [
                    [np.nan, np.nan],
                    [np.nan, np.nan],
                    [-1, 0],
                    [np.nan, np.nan],
                    [np.nan, 0],
                    [np.nan, np.nan],
                    [np.nan, 0],
                    [np.nan, np.nan],
                    [np.nan, 0],
                    [-4, np.nan],
                    [np.nan, 0],
                    [np.nan, np.nan],
                    [np.nan, 0],
                    [np.nan, np.nan],
                ]
            ],
            [
                np.datetime64("2005-02-25T01:30"),
                np.datetime64("2005-02-25T02:30"),
                np.datetime64("2005-02-25T03:30"),
                np.datetime64("2005-02-25T04:30"),
                np.datetime64("2005-02-25T05:30"),
                np.datetime64("2005-02-25T06:30"),
                np.datetime64("2005-02-25T07:30"),
                np.datetime64("2005-02-25T08:30"),
                np.datetime64("2005-02-25T09:30"),
                np.datetime64("2005-02-25T10:30"),
                np.datetime64("2005-02-25T11:30"),
                np.datetime64("2005-02-25T12:30"),
                np.datetime64("2005-02-25T13:30"),
                np.datetime64("2005-02-25T14:30"),
            ],
            False,
            1,
            "hour",
            "troughs",
        ),
    ],
)
def test_filter_extremes_rule(
    data_variable: List[float],
    result_data: List[float],
    time_data: List[float],
    mask: bool,
    distance: int,
    time_scale: str,
    extreme_type: str,
):
    """Make sure the calculation of the filter extremes is correct. Including
    differing water and bed levels."""
    logger = Mock(ILogger)
    rule = FilterExtremesRule(
        "test", ["test_var"], extreme_type, distance, time_scale, mask
    )
    assert isinstance(rule, FilterExtremesRule)
    # Create dataset
    ds = _xr.Dataset(
        {"test_var": (["dim1", "time", "dim2"], data_variable)},
        coords={
            "time": time_data,
        },
    )

    value_array = ds["test_var"]

    filter_extremes = rule.execute(value_array, logger)

    result_array = _xr.DataArray(
        result_data,
        dims=["dim1", "time", "dim2"],
        coords={
            "time": time_data,
        },
    )

    assert (
        _xr.testing.assert_allclose(filter_extremes, result_array, atol=1e-08) is None
    )

`test_validation_when_not_valid()`

Test an incorrect filter extremes rule validates with error time_scale is not in TimeOperationSettings

Source code in tests/business/entities/rules/test_filter_extremes_rule.py

def test_validation_when_not_valid():
    """
    Test an incorrect filter extremes rule validates with error
    time_scale is not in TimeOperationSettings
    """
    logger = Mock(ILogger)
    rule = FilterExtremesRule("test_rule_name", ["input_var"], "peaks", 5, "h", True)

    valid = rule.validate(logger)
    assert valid is False

`test_validation_when_valid()`

Test a correct filter extremes rule validates without error time_scale is in TimeOperationSettings

Source code in tests/business/entities/rules/test_filter_extremes_rule.py

def test_validation_when_valid():
    """
    Test a correct filter extremes rule validates without error
    time_scale is in TimeOperationSettings
    """
    logger = Mock(ILogger)
    rule = FilterExtremesRule("test_rule_name", ["input_var"], "peaks", 5, "hour", True)

    valid = rule.validate(logger)
    assert valid

test_formula_rule

Tests for RuleBase class

`test_create_formula_rule_should_set_defaults()`

Test creating a RuleBase with defaults

Source code in tests/business/entities/rules/test_formula_rule.py

def test_create_formula_rule_should_set_defaults():
    """Test creating a RuleBase with defaults"""

    # Arrange & Act
    rule = FormulaRule("test", ["foo", "bar"], "foo + bar")
    rule.output_variable_name = "outputname"

    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert rule.input_variable_names == ["foo", "bar"]
    assert rule.output_variable_name == "outputname"
    assert rule.formula == "foo + bar"
    assert isinstance(rule, FormulaRule)

`test_execute_adding_value_arrays()`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

def test_execute_adding_value_arrays():
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["foo", "bar"], "foo + bar")
    values = {"foo": 1.0, "bar": 4.0}

    # Act
    result_value = rule.execute(values, logger)

    # Assert
    assert math.isclose(result_value, 5.0, abs_tol=1e-9)

`test_execute_comparing_value_arrays(input_value1, input_value2, expected_output_value)`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

@pytest.mark.parametrize(
    "input_value1, input_value2, expected_output_value",
    [(0.5, 10, 0.0), (11, 1.5, 1.0)],
)
def test_execute_comparing_value_arrays(
    input_value1: float, input_value2: float, expected_output_value: float
):
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["foo", "bar"], "foo > bar")
    values = {
        "foo": input_value1,
        "bar": input_value2,
    }

    # Act
    result_value = rule.execute(values, logger)

    # Assert
    assert result_value == expected_output_value

`test_execute_math_value_arrays()`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

def test_execute_math_value_arrays():
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["val1"], "val1 * math.isqrt(9)")
    values = {"val1": 2.0}

    # Act
    result_value = rule.execute(values, logger)

    # Assert
    assert math.isclose(result_value, 6.0, abs_tol=1e-9)

`test_execute_multiplying_value_arrays()`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

def test_execute_multiplying_value_arrays():
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["foo", "bar"], "foo * bar")
    values = {
        "foo": 2.0,
        "bar": 3.0,
    }

    # Act
    result_value = rule.execute(values, logger)

    # Assert
    assert math.isclose(result_value, 6.0, abs_tol=1e-9)

`test_execute_numpy_value_arrays()`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

def test_execute_numpy_value_arrays():
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["val1"], "val1 * numpy.size(numpy.array([1, 3]))")
    values = {"val1": 1.0}

    # Act
    result_value = rule.execute(values, logger)

    # Assert
    assert math.isclose(result_value, 2.0, abs_tol=1e-9)

`test_execute_unwanted_python_code()`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

def test_execute_unwanted_python_code():
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["foo", "bar"], "print('hoi')")
    values = {
        "foo": 2.0,
        "bar": 3.0,
    }

    # Act
    with pytest.raises(NameError) as exc_info:
        rule.execute(values, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "name '_print_' is not defined"
    assert exception_raised.args[0] == expected_message

`test_formula_has_incorrect_variable_names()`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

def test_formula_has_incorrect_variable_names():
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["foo", "bar"], "foo + bas")
    values = {
        "foo": 2.0,
        "bar": 3.0,
    }

    # Act
    with pytest.raises(NameError) as exc_info:
        rule.execute(values, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "name 'bas' is not defined"
    assert exception_raised.args[0] == expected_message

`test_validate_of_invalid_python_code(formula, expected_output_value)`

Test formula on value_arrays of a RuleBase

Source code in tests/business/entities/rules/test_formula_rule.py

@pytest.mark.parametrize(
    "formula, expected_output_value",
    [("print('hoi')", False), ("foo + bar", True), ("output=foo + bar", True)],
)
def test_validate_of_invalid_python_code(formula: str, expected_output_value: bool):
    """Test formula on value_arrays of a RuleBase"""

    # Arrange
    logger = Mock(ILogger)
    rule = FormulaRule("test", ["foo", "bar"], formula)
    rule.output_variable_name = "outputname"

    # Act
    result = rule.validate(logger)

    # Assert
    assert result == expected_output_value

test_layer_filter_rule

Tests for LayerFilterRule class

`test_create_layer_filter_rule_should_set_defaults()`

Test creating a LayerFilterRule with defaults

Source code in tests/business/entities/rules/test_layer_filter_rule.py

def test_create_layer_filter_rule_should_set_defaults():
    """Test creating a LayerFilterRule with defaults"""

    # Arrange & Act
    rule = LayerFilterRule("test", ["foo"], 3)

    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert rule.input_variable_names == ["foo"]
    assert rule.output_variable_name == "output"
    assert rule.layer_number == 3
    assert isinstance(rule, LayerFilterRule)

`test_execute_value_array_filtered()`

Test execute of layer filter rule

Source code in tests/business/entities/rules/test_layer_filter_rule.py

def test_execute_value_array_filtered():
    """Test execute of layer filter rule"""
    # Arrange & Act
    logger = Mock(ILogger)
    rule = LayerFilterRule("test", ["foo"], 3)
    data = [[[1, 2, 3, 4]]]
    value_array = _xr.DataArray(data)
    filtered_array = rule.execute(value_array, logger)

    result_data = [[3]]
    result_array = _xr.DataArray(result_data)

    # Assert
    assert _xr.testing.assert_equal(filtered_array, result_array) is None

test_multiply_rule

Tests for RuleBase class

`test_create_multiply_rule_should_set_defaults()`

Test creating a multiply rule with defaults

Source code in tests/business/entities/rules/test_multiply_rule.py

def test_create_multiply_rule_should_set_defaults():
    """Test creating a multiply rule with defaults"""

    # Arrange & Act
    rule = MultiplyRule("test", ["foo"], [[0.5, 3.0]])
    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert rule.input_variable_names == ["foo"]
    assert rule.output_variable_name == "output"
    assert rule.multipliers == [[0.5, 3.0]]
    assert rule.date_range is None
    assert isinstance(rule, MultiplyRule)

`test_execute_value_array_multiplied_by_multipliers_no_dates()`

Test executing Multiply Rule with single multipliers and no date range.

Source code in tests/business/entities/rules/test_multiply_rule.py

def test_execute_value_array_multiplied_by_multipliers_no_dates():
    """Test executing Multiply Rule with single multipliers and
    no date range."""

    # Arrange
    logger = Mock(ILogger)
    rule = MultiplyRule("test", ["foo"], [[0.5, 4.0]])
    rule.description = "description"
    data = [1, 2, 3, 4]
    value_array = _xr.DataArray(data)

    # Act
    multiplied_array = rule.execute(value_array, logger)

    result_data = [2.0, 4.0, 6.0, 8.0]
    result_array = _xr.DataArray(result_data)

    # Assert
    assert _xr.testing.assert_equal(multiplied_array, result_array) is None

`test_execute_value_array_multiplied_by_multipliers_with_dates()`

Test executing Multiply Rule with multipliers and a date range.

Source code in tests/business/entities/rules/test_multiply_rule.py

def test_execute_value_array_multiplied_by_multipliers_with_dates():
    """Test executing Multiply Rule with multipliers and a date range."""

    # Arrange
    logger = Mock(ILogger)
    rule = MultiplyRule(
        "test",
        ["foo"],
        [[1], [100, 10]],
        date_range=[["01-01", "10-01"], ["11-01", "20-01"]],
    )

    values = [0.1, 0.7, 0.2, 0.2, 0.3, 0.1]
    time = [
        "2020-01-02",
        "2020-01-12",
        "2021-01-03",
        "2021-01-13",
        "2022-01-04",
        "2022-01-14",
    ]
    time = [_np.datetime64(t) for t in time]
    value_array = _xr.DataArray(values, coords=[time], dims=["time"])

    # Act
    multiplied_array = rule.execute(value_array, logger)

    result_data = [0.1, 700, 0.2, 200, 0.3, 100]
    result_array = _xr.DataArray(result_data, coords=[time], dims=["time"])

    # Assert
    assert _xr.testing.assert_equal(multiplied_array, result_array) is None

`test_execute_value_array_multiplied_by_multipliers_with_dates_missing_dates()`

Test executing Multiply Rule with multipliers and a date range. And check that the values that are outside the given periods are filled with None

Source code in tests/business/entities/rules/test_multiply_rule.py

def test_execute_value_array_multiplied_by_multipliers_with_dates_missing_dates():
    """Test executing Multiply Rule with multipliers and a date range. And check
    that the values that are outside the given periods are filled with None"""

    # Arrange
    logger = Mock(ILogger)
    rule = MultiplyRule(
        "test",
        ["foo"],
        [[2], [100, 10]],
        date_range=[["02-01", "10-01"], ["11-01", "20-01"]],
    )

    values = [0.1, 0.7, 0.2, 0.2, 0.3, 0.1]
    time = [
        "2020-01-02",
        "2020-01-12",
        "2021-01-03",
        "2021-01-13",
        "2022-01-04",
        "2022-01-14",
    ]
    time = [_np.datetime64(t) for t in time]
    value_array = _xr.DataArray(values, coords=[time], dims=["time"])

    # Act
    multiplied_array = rule.execute(value_array, logger)

    result_data = [None, 700, 0.4, 200, 0.6, 100]
    result_array = _xr.DataArray(result_data, coords=[time], dims=["time"])

    # Assert
    assert _xr.testing.assert_equal(multiplied_array, result_array) is None

test_response_curve_rule

Tests for RuleBase class

`fixture_example_rule()`

Initiation of ResponseCurveRule to be reused in the following tests

Source code in tests/business/entities/rules/test_response_curve_rule.py

@pytest.fixture(name="example_rule")
def fixture_example_rule():
    """Initiation of ResponseCurveRule to be reused in the following tests"""
    return ResponseCurveRule(
        "test_response_name",
        "input_variable_name",
        [0, 50, 300, 5000],
        [0, 1, 2, 3],
    )

`test_create_response_rule(example_rule)`

Test creating a new (valid) Response rule

Source code in tests/business/entities/rules/test_response_curve_rule.py

def test_create_response_rule(example_rule):
    """
    Test creating a new (valid) Response rule
    """
    # Arrange
    logger = Mock(ILogger)

    # Assert
    assert example_rule.name == "test_response_name"
    assert example_rule.input_variable_names[0] == "input_variable_name"
    assert (example_rule.input_values == [0, 50, 300, 5000]).all()
    assert (example_rule.output_values == [0, 1, 2, 3]).all()
    assert isinstance(example_rule, ResponseCurveRule)
    assert example_rule.validate(logger)

`test_execute_response_rule_values_between_limits(example_rule, input_value, expected_output_value)`

Test the function execution with input values between the interval limits.

Source code in tests/business/entities/rules/test_response_curve_rule.py

@pytest.mark.parametrize(
    "input_value, expected_output_value",
    [(25, (0.5, [0, 0])), (75, (1.1, [0, 0])), (770, (2.1, [0, 0]))],
)
def test_execute_response_rule_values_between_limits(
    example_rule, input_value: int, expected_output_value: float
):
    """
    Test the function execution with input values between the interval limits.
    """
    # Arrange
    logger = Mock(ILogger)

    # Assert
    assert example_rule.execute(input_value, logger) == expected_output_value
    logger.log_warning.assert_not_called()

`test_execute_values_combined_dec_inc(example_rule_combined, input_value, expected_output_value)`

Test the function execution with input values between the interval limits.

Source code in tests/business/entities/rules/test_response_curve_rule.py

@pytest.mark.parametrize(
    "input_value, expected_output_value",
    [
        (-1, (22, [1, 0])),
        (0.5, (18.5, [0, 0])),
        (1.5, (12.5, [0, 0])),
        (3.5, (11, [0, 0])),
        (7.5, (16, [0, 0])),
        (10.5, (20, [0, 1])),
    ],
)
def test_execute_values_combined_dec_inc(
    example_rule_combined,
    input_value: int,
    expected_output_value: int,
):
    """
    Test the function execution with input values between the interval limits.
    """
    # Arrange
    logger = Mock(ILogger)

    # Assert
    assert example_rule_combined.execute(input_value, logger) == expected_output_value

`test_input_values_are_not_sorted(example_rule)`

Test the function execution when input values are not sorted

Source code in tests/business/entities/rules/test_response_curve_rule.py

def test_input_values_are_not_sorted(example_rule):
    """
    Test the function execution when input values are not sorted
    """
    # Arrange
    logger = Mock(ILogger)

    # Act
    example_rule._input_values = _np.array([1, 2, 5, 3])

    # Assert
    assert not example_rule.validate(logger)
    logger.log_error.assert_called_with(
        "The input values should be given in a sorted order."
    )

`test_inputs_and_outputs_have_different_lengths(example_rule)`

Test the function execution when input and outputs have different lengths

Source code in tests/business/entities/rules/test_response_curve_rule.py

def test_inputs_and_outputs_have_different_lengths(example_rule):
    """
    Test the function execution when input and outputs have different lengths
    """
    # Arrange
    logger = Mock(ILogger)

    # Act
    example_rule._input_values = _np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

    # Assert
    assert not example_rule.validate(logger)
    logger.log_error.assert_called_with("The input and output values must be equal.")

test_rolling_statistics_rule

Tests for rolling statistics rule

`test_create_rolling_statistics_rule_should_set_defaults()`

Test creating a rolling statistics rule with defaults

Source code in tests/business/entities/rules/test_rolling_statistics_rule.py

def test_create_rolling_statistics_rule_should_set_defaults():
    """Test creating a rolling statistics rule with defaults"""

    # Arrange & Act
    rule = RollingStatisticsRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MIN,
    )

    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert isinstance(rule, RollingStatisticsRule)

`test_execute_value_array_rolling_statistics_average()`

RullingStatisticsRule (average, yearly)

Source code in tests/business/entities/rules/test_rolling_statistics_rule.py

def test_execute_value_array_rolling_statistics_average():
    """RullingStatisticsRule (average, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = RollingStatisticsRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MEDIAN,
    )

    rule.settings.time_scale = "day"
    rule.period = 2

    rolling_statistic = rule.execute(value_array, logger)

    result_data = [np.nan, np.nan, 0.2, 0.2, 0.2, 0.2]
    result_array = _xr.DataArray(result_data, coords=[time], dims=["time"])

    # Assert
    assert _xr.testing.assert_equal(rolling_statistic, result_array) is None

`test_execute_value_array_rolling_statistics_max()`

RollingStatisticsRule (max, yearly)

Source code in tests/business/entities/rules/test_rolling_statistics_rule.py

def test_execute_value_array_rolling_statistics_max():
    """RollingStatisticsRule (max, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = RollingStatisticsRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MAX,
    )

    rule.settings.time_scale = "day"
    rule.period = 2

    rolling_statistic = rule.execute(value_array, logger)

    result_data = [np.nan, np.nan, 0.7, 0.7, 0.3, 0.3]
    result_array = _xr.DataArray(result_data, coords=[time], dims=["time"])

    # Assert
    assert _xr.testing.assert_equal(rolling_statistic, result_array) is None

`test_execute_value_array_rolling_statistics_min()`

RullingStatisticsRule (min, yearly)

Source code in tests/business/entities/rules/test_rolling_statistics_rule.py

def test_execute_value_array_rolling_statistics_min():
    """RullingStatisticsRule (min, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = RollingStatisticsRule(
        name="test", input_variable_names=["foo"], operation_type=TimeOperationType.MIN
    )
    rule.settings.time_scale = "day"
    rule.period = 2

    rolling_statistic = rule.execute(value_array, logger)

    result_data = [np.nan, np.nan, 0.1, 0.2, 0.2, 0.1]
    result_array = _xr.DataArray(result_data, coords=[time], dims=["time"])

    # Assert
    assert _xr.testing.assert_equal(rolling_statistic, result_array) is None

`test_operation_type_not_implemented()`

Test that the rulling statistics rule gives an error if no operation_type is given

Source code in tests/business/entities/rules/test_rolling_statistics_rule.py

def test_operation_type_not_implemented():
    """Test that the rulling statistics rule gives an error
    if no operation_type is given"""

    # create test set
    logger = Mock(ILogger)
    rule = RollingStatisticsRule(
        name="test", input_variable_names=["foo"], operation_type="test"
    )
    rule.settings.time_scale = "day"
    rule.period = 2

    with pytest.raises(NotImplementedError) as exc_info:
        rule.execute(value_array, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "The operation type 'test' is currently not supported"
    assert exception_raised.args[0] == expected_message

`test_rolling_statistics_rule_without_time_dimension()`

RollingStatisticsRule should give an error when no time dim is defined

Source code in tests/business/entities/rules/test_rolling_statistics_rule.py

def test_rolling_statistics_rule_without_time_dimension():
    """RollingStatisticsRule should give an error when no time dim is defined"""
    # create test set
    logger = Mock(ILogger)
    rule = RollingStatisticsRule(
        name="test", input_variable_names=["foo"], operation_type=TimeOperationType.ADD
    )
    rule.settings.time_scale = "day"
    rule.period = 365

    test_data = [1.2, 0.4]
    test_array = _xr.DataArray(test_data, name="test_with_error")

    with pytest.raises(ValueError) as exc_info:
        rule.execute(test_array, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "No time dimension found for test_with_error"
    assert exception_raised.args[0] == expected_message

test_rule_base

Tests for RuleBase class

`TestRule (RuleBase)`

Source code in tests/business/entities/rules/test_rule_base.py

class TestRule(RuleBase):
    def validate(self) -> bool:
        return True

`validate(self)`

Validates if the rule is valid

Returns:

Type	Description
`bool`	wether the rule is valid

Source code in tests/business/entities/rules/test_rule_base.py

def validate(self) -> bool:
    return True

`test_create_rule_base_should_set_defaults()`

Test creating a RuleBase with defaults

Source code in tests/business/entities/rules/test_rule_base.py

def test_create_rule_base_should_set_defaults():
    """Test creating a RuleBase with defaults"""

    # Arrange & Act
    rule = TestRule("test", [])
    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert rule.output_variable_name == "output"
    assert isinstance(rule, RuleBase)

`test_setting_description_of_rule()`

Test setting description of a RuleBase

Source code in tests/business/entities/rules/test_rule_base.py

def test_setting_description_of_rule():
    """Test setting description of a RuleBase"""

    # Arrange & Act
    rule = TestRule("test", [])

    # Assert
    assert rule.description == ""
    rule.description = "foo"
    assert rule.description == "foo"

`test_setting_name_of_rule()`

Test setting name of a RuleBase

Source code in tests/business/entities/rules/test_rule_base.py

def test_setting_name_of_rule():
    """Test setting name of a RuleBase"""

    # Arrange & Act
    rule = TestRule("test", [])

    # Assert
    assert rule.name == "test"
    rule.name = "foo"
    assert rule.name == "foo"

`test_setting_output_variable_name_of_rule()`

Test setting input_variable_names of a RuleBase

Source code in tests/business/entities/rules/test_rule_base.py

def test_setting_output_variable_name_of_rule():
    """Test setting input_variable_names of a RuleBase"""

    # Arrange & Act
    rule = TestRule("test", [])

    # Assert
    assert rule.output_variable_name == "output"
    rule.output_variable_name = "foo"
    assert rule.output_variable_name == "foo"

test_step_function_rule

Tests for Step Function Rule class

`fixture_example_rule()`

Inititaion of StepFunctionRule to be reused in the following tests

Source code in tests/business/entities/rules/test_step_function_rule.py

@pytest.fixture(name="example_rule")
def fixture_example_rule():
    """Inititaion of StepFunctionRule to be reused in the following tests"""
    return StepFunctionRule(
        "step_function_rule_name",
        "input_variable_name",
        [0, 1, 2, 5, 10],
        [10, 11, 12, 15, 20],
    )

`fixture_example_rule_combined()`

Inititation of StepFunctionRule to be reused in the following tests, with differences in increasing an decreasing of the steps.

Source code in tests/business/entities/rules/test_step_function_rule.py

@pytest.fixture(name="example_rule_combined")
def fixture_example_rule_combined():
    """Inititation of StepFunctionRule to be reused in the following tests,
    with differences in increasing an decreasing of the steps."""
    return StepFunctionRule(
        "step_function_rule_name",
        "input_variable_name",
        [0, 1, 2, 5, 10],
        [22, 15, 10, 12, 20],
    )

`test_create_step_function(example_rule)`

Test creating a new (valid) Step Fuction rule

Source code in tests/business/entities/rules/test_step_function_rule.py

def test_create_step_function(example_rule):
    """
    Test creating a new (valid) Step Fuction rule
    """
    # Arrange
    logger = Mock(ILogger)

    # Assert
    assert example_rule.name == "step_function_rule_name"
    assert example_rule.input_variable_names[0] == "input_variable_name"
    assert (example_rule.limits == [0, 1, 2, 5, 10]).all()
    assert isinstance(example_rule, StepFunctionRule)
    assert example_rule.validate(logger)

`test_execute_values_at_limits(example_rule, input_value, expected_output_value)`

Test the function execution with input values exactly at the interval limits.

Source code in tests/business/entities/rules/test_step_function_rule.py

@pytest.mark.parametrize(
    "input_value, expected_output_value",
    [
        (0, (10, [0, 0])),
        (1, (11, [0, 0])),
        (2, (12, [0, 0])),
        (5, (15, [0, 0])),
        (10, (20, [0, 0])),
    ],
)
def test_execute_values_at_limits(
    example_rule, input_value: int, expected_output_value: int
):
    """
    Test the function execution with input values exactly at the interval limits.
    """
    # Arrange
    logger = Mock(ILogger)

    # Assert
    assert example_rule.execute(input_value, logger) == expected_output_value
    logger.log_warning.assert_not_called()

`test_execute_values_between_limits(example_rule, input_value, expected_output_value)`

Test the function execution with input values between the interval limits.

Source code in tests/business/entities/rules/test_step_function_rule.py

@pytest.mark.parametrize(
    "input_value, expected_output_value",
    [
        (0.5, (10, [0, 0])),
        (1.5, (11, [0, 0])),
        (2.5, (12, [0, 0])),
        (5.5, (15, [0, 0])),
    ],
)
def test_execute_values_between_limits(
    example_rule, input_value: int, expected_output_value: int
):
    """
    Test the function execution with input values between the interval limits.
    """
    # Arrange
    logger = Mock(ILogger)

    # Assert
    assert example_rule.execute(input_value, logger) == expected_output_value
    logger.log_warning.assert_not_called()

`test_execute_values_combined_dec_inc(example_rule_combined, input_value, expected_output_value)`

Test the function execution with input values between the interval limits.

Source code in tests/business/entities/rules/test_step_function_rule.py

@pytest.mark.parametrize(
    "input_value, expected_output_value",
    [
        (-1, (22, [1, 0])),
        (0.5, (22, [0, 0])),
        (1.5, (15, [0, 0])),
        (2.5, (10, [0, 0])),
        (5.5, (12, [0, 0])),
        (10.5, (20, [0, 1])),
    ],
)
def test_execute_values_combined_dec_inc(
    example_rule_combined,
    input_value: int,
    expected_output_value: int,
):
    """
    Test the function execution with input values between the interval limits.
    """
    # Arrange
    logger = Mock(ILogger)

    # Assert
    assert example_rule_combined.execute(input_value, logger) == expected_output_value

`test_limits_and_responses_have_different_lengths(example_rule)`

Test the function execution when limits and responses have different lengths

Source code in tests/business/entities/rules/test_step_function_rule.py

def test_limits_and_responses_have_different_lengths(example_rule):
    """
    Test the function execution when limits and responses have different lengths
    """
    # Arrange
    logger = Mock(ILogger)

    # Act
    example_rule._limits = _np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

    # Assert
    assert not example_rule.validate(logger)
    logger.log_error.assert_called_with(
        "The number of limits and of responses must be equal."
    )

`test_limits_must_be_unique(example_rule)`

The ParserStepFunctionRule cannot sort limits if they are not unique. An error message should be sent.

Source code in tests/business/entities/rules/test_step_function_rule.py

def test_limits_must_be_unique(example_rule):
    """The ParserStepFunctionRule cannot sort
    limits if they are not unique. An error message should be sent."""
    # Arrange
    logger = Mock(ILogger)

    # Act
    example_rule._limits = _np.array([0, 1, 2, 5, 1])

    # Assert
    assert not example_rule.validate(logger)
    logger.log_error.assert_called_with("Limits must be unique.")

`test_limits_should_be_ordered(example_rule)`

The ParserStepFunctionRule calculate responses if the limits are not sorted.

Source code in tests/business/entities/rules/test_step_function_rule.py

def test_limits_should_be_ordered(example_rule):
    """The ParserStepFunctionRule calculate responses if the limits
    are not sorted."""
    # Arrange
    logger = Mock(ILogger)

    # Act
    example_rule._limits = _np.array([0, 1, 2, 5, 4])

    # Assert
    assert not example_rule.validate(logger)
    logger.log_error.assert_called_with("The limits should be given in a sorted order.")

test_string_parser_utils

Tests for string parser utilities

`test_read_str_comparison()`

Test function to convert str to comparison and return value

Source code in tests/business/entities/rules/test_string_parser_utils.py

def test_read_str_comparison():
    """Test function to convert str to comparison and return value"""
    assert read_str_comparison(">5", ">") == 5
    assert read_str_comparison("<5", "<") == 5

`test_read_str_comparison_fails(test_string, operator)`

Test if a range in incorrect format gives an error

Source code in tests/business/entities/rules/test_string_parser_utils.py

@pytest.mark.parametrize(
    "test_string, operator",
    [[">=5", ">"], ["5<", "<"], ["<5>", "<"]],
)
def test_read_str_comparison_fails(test_string: str, operator: str):
    """Test if a range in incorrect format gives an error"""
    # Act
    with pytest.raises(ValueError) as exc_info:
        read_str_comparison(test_string, operator)

    exception_raised = exc_info.value

    # Assert
    assert (
        exception_raised.args[0]
        == f'Input "{test_string}" is not a valid comparison with operator: {operator}'
    )

`test_read_str_comparison_fails_multiple_operators(test_string, operator)`

Test if a range in incorrect format gives an error

Source code in tests/business/entities/rules/test_string_parser_utils.py

@pytest.mark.parametrize(
    "test_string, operator",
    [
        ["4", "<"],
        ["<<5", "<"],
        ["5<<", "<"],
        ["<5<", "<"],
    ],
)
def test_read_str_comparison_fails_multiple_operators(test_string: str, operator: str):
    """Test if a range in incorrect format gives an error"""
    # Act
    with pytest.raises(IndexError) as exc_info:
        read_str_comparison(test_string, operator)

    exception_raised = exc_info.value

    # Assert
    assert (
        exception_raised.args[0]
        == f'Input "{test_string}" is not a valid comparison with operator: {operator}'
    )

`test_str_range_to_list()`

Test function to validate range

Source code in tests/business/entities/rules/test_string_parser_utils.py

def test_str_range_to_list():
    """Test function to validate range"""

    # test data
    test_space = "0.5: 5.5"
    test_negative_number = "-3 : 3"

    assert str_range_to_list(test_space) == (0.5, 5.5)
    assert str_range_to_list(test_negative_number) == (-3, 3)

`test_str_range_to_list_fails()`

Test if a range in incorrect format gives an error

Source code in tests/business/entities/rules/test_string_parser_utils.py

def test_str_range_to_list_fails():
    """Test if a range in incorrect format gives an error"""
    # Arrange
    test_string: str = "0 - 5"

    # Act
    with pytest.raises(ValueError) as exc_info:
        str_range_to_list(test_string)

    exception_raised = exc_info.value

    # Assert
    assert exception_raised.args[0] == f'Input "{test_string}" is not a valid range'

`test_type_of_classification(test_string, result)`

Test function to type classification

Source code in tests/business/entities/rules/test_string_parser_utils.py

@pytest.mark.parametrize(
    "test_string, result",
    [
        ["12.34", "number"],
        ["-12.34", "number"],
        ["0", "number"],
        ["-", "NA"],
        [">5", "larger"],
        ["<5", "smaller"],
        [">=5", "larger_equal"],
        ["<=5", "smaller_equal"],
        [5, "number"],
        [-8.0, "number"],
    ],
)
def test_type_of_classification(test_string: str, result: str):
    """Test function to type classification"""
    assert type_of_classification(test_string) == result

`test_type_of_classification_fails(test_string)`

Test function to type classification for failing strings

Source code in tests/business/entities/rules/test_string_parser_utils.py

@pytest.mark.parametrize(
    "test_string",
    [["hello"], [">=5"], ["5<"], [""], ["--"], [":100:199"], ["3:>9"]],
)
def test_type_of_classification_fails(test_string: str):
    """Test function to type classification for failing strings"""
    with pytest.raises(ValueError) as exc_info:
        type_of_classification(test_string)

    exception_raised = exc_info.value
    assert exception_raised.args[0] == f"No valid criteria is given: {test_string}"

test_time_aggregation_rule

Tests for time aggregation rule

`test_aggregate_time_rule_without_time_dimension()`

TimeAggregationRule should give an error when no time dim is defined

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_aggregate_time_rule_without_time_dimension():
    """TimeAggregationRule should give an error when no time dim is defined"""
    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.ADD,
    )

    test_data = [1.2, 0.4]
    test_array = _xr.DataArray(test_data, name="test_with_error")

    with pytest.raises(ValueError) as exc_info:
        rule.execute(test_array, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "No time dimension found for test_with_error"
    assert exception_raised.args[0] == expected_message

`test_create_time_aggregation_rule_should_set_defaults()`

Test creating a time aggregation rule with defaults

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_create_time_aggregation_rule_should_set_defaults():
    """Test creating a time aggregation rule with defaults"""

    # Arrange & Act
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MIN,
    )

    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert isinstance(rule, TimeAggregationRule)

`test_execute_value_array_aggregate_time_monthly_add()`

Aggregate input_variable_names of a TimeAggregationRule (add, monthly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_monthly_add():
    """Aggregate input_variable_names of a TimeAggregationRule (add, monthly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test", input_variable_names=["foo"], operation_type=TimeOperationType.ADD
    )

    rule.settings.time_scale = "month"

    time_aggregation = rule.execute(value_array_monthly, logger)

    result_data = [0.1, 0.9, 0.5]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert (
        _xr.testing.assert_allclose(time_aggregation, result_array, atol=1e-11) is None
    )

`test_execute_value_array_aggregate_time_monthly_average()`

Aggregate input_variable_names of a TimeAggregationRule (average, monthly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_monthly_average():
    """Aggregate input_variable_names of a TimeAggregationRule (average, monthly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.AVERAGE,
    )
    rule.settings.time_scale = "month"

    time_aggregation = rule.execute(value_array_monthly, logger)

    result_data = [0.1, 0.45, 0.25]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert (
        _xr.testing.assert_allclose(time_aggregation, result_array, atol=1e-11) is None
    )

`test_execute_value_array_aggregate_time_monthly_max()`

Aggregate input_variable_names of a TimeAggregationRule (max, monthly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_monthly_max():
    """Aggregate input_variable_names of a TimeAggregationRule (max, monthly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MAX,
    )
    rule.settings.time_scale = "month"

    time_aggregation = rule.execute(value_array_monthly, logger)

    result_data = [0.1, 0.7, 0.3]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert (
        _xr.testing.assert_equal(
            time_aggregation,
            result_array,
        )
        is None
    )

`test_execute_value_array_aggregate_time_monthly_median()`

Test aggregate input_variable_names of a TimeAggregationRule (median, monthly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_monthly_median():
    """Test aggregate input_variable_names of a TimeAggregationRule (median, monthly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MEDIAN,
    )
    rule.settings.time_scale = "month"

    time_aggregation = rule.execute(value_array_monthly, logger)

    result_data = [0.1, 0.45, 0.25]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert (
        _xr.testing.assert_allclose(time_aggregation, result_array, atol=1e-11) is None
    )

`test_execute_value_array_aggregate_time_monthly_min()`

Aggregate input_variable_names of a TimeAggregationRule (min, monthly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_monthly_min():
    """Aggregate input_variable_names of a TimeAggregationRule (min, monthly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MIN,
    )
    rule.settings.time_scale = "month"

    time_aggregation = rule.execute(value_array_monthly, logger)

    result_data = [0.1, 0.2, 0.2]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_aggregation, result_array) is None

`test_execute_value_array_aggregate_time_monthly_percentile()`

Test aggregate input_variable_names of a TimeAggregationRule (PERCENTILE, monthly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_monthly_percentile():
    """Test aggregate input_variable_names of a TimeAggregationRule
    (PERCENTILE, monthly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.PERCENTILE,
    )
    rule.settings.time_scale = "month"
    rule.settings.percentile_value = 10

    time_aggregation = rule.execute(value_array_monthly, logger)
    result_data = [0.1, 0.25, 0.21]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert (
        _xr.testing.assert_allclose(time_aggregation, result_array, atol=1e-11) is None
    )

`test_execute_value_array_aggregate_time_monthly_stdev()`

Test aggregate input_variable_names of a TimeAggregationRule (STDEV, monthly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_monthly_stdev():
    """Test aggregate input_variable_names of a TimeAggregationRule
    (STDEV, monthly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.STDEV,
    )
    rule.settings.time_scale = "month"

    time_aggregation = rule.execute(value_array_monthly, logger)
    result_data = [0.0, 0.25, 0.05]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert (
        _xr.testing.assert_allclose(time_aggregation, result_array, atol=1e-11) is None
    )

`test_execute_value_array_aggregate_time_yearly_add()`

Aggregate input_variable_names of a TimeAggregationRule (add, yearly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_yearly_add():
    """Aggregate input_variable_names of a TimeAggregationRule (add, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.ADD,
    )

    time_aggregation = rule.execute(value_array_yearly, logger)

    result_data = [1.2, 0.4]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_yearly], dims=["time_year"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_aggregation, result_array) is None

`test_execute_value_array_aggregate_time_yearly_average()`

Aggregate input_variable_names of a TimeAggregationRule (average, yearly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_yearly_average():
    """Aggregate input_variable_names of a TimeAggregationRule (average, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.AVERAGE,
    )

    time_aggregation = rule.execute(value_array_yearly, logger)

    result_data = [0.3, 0.2]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_yearly], dims=["time_year"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_aggregation, result_array) is None

`test_execute_value_array_aggregate_time_yearly_max()`

Aggregate input_variable_names of a TimeAggregationRule (max, yearly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_yearly_max():
    """Aggregate input_variable_names of a TimeAggregationRule (max, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MAX,
    )

    time_aggregation = rule.execute(value_array_yearly, logger)

    result_data = [0.7, 0.3]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_yearly], dims=["time_year"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_aggregation, result_array) is None

`test_execute_value_array_aggregate_time_yearly_median()`

Test aggregate input_variable_names of a TimeAggregationRule (median, yearly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_yearly_median():
    """Test aggregate input_variable_names of a TimeAggregationRule (median, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MEDIAN,
    )

    time_aggregation = rule.execute(value_array_yearly, logger)

    result_data = [0.2, 0.2]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_yearly], dims=["time_year"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_aggregation, result_array) is None

`test_execute_value_array_aggregate_time_yearly_min()`

Aggregate input_variable_names of a TimeAggregationRule (min, yearly)

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_execute_value_array_aggregate_time_yearly_min():
    """Aggregate input_variable_names of a TimeAggregationRule (min, yearly)"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.MIN,
    )

    time_aggregation = rule.execute(value_array_yearly, logger)

    result_data = [0.1, 0.1]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_yearly], dims=["time_year"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_aggregation, result_array) is None

`test_operation_type_not_implemented()`

Test that the time aggregation rule gives an error if no operation_type is given

Source code in tests/business/entities/rules/test_time_aggregation_rule.py

def test_operation_type_not_implemented():
    """Test that the time aggregation rule gives an error
    if no operation_type is given"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type="test",
    )
    rule.settings.time_scale = "month"

    with pytest.raises(NotImplementedError) as exc_info:
        rule.execute(value_array_monthly, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "The operation type 'test' is currently not supported"
    assert exception_raised.args[0] == expected_message

test_time_aggregation_rule_analyze_periods

Tests for time aggregation rule for operation types: - COUNT_PERIODS - MAX_DURATION_PERIODS - AVG_DURATION_PERIODS

`test_analyze_groups_function(operation_type, expected_result_data)`

Test the count_groups to count groups for several examples.

This function is being used when 'count_periods' is given as aggregation in the TimeAggregationRule. The result should be aggregated per year. The count_periods should result in a number of the groups with value 1. This test should show that the count_periods accounts for begin and end of the year.

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

@pytest.mark.parametrize(
    "operation_type, expected_result_data",
    [
        ("COUNT_PERIODS", [2, 2, 2, 2]),
        ("MAX_DURATION_PERIODS", [2, 2, 3, 3]),
        ("AVG_DURATION_PERIODS", [1.5, 1.5, 2, 2]),
    ],
)
def test_analyze_groups_function(operation_type, expected_result_data):
    """Test the count_groups to count groups for several examples.

    This function is being used when 'count_periods' is given
      as aggregation in the TimeAggregationRule.
    The result should be aggregated per year.
    The count_periods should result in a number of the groups with value 1.
    This test should show that the count_periods accounts for begin and end of the year.
    """
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType[operation_type],
    )
    t_data = [0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    input_array = _xr.DataArray(t_data, coords=[t_time], dims=["time"])
    result = input_array.resample(time="YE").reduce(rule.analyze_groups)

    # expected results
    expected_result_time = ["2000-12-31", "2001-12-31", "2002-12-31", "2003-12-31"]
    expected_result_time = [_np.datetime64(t) for t in expected_result_time]
    expected_result = _xr.DataArray(
        expected_result_data, coords=[expected_result_time], dims=["time"]
    )

    assert _xr.testing.assert_equal(expected_result, result) is None

`test_analyze_groups_function_2d(operation_type, expected_result_data)`

Test if functional for 2d arrays

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

@pytest.mark.parametrize(
    "operation_type, expected_result_data",
    [
        ("COUNT_PERIODS", [[2, 2, 2, 2], [1, 2, 2, 2], [2, 1, 2, 2]]),
        ("MAX_DURATION_PERIODS", [[2, 2, 3, 3], [1, 2, 3, 3], [2, 2, 3, 3]]),
        ("AVG_DURATION_PERIODS", [[1.5, 1.5, 2, 2], [1, 1.5, 2, 2], [1.5, 2, 2, 2]]),
    ],
)
def test_analyze_groups_function_2d(operation_type, expected_result_data):
    """Test if functional for 2d arrays"""
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType[operation_type],
    )

    t_data = [
        [0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
        [0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
        [0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
    ]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    t_cells = [0, 1, 2]
    input_array = _xr.DataArray(
        t_data, coords=[t_cells, t_time], dims=["cells", "time"]
    )
    result = input_array.resample(time="YE").reduce(rule.analyze_groups)

    # expected results
    expected_result_time = ["2000-12-31", "2001-12-31", "2002-12-31", "2003-12-31"]
    expected_result_time = [_np.datetime64(t) for t in expected_result_time]
    expected_result = _xr.DataArray(
        expected_result_data,
        coords=[t_cells, expected_result_time],
        dims=["cells", "time"],
    )

    assert _xr.testing.assert_equal(expected_result, result) is None

`test_analyze_groups_function_no_periods(operation_type, expected_result_data)`

Test the count_groups to count groups for several examples.

This function is being used when 'count_periods' is given as aggregation in the TimeAggregationRule. The result should be aggregated per year. The count_periods should result in a number of the groups with value 1. This test should show that the count_periods accounts for begin and end of the year.

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

@pytest.mark.parametrize(
    "operation_type, expected_result_data",
    [
        ("COUNT_PERIODS", [0, 1, 0, 0]),
        ("MAX_DURATION_PERIODS", [0, 1, 0, 0]),
        ("AVG_DURATION_PERIODS", [0, 1, 0, 0]),
    ],
)
def test_analyze_groups_function_no_periods(operation_type, expected_result_data):
    """Test the count_groups to count groups for several examples.

    This function is being used when 'count_periods' is given
      as aggregation in the TimeAggregationRule.
    The result should be aggregated per year.
    The count_periods should result in a number of the groups with value 1.
    This test should show that the count_periods accounts for begin and end of the year.
    """
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType[operation_type],
    )
    t_data = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    input_array = _xr.DataArray(t_data, coords=[t_time], dims=["time"])
    result = input_array.resample(time="YE").reduce(rule.analyze_groups)

    # expected results
    expected_result_time = ["2000-12-31", "2001-12-31", "2002-12-31", "2003-12-31"]
    expected_result_time = [_np.datetime64(t) for t in expected_result_time]
    expected_result = _xr.DataArray(
        expected_result_data, coords=[expected_result_time], dims=["time"]
    )

    assert _xr.testing.assert_equal(expected_result, result) is None

`test_analyze_groups_function_not_only_1_and_0()`

Test whether it gives an error if the data array contains other values than 0 and 1

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

def test_analyze_groups_function_not_only_1_and_0():
    """Test whether it gives an error if the data array contains
    other values than 0 and 1"""
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.COUNT_PERIODS,
    )
    t_data = [2, 3, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    input_array = _xr.DataArray(t_data, coords=[t_time], dims=["time"])

    # Act
    with pytest.raises(ValueError) as exc_info:
        rule.execute(input_array, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = (
        "The value array for the time aggregation rule with operation type"
        " COUNT_PERIODS should only contain the values 0 and 1 (or NaN)."
    )
    assert exception_raised.args[0] == expected_message

`test_analyze_groups_function_only_1_and_0_and_nan()`

Test whether it gives an error if the data array contains other values than 0 and 1

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

def test_analyze_groups_function_only_1_and_0_and_nan():
    """Test whether it gives an error if the data array contains
    other values than 0 and 1"""
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.COUNT_PERIODS,
    )
    t_data = [1, _np.nan, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    input_array = _xr.DataArray(t_data, coords=[t_time], dims=["time"])

    # Act
    rule.execute(input_array, logger)

    # Assert
    assert rule.validate(logger)

`test_analyze_groups_function_only_nan(operation_type, expected_result_data)`

Test the count_groups to count groups for several examples including NaN values.

This function is being used when 'count_periods' is given as aggregation in the TimeAggregationRule. The result should be aggregated per year. The count_periods should result in a number of the groups with value 1. This test should show that the count_periods accounts for begin and end of the year.

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

@pytest.mark.parametrize(
    "operation_type, expected_result_data",
    [
        ("COUNT_PERIODS", [0, 0, 0, 0]),
        ("MAX_DURATION_PERIODS", [0, 0, 0, 0]),
        ("AVG_DURATION_PERIODS", [0, 0, 0, 0]),
    ],
)
def test_analyze_groups_function_only_nan(operation_type, expected_result_data):
    """Test the count_groups to count groups for several examples including NaN values.

    This function is being used when 'count_periods' is given
      as aggregation in the TimeAggregationRule.
    The result should be aggregated per year.
    The count_periods should result in a number of the groups with value 1.
    This test should show that the count_periods accounts for begin and end of the year.
    """
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType[operation_type],
    )
    t_data = [
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
        _np.nan,
    ]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    input_array = _xr.DataArray(t_data, coords=[t_time], dims=["time"])
    result = input_array.resample(time="YE").reduce(rule.analyze_groups)

    # expected results
    expected_result_time = ["2000-12-31", "2001-12-31", "2002-12-31", "2003-12-31"]
    expected_result_time = [_np.datetime64(t) for t in expected_result_time]
    expected_result = _xr.DataArray(
        expected_result_data, coords=[expected_result_time], dims=["time"]
    )

    assert _xr.testing.assert_equal(expected_result, result) is None

`test_analyze_groups_function_with_nan(operation_type, expected_result_data)`

Test the count_groups to count groups for several examples including NaN values.

This function is being used when 'count_periods' is given as aggregation in the TimeAggregationRule. The result should be aggregated per year. The count_periods should result in a number of the groups with value 1. This test should show that the count_periods accounts for begin and end of the year.

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

@pytest.mark.parametrize(
    "operation_type, expected_result_data",
    [
        ("COUNT_PERIODS", [2, 2, 2, 2]),
        ("MAX_DURATION_PERIODS", [1, 1, 2, 2]),
        ("AVG_DURATION_PERIODS", [1, 1, 1.5, 1.5]),
    ],
)
def test_analyze_groups_function_with_nan(operation_type, expected_result_data):
    """Test the count_groups to count groups for several examples including NaN values.

    This function is being used when 'count_periods' is given
      as aggregation in the TimeAggregationRule.
    The result should be aggregated per year.
    The count_periods should result in a number of the groups with value 1.
    This test should show that the count_periods accounts for begin and end of the year.
    """
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType[operation_type],
    )
    t_data = [
        0,
        1,
        0,
        1,
        _np.nan,
        1,
        0,
        1,
        _np.nan,
        0,
        1,
        0,
        1,
        1,
        _np.nan,
        _np.nan,
        1,
        1,
        0,
        1,
    ]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    input_array = _xr.DataArray(t_data, coords=[t_time], dims=["time"])
    result = input_array.resample(time="YE").reduce(rule.analyze_groups)

    # expected results
    expected_result_time = ["2000-12-31", "2001-12-31", "2002-12-31", "2003-12-31"]
    expected_result_time = [_np.datetime64(t) for t in expected_result_time]
    expected_result = _xr.DataArray(
        expected_result_data, coords=[expected_result_time], dims=["time"]
    )

    assert _xr.testing.assert_equal(expected_result, result) is None

`test_count_groups_function_3d(operation_type, expected_result_data)`

Test if functional for multiple dimensions

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

@pytest.mark.parametrize(
    "operation_type, expected_result_data",
    [
        (
            "COUNT_PERIODS",
            [
                [[2, 2, 2, 2], [1, 2, 2, 2], [2, 1, 2, 2]],
                [[2, 2, 2, 2], [1, 2, 2, 2], [2, 1, 2, 2]],
            ],
        ),
        (
            "MAX_DURATION_PERIODS",
            [
                [[2, 2, 3, 3], [1, 2, 3, 3], [2, 2, 3, 3]],
                [[2, 2, 3, 3], [1, 2, 3, 3], [2, 2, 3, 3]],
            ],
        ),
        (
            "AVG_DURATION_PERIODS",
            [
                [[1.5, 1.5, 2, 2], [1, 1.5, 2, 2], [1.5, 2, 2, 2]],
                [[1.5, 1.5, 2, 2], [1, 1.5, 2, 2], [1.5, 2, 2, 2]],
            ],
        ),
    ],
)
def test_count_groups_function_3d(operation_type, expected_result_data):
    """Test if functional for multiple dimensions"""
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType[operation_type],
    )

    t_data = [
        [
            [0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
            [0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
            [0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
        ],
        [
            [0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
            [0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
            [0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1],
        ],
    ]
    t_time = [
        "2000-01-01",
        "2000-01-02",
        "2000-01-03",
        "2000-01-04",
        "2000-01-05",
        "2001-01-01",
        "2001-01-02",
        "2001-01-03",
        "2001-01-04",
        "2001-01-05",
        "2002-01-01",
        "2002-01-02",
        "2002-01-03",
        "2002-01-04",
        "2002-01-05",
        "2003-01-01",
        "2003-01-02",
        "2003-01-03",
        "2003-01-04",
        "2003-01-05",
    ]
    t_time = [_np.datetime64(t) for t in t_time]
    t_cells = [0, 1, 2]
    t_cols = [0, 1]
    input_array = _xr.DataArray(
        t_data, coords=[t_cols, t_cells, t_time], dims=["cols", "cells", "time"]
    )
    result = input_array.resample(time="YE").reduce(rule.analyze_groups)

    # expected results
    expected_result_time = ["2000-12-31", "2001-12-31", "2002-12-31", "2003-12-31"]
    expected_result_time = [_np.datetime64(t) for t in expected_result_time]
    expected_result = _xr.DataArray(
        expected_result_data,
        coords=[t_cols, t_cells, expected_result_time],
        dims=["cols", "cells", "time"],
    )

    assert _xr.testing.assert_equal(expected_result, result) is None

`test_create_time_aggregation_rule_should_set_defaults()`

Test creating a time aggregation rule with defaults

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

def test_create_time_aggregation_rule_should_set_defaults():
    """Test creating a time aggregation rule with defaults"""

    # Arrange & Act
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.COUNT_PERIODS,
    )

    # Assert
    assert rule.name == "test"
    assert rule.description == ""
    assert isinstance(rule, TimeAggregationRule)
    assert rule.settings.operation_type == TimeOperationType.COUNT_PERIODS
    assert rule.settings.time_scale == "year"
    assert rule.settings.time_scale_mapping == {"month": "ME", "year": "YE"}

`test_execute_value_array_condition_time_monthly_count_periods()`

Test the TimeAggregationRule to count periods per month

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

def test_execute_value_array_condition_time_monthly_count_periods():
    """Test the TimeAggregationRule to count periods per month"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.COUNT_PERIODS,
    )
    rule.settings.time_scale = "month"

    time_condition = rule.execute(value_array_monthly, logger)

    result_data = [0, 1, 1]
    result_array = _xr.DataArray(
        result_data, coords=[result_time_monthly], dims=["time_month"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_condition, result_array) is None

`test_execute_value_array_condition_time_yearly_count_periods()`

Test the TimeAggregationRule to count periods per year

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

def test_execute_value_array_condition_time_yearly_count_periods():
    """Test the TimeAggregationRule to count periods per year"""

    # create test set
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["dry"],
        operation_type=TimeOperationType.COUNT_PERIODS,
    )
    rule.output_variable_name = "number_of_dry_periods"

    assert isinstance(rule, TimeAggregationRule)
    time_condition = rule.execute(test_array_yearly, logger)

    result_array = _xr.DataArray(
        result_data_yearly, coords=[result_time_yearly], dims=["time_year"]
    )

    # Assert
    assert _xr.testing.assert_equal(time_condition, result_array) is None

`test_validation_when_not_valid()`

Test if the rule is validated properly

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

def test_validation_when_not_valid():
    """Test if the rule is validated properly"""
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.COUNT_PERIODS,
    )
    rule.settings.time_scale = "awhile"

    valid = rule.validate(logger)
    allowed_time_scales = rule.settings.time_scale_mapping.keys()
    options = ",".join(allowed_time_scales)
    logger.log_error.assert_called_with(
        f"The provided time scale '{rule.settings.time_scale}' "
        f"of rule '{rule.name}' is not supported.\n"
        f"Please select one of the following types: "
        f"{options}"
    )
    assert not valid

`test_validation_when_valid()`

Test if the rule is validated properly

Source code in tests/business/entities/rules/test_time_aggregation_rule_analyze_periods.py

def test_validation_when_valid():
    """Test if the rule is validated properly"""
    logger = Mock(ILogger)
    rule = TimeAggregationRule(
        name="test",
        input_variable_names=["foo"],
        operation_type=TimeOperationType.COUNT_PERIODS,
    )
    rule.settings.time_scale = "month"

    valid = rule.validate(logger)
    assert valid

test_rule_based_model

Tests for RuleBasedModel class

`test_create_rule_based_model_with_defaults()`

Test that the default properties of a rule-based model are set when creating the model using the default constructor.

Source code in tests/business/entities/test_rule_based_model.py

def test_create_rule_based_model_with_defaults():
    """Test that the default properties of a rule-based model
    are set when creating the model using the default constructor."""

    # Arrange
    rule = Mock(IRule)
    dataset = Mock(IDatasetData)

    # Act
    model = RuleBasedModel([dataset], [rule])

    # Assert

    assert isinstance(model, RuleBasedModel)
    assert model.name == "Rule-Based model"
    assert rule in model.rules
    assert dataset in model.input_datasets
    assert model.status == ModelStatus.CREATED

`test_error_executing_model_with_processor_none()`

Tests the error thrown when the processor of a rule based model is None.

Source code in tests/business/entities/test_rule_based_model.py

def test_error_executing_model_with_processor_none():
    """
    Tests the error thrown when the processor of a rule based model is None.
    """
    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])
    rule: IRule = Mock(IRule)
    logger = Mock(ILogger)
    model = RuleBasedModel([dataset], [rule])
    model._rule_processor = None

    # Act
    with pytest.raises(RuntimeError) as exc_info:
        model.execute(logger)
    exception_raised = exc_info.value

    # Assert
    expected_message = "Processor is not set, please initialize model."
    assert exception_raised.args[0] == expected_message

`test_error_initializing_rule_based_model()`

Tests if the error message sent when initializing a rule based model fails.

Source code in tests/business/entities/test_rule_based_model.py

def test_error_initializing_rule_based_model():
    """Tests if the error message sent when initializing a rule based model fails."""
    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])
    dataset["test"].attrs = {"cf_role": "mesh_topology"}
    rule: IRule = Mock(IRule)
    rule.input_variable_names = ["unknown_var"]  # ["unknown_var"]
    rule.name = "rule with unknown var"
    model = RuleBasedModel([dataset], [rule])
    logger = Mock(ILogger)

    # Act
    model.initialize(logger)

    # Assert
    logger.log_error.assert_called_with("Initialization failed.")

`test_run_rule_based_model()`

Test if the model can correctly run the given rules and adds the calculated results"

   +------+

test --| R1 |-- out1 --+ +------+ | +-----+ +--| | | R3 |-- out3 +--| | +------+ | +-----+ test --| R2 |-- out2 --+ +------+

Source code in tests/business/entities/test_rule_based_model.py

def test_run_rule_based_model():
    """Test if the model can correctly run the given
    rules and adds the calculated results"

           +------+
    test --|  R1  |-- out1 --+
           +------+          |  +-----+
                             +--|     |
                                |  R3 |-- out3
                             +--|     |
           +------+          |  +-----+
    test --|  R2  |-- out2 --+
           +------+
    """
    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])
    dataset["test"].attrs = {"cf_role": "mesh_topology"}

    logger = Mock(ILogger)
    rule1 = Mock(IArrayBasedRule, id="rule1")
    rule2 = Mock(IArrayBasedRule, id="rule2")
    rule3 = Mock(IMultiArrayBasedRule, id="rule3")

    rule1.input_variable_names = ["test"]
    rule2.input_variable_names = ["test"]
    rule3.input_variable_names = ["out1", "out2"]

    rule1.output_variable_name = "out1"
    rule2.output_variable_name = "out2"
    rule3.output_variable_name = "out3"

    rule1.execute.return_value = _xr.DataArray([32, 94, 9])
    rule2.execute.return_value = _xr.DataArray([32, 94, 9])
    rule3.execute.return_value = _xr.DataArray([32, 94, 9])

    model = RuleBasedModel([dataset], [rule1, rule2, rule3])

    # Act
    assert model.validate(logger)
    model.initialize(logger)
    model.execute(logger)
    model.finalize(logger)

    # Assert
    assert "out1" in model.output_dataset.keys()
    assert "out2" in model.output_dataset.keys()
    assert "out3" in model.output_dataset.keys()

`test_status_setter()`

Test if status is correctly set for a model

Source code in tests/business/entities/test_rule_based_model.py

def test_status_setter():
    """Test if status is correctly set for a model"""

    # Arrange
    rule = Mock(IRule)
    dataset = Mock(IDatasetData)
    logger = Mock(ILogger)

    # Act
    model = RuleBasedModel([dataset], [rule], logger)

    assert model.status == ModelStatus.CREATED
    model.status = ModelStatus.EXECUTED
    assert model.status == ModelStatus.EXECUTED

`test_validation_of_rule_based_model()`

Test if the model correctly validates for required parameters (datasets, rules)

Source code in tests/business/entities/test_rule_based_model.py

def test_validation_of_rule_based_model():
    """Test if the model correctly validates for required
    parameters (datasets, rules)
    """

    # Arrange
    rule = Mock(IRule)
    dataset = _xr.Dataset()
    logger = Mock(ILogger)

    dataset["test"] = _xr.DataArray([32, 94, 9])
    dataset["test"].attrs = {"cf_role": "mesh_topology"}

    rule.input_variable_names = ["input"]
    rule.output_variable_name = "output"

    mapping_usual = {"test": "input"}
    model_usual = RuleBasedModel([dataset], [rule], mapping_usual)

    map_to_itself = {"test": "test"}
    model_map_to_itself = RuleBasedModel([dataset], [rule], map_to_itself)

    map_non_existing_var = {"non_existing_var": "input"}
    model_map_non_existing_var = RuleBasedModel([dataset], [rule], map_non_existing_var)

    map_to_wrong_var = {"test": "incorrect_var"}
    model_map_to_wrong_var = RuleBasedModel([dataset], [rule], map_to_wrong_var)

    map_from_non_existing_var_to_wrong_var = {"non_existing_var": "incorrect_var"}
    model_map_from_non_existing_var_to_wrong_var = RuleBasedModel(
        [dataset], [rule], map_from_non_existing_var_to_wrong_var
    )

    model_no_rules_and_datasets = RuleBasedModel([], [])
    model_no_rules = RuleBasedModel([dataset], [])
    model_no_datasets_model = RuleBasedModel([], [rule])

    # Act & Assert
    assert model_usual.validate(logger)
    assert not model_map_to_itself.validate(logger)
    assert not model_map_non_existing_var.validate(logger)
    assert not model_map_to_wrong_var.validate(logger)
    assert not model_map_from_non_existing_var_to_wrong_var.validate(logger)
    assert not model_no_rules_and_datasets.validate(logger)
    assert not model_no_rules.validate(logger)
    assert not model_no_datasets_model.validate(logger)

`test_validation_of_rule_based_model_rule_dependencies()`

Test if the model correctly validates the given rules for dependencies"

Source code in tests/business/entities/test_rule_based_model.py

def test_validation_of_rule_based_model_rule_dependencies():
    """Test if the model correctly validates the given
    rules for dependencies"
    """
    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])
    rule: IRule = Mock(IRule)
    logger = Mock(ILogger)

    rule.validate.return_value = True

    model = RuleBasedModel([dataset], [rule])

    # Act & Assert
    assert model.validate(logger)
    rule.validate.assert_called_once_with(logger)

test_rule_processor

Tests for RuleBasedModel class

`fixture_example_rule()`

Inititaion of StepFunctionRule to be reused in the following tests

Source code in tests/business/entities/test_rule_processor.py

@pytest.fixture(name="example_rule")
def fixture_example_rule():
    """Inititaion of StepFunctionRule to be reused in the following tests"""
    return StepFunctionRule(
        "rule_name",
        "input_variable_name",
        [0, 1, 2, 5, 10],
        [10, 11, 12, 15, 20],
    )

`test_creating_rule_processor_without_input_datasets_should_throw_exception()`

Tests if input datasets are correctly checked during creation of the processor.

Source code in tests/business/entities/test_rule_processor.py

def test_creating_rule_processor_without_input_datasets_should_throw_exception():
    """
    Tests if input datasets are correctly checked during creation of the processor.
    """

    # Arrange
    rule = Mock(IRule)

    # Act
    with pytest.raises(ValueError) as exc_info:
        RuleProcessor([rule], None)

    exception_raised = exc_info.value

    # Assert
    expected_message = "No datasets defined."
    assert exception_raised.args[0] == expected_message

`test_creating_rule_processor_without_rules_should_throw_exception()`

Tests if absence of rules is correctly checked during creation of the processor.

Source code in tests/business/entities/test_rule_processor.py

def test_creating_rule_processor_without_rules_should_throw_exception():
    """
    Tests if absence of rules is correctly checked during creation of the processor.
    """

    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])

    rules = []

    # Act
    with pytest.raises(ValueError) as exc_info:
        RuleProcessor(rules, dataset)

    exception_raised = exc_info.value

    # Assert
    expected_message = "No rules defined."
    assert exception_raised.args[0] == expected_message

`test_execute_rule_throws_error_for_unknown_input_variable()`

Tests that trying to execute a rule with an unknown input variable throws an error, and the error message.

Source code in tests/business/entities/test_rule_processor.py

def test_execute_rule_throws_error_for_unknown_input_variable():
    """Tests that trying to execute a rule with an unknown input variable
    throws an error, and the error message."""

    # Arrange
    output_dataset = _xr.Dataset()
    input_array = _xr.DataArray([32, 94, 9])

    output_dataset["test"] = input_array

    logger = Mock(ILogger)
    rule = Mock(IRule)

    rule.name = "test"
    rule.input_variable_names = ["unexisting"]
    rule.output_variable_name = "output"

    processor = RuleProcessor([rule], output_dataset)

    # Act
    with pytest.raises(KeyError) as exc_info:
        processor._execute_rule(rule, output_dataset, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = (
        f"Key {rule.input_variable_names[0]} was not found "
        + "in input datasets or in calculated output dataset."
    )
    assert exception_raised.args[0] == expected_message

`test_initialization_for_different_rule_dependencies(indices_to_remove, expected_result)`

Tests if the processor can initialize given the rule dependencies.

Source code in tests/business/entities/test_rule_processor.py

@pytest.mark.parametrize(
    "indices_to_remove, expected_result",
    [
        [[0], False],
        [[1], False],
        [[2], False],
        [[3], True],
        [[2, 3], True],
        [[1, 2, 3], True],
    ],
)
def test_initialization_for_different_rule_dependencies(
    indices_to_remove: List[int], expected_result: bool
):
    """Tests if the processor can initialize given the rule dependencies."""

    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])

    logger = Mock(ILogger)
    rules = _create_test_rules()
    processor = RuleProcessor(rules, dataset)

    rules_to_remove = [rules[index] for index in indices_to_remove]

    # remove rules
    for rule in rules_to_remove:
        rules.remove(rule)

    # Act & Assert
    assert expected_result == processor.initialize(logger)

`test_initialization_given_rule_dependencies()`

Tests if the processor can correctly initialize given the rule dependencies.

Source code in tests/business/entities/test_rule_processor.py

def test_initialization_given_rule_dependencies():
    """Tests if the processor can correctly initialize given
    the rule dependencies.
    """

    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])

    logger = Mock(ILogger)
    rules = _create_test_rules()
    processor = RuleProcessor(rules, dataset)

    # Act & Assert
    assert processor.initialize(logger)

`test_process_rules_calls_array_based_rule_execute_correctly()`

Tests if during processing the rule its execute method of an IArrayBasedRule is called with the right parameter.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_calls_array_based_rule_execute_correctly():
    """Tests if during processing the rule its execute method of
    an IArrayBasedRule is called with the right parameter."""

    # Arrange
    output_dataset = _xr.Dataset()
    input_array = _xr.DataArray([32, 94, 9])

    output_dataset["test"] = input_array

    logger = Mock(ILogger)
    rule = Mock(IArrayBasedRule)

    rule.input_variable_names = ["test"]
    rule.output_variable_name = "output"
    rule.execute.return_value = _xr.DataArray([4, 3, 2])

    processor = RuleProcessor([rule], output_dataset)

    # Act
    assert processor.initialize(logger)
    processor.process_rules(output_dataset, logger)

    # Assert
    assert len(output_dataset) == 2
    assert rule.output_variable_name in output_dataset.keys()

    rule.execute.assert_called_once_with(ANY, logger)

    # get first call, first argument
    array: _xr.DataArray = rule.execute.call_args[0][0]

    _xr.testing.assert_equal(array, input_array)

`test_process_rules_calls_cell_based_rule_execute_correctly()`

Tests if during processing the rule its execute method of an ICellBasedRule is called with the right parameter.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_calls_cell_based_rule_execute_correctly():
    """Tests if during processing the rule its execute method of
    an ICellBasedRule is called with the right parameter."""

    # Arrange
    dataset = _xr.Dataset()
    input_array = _xr.DataArray(_np.array([[1, 2, 3], [4, 5, 6]], _np.int32))

    dataset["test"] = input_array

    logger = Mock(ILogger)
    rule = Mock(ICellBasedRule)

    rule.input_variable_names = ["test"]
    rule.output_variable_name = "output"

    # expected return value = 1; number of warnings (min and max) = 0 and 0
    rule.execute.return_value = [1, [0, 0]]

    processor = RuleProcessor([rule], dataset)

    # Act
    assert processor.initialize(logger)
    processor.process_rules(dataset, logger)

    # Assert
    assert len(dataset) == 2
    assert rule.output_variable_name in dataset.keys()

    assert rule.execute.call_count == 6

`test_process_rules_calls_multi_array_based_rule_execute_correctly()`

Tests if during processing the rule its execute method of an IMultiArrayBasedRule is called with the right parameters.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_calls_multi_array_based_rule_execute_correctly():
    """Tests if during processing the rule its execute method of
    an IMultiArrayBasedRule is called with the right parameters."""

    # Arrange
    dataset = _xr.Dataset()
    array1 = _xr.DataArray([32, 94, 9])
    array2 = _xr.DataArray([7, 93, 6])

    dataset["test1"] = array1
    dataset["test2"] = array2

    logger = Mock(ILogger)
    rule = Mock(IMultiArrayBasedRule)

    rule.input_variable_names = ["test1", "test2"]
    rule.output_variable_name = "output"
    rule.execute.return_value = _xr.DataArray([4, 3, 2])

    processor = RuleProcessor([rule], dataset)

    # Act
    assert processor.initialize(logger)
    processor.process_rules(dataset, logger)

    # Assert
    assert len(dataset) == 3
    assert rule.output_variable_name in dataset.keys()

    rule.execute.assert_called_once_with(ANY, logger)

    # get first call, first argument
    array_lookup: Dict[str, _xr.DataArray] = rule.execute.call_args[0][0]

    _xr.testing.assert_equal(array_lookup["test1"], array1)
    _xr.testing.assert_equal(array_lookup["test2"], array2)

`test_process_rules_calls_multi_cell_based_fails_with_different_dims()`

MultiCellBasedRule allows for values with less dimensions, but not with different dimensions.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_calls_multi_cell_based_fails_with_different_dims():
    """MultiCellBasedRule allows for values with less dimensions, but not
    with different dimensions."""

    # Arrange
    dataset = _xr.Dataset()
    input_array1 = _xr.DataArray(
        _np.array([1, 2], _np.int32),
        dims=["x"],
        coords={"x": [0, 1]},
    )
    input_array2 = _xr.DataArray(
        _np.array([1, 2], _np.int32),
        dims=["y"],
        coords={"y": [0, 1]},
    )

    dataset["test1"] = input_array1
    dataset["test2"] = input_array2

    logger = Mock(ILogger)
    rule = Mock(IMultiCellBasedRule)
    rule.name = "test_rule"
    rule.input_variable_names = ["test1", "test2"]
    rule.output_variable_name = "output"

    rule.execute.return_value = 1
    processor = RuleProcessor([rule], dataset)

    processor.initialize(logger)

    # Act
    with pytest.raises(NotImplementedError) as exc_info:
        processor.process_rules(dataset, logger)
    exception_raised = exc_info.value

    # Assert
    expected = f"Can not execute rule {rule.name} with variables with different \
                    dimensions. Variable test1 with dimensions:('x',) is \
                    different than test2 with dimensions:('y',)"
    assert exception_raised.args[0] == expected

`test_process_rules_calls_multi_cell_based_rule_execute_correctly()`

Tests if during processing the rule its execute method of an IMultiCellBasedRule is called with the right parameter.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_calls_multi_cell_based_rule_execute_correctly():
    """Tests if during processing the rule its execute method of
    an IMultiCellBasedRule is called with the right parameter."""

    # Arrange
    dataset = _xr.Dataset()
    input_array1 = _xr.DataArray(_np.array([[1, 2, 3], [4, 5, 6]], _np.int32))
    input_array2 = _xr.DataArray(_np.array([[1, 2, 3], [4, 5, 6]], _np.int32))

    dataset["test1"] = input_array1
    dataset["test2"] = input_array2

    logger = Mock(ILogger)
    rule = Mock(IMultiCellBasedRule)

    rule.input_variable_names = ["test1", "test2"]
    rule.output_variable_name = "output"

    rule.execute.return_value = 1

    processor = RuleProcessor([rule], dataset)

    # Act
    assert processor.initialize(logger)
    processor.process_rules(dataset, logger)

    # Assert
    assert len(dataset) == 3
    assert rule.output_variable_name in dataset.keys()

    assert rule.execute.call_count == 6

`test_process_rules_calls_multi_cell_based_rule_special_cases(input_array1, input_array2, dims)`

Some exceptional cases need to be tested for the multi_cell rule: 1. variables with different dimensions (1D vs 2D) 2. variables with different dimensions (1D vs 3D)

Source code in tests/business/entities/test_rule_processor.py

@pytest.mark.parametrize(
    "input_array1, input_array2, dims",
    [
        (
            _xr.DataArray(
                _np.array([1, 2], _np.int32),
                dims=["x"],
                coords={"x": [0, 1]},
            ),
            _xr.DataArray(
                _np.array([[1, 2], [3, 4]], _np.int32),
                dims=["x", "y"],
                coords={"x": [0, 1], "y": [0, 1]},
            ),
            {"x": 2, "y": 2},
        ),
        (
            _xr.DataArray(
                _np.array([1, 2], _np.int32),
                dims=["x"],
                coords={"x": [0, 1]},
            ),
            _xr.DataArray(
                _np.array([[[1, 2], [3, 4]], [[1, 2], [3, 4]]], _np.int32),
                dims=["x", "y", "z"],
                coords={"x": [0, 1], "y": [0, 1], "z": [0, 1]},
            ),
            {"x": 2, "y": 2, "z": 2},
        ),
    ],
)
def test_process_rules_calls_multi_cell_based_rule_special_cases(
    input_array1, input_array2, dims
):
    """Some exceptional cases need to be tested for the multi_cell rule:
    1. variables with different dimensions (1D vs 2D)
    2. variables with different dimensions (1D vs 3D)"""

    # Arrange
    dataset = _xr.Dataset()

    dataset["test1"] = input_array1
    dataset["test2"] = input_array2

    logger = Mock(ILogger)
    rule = Mock(IMultiCellBasedRule)

    rule.input_variable_names = ["test1", "test2"]
    rule.output_variable_name = "output"

    rule.execute.return_value = 1

    processor = RuleProcessor([rule], dataset)

    # Act
    assert processor.initialize(logger)
    output_dataset = processor.process_rules(dataset, logger)

    # Assert
    print(output_dataset.output, output_dataset.dims, output_dataset.dims == dims)
    assert output_dataset.dims == dims

`test_process_rules_copies_multi_coords_correctly()`

Tests if during processing the coords are copied to the output array and there are no duplicates.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_copies_multi_coords_correctly():
    """Tests if during processing the coords are copied to the output array
    and there are no duplicates."""

    # Arrange
    output_dataset = _xr.Dataset()
    output_dataset["test"] = _xr.DataArray([32, 94, 9])

    logger = Mock(ILogger)
    rule = Mock(IArrayBasedRule)
    rule_2 = Mock(IArrayBasedRule)

    result_array = _xr.DataArray([27, 45, 93])
    result_array = result_array.assign_coords({"test": _xr.DataArray([2, 4, 5])})

    result_array_2 = _xr.DataArray([1, 2, 93])
    result_array_2 = result_array_2.assign_coords({"test": _xr.DataArray([2, 4, 5])})

    rule.input_variable_names = ["test"]
    rule.output_variable_name = "output"
    rule.execute.return_value = result_array

    rule_2.input_variable_names = ["test"]
    rule_2.output_variable_name = "output_2"
    rule_2.execute.return_value = result_array_2

    processor = RuleProcessor([rule, rule_2], output_dataset)

    # Act
    assert processor.initialize(logger)
    result_dataset = processor.process_rules(output_dataset, logger)

    # Assert
    assert "test" in result_dataset.coords
    # compare coords at the level of variable
    result_array_coords = result_array.coords["test"]
    result_output_var_coords = result_dataset.output.coords["test"]  # output variable
    assert (result_output_var_coords == result_array_coords).all()

    # compare coords at the level of dataset /
    # check if the coordinates are correctly copied to the dataset
    result_dataset_coords = result_dataset.coords["test"]
    assert (result_output_var_coords == result_dataset_coords).all()

    # check if havnig an extra rule with coordinates then they are not copy pasted too
    assert len(result_dataset.output.coords) == 1

`test_process_rules_fails_for_uninitialized_processor()`

Tests if an error is thrown if process_rules is called on the processor when it is not properly initialized.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_fails_for_uninitialized_processor():
    """Tests if an error is thrown if process_rules is called on the processor
    when it is not properly initialized."""

    # Arrange
    input_dataset = _xr.Dataset()
    output_dataset = _xr.Dataset()
    input_dataset["test"] = _xr.DataArray([32, 94, 9])

    logger = Mock(ILogger)
    rule = Mock(IRule)

    processor = RuleProcessor([rule], input_dataset)

    # Act
    with pytest.raises(RuntimeError) as exc_info:
        processor.process_rules(output_dataset, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "Processor is not properly initialized, please initialize."
    assert exception_raised.args[0] == expected_message

`test_process_rules_given_rule_dependencies()`

Tests if the processor can correctly process_rules given the rule dependencies.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_given_rule_dependencies():
    """Tests if the processor can correctly process_rules given
    the rule dependencies.
    """

    # Arrange
    dataset = _xr.Dataset()
    dataset["test"] = _xr.DataArray([32, 94, 9])

    rule1 = Mock(IArrayBasedRule, id="rule1")
    rule2 = Mock(IArrayBasedRule, id="rule2")
    rule3 = Mock(IMultiArrayBasedRule, id="rule3")

    logger = Mock(ILogger)

    rule1.input_variable_names = ["test"]
    rule2.input_variable_names = ["test"]
    rule3.input_variable_names = ["out1", "out2"]

    rule1.output_variable_name = "out1"
    rule2.output_variable_name = "out2"
    rule3.output_variable_name = "out3"

    rule1.execute.return_value = _xr.DataArray([1, 2, 3])
    rule2.execute.return_value = _xr.DataArray([4, 5, 6])
    rule3.execute.return_value = _xr.DataArray([7, 8, 9])

    rules: List[IRule] = [rule1, rule2, rule3]
    processor = RuleProcessor(rules, dataset)

    assert processor.initialize(logger)

    # Act
    processor.process_rules(dataset, logger)

    # Assert
    assert len(dataset) == 4
    for rule in [rule1, rule2, rule3]:
        rule.execute.assert_called_once_with(ANY, logger)
        assert rule.output_variable_name in dataset.keys()

`test_process_rules_throws_exception_for_array_based_rule_with_multiple_inputs()`

Tests if an error is thrown during processing of an IArrayBasedRule if two inputs were defined.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_throws_exception_for_array_based_rule_with_multiple_inputs():
    """Tests if an error is thrown during processing of an IArrayBasedRule
    if two inputs were defined."""

    # Arrange
    output_dataset = _xr.Dataset()

    output_dataset["test1"] = _xr.DataArray([32, 94, 9])
    output_dataset["test2"] = _xr.DataArray([32, 94, 9])

    logger = Mock(ILogger)
    rule = Mock(IArrayBasedRule)

    rule.input_variable_names = ["test1", "test2"]
    rule.output_variable_name = "output"

    processor = RuleProcessor([rule], output_dataset)
    assert processor.initialize(logger)

    # Act
    with pytest.raises(NotImplementedError) as exc_info:
        processor.process_rules(output_dataset, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = "Array based rule only supports one input array."
    assert exception_raised.args[0] == expected_message

`test_process_rules_throws_exception_for_unsupported_rule()`

Tests if an error is thrown when trying to execute a rule that is not supported.

Source code in tests/business/entities/test_rule_processor.py

def test_process_rules_throws_exception_for_unsupported_rule():
    """Tests if an error is thrown when trying to execute a rule that is
    not supported."""

    # Arrange
    output_dataset = _xr.Dataset()
    input_array = _xr.DataArray([32, 94, 9])

    output_dataset["test"] = input_array

    logger = Mock(ILogger)
    rule = Mock(IRule)

    rule.name = "test"
    rule.input_variable_names = ["test"]
    rule.output_variable_name = "output"

    processor = RuleProcessor([rule], output_dataset)
    assert processor.initialize(logger)

    # Act
    with pytest.raises(NotImplementedError) as exc_info:
        processor.process_rules(output_dataset, logger)

    exception_raised = exc_info.value

    # Assert
    expected_message = f"Can not execute rule {rule.name}."
    assert exception_raised.args[0] == expected_message

`test_process_values_outside_limits(example_rule, input_value, expected_output_value, expected_log_message)`

Test the function execution with input values outside the interval limits.

Source code in tests/business/entities/test_rule_processor.py

@pytest.mark.parametrize(
    "input_value, expected_output_value, expected_log_message",
    [
        (-1, (10, [1, 0]), "value less than min: 1 occurence(s)"),
        (11, (20, [0, 1]), "value greater than max: 1 occurence(s)"),
    ],
)
def test_process_values_outside_limits(
    example_rule,
    input_value: int,
    expected_output_value: int,
    expected_log_message: str,
):
    """
    Test the function execution with input values outside the interval limits.
    """
    # Arrange
    logger = Mock(ILogger)
    dataset = _xr.Dataset()
    dataset["test1"] = _xr.DataArray(input_value)
    rule = Mock(ICellBasedRule)
    rule.input_variable_names = ["test1"]
    rule.output_variable_name = "output"
    rule.execute.return_value = expected_output_value
    processor = RuleProcessor([rule], dataset)

    # Act
    assert processor.initialize(logger)
    processor.process_rules(dataset, logger)

    # Assert
    assert example_rule.execute(input_value, logger) == expected_output_value
    processor.process_rules(dataset, logger)
    logger.log_warning.assert_called_with(expected_log_message)

test_application

Tests for Application class

`test_running_application()`

Test running application for test file

Source code in tests/business/test_application.py

def test_running_application():
    """Test running application for test file"""

    # Arrange
    logger = Mock(ILogger)
    data_layer = Mock(IDataAccessLayer)
    dataset = Mock(IDatasetData)
    model: IModel = Mock(IModel)
    model_builder = Mock(IModelBuilder)
    model_data = Mock(IModelData)

    model.name = "Test model"
    model.partition = ""
    model_builder.build_model.return_value = model
    data_layer.read_input_file.return_value = model_data
    data_layer.retrieve_file_names.return_value = {"": "Test.nc"}
    model_data.version = [0, 0, 0]
    model_data.datasets = [dataset]
    model_data.output_path = "Result_test.nc"

    application = Application(logger, data_layer, model_builder)
    application.APPLICATION_VERSION = "0.0.0"
    application.APPLICATION_VERSION_PARTS = [0, 0, 0]

    # Act
    application.run("Test.yaml")

    # Assert
    expected_message = 'Model "Test model" has successfully finished running'
    logger.log_info.assert_called_with(expected_message)

    model.validate.assert_called()
    model.initialize.assert_called()
    model.execute.assert_called()
    model.finalize.assert_called()

utils

test_dataset_utils

Tests for utility functions regarding an xarray dataset