imod.formats.ipf.read#

imod.formats.ipf.read(path, kwargs={}, assoc_kwargs={})[source]#

Read one or more IPF files to a single pandas.DataFrame, including associated (TXT) files.

The different IPF files can be from different model layers, and column names may differ between them.

Note that this function always returns a pandas.DataFrame. IPF files always contain spatial information, for which geopandas.GeoDataFrame is a better fit, in principle. However, GeoDataFrames are not the best fit for the associated data.

To perform spatial operations on the points, you’re likely best served by (temporarily) creating a GeoDataFrame, doing the spatial operation, and then using the output to select values in the original DataFrame. Please refer to the examples.

Parameters:
  • path (str, Path or list) – This can be a single file, ‘wells_l1.ipf’, a glob pattern expansion, ‘wells_l*.ipf’, or a list of files, [‘wells_l1.ipf’, ‘wells_l2.ipf’]. Note that each file needs to have the same columns, such that they can be combined in a single pd.DataFrame.

  • kwargs (dict) – Dictionary containing the pandas.read_csv() keyword arguments for the IPF files (e.g. {“delim_whitespace”: True})

  • assoc_kwargs (dict) – Dictionary containing the pandas.read_csv() keyword arguments for the associated (TXT) files (e.g. {“delim_whitespace”: True})

Return type:

pandas.DataFrame

Examples

Read an IPF file into a dataframe:

>>> import imod
>>> df = imod.ipf.read("example.ipf")

Convert the x and y data into a GeoDataFrame, do a spatial operation, and use it to select points within a polygon. Note: gpd.points_from_xy() requires a geopandas version >= 0.5.

>>> import geopandas as gpd
>>> polygon = gpd.read_file("polygon.shp").geometry[0]
>>> ipf_points = gpd.GeoDataFrame(geometry=gpd.points_from_xy(df["x"], df["y"]))
>>> within_polygon = ipf_points.within(polygon)
>>> selection = df[within_polygon]

The same exercise is a little more complicated when associated files (like timeseries) are involved, since many duplicate values of x and y will exist. The easiest way to isolate these is by applying a groupby, and then taking first of x and y of every group:

>>> df = imod.ipf.read("example_with_time.ipf")
>>> first = df.groupby("id").first()  # replace "id" by what your ID column is called
>>> x = first["x"]
>>> y = first["y"]
>>> id_code = first.index  # id is a reserved keyword in python
>>> ipf_points = gpd.GeoDataFrame(geometry=gpd.points_from_xy(x, y))
>>> within_polygon = ipf_points.within(polygon)

Using the result is a little more complicated as well, since it has to be mapped back to many duplicate values of the original dataframe. There are two options. First, by using the index:

>>> within_polygon.index = id_code
>>> df = df.set_index("id")
>>> selection = df[within_polygon]

If you do not wish to change index on the original dataframe, use pandas.DataFrame.merge() instead.

>>> import pandas as pd
>>> within_polygon = pd.DataFrame({"within": within_polygon})
>>> within_polygon["id"] = id_code
>>> df = df.merge(within_polygon, on="id")
>>> df = df[df["within"]]