Modules#
hatyan.cli module#
- Console script for hatyan.
hatyan --help
hatyan.hatyan_core module#
hatyan_core.py contains core components that wrap around schureman/foreman definitions
- hatyan.hatyan_core.get_const_list_hatyan(listtype)[source]#
Definition of several hatyan components lists, taken from the tidegui initializetide.m code, often originating from corresponcence with Koos Doekes
Parameters#
- listtypestr
- The type of the components list to be retrieved, options:
‘all’: all available components in hatyan_python
‘all_originalorder’: all 195 hatyan components in original hatyan-FORTRAN order
‘year’: default list of 94 hatyan components
‘halfyear’: list of 88 components to be used when analyzing approximately half a year of data
‘month’: list of 21 components to be used when analyzing one month of data. If desired, the K1 component can be splitted in P1/K1, N2 in N2/Nu2, S2 in T2/S2/K2 and 2MN2 in Labda2/2MN2.
‘month_deepwater’: list of 21 components to be used when analyzing one month of data for deep water (from tidegui).
‘springneap’: list of 14 components to be used when analyzing one spring neap period (approximately 15 days) of data
‘day’: list of 10 components to be used when analyzing one day
‘day_tidegui’: list of 5 components to be used when analyzing one day ir two tidal cycles (from tidegui)
‘tidalcycle’: list of 6 components to be used when analyzing one tidal cycle (approximately 12 hours and 25 minutes)
Raises#
- Exception
DESCRIPTION.
Returns#
- const_list_hatyanlist of str
A list of component names.
hatyan.analysis_prediction module#
analysis_prediction.py contains hatyan definitions related to tidal analysis and prediction.
- hatyan.analysis_prediction.analysis(ts, const_list, nodalfactors=True, fu_alltimes=True, xfac=False, source='schureman', cs_comps=None, analysis_perperiod=False, return_allperiods=False, max_matrix_condition=12)[source]#
Analysis of timeseries. Optionally processes a timeseries per year and vector averages the results afterwards. The timezone of the timeseries, will also be reflected in the phases of the resulting component set, so the resulting component set can be used to make a prediction in the original timezone.
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries to be analysed, as obtained from e.g. readts_*.
- const_listlist, pandas.Series or str
list or pandas.Series: contains the tidal constituent names for which to analyse the provided timeseries ts. str: a predefined name of a component set for hatyan_core.get_const_list_hatyan()
- nodalfactorsbool
Whether or not to apply nodal factors. The default is True.
- fu_alltimesbool
Whether to calculate nodal factors in middle of the analysis/prediction period (default) or on every timestep. The default is True.
- xfacbool
Whether or not to apply x-factors. The default is False.
- sourceTYPE
DESCRIPTION. The default is ‘schureman’.
- cs_compspandas.DataFrame, optional
contains the from/derive component lists for components splitting, as well as the amplitude factor and the increase in degrees. Only relevant for analysis. The default is None.
- max_matrix_condition: float or int
the maximum condition of the xTx matrix. The default is 12.
- analysis_perperiodFalse or Y/Q/W, optional
caution, it tries to analyse each year/quarter/month, but skips if it fails. The default is False.
- return_allperiodsbool, optional
Only relevant if analysis_perperiod is not None. The default is False.
Returns#
- COMP_mean_pdpandas.DataFrame
The DataFrame contains the component data with component names as index, and colums ‘A’ and ‘phi_deg’.
- COMP_all_pdpandas.DataFrame, optional
The same as COMP_mean_pd, but with all years added with MultiIndex
- hatyan.analysis_prediction.prediction(comp, times=None, timestep=None)[source]#
generates a tidal prediction from a set of components A and phi values. The component set has the same timezone as the timeseries used to create it. If times is timezone-naive the resulting prediction will be in component timezone. If times is timezone-aware the resulting prediction will be converted to that timezone. If a components dataframe contains multiple column levels (multiple periods), The prediction is a concatenation of predictions of all periods (based on the respective A/phi values).
Parameters#
- comppd.DataFrame
The DataFrame contains the component data with component names as index, and colums ‘A’ and ‘phi_deg’.
- times(pd.DatetimeIndex,slice), optional
pd.DatetimeIndex with prediction timeseries or slice(tstart,stop,timestep) to construct it from. If None, pd.DatetimeIndex is constructed from the tstart/tstop/timestep metadata attrs of the comp object. Only allowed/relevant for component dataframes with single-level columns (single period). The default is None.
- timestepstr
Only allowed/relevant for component dataframes with multi-level columns (different periods). The string is parsed with pandas.tseries.frequencies.to_offset(). The default is None.
Returns#
- ts_predictionTYPE
The DataFrame contains a ‘values’ column and a pd.DatetimeIndex as index, it contains the prediction times and values.
hatyan.components module#
components.py contains all the definitions related to hatyan components.
- hatyan.components.merge_componentgroups(comp_main, comp_sec)[source]#
Merges the provided component groups into one
Parameters#
- comp_mainpd.DataFrame
The reference component dataframe (with A/phi columns).
- comp_secpd.DataFrame
The dataframe with the components that will be used to overwrite the components in comp_main.
Returns#
- comp_mergedpd.DataFrame
The merged dataframe, a copy of comp_main with all components from comp_sec overwritten.
- hatyan.components.plot_components(comp, comp_allperiods=None, comp_validation=None, sort_freqs=True)[source]#
Create a plot with the provided analysis results
Parameters#
- compTYPE
DESCRIPTION.
- comp_allyearsTYPE, optional
DESCRIPTION. The default is None.
- comp_validationTYPE, optional
DESCRIPTION. The default is None.
- sort_freqsBOOL, optional
Whether to sort the component list on frequency or not, without sorting it is possible to plot components that are not available in hatyan. The default is True.
Returns#
- figmatplotlib.figure.Figure
The generated figure handle, with which the figure can be adapted and saved.
- axs(tuple of) matplotlib.axes._subplots.AxesSubplot
The generated axis handle, whith which the figure can be adapted.
hatyan.timeseries module#
timeseries.py contains all definitions related to hatyan timeseries.
- class hatyan.timeseries.Timeseries_Statistics(ts)[source]#
Bases:
object
returns several statistics of the provided timeseries as a Timeseries_Statistics class, which is a like a dict that pretty prints automatically.
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries to be checked.
Returns#
- stats: class Timeseries_Statistics
Timeseries_Statistics is a like a dict that pretty prints automatically.
- hatyan.timeseries.calc_HWLW(ts, calc_HWLW345=False, buffer_hr=6)[source]#
Calculates extremes (high and low waters) for the provided timeseries. This definition uses scipy.signal.find_peaks() with arguments ‘distance’ and ‘prominence’. The minimal ‘distance’ between two high or low water peaks is based on the M2 period: 12.42/1.5=8.28 hours for HW and 12.42/1.7=7.30 hours for LW (larger because of aggers). The prominence for local extremes is set to 0.01m, to filter out very minor dips in the timeseries. If there are two equal high or low water values, the first one is taken. There are no main high/low waters calculated within 6 hours of the start/end of the timeseries (keyword buffer_hr), since these can be invalid. Since scipy.signal.find_peaks() warns about nan values, those are removed first. Nans/gaps can influence the results since find_peaks does not know about time registration. This is also tricky for input timeseries with varying time interval.
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries with a tidal prediction or water level measurements.
- calc_HWLW345boolean, optional
Whether to also calculate local extremes, first/second low waters and ‘aggers’. The default is False, in which case only extremes per tidal period are calculated. When first/second low waters and aggers are calculated, the local extremes around highwater (eg double highwaters and dips) are filtered out first.
Returns#
- data_pd_HWLWpandas.DataFrame
The DataFrame contains colums ‘times’, ‘values’ and ‘HWLWcode’, it contains the times, values and codes of the timeseries that are extremes. 1 (high water) and 2 (low water). And if calc_HWLW345=True also 3 (first low water), 4 (agger) and 5 (second low water).
- hatyan.timeseries.calc_HWLW12345to12(data_HWLW_12345)[source]#
Parameters#
- data_HWLW12345TYPE
DESCRIPTION.
Returns#
None.
- hatyan.timeseries.calc_HWLWnumbering(ts_ext, station=None, doHWLWcheck=True)[source]#
For calculation of the extremes numbering, w.r.t. the first high water at Cadzand in 2000 (occurred on 1-1-2000 at approximately 9:45). The number of every high and low water is calculated by taking the time difference between itself and the first high water at Cadzand, correcting it with the station phase difference (M2phasediff). Low waters are searched for half an M2 period from the high waters. By adding a search window of half the period of M2 (searchwindow_hr), even strong time variance between consecutive high or low waters should be caputered.
Parameters#
- ts_extpandas.DataFrame
The DataFrame should contain a ‘values’ and ‘HWLWcode’ column and a pd.DatetimeIndex as index, it contains the times, values and codes of the timeseries that are extremes.
- station: string, optional
The station for which the M2 phase difference should be retrieved from data_M2phasediff_perstation.txt. This value is the phase difference in degrees of the occurrence of the high water generated by the same tidal wave as the first high water in 2000 at Cadzand (actually difference between M2 phases of stations). This value is used to correct the search window of high/low water numbering. The default is None. Providing a value will result in a proper HWLWno, corresponing to CADZD. Providing None will result in a HWLWno that is a multiple of 360degrees/M2_period_hr off (positive or negative). This is only an issue when comparing different stations, not comparing e.g. measured and predicted HW values of one station.
Raises#
- Exception
DESCRIPTION.
Returns#
- ts_extpandas.DataFrame
The input DataFrame with the column ‘HWLWno’ added, which contains the numbers of the extremes.
- hatyan.timeseries.crop_timeseries(ts, times, onlyfull=True)[source]#
Crops the provided timeseries
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries.
- timesslice
slice(tstart,tstop).
Raises#
- Exception
DESCRIPTION.
Returns#
- ts_pd_outTYPE
DESCRIPTION.
- hatyan.timeseries.plot_HWLW_validatestats(ts_ext, ts_ext_validation)[source]#
This definition calculates (and plots and prints) some statistics when comparing extreme values. This is done by calculating the extreme number (sort of relative to Cadzand 1jan2000, but see ‘warning’) and subtracting the ts_ext and ts_ext_validation dataframes based on these numbers (and HWLWcode). It will only result in values for the overlapping extremes, other values will be NaN and are not considered for the statistics. Warning: the calculated extreme numbers in this definition are not corrected for the real phase difference with the M2phasediff argument, the calculated extreme are fine for internal use (to match corresponding extremes) but the absolute number might be incorrect.
Parameters#
- ts_extpandas.DataFrame
The DataFrame should contain a ‘values’ and ‘HWLW_code’ column and a pd.DatetimeIndex as index, it contains the times, values and codes of the timeseries that are extremes.
- ts_ext_validationpandas.DataFrame
The DataFrame should contain a ‘values’ and ‘HWLW_code’ column and a pd.DatetimeIndex as index, values and codes of the timeseries that are extremes.
Returns#
- figmatplotlib.figure.Figure
The generated figure handle, with which the figure can be adapted and saved.
- axs(tuple of) matplotlib.axes._subplots.AxesSubplot
The generated axis handle, whith which the figure can be adapted.
- hatyan.timeseries.plot_timeseries(ts, ts_validation=None, ts_ext=None, ts_ext_validation=None)[source]#
Creates a plot with the provided timeseries
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries.
- ts_validationpandas.DataFrame, optional
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries. The default is None.
- ts_extpandas.DataFrame, optional
The DataFrame should contain a ‘values’ and ‘HWLW_code’ column and a pd.DatetimeIndex as index, it contains the times, values and codes of the timeseries that are extremes. The default is None.
- ts_ext_validationpandas.DataFrame, optional
The DataFrame should contain a ‘values’ and ‘HWLW_code’ column and a pd.DatetimeIndex as index, it contains the times, values and codes of the timeseries that are extremes. The default is None.
Returns#
- figmatplotlib.figure.Figure
The generated figure handle, with which the figure can be adapted and saved.
- axs(tuple of) matplotlib.axes._subplots.AxesSubplot
The generated axis handle, whith which the figure can be adapted.
- hatyan.timeseries.read_dia(filename, station=None, block_ids=None, allow_duplicates=False)[source]#
Reads an equidistant or non-equidistant dia file, or a list of dia files. Also works for diafiles containing multiple blocks for one station.
Parameters#
- filenameTYPE
DESCRIPTION.
- stationTYPE
DESCRIPTION. The default is None.
- block_idsint, list of int or ‘allstation’, optional
DESCRIPTION. The default is None.
Returns#
- data_pdpandas.core.frame.DataFrame
DataFrame with a ‘values’ column and a pd.DatetimeIndex as index in case of an equidistant file, or more columns in case of a non-equidistant file.
- hatyan.timeseries.read_noos(filename, datetime_format='%Y%m%d%H%M', na_values=None)[source]#
Reads a noos file
Parameters#
- filenameTYPE
DESCRIPTION.
- datetime_formatTYPE, optional
DESCRIPTION. The default is ‘%Y%m%d%H%M’.
- na_valuesTYPE, optional
DESCRIPTION. The default is None.
Returns#
- data_pdTYPE
DESCRIPTION.
- hatyan.timeseries.resample_timeseries(ts, timestep_min, tstart=None, tstop=None)[source]#
resamples the provided timeseries, only overlapping timesteps are selected, so no interpolation. with tstart/tstop it is possible to extend the timeseries with NaN values.
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ and ‘HWLW_code’ column and a pd.DatetimeIndex as index, it contains the timeseries to be resampled.
- timestep_minint
the amount of minutes with which to resample the timeseries.
- tstartdt.datetime, optional
the start date for the resampled timeseries, the default is None which results in using the start date of the input ts.
- tstopdt.datetime, optional
the stop date for the resampled timeseries, the default is None which results in using the stop date of the input ts.
Returns#
- data_pd_resamplepandas.DataFrame with a ‘values’ column and a pd.DatetimeIndex as index
the resampled timeseries.
- hatyan.timeseries.timeseries_fft(ts_residue, min_prominence=1000, max_freqdiff=None, plot_fft=True, source='schureman')[source]#
- hatyan.timeseries.write_dia(ts, filename, headerformat='dia')[source]#
Writes the timeseries to an equidistant dia file or the extremes to a non-equidistant dia file. This is only supported for timeseries with a UTC+1 timestamp, since DONAR (and therefore dia) data is always in UTC+1 (MET/CET), also during summertime periods.
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries. In case of extremes, the DataFrame should also contain a ‘HWLW_code’ column.
- stationTYPE
DESCRIPTION.
- vertrefTYPE
DESCRIPTION.
- filenameTYPE
DESCRIPTION.
Raises#
- Exception
DESCRIPTION.
Returns#
None.
- hatyan.timeseries.write_netcdf(ts, filename, ts_ext=None, nosidx=False, mode='w')[source]#
Writes the timeseries to a netCDF file
Parameters#
- tspandas.DataFrame
The DataFrame should contain a ‘values’ column and a pd.DatetimeIndex as index, it contains the timeseries.
- stationstr
DESCRIPTION.
- vertrefstr
DESCRIPTION.
- filenamestr
The filename of the netCDF file that will be written.
- ts_extpandas.DataFrame, optional
The DataFrame should contain a ‘values’ and ‘HWLW_code’ column and a pd.DatetimeIndex as index, it contains the times, values and codes of the timeseries that are extremes. The default is None.
- tzone_hrint, optional
The timezone (GMT+tzone_hr) that applies to the data. The default is 1 (MET).
Returns#
None.
hatyan.ddlpy_helpers module#
ddlpy_helpers.py contains functions to convert ddlpy timeseries dataframes to hatyan timeseries dataframes. The package ddlpy is available at Deltares/ddlpy
- hatyan.ddlpy_helpers.ddlpy_to_hatyan(ddlpy_meas, ddlpy_meas_exttyp=None)[source]#
Convert ddlpy measurements to hatyan timeseries dataframe.
Parameters#
- ddlpy_measpd.DataFrame
ddlpy measurements dataframe. It is assumed that it contains numeric values that represent waterlevel timeseries or waterlevel extremes (measured or astro).
- ddlpy_meas_exttyppd.DataFrame, optional
ddlpy measurements dataframe. If it is supplied it is assumed that it contains alfanumeric values and these represent tidal extreme types (high- and low waters). The default is None.
Returns#
- pd.DataFrame
hatyan timeseries DataFrame with values/qualitycode/status columns. If ddlpy_meas_typ is supplied, this DataFrame will also include a HWLWcode column.
hatyan.astrog module#
astrog.py contains all astro-related definitions, previously embedded in a separate program but now part of hatyan.
- hatyan.astrog.astrog_anomalies(tFirst, tLast, dT_fortran=False)[source]#
Makes use of the definitions dT, astrab and astrac. Calculates lunar anomalies. The lunar anomalies are independent of coordinates.
Parameters#
- tFirstdatetime.datetime or string (“yyyymmdd”)
Start of timeframe for output.
- tLastdatetime.datetime or string (“yyyymmdd”)
End of timeframe for output.
- dT_fortranboolean, optional
Reproduce fortran difference between universal time and terrestrial time (dT). Can be True (for latest fortran reproduction) or False (international definition). The default is False.
Raises#
- Exception
Checks input times tFirst and tLast.
Returns#
- astrog_dfpandas DataFrame
datetime: lunar anomaly in UTC (datetime) type: type of anomaly (1=perigeum, 2=apogeum)
- hatyan.astrog.astrog_culminations(tFirst, tLast, dT_fortran=False)[source]#
Makes use of the definitions dT, astrab and astrac. Calculates lunar culminations, parallax and declination. By default the lunar culmination is calculated at coordinates lon=0 (Greenwich), since EHMOON is used to calculate it. Possible to add lon-correction at end of definition.
Parameters#
- tFirstpd.Timestamp, datetime.datetime or string (“yyyymmdd”)
Start of timeframe for output.
- tLastpd.Timestamp, datetime.datetime or string (“yyyymmdd”)
End of timeframe for output.
- dT_fortranboolean, optional
Reproduce fortran difference between universal time and terrestrial time (dT). Can be True (for latest fortran reproduction) or False (international definition). The default is False.
Raises#
- Exception
Checks input times tFirst and tLast.
Returns#
- astrog_dfpandas DataFrame
datetime: lunar culmination at Greenwich in UTC (datetime) type: type of culmination (1=lower, 2=upper) parallax: lunar parallax (degrees) declination: lunar declination (degrees)
- hatyan.astrog.astrog_moonriseset(tFirst, tLast, dT_fortran=False, lon=5.3876, lat=52.1562)[source]#
Makes use of the definitions dT, astrab and astrac. Calculates moonrise and -set at requested location.
Parameters#
- tFirstdatetime.datetime or string (“yyyymmdd”)
Start of timeframe for output.
- tLastdatetime.datetime or string (“yyyymmdd”)
End of timeframe for output.
- dT_fortranboolean, optional
Reproduce fortran difference between universal time and terrestrial time (dT). Can be True (for latest fortran reproduction) or False (international definition). The default is False.
- lonfloat, optional
Longitude, defined positive eastward. The default is -5.3876 (Amersfoort).
- latfloat, optional
Latitude, defined positive northward, cannot exceed 59 (too close to poles). The default is 52.1562 (Amersfoort).
Raises#
- Exception
Checks input times tFirst and tLast.
Returns#
- astrog_dfpandas DataFrame
datetime: time of rise or set in UTC (datetime) type: type (1=moonrise, 2=moonset)
- hatyan.astrog.astrog_phases(tFirst, tLast, dT_fortran=False)[source]#
Makes use of the definitions dT, astrab and astrac. Calculates lunar phases. The lunar phases are independent of coordinates.
Parameters#
- tFirstdatetime.datetime or string (“yyyymmdd”)
Start of timeframe for output.
- tLastdatetime.datetime or string (“yyyymmdd”)
End of timeframe for output.
- dT_fortranboolean, optional
Reproduce fortran difference between universal time and terrestrial time (dT). Can be True (for latest fortran reproduction) or False (international definition). The default is False.
Raises#
- Exception
Checks input times tFirst and tLast.
Returns#
- astrog_dfpandas DataFrame
datetime: lunar phase in UTC (datetime) type: type of phase (1=FQ, 2=FM, 3=LQ, 4=NM)
- hatyan.astrog.astrog_seasons(tFirst, tLast, dT_fortran=False)[source]#
Makes use of the definitions dT, astrab and astrac. Calculates astronomical seasons. The seasons are independent of coordinates.
Parameters#
- tFirstdatetime.datetime or string (“yyyymmdd”)
Start of timeframe for output.
- tLastdatetime.datetime or string (“yyyymmdd”)
End of timeframe for output.
- dT_fortranboolean, optional
Reproduce fortran difference between universal time and terrestrial time (dT). Can be True (for latest fortran reproduction) or False (international definition). The default is False.
Raises#
- Exception
Checks input times tFirst and tLast.
Returns#
- astrog_dfpandas DataFrame
datetime: start of astronomical season in UTC (datetime) type: type of astronomical season (1=spring, 2=summer, 3=autumn, 4=winter)
- hatyan.astrog.astrog_sunriseset(tFirst, tLast, dT_fortran=False, lon=5.3876, lat=52.1562)[source]#
Makes use of the definitions dT, astrab and astrac. Calculates sunrise and -set at requested location.
Parameters#
- tFirstdatetime.datetime or string (“yyyymmdd”)
Start of timeframe for output.
- tLastdatetime.datetime or string (“yyyymmdd”)
End of timeframe for output.
- dT_fortranboolean, optional
Reproduce fortran difference between universal time and terrestrial time (dT). Can be True (for latest fortran reproduction) or False (international definition). The default is False.
- lonfloat, optional
Longitude, defined positive eastward. The default is -5.3876 (Amersfoort).
- latfloat, optional
Latitude, defined positive northward, cannot exceed 59 (too close to poles). The default is 52.1562 (Amersfoort).
Raises#
- Exception
Checks input times tFirst and tLast.
Returns#
- astrog_dfpandas DataFrame
datetime: time of rise or set in UTC (datetime) type: type (1=sunrise, 2=sunset)
- hatyan.astrog.convert2perday(dataframeIn, timeformat='%H:%M %Z')[source]#
converts normal astrog pd.DataFrame to one with the same information restructured per day
Parameters#
- dataframeInpd.DataFrame
with columns ‘datetime’ and ‘type_str’.
- timeformatstr, optional
format of the timestrings in dataframeOut. The default is ‘%H:%M %Z’.
Returns#
- dataframeOutpd.DataFrame
The ‘datetime’ column contains dates, with columns containing all unique ‘type_str’ values.
- hatyan.astrog.plot_astrog_diff(pd_python, pd_fortran, typeCol='type', typeUnit='-', typeLab=None, typeBand=None, timeBand=None)[source]#
Plots results of FORTRAN and python verison of astrog for visual inspection. Top plot shows values or type, middle plot shows time difference, bottom plot shows value/type difference.
Parameters#
- pd_pythonpandas DataFrame
DataFrame from astrog (python) with times (UTC).
- pd_fortranpandas DataFrame
DataFrame from astrog (FORTRAN) with times (UTC).
- typeUnitstring, optional
Unit of provided values/types. The default is ‘-‘.
- typeLabTYPE, optional
Labels of provided types. The default is [‘rise’,’set’].
- typeBandlist of floats, optional
Expected bandwith of accuracy of values/types. The default is [-.5,.5].
- timeBandlist of floats, optional
Expected bandwith of accuracy of times (seconds). The default is [0,60].
- timeLimlist of floats, optional
Time limits of x-axis. The default is None (takes limits from pd_python).
Returns#
- figfigure handle
Output figure.
- axsaxis handles
Axes in figure.