Operations on distribution functions#

In this example, we demonstrate various operations on distribution functions using the probabilistic library.

Define a stochastic variable#

First, we import the necessary classes:

[1]:

from probabilistic_library import DistributionType, Stochast, StandardNormal

Next, we create a random variable using the Stochast class:

[2]:

stochast = Stochast()

Let’s consider a random variable, which is uniformly distributed over the interval \([-1, 1]\). This is defined as follows:

[3]:

stochast.distribution = DistributionType.uniform
stochast.minimum = -1
stochast.maximum = 1

stochast.plot()

../_images/_examples_operations_on_distribution_functions_5_0.png

The properties of this distribution are the minimum and maximum values, but also the derived properties as mean, deviation and coefficient of variation (variation):

[4]:

stochast.print()

Variable:
  distribution = uniform
Definition:
  minimum = -1.0
  maximum = 1.0
Derived values:
  mean = 0.0
  deviation = 0.5773502691896257
  variation = 0.0

We can also specify the derived properties leading to an update of the other parameters:

[5]:

stochast.mean = 2.0
stochast.deviation = 1.0

stochast.print()

Variable:
  distribution = uniform
Definition:
  minimum = 0.2679491924311228
  maximum = 3.732050807568877
Derived values:
  mean = 2.0
  deviation = 0.9999999999999999
  variation = 0.49999999999999994

[6]:

stochast.mean = 2.0
stochast.variation = 1.0

stochast.print()

Variable:
  distribution = uniform
Definition:
  minimum = -1.4641016151377544
  maximum = 5.464101615137754
Derived values:
  mean = 2.0
  deviation = 1.9999999999999998
  variation = 0.9999999999999999

CDF and PDF#

The cdf and pdf values of a distribution function can be obtained with stochast.get_cdf() and stochast.get_pdf(), respectively:

[7]:

print(f"CDF: {stochast.get_cdf(2.0)}")
print(f"PDF: {stochast.get_pdf(2.0)}")

CDF: 0.5
PDF: 0.14433756729740646

Quantiles#

A quantile can be calculated with stochast.get_quantile(), for example:

[8]:

p = 0.75
print(f"x({p}) {stochast.get_quantile(p)}")

x(0.75) 3.7320508075688776

Another option is to use the function StandardNormal.get_u_from_p() in the class StandardNormal, which converts the non-exceeding probability of 0.75 into the corresponding value in the standard normal space (\(u\)-space). Subsequently, stochast.get_x_from_u() translates it back to the original space (\(x\)-space).

[9]:

u = StandardNormal.get_u_from_p(p)
print(f"x({p}) = {stochast.get_x_from_u(u)}")

x(0.75) = 3.7320508075688776

Design value#

A design_value of a variable is defined as the value obtained by dividing the value corresponding to a specific design_quantile by the design_factor. For example:

[10]:

stochast.design_quantile = 0.75
stochast.design_factor = 0.99

stochast.print()

Variable:
  distribution = uniform
Definition:
  minimum = -1.4641016151377544
  maximum = 5.464101615137754
  design_quantile = 0.75
  design_factor = 0.99
Derived values:
  mean = 2.0
  deviation = 1.9999999999999998
  variation = 0.9999999999999999
  design_value = 3.769748290473614

The design_value can be set explicitely, leading to an update of the properties of the random variable (while design_quantile, design_factor and variation remain unchanged):

[11]:

stochast.design_value = 3.5

stochast.print()

Variable:
  distribution = uniform
Definition:
  minimum = -1.3593370081945169
  maximum = 5.07311477919061
  design_quantile = 0.75
  design_factor = 0.99
Derived values:
  mean = 1.8568888854980465
  deviation = 1.856888885498046
  variation = 0.9999999999999998
  design_value = 3.500001850852857

Truncated distribution function#

Let’s consider a normal distribution function with a location of 0.0 and a scale of 1.0:

[12]:

stochast = Stochast()
stochast.distribution = DistributionType.normal
stochast.location = 0.0
stochast.scale = 1.0
stochast.plot()

../_images/_examples_operations_on_distribution_functions_23_0.png

To truncate a distribution, we use the truncated attribute.

[13]:

stochast.truncated = True

The truncation interval is specified using the minimum and maximum properties. If these are not defined, the original domain of the distribution is used (i.e., no truncation is applied).

Suppose we want to truncate this distribution to the interval [-0.5, \infty). If this is the first time truncation is applied in the project, it is sufficient to specify only the minimum value:

[14]:

stochast.minimum = -0.5

stochast.print()
stochast.plot()

Variable:
  distribution = normal (truncated)
Definition:
  location = 0.0
  scale = 1.0
  minimum = -0.5
  maximum = inf
Derived values:
  mean = 0.0
  deviation = 1.0
  variation = 0.0

../_images/_examples_operations_on_distribution_functions_27_1.png

If we want to truncate the same distribution to the interval \((-\infty, 0.5]\), we need to specify both the minimum and maximum properties. Otherwise, the minimum would remain set to -0.5.

[15]:

import numpy as np
stochast.truncated = True
stochast.minimum = -np.inf
stochast.maximum = 0.5

stochast.print()
stochast.plot()

Variable:
  distribution = normal (truncated)
Definition:
  location = 0.0
  scale = 1.0
  minimum = -inf
  maximum = 0.5
Derived values:
  mean = 0.0
  deviation = 1.0
  variation = 0.0

../_images/_examples_operations_on_distribution_functions_29_1.png

Below, we truncate the distribution to the interval \([-0.5, 0.5]\):

[16]:

stochast.truncated = True
stochast.minimum = -0.5
stochast.maximum = 0.5

stochast.print()
stochast.plot()

Variable:
  distribution = normal (truncated)
Definition:
  location = 0.0
  scale = 1.0
  minimum = -0.5
  maximum = 0.5
Derived values:
  mean = 0.0
  deviation = 1.0
  variation = 0.0

../_images/_examples_operations_on_distribution_functions_31_1.png

Inverted distribution function#

Let’s consider a log-normal distribution function with a locationof 1.0, a scale of 0.5 and a shift of 0.0:

[17]:

stochast = Stochast()
stochast.distribution = DistributionType.log_normal
stochast.location = 1.0
stochast.scale = 0.5
stochast.shift = 0.0

stochast.print()
stochast.plot()

Variable:
  distribution = log_normal
Definition:
  location = 1.0
  scale = 0.5
  shift = 0.0
Derived values:
  mean = 3.080216848918031
  deviation = 1.6415718456238662
  variation = 0.5329403500277882

../_images/_examples_operations_on_distribution_functions_33_1.png

We want to invert this distribution function with respect to the shift value. This is done by setting the inverted attribute:

[18]:

stochast.inverted = True

stochast.print()
stochast.plot()

Variable:
  distribution = log_normal (inverted)
Definition:
  location = 1.0
  scale = 0.5
  shift = 0.0
Derived values:
  mean = -3.080216848918031
  deviation = 1.6415718456238662
  variation = 0.5329403500277882

../_images/_examples_operations_on_distribution_functions_35_1.png

Fit parameters of a distribution function#

It is also possible to estimate parameters of a distribution function from data. In this example, we consider the following dataset:

[19]:

data = [2.3, 0.0, -1.0, 2.6, 2.7, 2.8, 3.3, 3.4, 1.0, 3.0, 0.0, -2.0, -1.0]

Let’s consider a normal distribution. By using stochast.fit(), we obtain the fitted location and scale:

[20]:

stochast = Stochast()
stochast.distribution = DistributionType.normal
stochast.fit(data)

stochast.print()
stochast.plot()

Variable:
  distribution = normal
Definition:
  location = 1.3153846153846152
  scale = 1.8959809043720852
Derived values:
  mean = 1.3153846153846152
  deviation = 1.8959809043720852
  variation = 1.4413889916279012

../_images/_examples_operations_on_distribution_functions_39_1.png

The goodness of fit can be assessed using the Kolmogorov-Smirnov test through the get_ks_test() method:

[21]:

def get_ks_test(data):
    print(f"kolmogorov smirnov test = {stochast.get_ks_test(data)}")

get_ks_test(data)

kolmogorov smirnov test = 0.23669173779063168

When we consider a log-normal distribution, the fitted parameters are as follows:

[22]:

stochast = Stochast()
stochast.distribution = DistributionType.log_normal
stochast.fit(data)

stochast.print()
stochast.plot()

Variable:
  distribution = log_normal
Definition:
  location = 1.5964855623845133
  scale = 0.4261942592095209
  shift = -4.0
Derived values:
  mean = 1.4049020870683506
  deviation = 2.412211284018044
  variation = 1.7169960143284249

../_images/_examples_operations_on_distribution_functions_43_1.png

The result of the goodness-of-fit test is:

[24]:

get_ks_test(data)

kolmogorov smirnov test = 0.2550238105742991