Fitting the GEV
InĀ [1]:
Copied!
import lmo
import numpy as np
import matplotlib.pyplot as plt
import lmo import numpy as np import matplotlib.pyplot as plt
Peak streamflow values in $m^3 s^{-1}$
InĀ [3]:
Copied!
data = [
123, 2250, 543, 178, 67, 5100, 248, 1500, 342, 329,
543, 980, 1020, 4502, 3406, 856, 297,
] # fmt: skip
data = [ 123, 2250, 543, 178, 67, 5100, 248, 1500, 342, 329, 543, 980, 1020, 4502, 3406, 856, 297, ] # fmt: skip
Summarize the probability distribution of the data using L-moments:
- L-location $l_1$: the 1st L-moment, equivalent to the mean.
- L-scale $l_2$: the 2nd L-moment, equivalent to half the gini mean difference.
- L-skewness coefficient $t_3$: the 3rd L-moment ratio $l_3 / l_2$
- L-kurtosis coefficient $t_4$: the 4th L-moment ratio $l_4 / l_2$
InĀ [4]:
Copied!
l_stats_data = lmo.l_stats(data)
print("L-moment statistics [L-loc, L-scale, L-skew, L-kurt]:\n")
print("Data:", l_stats_data)
l_stats_data = lmo.l_stats(data) print("L-moment statistics [L-loc, L-scale, L-skew, L-kurt]:\n") print("Data:", l_stats_data)
L-moment statistics [L-loc, L-scale, L-skew, L-kurt]: Data: [1310.82353 799.77206 0.48093 0.19718]
For reference, the normal distribution has L-skewness $\tau_3 = 0$ (unskewed) and L-kurtosis $\tau_4 \approx 0.1226$. So this distribution is not normally distributed.
This is confirmed by the L-moment-based hypothesis test for non-normality.
InĀ [5]:
Copied!
lmo.diagnostic.normaltest(data).is_significant(0.01)
lmo.diagnostic.normaltest(data).is_significant(0.01)
Out[5]:
np.True_
Now fit the Generalized Extreme Value (GEV) distribution to the data, using scipy'y default MLE method, and the Method of L-moments (L-MM)
InĀ [6]:
Copied!
from scipy.stats.distributions import genextreme as GEV
# MM (method of moments) fails, as the variance is non-finite
# params_mm = GEV.fit(data, method='MM')
params_mle = GEV.fit(data, method="MLE")
params_lmm = GEV.l_fit(data)
print("Fitted GEV parameters [shape (c), loc, scale]:\n")
print("MLE:", np.round(params_mle, 5))
print("L-MM:", np.round(params_lmm, 5))
from scipy.stats.distributions import genextreme as GEV # MM (method of moments) fails, as the variance is non-finite # params_mm = GEV.fit(data, method='MM') params_mle = GEV.fit(data, method="MLE") params_lmm = GEV.l_fit(data) print("Fitted GEV parameters [shape (c), loc, scale]:\n") print("MLE:", np.round(params_mle, 5)) print("L-MM:", np.round(params_lmm, 5))
Fitted GEV parameters [shape (c), loc, scale]: MLE: [ -0.93795 391.81895 424.55898] L-MM: [ -0.43365 481.11193 629.29211]
InĀ [7]:
Copied!
X_mle = GEV(*params_mle)
X_lmm = GEV(*params_lmm)
print("Fitted GEV L-moment statistics (loc, scale, skew, kurtosis):\n")
print("Data:", l_stats_data)
print("MLE: ", X_mle.l_stats())
print("L-MM:", X_lmm.l_stats())
X_mle = GEV(*params_mle) X_lmm = GEV(*params_lmm) print("Fitted GEV L-moment statistics (loc, scale, skew, kurtosis):\n") print("Data:", l_stats_data) print("MLE: ", X_mle.l_stats()) print("L-MM:", X_lmm.l_stats())
Fitted GEV L-moment statistics (loc, scale, skew, kurtosis): Data: [1310.82353 799.77206 0.48093 0.19718] MLE: [6999.41443 6465.82686 0.93602 0.89896] L-MM: [1310.82353 799.77207 0.48093 0.34856]
Clearly, the MLE is completely wrong, as the mean is almost 5x larger than the sample mean.
Let's confirm this visually:
InĀ [8]:
Copied!
from scipy.stats import probplot
fig, axs = plt.subplots(1, 2, sharey=True, figsize=(10, 4.5))
probplot(data, dist=X_mle, plot=axs[0], rvalue=True)
probplot(data, dist=X_lmm, plot=axs[1], rvalue=True)
axs[0].set_title("MLE")
axs[1].set_title("L-MM")
plt.tight_layout()
from scipy.stats import probplot fig, axs = plt.subplots(1, 2, sharey=True, figsize=(10, 4.5)) probplot(data, dist=X_mle, plot=axs[0], rvalue=True) probplot(data, dist=X_lmm, plot=axs[1], rvalue=True) axs[0].set_title("MLE") axs[1].set_title("L-MM") plt.tight_layout()