Lmo reference¶
High-level API¶
Sample L-moments¶
Lmo: Robust statistics with trimmed L-moments and L-comoments.
lmo.l_kurtosis(a, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
L-kurtosis coefficient; the 4th sample L-moment ratio.
Alias for lmo.l_ratio(a, 4, 2, *, **)
.
Examples:
>>> import lmo, numpy as np
>>> x = np.random.default_rng(12345).standard_t(2, 99)
>>> lmo.l_kurtosis(x)
0.28912787
>>> lmo.l_kurtosis(x, trim=(1, 1))
0.19928182
Notes
The L-kurtosis \(\tau_4\) lies within the interval \([-\frac{1}{4}, 1)\), and by the L-skewness \(\\tau_3\) as \(5 \tau_3^2 - 1 \le 4 \tau_4\).
See Also
lmo.l_loc(a, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
L-location (or L-loc): unbiased estimator of the first L-moment, \(\lambda^{(s, t)}_1\).
Alias for lmo.l_moment(a, 1, *, **)
.
Examples:
The first moment (i.e. the mean) of the Cauchy distribution does not exist. This means that estimating the location of a Cauchy distribution from its samples, cannot be done using the traditional average (i.e. the arithmetic mean). Instead, a robust location measure should be used, e.g. the median, or the TL-location.
To illustrate, let’s start by drawing some samples from the standard Cauchy distribution, which is centered around the origin.
>>> import lmo
>>> import numpy as np
>>> rng = np.random.default_rng(12345)
>>> x = rng.standard_cauchy(200)
The mean and the untrimmed L-location (which are equivalent) give wrong results, so don’t do this:
>>> np.mean(x)
-3.6805
>>> lmo.l_loc(x)
-3.6805
Usually, the answer to this problem is to use the median. However, the median only considers one or two samples (depending on whether the amount of samples is odd or even, respectively). So the median ignores most of the available information.
>>> np.median(x)
0.096825
>>> lmo.l_loc(x, trim=(len(x) - 1) // 2)
0.096825
Luckily for us, Lmo knows how to deal with longs tails, as well – trimming them (specifically, by skipping the first \(s\) and last \(t\) expected order statistics).
Let’s try the TL-location (which is equivalent to the median)
>>> lmo.l_loc(x, trim=1) # equivalent to `trim=(1, 1)`
0.06522
Notes
The trimmed L-location naturally unifies the arithmetic mean, the median, the minimum and the maximum. In particular, the following are equivalent, given n = len(x)
:
l_loc(x, trim=0)
/statistics.mean(x)
/np.mean(x)
l_loc(x, trim=(n-1) // 2)
/statistics.median(x)
/np.median(x)
l_loc(x, trim=(0, n-1))
/min(x)
/np.min(x)
l_loc(x, trim=(n-1, 0))
/max(x)
/np.max(x)
Note that numerical noise might cause slight differences between their results.
Even though lmo
is built with performance in mind, the equivalent numpy
functions are always faster, as they don’t need to sort all samples. Specifically, the time complexity of lmo.l_loc
(and l_moment
in general) is \(O(n \log n)\), whereas that of numpy.{mean,median,min,max}
is O(n)
See Also
lmo.l_moment(a, r, /, trim=(0, 0), *, axis=None, dtype=np.float64, fweights=None, aweights=None, sort=None, cache=False)
¶
Estimates the generalized trimmed L-moment \(\lambda^{(s, t)}_r\) from the samples along the specified axis. By default, this will be the regular L-moment, \(\lambda_r = \lambda^{(0, 0)}_r\).
PARAMETER | DESCRIPTION |
---|---|
a | Array containing numbers whose L-moments is desired. If |
r | The L-moment order(s), non-negative integer or array. TYPE: |
trim | Left- and right-trim orders \((s, t)\), non-negative ints or floats that are bound by \(s + t < n - r\). A single scalar \(t\) can be proivided as well, as alias for \((t, t)\). Some special cases include:
TYPE: |
axis | Axis along which to calculate the moments. If TYPE: |
dtype | Floating type to use in computing the L-moments. Default is |
fweights | 1-D array of integer frequency weights; the number of times each observation vector should be repeated. TYPE: |
aweights | An array of weights associated with the values in All The algorithm is similar to that for weighted quantiles. |
sort | Sorting algorithm, see TYPE: |
cache | Set to TYPE: |
RETURNS | DESCRIPTION |
---|---|
l | The L-moment(s) of the input This is a scalar iff a is 1-d and r is a scalar. Otherwise, this is an array with |
Examples:
Calculate the L-location and L-scale from student-T(2) samples, for different (symmetric) trim-lengths.
>>> import lmo, numpy as np
>>> x = np.random.default_rng(12345).standard_t(2, 99)
>>> lmo.l_moment(x, [1, 2], trim=(0, 0))
array([-0.01412282, 0.94063132])
>>> lmo.l_moment(x, [1, 2], trim=(1/2, 1/2))
array([-0.02158858, 0.5796519 ])
>>> lmo.l_moment(x, [1, 2], trim=(1, 1))
array([-0.0124483 , 0.40120115])
The theoretical L-locations are all 0, and the the L-scale are 1.1107
, 0.6002
and 0.4165
, respectively.
See Also
lmo.l_moment_cov(a, r_max, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
Non-parmateric auto-covariance matrix of the generalized trimmed L-moment point estimates with orders r = 1, ..., r_max
.
RETURNS | DESCRIPTION |
---|---|
S_l | Variance-covariance matrix/tensor of shape |
Examples:
Fitting of the cauchy distribution with TL-moments. The location is equal to the TL-location, and scale should be \(0.698\) times the TL(1)-scale, see Elamir & Seheult (2003).
>>> import lmo, numpy as np
>>> rng = np.random.default_rng(12345)
>>> x = rng.standard_cauchy(1337)
>>> lmo.l_moment(x, [1, 2], trim=(1, 1))
array([0.08142405, 0.68884917])
The L-moment estimates seem to make sense. Let’s check their standard errors, by taking the square root of the variances (the diagonal of the covariance matrix):
>>> lmo.l_moment_cov(x, 2, trim=(1, 1))
array([[ 4.89407076e-03, -4.26419310e-05],
[-4.26419310e-05, 1.30898414e-03]])
>>> np.sqrt(_.diagonal())
array([0.06995764, 0.03617989])
References
Todo
- Use the direct (Jacobi) method from Hosking (2015).
lmo.l_moment_influence(a, r, /, trim=(0, 0), *, sort=None, tol=1e-08)
¶
Empirical Influence Function (EIF) of a sample L-moment.
Notes
This function is not vectorized.
PARAMETER | DESCRIPTION |
---|---|
a | 1-D array-like containing observed samples. |
r | L-moment order. Must be a non-negative integer. TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
PARAMETER | DESCRIPTION |
---|---|
sort | Sorting algorithm, see TYPE: |
tol | Zero-roundoff absolute threshold. TYPE: |
RETURNS | DESCRIPTION |
---|---|
influence_function | The (vectorized) empirical influence function. TYPE: |
lmo.l_ratio(a, r, s, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
Estimates the generalized L-moment ratio.
Equivalent to lmo.l_moment(a, r, *, **) / lmo.l_moment(a, s, *, **)
.
The L-moment with r=0
is 1
, so the l_ratio(a, r, 0, *, **)
is equivalent to l_moment(a, r, *, **)
.
Notes
Often, when referring to the \(r\)th L-ratio, the L-moment ratio with \(k=2\) is implied, i.e. \(\tau^{(s, t)}_r\) is short-hand notation for \(\tau^{(s, t)}_{r,2}\).
The L-variation (L-moment Coefficient of Variation, or L-CB) is another special case of the L-moment ratio, \(\tau^{(s, t)}_{2,1}\). It is sometimes denoted in the literature by dropping the subscript indices: \(\tau^{(s, t)}\). Note that this should only be used with strictly positive distributions.
Examples:
Estimate the L-location, L-scale, L-skewness and L-kurtosis simultaneously:
>>> import lmo
>>> import numpy as np
>>> rng = np.random.default_rng(12345)
>>> x = rng.lognormal(size=99)
>>> lmo.l_ratio(x, [1, 2, 3, 4], [0, 0, 2, 2])
array([1.53196368, 0.77549561, 0.4463163 , 0.29752178])
>>> lmo.l_ratio(x, [1, 2, 3, 4], [0, 0, 2, 2], trim=(0, 1))
array([0.75646807, 0.32203446, 0.23887609, 0.07917904])
See Also
lmo.l_ratio_influence(a, r, k=2, /, trim=(0, 0), *, sort=None, tol=1e-08)
¶
Empirical Influence Function (EIF) of a sample L-moment ratio.
Notes
This function is not vectorized.
PARAMETER | DESCRIPTION |
---|---|
a | 1-D array-like containing observed samples. |
r | L-moment ratio order. Must be a non-negative integer. TYPE: |
k | Denominator L-moment order, defaults to 2. TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
PARAMETER | DESCRIPTION |
---|---|
sort | Sorting algorithm, see TYPE: |
tol | Zero-roundoff absolute threshold. TYPE: |
RETURNS | DESCRIPTION |
---|---|
influence_function | The (vectorized) empirical influence function. TYPE: |
lmo.l_ratio_se(a, r, s, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
Non-parametric estimates of the Standard Error (SE) in the L-ratio estimates from lmo.l_ratio
.
Examples:
Estimate the values and errors of the TL-loc, scale, skew and kurtosis for Cauchy-distributed samples. The theoretical values are [0.0, 0.698, 0.0, 0.343]
(Elamir & Seheult, 2003), respectively.
>>> import lmo, numpy as np
>>> rng = np.random.default_rng(12345)
>>> x = rng.standard_cauchy(42)
>>> lmo.l_ratio(x, [1, 2, 3, 4], [0, 0, 2, 2], trim=(1, 1))
array([-0.25830513, 0.61738638, -0.03069701, 0.25550176])
>>> lmo.l_ratio_se(x, [1, 2, 3, 4], [0, 0, 2, 2], trim=(1, 1))
array([0.32857302, 0.12896501, 0.13835403, 0.07188138])
lmo.l_scale(a, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
L-scale unbiased estimator for the second L-moment, \(\lambda^{(s, t)}_2\).
Alias for lmo.l_moment(a, 2, *, **)
.
Examples:
>>> import lmo, numpy as np
>>> x = np.random.default_rng(12345).standard_cauchy(99)
>>> x.std()
72.87715244
>>> lmo.l_scale(x)
9.501123995
>>> lmo.l_scale(x, trim=(1, 1))
0.658993279
Notes
If trim = (0, 0)
(default), the L-scale is equivalent to half the Gini mean difference (GMD).
See Also
lmo.l_skew(a, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
Unbiased sample estimator for the L-skewness coefficient.
Alias for lmo.l_ratio(a, 3, 2, *, **)
.
Examples:
>>> import lmo, numpy as np
>>> x = np.random.default_rng(12345).standard_exponential(99)
>>> lmo.l_skew(x)
0.38524343
>>> lmo.l_skew(x, trim=(0, 1))
0.27116139
See Also
lmo.l_stats(a, /, trim=(0, 0), num=4, *, axis=None, dtype=np.float64, **kwargs)
¶
Calculates the L-loc(ation), L-scale, L-skew(ness) and L-kurtosis.
Equivalent to lmo.l_ratio(a, [1, 2, 3, 4], [0, 0, 2, 2], *, **)
by default.
Examples:
>>> import lmo, scipy.stats
>>> x = scipy.stats.gumbel_r.rvs(size=99, random_state=12345)
>>> lmo.l_stats(x)
array([0.79014773, 0.68346357, 0.12207413, 0.12829047])
The theoretical L-stats of the standard Gumbel distribution are [0.577, 0.693, 0.170, 0.150]
.
See Also
lmo.l_stats_se(a, /, num=4, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
Calculates the standard errors (SE’s) of the L-stats
.
Equivalent to lmo.l_ratio_se(a, [1, 2, 3, 4], [0, 0, 2, 2], *, **)
by default.
Examples:
>>> import lmo, scipy.stats
>>> x = scipy.stats.gumbel_r.rvs(size=99, random_state=12345)
>>> lmo.l_stats(x)
array([0.79014773, 0.68346357, 0.12207413, 0.12829047])
>>> lmo.l_stats_se(x)
array([0.12305147, 0.05348839, 0.04472984, 0.03408495])
The theoretical L-stats of the standard Gumbel distribution are [0.577, 0.693, 0.170, 0.150]
. The corresponding relative z-scores are [-1.730, 0.181, 1.070, 0.648]
.
See Also
lmo.l_variation(a, /, trim=(0, 0), *, axis=None, dtype=np.float64, **kwargs)
¶
The coefficient of L-variation (or L-CV) unbiased sample estimator:
Alias for lmo.l_ratio(a, 2, 1, *, **)
.
Examples:
>>> import lmo, numpy as np
>>> x = np.random.default_rng(12345).pareto(4.2, 99)
>>> x.std() / x.mean()
1.32161112
>>> lmo.l_variation(x)
0.59073639
>>> lmo.l_variation(x, trim=(0, 1))
0.55395044
Notes
If trim = (0, 0)
(default), this is equivalent to the Gini coefficient, and lies within the interval \((0, 1)\).
Sample L-comoments¶
Lmo: Robust statistics with trimmed L-moments and L-comoments.
lmo.l_cokurtosis(a, /, trim=(0, 0), *, dtype=np.float64, **kwargs)
¶
Sample L-cokurtosis coefficient matrix \(\tilde\Lambda^{(t_1, t_2)}_4\).
Alias for lmo.l_coratio(a, 4, 2, *, **)
.
See Also
lmo.l_coloc(a, /, trim=(0, 0), *, dtype=np.float64, **kwargs)
¶
L-colocation matrix of 1st L-comoment estimates, \(\Lambda^{(t_1, t_2)}_1\).
Alias for lmo.l_comoment(a, 1, *, **)
.
Notes
If trim = (0, 0)
(default), the L-colocation for \([ij]\) is the L-location \(\lambda_1\) of \(x_i\), independent of \(x_j\).
Examples:
Without trimming, the L-colocation only provides marginal information:
>>> import lmo, numpy as np
>>> rng = np.random.default_rng(12345)
>>> x = rng.multivariate_normal([0, 0], [[6, -3], [-3, 3.5]], 99).T
>>> lmo.l_loc(x, axis=-1)
array([-0.02678225, 0.03008309])
>>> lmo.l_coloc(x)
array([[-0.02678225, -0.02678225],
[ 0.03008309, 0.03008309]])
But the trimmed L-locations are a different story…
>>> lmo.l_loc(x, trim=(1, 1), axis=-1)
array([-0.10488868, -0.00625729])
>>> lmo.l_coloc(x, trim=(1, 1))
array([[-0.10488868, -0.03797989],
[ 0.03325074, -0.00625729]])
What this tells us, is somewhat of a mystery: trimmed L-comoments have been only been briefly mentioned once or twice in the literature.
See Also
lmo.l_comoment(a, r, /, trim=(0, 0), *, dtype=np.float64, rowvar=True, sort=None, cache=False)
¶
Multivariate extension of lmo.l_moment
.
Estimates the L-comoment matrix:
Whereas the L-moments are calculated using the order statistics of the observations, i.e. by sorting, the L-comoment sorts \(x_i\) using the order of \(x_j\). This means that in general, \(\lambda_{r [ij]}^{(t_1, t_2)} \neq \lambda_{r [ji]}^{(t_1, t_2)}\), i.e. \(\Lambda_{r}^{(t_1, t_2)}\) is not symmetric.
The \(r\)-th L-comoment \(\lambda_{r [ij]}^{(t_1, t_2)}\) reduces to the L-moment if \(i=j\), and can therefore be seen as a generalization of the (univariate) L-moments. Similar to how the diagonal of a covariance matrix contains the variances, the diagonal of the L-comoment matrix contains the L-moments.
Based on the proposed definition by Serfling & Xiao (2007) for L-comoments. Extended to allow for generalized trimming.
PARAMETER | DESCRIPTION |
---|---|
a | 1-D or 2-D array-like containing |
r | The L-moment order(s), non-negative integer or array. TYPE: |
trim | Left- and right-trim orders \((t_1, t_2)\), non-negative ints or floats that are bound by \(t_1 + t_2 < n - r\). Some special cases include:
TYPE: |
rowvar | If TYPE: |
dtype | Floating type to use in computing the L-moments. Default is |
sort | Sorting algorithm, see TYPE: |
cache | Set to TYPE: |
RETURNS | DESCRIPTION |
---|---|
L | Array of shape |
Examples:
Estimation of the second L-comoment (the L-coscale) from biviariate normal samples:
>>> import lmo, numpy as np
>>> rng = np.random.default_rng(12345)
>>> x = rng.multivariate_normal([0, 0], [[6, -3], [-3, 3.5]], 99).T
>>> lmo.l_comoment(x, 2)
array([[ 1.2766793 , -0.83299947],
[-0.71547941, 1.05990727]])
The diagonal contains the univariate L-moments:
>>> lmo.l_moment(x, 2, axis=-1)
array([1.2766793 , 1.05990727])
lmo.l_coratio(a, r, s, /, trim=(0, 0), *, dtype=np.float64, **kwargs)
¶
Estimate the generalized matrix of L-comoment ratio’s.
See Also
lmo.l_corr(a, /, trim=(0, 0), *, dtype=np.float64, **kwargs)
¶
Sample L-correlation coefficient matrix \(\tilde\Lambda^{(t_1, t_2)}_2\); the ratio of the L-coscale matrix over the L-scale column-vectors.
Alias for lmo.l_coratio(a, 2, 2, *, **)
.
The diagonal consists of all 1’s.
Where the pearson correlation coefficient measures linearity, the (T)L-correlation coefficient measures monotonicity.
Examples:
>>> import lmo, numpy as np
>>> rng = np.random.default_rng(12345)
>>> cov = np.array([[6, -3], [-3, 3.5]])
>>> x = rng.multivariate_normal([0, 0], [[6, -3], [-3, 3.5]], 99).T
>>> lmo.l_corr(x)
array([[ 1. , -0.65247355],
[-0.67503962, 1. ]])
Let’s compare this with the theoretical correlation
>>> cov[0, 1] / np.sqrt(cov[0, 0] * cov[1, 1])
-0.6546536707079772
and the (Pearson) correlation coefficient matrix:
>>> np.corrcoef(x)
array([[ 1. , -0.66383285],
[-0.66383285, 1. ]])
See Also
lmo.l_coscale(a, /, trim=(0, 0), *, dtype=np.float64, **kwargs)
¶
L-coscale matrix of 2nd L-comoment estimates, \(\Lambda^{(t_1, t_2)}_2\).
Alias for lmo.l_comoment(a, 2, *, **)
.
Analogous to the (auto-) variance-covariance matrix, the L-coscale matrix is positive semi-definite, and its main diagonal contains the L-scale’s. conversely, the L-coscale matrix is inherently asymmetric, thus yielding more information.
Examples:
>>> import lmo, numpy as np
>>> rng = np.random.default_rng(12345)
>>> x = rng.multivariate_normal([0, 0], [[6, -3], [-3, 3.5]], 99).T
>>> lmo.l_scale(x, trim=(1, 1), axis=-1)
array([0.66698774, 0.54440895])
>>> lmo.l_coscale(x, trim=(1, 1))
array([[ 0.66698774, -0.41025416],
[-0.37918065, 0.54440895]])
See Also
lmo.l_coskew(a, /, trim=(0, 0), *, dtype=np.float64, **kwargs)
¶
Sample L-coskewness coefficient matrix \(\tilde\Lambda^{(t_1, t_2)}_3\).
Alias for lmo.l_coratio(a, 3, 2, *, **)
.
See Also
lmo.l_costats(a, /, trim=(0, 0), *, dtype=np.float64, **kwargs)
¶
Calculates the L-coscale, L-corr(elation), L-coskew(ness) and L-cokurtosis.
Equivalent to lmo.l_coratio(a, [2, 2, 3, 4], [0, 2, 2, 2], *, **)
.
See Also
Distributions¶
Probability distributions, compatible with scipy.stats
.
lmo.distributions.kumaraswamy: RVContinuous[float, float] = kumaraswamy_gen(a=0.0, b=1.0, name='kumaraswamy')
module-attribute
¶
A Kumaraswamy random variable, similar to scipy.stats.beta
.
The probability density function for kumaraswamy
is:
for \( 0 < x < 1,\ a > 0,\ b > 0 \).
kumaraswamy
takes \( a \) and \( b \) as shape parameters.
See Also
lmo.distributions.wakeby: RVContinuous[float, float, float] = wakeby_gen(a=0.0, name='wakeby')
module-attribute
¶
A Wakeby random variable, a generalization of scipy.stats.genpareto
.
wakeby
takes \( b \), \( d \) and \( f \) as shape parameters.
For a detailed description of the Wakeby distribution, refer to Distributions - Wakeby.
lmo.distributions.genlambda: RVContinuous[float, float, float] = genlambda_gen(name='genlambda')
module-attribute
¶
A generalized Tukey-Lambda random variable.
genlambda
takes b
, d
and f
as shape parameters. b
and d
can be any float, and f
requires -1 <= f <= 1
.
If f == 0
and b == d
, genlambda
is equivalent to scipy.stats.tukeylambda
, with b
(or d
) as shape parameter.
For a detailed description of the GLD, refer to Distributions - GLD.
lmo.distributions.l_poly(lmbda, /, trim=(0, 0), *, seed=None)
¶
Polynomial quantile distribution with (only) the given L-moments.
Todo
- Examples
stats(moments='mv')
Create a new l_poly
instance.
PARAMETER | DESCRIPTION |
---|---|
lmbda | 1-d array-like of L-moments \( \tlmoment{s,t}{r} \) for \( r = 1, 2, \ldots, R \). At least 2 L-moments are required. All remaining L-moments with \( r > R \) are considered zero. |
trim | The trim-length(s) of L-moments TYPE: |
seed | Random number generator. |
random_state: np.random.Generator
property
writable
¶
The random number generator of the distribution.
fit(data, moments=None, trim=(0, 0))
classmethod
¶
Fit distribution using the (trimmed) L-moment estimates of the given data.
PARAMETER | DESCRIPTION |
---|---|
data | 1-d array-like with sample observations. |
moments | How many sample L-moments to use, TYPE: |
trim | The left and right trim-lengths \((s, t)\) to use. Defaults to \((0, 0)\). TYPE: |
RETURNS | DESCRIPTION |
---|---|
Self | A fitted |
RAISES | DESCRIPTION |
---|---|
TypeError | Invalid |
ValueError | Not enough |
ValueError | If the L-moments of the data do not result in strictly monotinically increasing quantile function (PPF). This generally means that either the left, the right, or both |
rvs(size=None, random_state=None)
¶
Draw random variates from the relevant distribution.
PARAMETER | DESCRIPTION |
---|---|
size | Defining number of random variates, defaults to 1. |
random_state | Seed or |
RETURNS | DESCRIPTION |
---|---|
float | _ArrF8 | A scalar or array with shape like |
ppf(p)
¶
Percent point function \( Q(p) \) (inverse of CDF, a.k.a. the quantile function) at \( p \) of the given distribution.
PARAMETER | DESCRIPTION |
---|---|
p | Scalar or array-like of lower tail probability values in \( [0, 1] \). |
See Also
isf(q)
¶
qdf(p)
¶
Quantile density function \( q \equiv \frac{\dd{Q}}{\dd{p}} \) ( derivative of the PPF) at \( p \) of the given distribution.
PARAMETER | DESCRIPTION |
---|---|
p | Scalar or array-like of lower tail probability values in \( [0, 1] \). |
See Also
cdf(x)
¶
Cumulative distribution function \( F(x) = \mathrm{P}(X \le x) \) at \( x \) of the given distribution.
Note
Because the CDF of l_poly
is not analytically expressible, it is evaluated numerically using a root-finding algorithm.
PARAMETER | DESCRIPTION |
---|---|
x | Scalar or array-like of quantiles. |
logcdf(x)
¶
sf(x)
¶
logsf(x)
¶
pdf(x)
¶
Probability density function \( f \equiv \frac{\dd{F}}{\dd{x}} \) (derivative of the CDF) at \( x \).
By applying the inverse function rule, the PDF can also defined using the QDF as \( f(x) = 1 / q\big(F(x)\big) \).
PARAMETER | DESCRIPTION |
---|---|
x | Scalar or array-like of quantiles. |
hf(x)
¶
Hazard function \( h(x) = f(x) / S(x) \) at \( x \), with \( f \) and \( S \) the PDF and SF, respectively.
PARAMETER | DESCRIPTION |
---|---|
x | Scalar or array-like of quantiles. |
mean()
¶
The mean \( \mu = \E[X] \) of random varianble \( X \) of the relevant distribution.
See Also
var()
¶
The variance \( \Var[X] = \E\bigl[(X - \E[X])^2\bigr] = \E\bigl[X^2\bigr] - \E[X]^2 = \sigma^2 \) (2nd central product moment) of random varianble \( X \) of the relevant distribution.
See Also
std()
¶
The standard deviation \( \Std[X] = \sqrt{\Var[X]} = \sigma \) of random varianble \( X \) of the relevant distribution.
See Also
entropy()
¶
Differential entropy \( \mathrm{H}[X] \) of random varianble \( X \) of the relevant distribution.
It is defined as
with \( f(x) \) the PDF, \( Q(p) \) the PPF, and \( q(p) = Q'(p) \) the QDF.
See Also
interval(confidence)
¶
Confidence interval with equal areas around the median.
For confidence
level \( \alpha \in [0, 1] \), this function evaluates
where \( Q(p) \) is the PPF.
PARAMETER | DESCRIPTION |
---|---|
confidence | Scalar or array-like. The Probability that a random varianble will be drawn from the returned range. Each confidence value should be between 0 and 1. |
moment(n)
¶
Non-central product moment \( \E[X^n] \) of \( X \) of specified order \( n \).
Note
The product moment is evaluated using numerical integration (scipy.integrate.quad
), which cannot check whether the product-moment actually exists for the distribution, in which case an invalid result will be returned.
PARAMETER | DESCRIPTION |
---|---|
n | Order \( n \ge 0 \) of the moment. TYPE: |
See Also
Todo
- For n>=2, attempt tot infer from
_l_moments
if the 2nd moment condition holds, usingdiagnostics.l_moment_bounds
.
stats(moments='mv')
¶
Some product-moment statistics of the given distribution.
PARAMETER | DESCRIPTION |
---|---|
moments | Composed of letters
TYPE: |
expect(g)
¶
Calculate expected value of a function with respect to the distribution by numerical integration.
The expected value of a function \( g(x) \) with respect to a random variable \( X \) is defined as
with \( f(x) \) the PDF, and \( Q \) the PPF.
PARAMETER | DESCRIPTION |
---|---|
g | Continuous and deterministic function \( g: \reals \mapsto \reals \). TYPE: |
l_moment(r, /, trim=None)
¶
Evaluate the population L-moment(s) \(\lambda^{(s,t)}_r\).
PARAMETER | DESCRIPTION |
---|---|
r | L-moment order(s), non-negative integer or array-like of integers. TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
l_ratio(r, k, /, trim=None)
¶
Evaluate the population L-moment ratio(’s) \(\tau^{(s,t)}_{r,k}\).
PARAMETER | DESCRIPTION |
---|---|
r | L-moment order(s), non-negative integer or array-like of integers. TYPE: |
k | L-moment order of the denominator, e.g. 2. TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
l_stats(trim=None, moments=4)
¶
Evaluate the L-moments (for \(r \le 2\)) and L-ratio’s (for \(r > 2\)).
PARAMETER | DESCRIPTION |
---|---|
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
moments | The amount of L-moments to return. Defaults to 4. TYPE: |
l_loc(trim=None)
¶
L-location of the distribution, i.e. the 1st L-moment.
Alias for l_poly.l_moment(1, ...)
.
See Also
l_scale(trim=None)
¶
L-scale of the distribution, i.e. the 2nd L-moment.
Alias for l_poly.l_moment(2, ...)
.
See Also
l_skew(trim=None)
¶
L-skewness coefficient of the distribution; the 3rd L-moment ratio.
Alias for l_poly.l_ratio(3, 2, ...)
.
See Also
l_kurtosis(trim=None)
¶
L-kurtosis coefficient of the distribution; the 4th L-moment ratio.
Alias for l_poly.l_ratio(4, 2, ...)
.
See Also
lmo.distributions.l_rv_nonparametric(l_moments, trim=(0, 0), a=None, b=None, **kwargs)
¶
Bases: _rv_continuous
Warning
l_rv_nonparametric
is deprecated, and will be removed in version 0.13
. Use l_poly
instead.
Estimate a distribution using the given L-moments. See scipy.stats.rv_continuous
for the available method.
The PPF (quantile function) is estimated using generalized Fourier series, with the (shifted) Jacobi orthogonal polynomials as basis, and the (scaled) L-moments as coefficients.
The corrected version of theorem 3 from Hosking (2007) states that
converges almost everywhere as \( R \rightarrow \infty \), for any sufficiently smooth quantile function (PPF) \( Q(u) \) on \( u \in (0, 1) \). Here, \( \shjacobi n \alpha \beta x = \jacobi{n}{\alpha}{\beta}{2x - 1} \) is a shifted Jacobi polynomial.
References
PARAMETER | DESCRIPTION |
---|---|
l_moments | Vector containing the first \(R\) consecutive L-moments \(\left[ \lambda^{(s, t)}_1 \; \lambda^{(s, t)}_2 \; \dots \; \lambda^{(s, t)}_R \right]\), where \(R \ge 2\). Sample L-moments can be estimated using e.g. The trim-lengths \((s, t)\) should be the same for all L-moments. TYPE: |
trim | The left and right trim-lengths \((s, t)\), that correspond to the provided TYPE: |
a | Lower bound of the support of the distribution. By default it is estimated from the L-moments. TYPE: |
b | Upper bound of the support of the distribution. By default it is estimated from the L-moments. TYPE: |
**kwargs | Optional params for TYPE: |
RAISES | DESCRIPTION |
---|---|
ValueError | If |
l_moments: npt.NDArray[np.float64]
property
¶
Initial L-moments, for orders \(r = 1, 2, \dots, R\).
trim: tuple[int, int] | tuple[float, float]
property
¶
The provided trim-lengths \((s, t)\).
ppf_poly: PolySeries
property
¶
Polynomial estimate of the percent point function (PPF), a.k.a. the quantile function (QF), or the inverse cumulative distribution function (ICDF).
Note
Converges to the “true” PPF in the mean-squared sense, with weight function \(q^s (1 - q)^t\) of quantile \(q \in [0, 1]\), and trim-lengths \((t_1, t_2) \in \mathbb{R^+} \times \mathbb{R^+}\).
RETURNS | DESCRIPTION |
---|---|
PolySeries | A |
cdf_poly: PolySeries
cached
property
¶
Polynomial least-squares interpolation of the CDF.
RETURNS | DESCRIPTION |
---|---|
PolySeries | A |
pdf_poly: PolySeries
cached
property
¶
Derivative of the polynomial interpolation of the CDF, i.e. the polynomial estimate of the PDF.
RETURNS | DESCRIPTION |
---|---|
PolySeries | A |
fit(data, /, rmax=None, trim=(0, 0))
classmethod
¶
Estimate L-moment from the samples, and return a new l_rv_nonparametric
instance.
PARAMETER | DESCRIPTION |
---|---|
data | 1d array-like with univariate sample observations. |
rmax | The (maximum) amount of L-moment orders to use. Defaults to \(\lceil 4 \log_{10} N \rceil\). The quantile polynomial will be of degree TYPE: |
trim | The left and right trim-lengths \((s, t)\), that correspond to the provided TYPE: |
RETURNS | DESCRIPTION |
---|---|
l_rv_nonparametric | A fitted |
l_rv_nonparametric | |
l_rv_nonparametric | instance. |
Todo
- Optimal
rmax
selection (the error appears to be periodic..?) - Optimal
trim
selection
lmo.distributions.wakeby_gen
¶
Bases: _rv_continuous
Statistical test and tools¶
Hypothesis tests, estimator properties, and performance metrics.
lmo.diagnostic.HypothesisTestResult
¶
Bases: NamedTuple
Results of a hypothesis test.
ATTRIBUTE | DESCRIPTION |
---|---|
statistic | The raw test statistic. Its distribution depends on the specific test implementation. TYPE: |
pvalue | Two-sided probability value corresponding to the the null hypothesis, \(H_0\). TYPE: |
lmo.diagnostic.normaltest(a, /, *, axis=None)
¶
Statistical hypothesis test for non-normality, using the L-skewness and L-kurtosis coefficients on the sample data..
Adapted from Harri & Coble (2011), and includes Hosking’s correction.
Definition
- H0: The data was drawn from a normal distribution.
- H1: The data was drawn from a non-normal distribution.
Examples:
Compare the testing power with scipy.stats.normaltest
given 10.000 samples from a contaminated normal distribution.
>>> import numpy as np
>>> from lmo.diagnostic import normaltest
>>> from scipy.stats import normaltest as normaltest_scipy
>>> rng = np.random.default_rng(12345)
>>> n = 10_000
>>> x = 0.9 * rng.normal(0, 1, n) + 0.1 * rng.normal(0, 9, n)
>>> normaltest(x)[1]
0.04806618
>>> normaltest_scipy(x)[1]
0.08435627
At a 5% significance level, Lmo’s test is significant (i.e. indicating non-normality), whereas scipy’s test isn’t (i.e. inconclusive).
PARAMETER | DESCRIPTION |
---|---|
a | Array-like of sample data. |
axis | Axis along which to compute the test. TYPE: |
RETURNS | DESCRIPTION |
---|---|
HypothesisTestResult | A named tuple with:
|
lmo.diagnostic.l_moment_gof(rv_or_cdf, l_moments, n_obs, /, trim=(0, 0), **kwargs)
¶
Goodness-of-fit (GOF) hypothesis test for the null hypothesis that the observed L-moments come from a distribution with the given scipy.stats
distribution or cumulative distribution function (CDF).
H0
: The theoretical probability distribution, with the given CDF, is a good fit for the observed L-moments.H1
: The distribution is not a good fit for the observed L-moments.
The test statistic is the squared Mahalanobis distance between the \(n\) observed L-moments, and the theoretical L-moments. It asymptically (in sample size) follows the \(\chi^2\) distribution, with \(n\) degrees of freedom.
The sample L-moments are expected to be of consecutive orders \(r = 1, 2, \dots, n\). Generally, the amount of L-moments \(n\) should not be less than the amount of parameters of the distribution, including the location and scale parameters. Therefore, it is required to have \(n \ge 2\).
Notes
The theoretical L-moments and their covariance matrix are calculated from the CDF using numerical integration (scipy.integrate.quad
and scipy.integrate.nquad
). Undefined or infinite integrals cannot be detected, in which case the results might be incorrect.
If an IntegrationWarning
is issued, or the function is very slow, then the results are probably incorrect, and larger degrees of trimming should be used.
Examples:
Test if the samples are drawn from a normal distribution.
>>> import lmo
>>> import numpy as np
>>> from lmo.diagnostic import l_moment_gof
>>> from scipy.stats import norm
>>> rng = np.random.default_rng(12345)
>>> X = norm(13.12, 1.66)
>>> n = 1_000
>>> x = X.rvs(n, random_state=rng)
>>> x_lm = lmo.l_moment(x, [1, 2, 3, 4])
>>> l_moment_gof(X, x_lm, n).pvalue
0.82597
Contaminated samples:
>>> y = 0.9 * x + 0.1 * rng.normal(X.mean(), X.std() * 10, n)
>>> y_lm = lmo.l_moment(y, [1, 2, 3, 4])
>>> y_lm.round(3)
array([13.193, 1.286, 0.006, 0.168])
>>> l_moment_gof(X, y_lm, n).pvalue
0.0
See Also
lmo.diagnostic.l_stats_gof(rv_or_cdf, l_stats, n_obs, /, trim=(0, 0), **kwargs)
¶
Analogous to lmo.diagnostic.l_moment_gof
, but using the L-stats (see lmo.l_stats
).
lmo.diagnostic.l_moment_bounds(r, /, trim=(0, 0), scale=1.0)
¶
Returns the absolute upper bounds \(L^{(s,t)}_r\) on L-moments \(\lambda^{(s,t)}_r\), proportional to the scale \(\sigma_X\) (standard deviation) of the probability distribution of random variable \(X\). So \(\left| \lambda^{(s,t)}_r(X) \right| \le \sigma_X \, L^{(s,t)}_r\), given that standard deviation \(\sigma_X\) of \(X\) exists and is finite.
Warning
These bounds do not apply to distributions with undefined variance, e.g. the Cauchy distribution, even if trimmed L-moments are used. Distributions with infinite variance (e.g. Student’s t with \(\nu=2\)) are a grey area:
For the L-scale (\(r=2\)), the corresponding bound will not be a valid one. However, it can still be used to find the L-ratio bounds, because the \(\sigma_X\) terms will cancel out. Doing this is not for the faint of heart, as it requires dividing infinity by infinity. So be sure to wear safety glasses.
The bounds are derived by applying the Cauchy-Schwarz inequality to the covariance-based definition of generalized trimmed L-moment, for \(r > 1\):
where \(B\) is the Beta function, \(P^{(\alpha, \beta)}_r\) the Jacobi polynomial, and \(F\) the cumulative distribution function of random variable \(X\).
After a lot of work, one can (and one did) derive the closed-form inequality:
for \(r \in \mathbb{N}_{\ge 2}\) and \(s, t \in \mathbb{R}_{\ge 0}\), where \(\Gamma\) is the Gamma function, and \(B\) the multivariate Beta function
For the untrimmed L-moments, this simplifies to
Notes
For \(r=1\) there are no bounds, i.e. float('inf')
is returned.
There are no references; this novel finding is not (yet..?) published by the author, @jorenham.
PARAMETER | DESCRIPTION |
---|---|
r | The L-moment order(s), non-negative integer or array-like of integers. TYPE: |
trim | Left- and right-trim orders \((s, t)\), as a tuple of non-negative ints or floats. TYPE: |
scale | The standard deviation \(\sigma_X\) of the random variable \(X\). Defaults to 1. TYPE: |
RETURNS | DESCRIPTION |
---|---|
out | float array or scalar like TYPE: |
See Also
lmo.diagnostic.l_ratio_bounds(r, /, trim=(0, 0), *, legacy=False)
¶
Unlike the standardized product-moments, the L-moment ratio’s with \( r \ge 2 \) are bounded above and below.
Specifically, Hosking derived in 2007 that
But this derivation relies on unnecessarily loose Jacobi polynomial bounds. If the actual min and max of the Jacobi polynomials are used instead, the following (tighter) inequality is obtained:
where
Examples:
Without trim, the lower- and upper-bounds of the L-skewness and L-kurtosis are:
>>> l_ratio_bounds(3)
(-1.0, 1.0)
>>> l_ratio_bounds(4)
(-0.25, 1.0)
For the L-kurtosis, the “legacy” bounds by Hosking (2007) are clearly looser:
>>> l_ratio_bounds(4, legacy=True)
(-1.0, 1.0)
For the symmetrically trimmed TL-moment ratio’s:
>>> l_ratio_bounds(3, trim=3)
(-1.2, 1.2)
>>> l_ratio_bounds(4, trim=3)
(-0.15, 1.5)
Similarly, those of the LL-ratio’s are
>>> l_ratio_bounds(3, trim=(0, 3))
(-0.8, 2.0)
>>> l_ratio_bounds(4, trim=(0, 3))
(-0.233333, 3.5)
The LH-skewness bounds are “flipped” w.r.t to the LL-skewness, but they are the same for the L*-kurtosis:
>>> l_ratio_bounds(3, trim=(3, 0))
(-2.0, 0.8)
>>> l_ratio_bounds(4, trim=(3, 0))
(-0.233333, 3.5)
The bounds of multiple L-ratio’s can be calculated in one shot:
>>> np.stack(l_ratio_bounds([3, 4, 5, 6], trim=(1, 2)))
array([[-1. , -0.19444444, -1.12 , -0.14945848],
[ 1.33333333, 1.75 , 2.24 , 2.8 ]])
PARAMETER | DESCRIPTION |
---|---|
r | Scalar or array-like with the L-moment ratio order(s). TYPE: |
trim | L-moment ratio trim-length(s). TYPE: |
legacy | If set to TYPE: |
RETURNS | DESCRIPTION |
---|---|
tuple[float | _ArrF8, float | _ArrF8] | A 2-tuple with arrays or scalars, of the lower- and upper bounds. |
See Also
lmo.diagnostic.rejection_point(influence_fn, /, rho_min=0, rho_max=np.inf)
¶
Evaluate the approximate rejection point of an influence function \(\psi_{T|F}(x)\) given a statistical functional \(T\) (e.g. an L-moment) and cumulative distribution function \(F(x)\).
with a \(\epsilon\) a small positive number, corresponding to the tol
param of e.g. l_moment_influence , which defaults to 1e-8
.
Examples:
The untrimmed L-location isn’t robust, e.g. with the standard normal distribution:
>>> import numpy as np
>>> from scipy.stats import distributions as dists
>>> from lmo.diagnostic import rejection_point
>>> if_l_loc_norm = dists.norm.l_moment_influence(1, trim=0)
>>> if_l_loc_norm(np.inf)
inf
>>> rejection_point(if_l_loc_norm)
nan
For the TL-location of the Gaussian distribution, and even for the Student’s t distribution with 4 degrees of freedom (3 also works, but is very slow), they exist.
>>> influence_norm = dists.norm.l_moment_influence(1, trim=1)
>>> influence_t4 = dists.t(4).l_moment_influence(1, trim=1)
>>> influence_norm(np.inf), influence_t4(np.inf)
(0.0, 0.0)
>>> rejection_point(influence_norm), rejection_point(influence_t4)
(6.0, 206.0)
Notes
Large rejection points (e.g. >1000) are unlikely to be found.
For instance, that of the TL-location of the Student’s t distribution with 2 degrees of freedom lies between somewhere 1e4
and 1e5
, but will not be found. In this case, using trim=2
will return 166.0
.
PARAMETER | DESCRIPTION |
---|---|
influence_fn | Univariate influence function. |
rho_min | The minimum \(\rho^*_{T|F}\) of the search space. Must be finite and non-negative. Defaults to \(0\). TYPE: |
rho_max | The maximum \(\rho^*_{T|F}\) of the search space. Must be larger than |
RETURNS | DESCRIPTION |
---|---|
float | A finite or infinite scalar. |
lmo.diagnostic.error_sensitivity(influence_fn, /, domain=(-math.inf, math.inf))
¶
Evaluate the gross-error sensitivity of an influence function \(\psi_{T|F}(x)\) given a statistical functional \(T\) (e.g. an L-moment) and cumulative distribution function \(F(x)\).
Examples:
Evaluate the gross-error sensitivity of the standard exponential distribution’s LL-skewness (\(\tau^{(0, 1)}_3\)) and LL-kurtosis (\(\tau^{(0, 1)}_4\)) coefficients:
>>> from lmo.diagnostic import error_sensitivity
>>> from scipy.stats import expon
>>> ll_skew_if = expon.l_ratio_influence(3, 2, trim=(0, 1))
>>> ll_kurt_if = expon.l_ratio_influence(4, 2, trim=(0, 1))
>>> error_sensitivity(ll_skew_if, domain=(0, float('inf')))
1.814657
>>> error_sensitivity(ll_kurt_if, domain=(0, float('inf')))
1.377743
PARAMETER | DESCRIPTION |
---|---|
influence_fn | Univariate influence function. |
domain | Domain of the CDF. Defaults to \((-\infty, \infty)\). |
RETURNS | DESCRIPTION |
---|---|
float | Gross-error sensitivity \(\gamma^*_{T|F}\) . |
lmo.diagnostic.shift_sensitivity(influence_fn, /, domain=(-math.inf, math.inf))
¶
Evaluate the local-shift sensitivity of an influence function \(\psi_{T|F}(x)\) given a statistical functional \(T\) (e.g. an L-moment) and cumulative distribution function \(F(x)\).
Represents the effect of shifting an observation slightly from \(x\), to a neighbouring point \(y\). For instance, adding an observation at \(y\) and removing one at \(x\).
Examples:
Evaluate the local-shift sensitivity of the standard exponential distribution’s LL-skewness (\(\tau^{(0, 1)}_3\)) and LL-kurtosis (\(\tau^{(0, 1)}_4\)) coefficients:
>>> from lmo.diagnostic import shift_sensitivity
>>> from scipy.stats import expon
>>> ll_skew_if = expon.l_ratio_influence(3, 2, trim=(0, 1))
>>> ll_kurt_if = expon.l_ratio_influence(4, 2, trim=(0, 1))
>>> domain = 0, float('inf')
>>> shift_sensitivity(ll_skew_if, domain)
0.837735
>>> shift_sensitivity(ll_kurt_if, domain)
1.442062
Let’s compare these with the untrimmed ones:
>>> shift_sensitivity(expon.l_ratio_influence(3, 2), domain)
1.920317
>>> shift_sensitivity(expon.l_ratio_influence(4, 2), domain)
1.047565
PARAMETER | DESCRIPTION |
---|---|
influence_fn | Univariate influence function. |
domain | Domain of the CDF. Defaults to \((-\infty, \infty)\). |
RETURNS | DESCRIPTION |
---|---|
float | Local-shift sensitivity \(\lambda^*_{T|F}\) . |
Third party integration¶
scipy.stats
¶
Extensions for scipy.stats
distributions.
lmo.contrib.scipy_stats.l_rv_generic
¶
Additional methods that are patched into scipy.stats.rv_continuous
and scipy.stats.rv_discrete
.
l_moment(r, /, *args, trim=(0, 0), quad_opts=None, **kwds)
¶
Population L-moment(s) \(\lambda^{(s,t)}_r\).
with \(U = F_X(X)\) the rank of \(X\), and \(\tilde{P}^{(a,b)}_n(x)\) the shifted (\(x \mapsto 2x-1\)) Jacobi polynomial.
Examples:
Evaluate the population L-moments of the normally-distributed IQ test:
>>> import lmo
>>> from scipy.stats import norm
>>> norm(100, 15).l_moment([1, 2, 3, 4]).round(6)
array([100. , 8.462844, 0. , 1.037559])
>>> _[1] * np.sqrt(np.pi)
15.0000004
Discrete distributions are also supported, e.g. the Binomial distribution:
>>> from scipy.stats import binom
>>> binom(10, .6).l_moment([1, 2, 3, 4]).round(6)
array([ 6. , 0.862238, -0.019729, 0.096461])
PARAMETER | DESCRIPTION |
---|---|
r | L-moment order(s), non-negative integer or array-like of integers. TYPE: |
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information) TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
RAISES | DESCRIPTION |
---|---|
TypeError |
|
ValueError |
|
RETURNS | DESCRIPTION |
---|---|
lmbda | The population L-moment(s), a scalar or float array like |
References
See Also
lmo.l_moment
: sample L-moment
l_ratio(r, k, /, *args, trim=(0, 0), quad_opts=None, **kwds)
¶
L-moment ratio(’s) \(\tau^{(s,t)}_{r,k}\).
Unless explicitly specified, the r-th (\(r>2\)) L-ratio, \(\tau^{(s,t)}_r\) refers to \(\tau^{(s,t)}_{r, 2}\). Another special case is the L-variation, or the L-CV, \(\tau^{(s,t)} = \tau^{(s,t)}_{2, 1}\). This is the L-moment analogue of the coefficient of variation.
Examples:
Evaluate the population L-CV and LL-CV (CV = coefficient of variation) of the standard Rayleigh distribution.
>>> import lmo
>>> from scipy.stats import distributions
>>> X = distributions.rayleigh()
>>> X.std() / X.mean() # legacy CV
0.5227232
>>> X.l_ratio(2, 1)
0.2928932
>>> X.l_ratio(2, 1, trim=(0, 1))
0.2752551
And similarly, for the (discrete) Poisson distribution with rate parameter set to 2, the L-CF and LL-CV evaluate to:
>>> X = distributions.poisson(2)
>>> X.std() / X.mean()
0.7071067
>>> X.l_ratio(2, 1)
0.3857527
>>> X.l_ratio(2, 1, trim=(0, 1))
0.4097538
Note that (untrimmed) L-CV requires a higher (subdivision) limit in the integration routine, otherwise it’ll complain that it didn’t converge (enough) yet. This is because it’s effectively integrating a non-smooth function, which is mathematically iffy, but works fine in this numerical application.
PARAMETER | DESCRIPTION |
---|---|
r | L-moment ratio order(s), non-negative integer or array-like of integers. TYPE: |
k | L-moment order of the denominator, e.g. 2. TYPE: |
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information) TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
See Also
l_rv_generic.l_moment
lmo.l_ratio
- Sample L-moment ratio estimator
l_stats(*args, trim=(0, 0), moments=4, quad_opts=None, **kwds)
¶
The L-moments (for \(r \le 2\)) and L-ratio’s (for \(r > 2\)).
By default, the first moments = 4
population L-stats are calculated:
- \(\lambda^{(s,t)}_1\) - L-location
- \(\lambda^{(s,t)}_2\) - L-scale
- \(\tau^{(s,t)}_3\) - L-skewness coefficient
- \(\tau^{(s,t)}_4\) - L-kurtosis coefficient
This method is equivalent to X.l_ratio([1, 2, 3, 4], [0, 0, 2, 2], *, **)
, for with default moments = 4
.
Examples:
Summarize the standard exponential distribution for different trim-orders.
>>> import lmo
>>> from scipy.stats import distributions
>>> X = distributions.expon()
>>> X.l_stats().round(6)
array([1. , 0.5 , 0.333333, 0.166667])
>>> X.l_stats(trim=(0, 1/2)).round(6)
array([0.666667, 0.333333, 0.266667, 0.114286])
>>> X.l_stats(trim=(0, 1)).round(6)
array([0.5 , 0.25 , 0.222222, 0.083333])
Note
This should not be confused with the term L-statistic, which is sometimes used to describe any linear combination of order statistics.
PARAMETER | DESCRIPTION |
---|---|
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information) TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
moments | The amount of L-moments to return. Defaults to 4. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
See Also
l_rv_generic.l_ratio
lmo.l_stats
- Unbiased sample estimation of L-stats.
l_loc(*args, trim=(0, 0), **kwds)
¶
L-location of the distribution, i.e. the 1st L-moment.
Alias for X.l_moment(1, ...)
.
l_scale(*args, trim=(0, 0), **kwds)
¶
L-scale of the distribution, i.e. the 2nd L-moment.
Alias for X.l_moment(2, ...)
.
l_skew(*args, trim=(0, 0), **kwds)
¶
L-skewness coefficient of the distribution; the 3rd L-moment ratio.
Alias for X.l_ratio(3, 2, ...)
.
l_kurtosis(*args, trim=(0, 0), **kwds)
¶
L-kurtosis coefficient of the distribution; the 4th L-moment ratio.
Alias for X.l_ratio(4, 2, ...)
.
l_moments_cov(r_max, /, *args, trim=(0, 0), quad_opts=None, **kwds)
¶
Variance/covariance matrix of the L-moment estimators.
L-moments that are estimated from \(n\) samples of a distribution with CDF \(F\), converge to the multivariate normal distribution as the sample size \(n \rightarrow \infty\).
Here, \(\vec{l}^{(s, t)} = \left[l^{(s,t)}_r, \dots, l^{(s,t)}_{r_{max}}\right]^T\) is a vector of estimated sample L-moments, and \(\vec{\lambda}^{(s, t)}\) its theoretical (“true”) counterpart.
This function calculates the covariance matrix
where
the shifted Jacobi polynomial \(p_n(u) = P^{(t, s)}_{n-1}(2u - 1)\), \(P^{(t, s)}_m\), and \(w^{(s,t)}(u) = u^s (1-u)^t\) its weight function.
Notes
This function is not vectorized or parallelized.
For small sample sizes (\(n < 100\)), the covariances of the higher-order L-moments (\(r > 2\)) can be biased. But this bias quickly disappears at roughly \(n > 200\) (depending on the trim- and L-moment orders).
Examples:
>>> import lmo
>>> from scipy.stats import distributions
>>> X = distributions.expon() # standard exponential distribution
>>> X.l_moments_cov(4).round(6)
array([[1. , 0.5 , 0.166667, 0.083333],
[0.5 , 0.333333, 0.166667, 0.083333],
[0.166667, 0.166667, 0.133333, 0.083333],
[0.083333, 0.083333, 0.083333, 0.071429]])
>>> X.l_moments_cov(4, trim=(0, 1)).round(6)
array([[0.333333, 0.125 , 0. , 0. ],
[0.125 , 0.075 , 0.016667, 0. ],
[0. , 0.016667, 0.016931, 0.00496 ],
[0. , 0. , 0.00496 , 0.0062 ]])
PARAMETER | DESCRIPTION |
---|---|
r_max | The amount of L-moment orders to consider. If for example TYPE: |
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information) TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. or floats. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
RETURNS | DESCRIPTION |
---|---|
cov | Covariance matrix, with shape |
RAISES | DESCRIPTION |
---|---|
RuntimeError | If the covariance matrix is invalid. |
l_stats_cov(*args, moments=4, trim=(0, 0), quad_opts=None, **kwds)
¶
Similar to l_moments_cov
, but for the l_rv_generic.l_stats
.
As the sample size \(n \rightarrow \infty\), the L-moment ratio’s are also distributed (multivariate) normally. The L-stats are defined to be L-moments for \(r\le 2\), and L-ratio coefficients otherwise.
The corresponding covariance matrix has been found to be
where \(\bf{\Lambda}^{(s, t)}\) is the covariance matrix of the L-moments from l_moment_cov_from_cdf
, and \(\tau^{(s,t)}_r = \lambda^{(s,t)}_r / \lambda^{(s,t)}_2\) the population L-ratio.
Examples:
Evaluate the LL-stats covariance matrix of the standard exponential distribution, for 0, 1, and 2 degrees of trimming.
>>> import lmo
>>> from scipy.stats import distributions
>>> X = distributions.expon() # standard exponential distribution
>>> X.l_stats_cov().round(6)
array([[1. , 0.5 , 0. , 0. ],
[0.5 , 0.333333, 0.111111, 0.055556],
[0. , 0.111111, 0.237037, 0.185185],
[0. , 0.055556, 0.185185, 0.21164 ]])
>>> X.l_stats_cov(trim=(0, 1)).round(6)
array([[ 0.333333, 0.125 , -0.111111, -0.041667],
[ 0.125 , 0.075 , 0. , -0.025 ],
[-0.111111, 0. , 0.21164 , 0.079365],
[-0.041667, -0.025 , 0.079365, 0.10754 ]])
>>> X.l_stats_cov(trim=(0, 2)).round(6)
array([[ 0.2 , 0.066667, -0.114286, -0.02 ],
[ 0.066667, 0.038095, -0.014286, -0.023333],
[-0.114286, -0.014286, 0.228571, 0.04 ],
[-0.02 , -0.023333, 0.04 , 0.086545]])
Note that with 0 trim the L-location is independent of the L-skewness and L-kurtosis. With 1 trim, the L-scale and L-skewness are independent. And with 2 trim, all L-stats depend on each other.
PARAMETER | DESCRIPTION |
---|---|
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information) TYPE: |
moments | The amount of L-statistics to consider. Defaults to 4. TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. or floats. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
l_moment_influence(r, /, *args, trim=(0, 0), quad_opts=None, tol=1e-08, **kwds)
¶
Returns the influence function (IF) of an L-moment.
with \(F\) the CDF, \(\tilde{P}^{(s,t)}_{r-1}\) the shifted Jacobi polynomial, and
where \(B\) is the (complete) Beta function.
The proof is trivial, because population L-moments are linear functionals.
Notes
The order parameter r
is not vectorized.
PARAMETER | DESCRIPTION |
---|---|
r | The L-moment order \(r \in \mathbb{N}^+\).. TYPE: |
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information) TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. or floats. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
tol | Values that are absolutely smaller than this will be rounded to zero. TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
RETURNS | DESCRIPTION |
---|---|
influence_function | The (vectorized) influence function, \(\psi_{\lambda^{(s, t)}_r | F}(x)\). TYPE: |
See Also
l_ratio_influence(r, k, /, *args, trim=(0, 0), quad_opts=None, tol=1e-08, **kwds)
¶
Returns the influence function (IF) of an L-moment ratio.
where the L-moment ratio is defined as
Because IF’s are a special case of the general Gâteuax derivative, the L-ratio IF is derived by applying the chain rule to the L-moment IF.
PARAMETER | DESCRIPTION |
---|---|
r | L-moment ratio order, i.e. the order of the numerator L-moment. TYPE: |
k | Denominator L-moment order, defaults to 2. TYPE: |
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information) TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. or floats. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
tol | Values that are absolutely smaller than this will be rounded to zero. TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
RETURNS | DESCRIPTION |
---|---|
influence_function | The influence function, with vectorized signature TYPE: |
See Also
l_fit(data, *args, n_extra=0, trim=(0, 0), full_output=False, fit_kwargs=None, random_state=None, **kwds)
¶
Return estimates of shape (if applicable), location, and scale parameters from data. The default estimation method is Method of L-moments (L-MM), but the Generalized Method of L-Moments (L-GMM) is also available (see the n_extra
parameter).
See ‘lmo.inference.fit’ for details.
Examples:
Fitting standard normal samples Using scipy’s default MLE (Maximum Likelihood Estimation) method:
>>> import lmo
>>> import numpy as np
>>> from scipy.stats import norm
>>> rng = np.random.default_rng(12345)
>>> x = rng.standard_normal(200)
>>> norm.fit(x)
(0.0033254, 0.95554)
Better results can be obtained different by using Lmo’s L-MM (Method of L-moment):
>>> norm.l_fit(x, random_state=rng)
FitArgs(loc=0.0033145, scale=0.96179)
>>> norm.l_fit(x, trim=1, random_state=rng)
FitArgs(loc=0.019765, scale=0.96749)
To use more L-moments than the number of parameters, two in this case, n_extra
can be used. This will use the L-GMM (Generalized Method of L-Moments), which results in slightly better estimates:
>>> norm.l_fit(x, n_extra=1, random_state=rng)
FitArgs(loc=0.0039747, scale=0.96233)
>>> norm.l_fit(x, trim=1, n_extra=1, random_state=rng)
FitArgs(loc=-0.00127874, scale=0.968547)
PARAMETER | DESCRIPTION |
---|---|
data | 1-D array-like data to use in estimating the distribution parameters. |
*args | Starting value(s) for any shape-characterizing arguments ( those not provided will be determined by a call to TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
n_extra | The amount of extra L-moment conditions to use than the amount of parameters. If 0 (default), L-MM will be used. If >0, \(k\)-step L-GMM will be used. TYPE: |
full_output | If set to True, a TYPE: |
fit_kwargs | Additional keyword arguments to be passed to ‘lmo.inference.fit’ or ‘scipy.optimize.minimize’. |
random_state | Integer or |
**kwds | Special keyword arguments are recognized as holding certain parameters fixed:
TYPE: |
RETURNS | DESCRIPTION |
---|---|
result | Named tuple with estimates for any shape parameters (if applicable), followed by those for location and scale. For most random variables, shape statistics will be returned, but there are exceptions (e.g. |
See Also
Todo
- Support integral parameters.
l_fit_loc_scale(data, *args, trim=(0, 0), **kwds)
¶
Estimate loc and scale parameters from data using the first two L-moments.
Notes
The implementation mimics that of fit_loc_scale()
PARAMETER | DESCRIPTION |
---|---|
data | Data to fit. |
*args | The shape parameter(s) for the distribution (see docstring of the instance object for more information). TYPE: |
trim | Left- and right- trim. Can be scalar or 2-tuple of non-negative int or float. TYPE: |
**kwds | Additional keyword arguments to pass to the distribution. TYPE: |
RETURNS | DESCRIPTION |
---|---|
loc_hat | Estimated location parameter for the data. TYPE: |
scale_hat | Estimated scale parameter for the data. TYPE: |
pandas
(optional)¶
Extension methods for pandas.Series
and pandas.DataFrame
.
Pandas is an optional dependency, and can be installed using pip install lmo[pandas]
.
Examples:
Univariate summary statistics:
>>> df = pd.DataFrame({'a': [1, 2, 2, 3, 4], 'b': [3, 4, 4, 4, 4]})
>>> df.l_stats()
a b
r
1 2.400000 3.8
2 0.700000 0.2
3 0.142857 -1.0
4 0.285714 1.0
>>> df.aggregate(['mean', 'std', 'skew', 'kurt'])
a b
mean 2.400000 3.800000
std 1.140175 0.447214
skew 0.404796 -2.236068
kurt -0.177515 5.000000
Comparison of L-correlation, and Pearson correlation matrices:
>>> df = pd.DataFrame({'dogs': [.2, .0, .5, .4], 'cats': [.3, .2, .0, .1]})
>>> df.l_corr()
dogs cats
dogs 1.0 -0.764706
cats -0.8 1.000000
>>> df.corr()
dogs cats
dogs 1.000000 -0.756889
cats -0.756889 1.000000
lmo.contrib.pandas.Series
¶
Extension methods for pandas.Series
.
This class is not meant to be used directly. These methods are curried and registered as series accessors.
l_moment(r, /, trim=(0, 0), **kwargs)
¶
See lmo.l_moment
.
RETURNS | DESCRIPTION |
---|---|
out | A scalar, or a TYPE: |
l_ratio(r, k, /, trim=(0, 0), **kwargs)
¶
See lmo.l_ratio
.
RETURNS | DESCRIPTION |
---|---|
out | A scalar, or TYPE: |
l_stats(trim=(0, 0), num=4, **kwargs)
¶
See lmo.l_stats
.
RETURNS | DESCRIPTION |
---|---|
out | A |
l_scale(trim=(0, 0), **kwargs)
¶
See lmo.l_scale
.
RETURNS | DESCRIPTION |
---|---|
out | A scalar. TYPE: |
l_variation(trim=(0, 0), **kwargs)
¶
See lmo.l_variation
.
RETURNS | DESCRIPTION |
---|---|
out | A scalar. TYPE: |
l_skew(trim=(0, 0), **kwargs)
¶
See lmo.l_skew
.
RETURNS | DESCRIPTION |
---|---|
out | A scalar. TYPE: |
l_kurtosis(trim=(0, 0), **kwargs)
¶
See lmo.l_kurtosis
.
RETURNS | DESCRIPTION |
---|---|
out | A scalar. TYPE: |
lmo.contrib.pandas.DataFrame
¶
Extension methods for pandas.DataFrame
.
This class is not meant to be used directly. These methods are curried and registered as dataframe accessors.
l_moment(r, /, trim=(0, 0), axis=0, **kwargs)
¶
See lmo.l_moment
.
RETURNS | DESCRIPTION |
---|---|
out | A TYPE: |
l_ratio(r, k, /, trim=(0, 0), axis=0, **kwargs)
¶
See lmo.l_ratio
.
RETURNS | DESCRIPTION |
---|---|
out | A TYPE: |
l_stats(trim=(0, 0), num=4, axis=0, **kwargs)
¶
See lmo.l_stats
.
RETURNS | DESCRIPTION |
---|---|
out | A |
l_loc(trim=(0, 0), axis=0, **kwargs)
¶
Alias for l_moment(1, ...)
. See lmo.l_loc
for details.
l_scale(trim=(0, 0), axis=0, **kwargs)
¶
Alias for l_moment(2, ...)
. See lmo.l_scale
for details.
l_variation(trim=(0, 0), axis=0, **kwargs)
¶
Alias for l_ratio(2, 1, ...)
. See lmo.l_variation
for details.
l_skew(trim=(0, 0), axis=0, **kwargs)
¶
Alias for l_ratio(3, 2, ...)
. See lmo.l_skew
for details.
l_kurtosis(trim=(0, 0), axis=0, **kwargs)
¶
Alias for l_ratio(4, 2, ...)
. See lmo.l_kurtosis
for details.
l_kurt(trim=(0, 0), axis=0, **kwargs)
¶
Alias for l_kurtosis
.
l_comoment(r, /, trim=(0, 0), **kwargs)
¶
See lmo.l_comoment
.
PARAMETER | DESCRIPTION |
---|---|
r | The L-moment order, as a non-negative scalar. TYPE: |
trim | Left- and right-trim orders. TYPE: |
**kwargs | Additional options to pass to TYPE: |
RETURNS | DESCRIPTION |
---|---|
out | A |
RAISES | DESCRIPTION |
---|---|
TypeError | If |
l_coratio(r, k=2, /, trim=(0, 0), **kwargs)
¶
See lmo.l_coratio
.
PARAMETER | DESCRIPTION |
---|---|
r | The L-moment order of the numerator, a non-negative scalar. TYPE: |
k | The L-moment order of the denominator, a non-negative scalar. Defaults to 2. If set to 0, this is equivalent to TYPE: |
trim | Left- and right-trim orders. TYPE: |
**kwargs | Additional options to pass to TYPE: |
RETURNS | DESCRIPTION |
---|---|
out | A |
RAISES | DESCRIPTION |
---|---|
TypeError | If |
l_coloc(trim=(0, 0), **kwargs)
¶
Alias for l_comoment(1, trim, **kwargs)
. See lmo.l_coloc
for details.
l_coscale(trim=(0, 0), **kwargs)
¶
Alias for l_comoment(2, trim, **kwargs)
. See lmo.l_coscale
for details.
l_corr(trim=(0, 0), **kwargs)
¶
Alias for l_coratio(2, 2, trim, **kwargs)
. See lmo.l_corr
for details.
l_coskew(trim=(0, 0), **kwargs)
¶
Alias for l_coratio(3, 2, trim, **kwargs)
. See lmo.l_coskew
for details.
l_cokurtosis(trim=(0, 0), **kwargs)
¶
Alias for l_coratio(4, 2, trim, **kwargs)
. See lmo.l_cokurtosis
for details.
l_cokurt(trim=(0, 0), **kwargs)
¶
Alias for l_cokurtosis
.
Low-level API¶
Lmo: Robust statistics with trimmed L-moments and L-comoments.
lmo.l_weights(r, n, /, trim=(0, 0), dtype=np.float64, *, cache=False)
¶
Projection matrix of the first \(r\) (T)L-moments for \(n\) samples.
For integer trim is the matrix is a linear combination of the Power Weighted Moment (PWM) weights (the sample estimator of \(\beta_{r_1}\)), and the shifted Legendre polynomials.
If the trimmings are nonzero and integers, a linearized (and corrected) adaptation of the recurrence relations from Hosking (2007) are applied, as well.
for \(s > 0\), and
for \(t > 0\).
If the trim values are floats instead, the weights are calculated directly from the (generalized) order statistics. At the time of writing (07-2023), these “generalized trimmed L-moments” have not been discussed in the literature or the R-packages. It’s probably a good idea to publish this…
TLDR
This matrix (linearly) transforms \(x_{i:n}\) (i.e. the sorted observation vector(s) of size \(n\)), into (an unbiased estimate of) the generalized trimmed L-moments, with orders \(\le r\).
RETURNS | DESCRIPTION |
---|---|
P_r | 2-D array of shape |
Examples:
>>> import lmo
>>> lmo.l_weights(3, 4)
array([[ 0.25 , 0.25 , 0.25 , 0.25 ],
[-0.25 , -0.08333333, 0.08333333, 0.25 ],
[ 0.25 , -0.25 , -0.25 , 0.25 ]])
>>> _ @ [-1, 0, 1 / 2, 3 / 2]
array([0.25 , 0.66666667, 0. ])
constants
¶
inference
¶
Parametric inference.
lmo.inference.GMMResult
¶
Bases: NamedTuple
Represents the Generalized Method of L-Moments (L-GMM) results. See lmo.inference.fit
for details.
ATTRIBUTE | DESCRIPTION |
---|---|
args | The estimated distribution arguments, as |
success | Whether or not the optimizer exited successfully. TYPE: |
eps | Final relative difference in the (natural) L-moment conditions. |
statistic | The minimized objective value, corresponding to the TYPE: |
n_samp | Amount of samples used to calculate the sample L-moment (after trimming). TYPE: |
n_step | Number of GMM steps (the amount of times the weight matrix has been estimated). TYPE: |
n_iter | Number of evaluations of the objective function (the theoretical L-moments). TYPE: |
weights | The final weight (precision, inverse covariance) matrix. |
n_arg: int
property
¶
The number of model parameters.
n_con: int
property
¶
The amount of L-moment conditions of the model.
n_extra: int
property
¶
The number of over-identifying L-moment conditions. For L-MM this is zero, otherwise, for L-GMM, it is strictly positive.
j_test: HypothesisTestResult
property
¶
Sargan-Hansen J-test for over-identifying restrictions; a hypothesis test for the invalidity of the model.
The test is defined through two hypotheses:
- \(H_0\): The data satisfies the L-moment conditions, i.e. the model is “valid”.
- \(H_1\): The data does not satisfy the L-moment conditions, i.e. the model is “invalid”.
AIC: float
property
¶
Akaike Information Criterion, based on the p-value of the J-test. Requires over-identified L-moment conditions, i.e. n_extra > 0
.
The AIC is useful for model selection, e.g. for finding the most appropriate probability distribution from the data (smaller is better).
AICc: float
property
¶
A modification of the AIC that includes a bias-correction small sample sizes.
lmo.inference.fit(ppf, args0, n_obs, l_moments, r=None, trim=(0, 0), *, k=None, k_max=50, l_tol=0.0001, l_moment_fn=None, n_mc_samples=9999, random_state=None, **kwds)
¶
Fit the distribution parameters using the (Generalized) Method of L-Moments (L-(G)MM).
The goal is to find the “true” parameters \(\theta_0\) of the distribution. In practise, this is done using a reasonably close estimate, \(\theta\).
In the (non-Generalized) Method of L-moments (L-MM), this is done by solving the system of equations \(l^{(s, t)}_r = \lambda^{(s, t)}_r\), for \(r = 1, \dots, k\), with \(n = |\theta|\) the number of parameters. Because the amount of parameters matches the amount of L-moment conditions, the solution is point-defined, and can be found using simple least squares.
L-GMM extends L-MM by allowing more L-moment conditions than there are free parameters, \(m > n\). This requires solving an over-identified system of \(m\) equations:
where \(W_m\) is a \(m \times m\) weight matrix.
The weight matrix is initially chosen as the matrix inverse of the non-parametric L-moment covariance matrix, see lmo.l_moment_cov
. These weights are then plugged into the the equation above, and fed into scipy.optimize.minimize
, to obtain the initial parameter estimates.
In the next step(s), Monte-Carlo sampling is used to draw samples from the distribution (using the current parameter estimates), with sample sizes matching that of the data. The L-moments of these samples are consequently used to to calculate the new weight matrix.
Todo
- Raise on minimization error, warn on failed k-step convergence
- Optional
integrality
kwarg with boolean mask for integral params. - Implement CUE: Continuously Updating GMM (i.e. implement and use
_loss_cue()
, then run withk=1
).
PARAMETER | DESCRIPTION |
---|---|
ppf | The (vectorized) quantile function of the probability distribution, with signature TYPE: |
args0 | Initial estimate of the distribution’s parameter values. |
n_obs | Amount of observations. TYPE: |
l_moments | Estimated sample L-moments. Must be a 1-d array-like s.t. |
r | The orders of TYPE: |
trim | The L-moment trim-length(s) to use. Currently, only integral trimming is supported. TYPE: |
PARAMETER | DESCRIPTION |
---|---|
k | If set to a positive integer, exactly \(k\) steps will be run. Will be ignored if TYPE: |
k_max | Maximum amount of steps to run while not reaching convergence. Will be ignored if \(k\) is specified or if TYPE: |
l_tol | Error tolerance in the parametric L-moments (unit-standardized). Will be ignored if \(k\) is specified or if TYPE: |
l_moment_fn | Function for parametric L-moment calculation, with signature: |
n_mc_samples | The number of Monte-Carlo (MC) samples drawn from the distribution to to form the weight matrix in step \(k > 1\). Will be ignored if TYPE: |
random_state | A seed value or TYPE: |
**kwds | Additional keyword arguments to be passed to TYPE: |
RAISES | DESCRIPTION |
---|---|
ValueError | Invalid arguments. |
RETURNS | DESCRIPTION |
---|---|
result | An instance of [ TYPE: |
linalg
¶
Linear algebra and linearized orthogonal polynomials.
lmo.linalg.sandwich(A, X, /, dtype=np.float64)
¶
Calculates the “sandwich” matrix product (A @ X @ A.T
) along the specified X
axis.
PARAMETER | DESCRIPTION |
---|---|
A | 2-D array of shape |
dtype | The data type of the result. |
X | Array of shape |
RETURNS | DESCRIPTION |
---|---|
C | Array of shape |
See Also
- https://wikipedia.org/wiki/Covariance_matrix
lmo.linalg.pascal(k, /, dtype=np.int64, *, inv=False)
¶
Construct the lower-diagonal Pascal matrix \(L_{k \times k\)}$, or its matrix inverse \(L^{-1}\).
Implemented using recursion, unlike the slow naive implementation from the equivalent scipy.linalg.pascal
and scipy.linalg.invpascal
functions using kind='lower'
. By using the binomial recurrence relation, assuming \(0 < j < i\), \(\binom{i}{j} = \frac{i}{j} \binom{i-1}{j-1}\), the following recursive definition is obtained:
Examples:
>>> import numpy as np
>>> from lmo.linalg import pascal
>>> pascal(4, dtype=np.int_)
array([[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 2, 1, 0],
[1, 3, 3, 1]])
>>> pascal(4, dtype=np.int_, inv=True)
array([[ 1, 0, 0, 0],
[-1, 1, 0, 0],
[ 1, -2, 1, 0],
[-1, 3, -3, 1]])
>>> np.rint(np.linalg.inv(pascal(4))).astype(int)
array([[ 1, 0, 0, 0],
[-1, 1, 0, 0],
[ 1, -2, 1, 0],
[-1, 3, -3, 1]])
Now, let’s compare with scipy:
>>> from scipy.linalg import invpascal
>>> invpascal(4, kind='lower').astype(int)
array([[ 1, 0, 0, 0],
[-1, 1, 0, 0],
[ 1, -2, 1, 0],
[-1, 3, -3, 1]])
lmo.linalg.ir_pascal(k, /, dtype=np.float64)
¶
Inverse regulatized lower-diagonal Pascal matrix, \(\bar{L}_{ij} = L^{-1}_ij / i\).
Used to linearly combine order statistics order statistics into L-moments.
lmo.linalg.sh_legendre(k, /, dtype=np.int64)
¶
Shifted Legendre polynomial coefficient matrix \(\widetilde{P}\) of shape (k, k)
.
The \(j\)-th coefficient of the shifted Legendre polynomial of degree \(k\) is at \((k, j)\):
Useful for transforming probability-weighted moments into L-moments.
Danger
For \(k \ge 29\), all 64-bits dtypes (default is int64) will overflow, which results in either an OverflowError
(if you’re lucky), or will give incorrect results. Similarly, all 32-bits dtypes (e.g. np.int_
on Windows) already overflow when \(k \ge 16\).
This is not explicitly checked – so be sure to select the right dtype
depending on k
.
One option is to use dtype=np.object_
, which will use Python-native int
. However, this is a lot slower, and is likely to fail. For instance, when multiplied together with some float64
array, a TypeError
is raised.
PARAMETER | DESCRIPTION |
---|---|
k | The size of the matrix, and the max degree of the shifted Legendre polynomial. TYPE: |
dtype | Desired output data type, e.g, |
RETURNS | DESCRIPTION |
---|---|
P | 2-D array of the lower-triangular square matrix of size \(k^2\)`. |
Examples:
Calculate \(\widetilde{P}_{4 \times 4}\):
>>> from lmo.linalg import sh_legendre
>>> sh_legendre(4, dtype=int)
array([[ 1, 0, 0, 0],
[ -1, 2, 0, 0],
[ 1, -6, 6, 0],
[ -1, 12, -30, 20]])
See Also
- https://wikipedia.org/wiki/Legendre_polynomials
- https://wikipedia.org/wiki/Pascal_matrix
lmo.linalg.sh_jacobi(k, a, b, /, dtype=None)
¶
Shifted Jacobi polynomial coefficient matrix \(\widetilde{P}^{(a,b)}\) of shape (k, k)
.
The \(j\)-th coefficient of the shifted Jacobi polynomial of degree \(k\) is at \((k, j)\):
The “shift” refers to the change of variables \(x \mapsto 2x - 1\) in the (unshifted) Jacobi (or hypergeometric) polynomials.
The (shifted) Jacobi polynomials \(\widetilde{P}^{(a,b)}\) generalize the (shifted) Legendre polynomials \(\widetilde{P}\): \(\widetilde{P}^{(0, 0)} = \widetilde{P}\)
PARAMETER | DESCRIPTION |
---|---|
k | The size of the matrix, and the max degree of the polynomial. TYPE: |
a | The \(\alpha\) parameter, must be \(\ge 0\). TYPE: |
b | The \(\beta\) parameter, must be \(\ge 0\). TYPE: |
dtype | Desired output data type, e.g, |
RETURNS | DESCRIPTION |
---|---|
P | 2-D array of the lower-triangular square matrix of size \(k^2\)`. |
Examples:
Calculate \(\widetilde{P}^{(1, 1)}_{4 \times 4}\):
>>> from lmo.linalg import sh_jacobi
>>> sh_jacobi(4, 1, 1, dtype=int)
array([[ 1, 0, 0, 0],
[ -2, 4, 0, 0],
[ 3, -15, 15, 0],
[ -4, 36, -84, 56]])
Let’s compare \(\widetilde{P}^(1, \pi)_3\) with the scipy Jacobi poly1d. This requires manual shifting \(x \mapsto f(x)\), with \(f(x) = 2x - 1\):
>>> import numpy as np
>>> import scipy.special as sc
>>> f_x = np.poly1d([2, -1]) # f(x) = 2*x + 1
>>> sc.jacobi(3, 1, np.pi)(f_x)
poly1d([ 125.80159497, -228.55053774, 128.54584648, -21.79690371])
>>> sh_jacobi(4, 1, np.pi)[3]
array([ -21.79690371, 128.54584648, -228.55053774, 125.80159497])
Apart from the reversed coefficients of numpy.poly1d
(an awkward design choice, but it’s fixed in the new numpy.polynomial
module.)
See Also
- https://mathworld.wolfram.com/JacobiPolynomial.html
scipy.special.jacobi
lmo.linalg.succession_matrix(c)
¶
A toeplitz-like transformation matrix construction, that prepends \(i\) zeroes to \(i\)-th row, so that the input shape is mapped from (n, k)
to (n, k + n)
.
So all values \(i > j \vee i + j \ge k\) are zero in the succession matrix.
PARAMETER | DESCRIPTION |
---|---|
c | Dense matrix of shape |
RETURNS | DESCRIPTION |
---|---|
S | Matrix of shape |
Examples:
>>> from lmo.linalg import succession_matrix
>>> c = np.arange(1, 9).reshape(4, 2)
>>> c
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
>>> succession_matrix(c)
array([[1, 2, 0, 0, 0],
[0, 3, 4, 0, 0],
[0, 0, 5, 6, 0],
[0, 0, 0, 7, 8]])
lmo.linalg.trim_matrix(r, /, trim, dtype=np.float64)
¶
Linearization of the trimmed L-moment recurrence relations, following the (corrected) derivation by Hosking (2007) from the (shifted) Jacobi Polynomials.
This constructs a \(r \times r + t_1 + t_2\) matrix \(T^{(t_1, t_2)}\) that “trims” conventional L-moments. E.g. the first 3 \((1, 1)\) trimmed L-moments can be obtained from the first \(3+1+1=5\) (untrimmed) L-moments (assuming they exist) with trim_matrix(3, (1, 1)) @ l_moment(x, np.ogrid[:5] + 1)
.
The big “L” in “L-moment”, referring to it being a Linear combination of order statistics, has been prominently put in the name by Hosking (1990) for a good reason. It means that transforming order statistics to a bunch of L-moments, can be done using a single matrix multiplication (see lmo.linalg.sh_legendre
). By exploiting liniarity, it can easily be chained with this trim matrix, to obtain a reusable order-statistics -> trimmed L-moments transformation (matrix).
Note that these linear transformations can be used in exactly the same way to e.g. calculate several population TL-moments of some random varianble, using nothing but its theoretical probablity-weighted moments (PWMs).
PARAMETER | DESCRIPTION |
---|---|
r | The max (trimmed) L-moment order. TYPE: |
trim | Left- and right-trim orders \((t_1, t_2)\), integers \(\ge 0\). If set to (0, 0), the identity matrix is returned. |
dtype | Desired output data type, e.g, |
RETURNS | DESCRIPTION |
---|---|
npt.NDArray[T] | Toeplitz-like matrix of shape \((r, r + t_1 + t_2)\). |
Examples:
>>> from lmo.linalg import trim_matrix
>>> trim_matrix(3, (0, 1))
array([[ 1. , -1. , 0. , 0. ],
[ 0. , 0.75 , -0.75 , 0. ],
[ 0. , 0. , 0.66666667, -0.66666667]])
>>> trim_matrix(3, (1, 0))
array([[1. , 1. , 0. , 0. ],
[0. , 0.75 , 0.75 , 0. ],
[0. , 0. , 0.66666667, 0.66666667]])
ostats
¶
Order statistics \(X_{i:n}\), with \(i \in [0, n)\).
Primarily used as an intermediate step for L-moment estimation.
lmo.ostats.weights(i, n, N, /, *, cached=False)
¶
Compute the linear weights \(w_{i:n|j:N}\) for \(j = 0, \dots, N-1\).
The unbiased sample estimator \(\mu_{i:n}\) is then derived from
where
Here, \(B\) denotes the Beta function, \(B(a,b) = \Gamma(a) \Gamma(b) / \Gamma(a + b)\).
Notes
This function uses “Python-style” 0-based indexing for \(i\), instead of the conventional 1-based indexing that is generally used in the literature.
PARAMETER | DESCRIPTION |
---|---|
i | 0-indexed sample (fractional) index, \(0 \le i \lt n\). Negative indexing is allowed. TYPE: |
n | Subsample size, optionally fractional, \(0 \le n0\) TYPE: |
N | Sample size, i.e. the observation count. TYPE: |
cached | Cache the result for TYPE: |
RETURNS | DESCRIPTION |
---|---|
npt.NDArray[np.float64] | 1d array of size \(N\) with (ordered) sample weights. |
lmo.ostats.from_cdf(F, i, n)
¶
Transform \(F(X)\) to \(F_{i:n}(X)\), of the \(i\)th variate within subsamples of size, i.e. \(0 \le i \le n - 1\).
PARAMETER | DESCRIPTION |
---|---|
F | Scalar or array-like with the returned value of some cdf, i.e. \(F_X(x) = P(X \le x)\). Must be between 0 and 1. |
i | 0-indexed sample (fractional) index, \(0 \le i < n\). TYPE: |
n | Subsample size, optionally fractional, \(0 \le n0\) TYPE: |
pwm_beta
¶
Low-level utility functions for calculating a special case of the probability-weighted moments (PWM’s), \(\beta_k = M_{1,k,0}\).
Primarily used as an intermediate step for L-moment estimation.
lmo.pwm_beta.weights(r, n, /, dtype=np.float64)
¶
Probability Weighted moment (PWM) projection matrix \(B\) of the unbiased estimator for \(\beta_k = M_{1,k,0}\) for \(k = 0, \dots, r - 1\).
The PWM’s are estimated by linear projection of the sample of order statistics, i.e. \(b = B x_{i:n}\)
PARAMETER | DESCRIPTION |
---|---|
r | The amount of orders to evaluate, i.e. \(k = 0, \dots, r - 1\). TYPE: |
n | Sample count. TYPE: |
dtype | Desired output floating data type. |
RETURNS | DESCRIPTION |
---|---|
P_b | Upper-triangular projection matrix of shape |
Examples:
>>> from lmo import pwm_beta
>>> pwm_beta.weights(4, 5)
array([[0.2 , 0.2 , 0.2 , 0.2 , 0.2 ],
[0. , 0.05 , 0.1 , 0.15 , 0.2 ],
[0. , 0. , 0.03333333, 0.1 , 0.2 ],
[0. , 0. , 0. , 0.05 , 0.2 ]])
lmo.pwm_beta.cov(a, r, /, axis=None, dtype=np.float64, **kwargs)
¶
Distribution-free variance-covariance matrix of the probability weighted moment (PWM) point estimates \(\beta_k = M_{1,k,0}\), with orders \(k = 0, \dots, r - 1\).
PARAMETER | DESCRIPTION |
---|---|
a | Array-like with observations. |
r | The amount of orders to evaluate, i.e. \(k = 0, \dots, r - 1\). TYPE: |
axis | The axis along which to calculate the covariance matrices. TYPE: |
dtype | Desired output floating data type. |
**kwargs | Additional keywords to pass to TYPE: |
RETURNS | DESCRIPTION |
---|---|
S_b | Variance-covariance matrix/tensor of shape |
See Also
- https://wikipedia.org/wiki/Covariance_matrix
special
¶
Mathematical “special” functions, extending scipy.special
.
lmo.special.fpow(x, n, out=None)
¶
Factorial power, or falling factorial.
It is defined as
PARAMETER | DESCRIPTION |
---|---|
x | Real-valued array-like or scalar. |
n | Real valued array-like or scalar. |
out | Optional output array for the function results |
RETURNS | DESCRIPTION |
---|---|
out | Array or scalar with the value(s) of the function. |
See Also
scipy.special.poch
– the rising factorial
lmo.special.gamma2(a, x, out=None)
¶
Incomplete (upper) gamma function.
It is defined as
for \( a \ge 0 \) and \( x \ge 0 \).
PARAMETER | DESCRIPTION |
---|---|
a | Non-negative scalar. TYPE: |
x | Non-negative array-like. |
out | Optional output array for the results. |
RETURNS | DESCRIPTION |
---|---|
out | Scalar of array with the values of the incomplete gamma function. |
See Also
scipy.special.gammaincc
for the regularized gamma function \( Q(a,\ x) \).
lmo.special.harmonic(n, /, out=None)
¶
Harmonic number \( H_n = \sum_{k=1}^{n} 1 / k \), extended for real and complex argument via analytic contunuation.
Examples:
>>> harmonic(0)
0.0
>>> harmonic(1)
1.0
>>> harmonic(2)
1.5
>>> harmonic(42)
4.32674
>>> harmonic(np.pi)
1.87274
>>> harmonic(-1 / 12)
-0.146106
>>> harmonic(1 - 1j)
(1.1718...-0.5766...j)
PARAMETER | DESCRIPTION |
---|---|
n | Real- or complex- valued parameter, as array-like or scalar. |
out | Optional real or complex output array for the results. TYPE: |
RETURNS | DESCRIPTION |
---|---|
out | Array or scalar with the value(s) of the function. TYPE: |
See Also
lmo.special.norm_sh_jacobi(n, alpha, beta)
¶
Evaluate the (weighted) \( L^2 \)-norm of a shifted Jacobi polynomial.
Specifically,
with
the normalized Jacobi polynomial on \( [0, 1] \).
lmo.special.fourier_jacobi(x, c, a, b)
¶
Evaluate the Fourier-Jacobi series, using the Clenshaw summation algorithm.
If \( c \) is of length \( n + 1 \), this function returns the value:
Here, \( \jacobi{n}{a}{b}{x} \) is a Jacobi polynomial of degree \( n = |\vec{c}| \), which is orthogonal iff \( (a, b) \in (-1,\ \infty)^2 \) and \( x \in [0,\ 1] \).
Tip
Trailing zeros in the coefficients will be used in the evaluation, so they should be avoided if efficiency is a concern.
PARAMETER | DESCRIPTION |
---|---|
x | Scalar or array-like with input data. |
c | Array-like of coefficients, ordered from low to high. All coefficients to the right are considered zero. For instance, |
a | Jacobi parameter \( a > -1 \). TYPE: |
b | Jacobi parameter \( a > -1 \). TYPE: |
RETURNS | DESCRIPTION |
---|---|
out | Scalar or array of same shape as |
theoretical
¶
Theoretical (population) L-moments of known univariate probability distributions.
lmo.theoretical.l_moment_from_cdf(cdf, r, /, trim=(0, 0), *, support=None, quad_opts=None, alpha=ALPHA, ppf=None)
¶
Evaluate the population L-moment of a continuous probability distribution, using its Cumulative Distribution Function (CDF) \(F_X(x) = P(X \le x)\).
where,
\(\widetilde{P}^{(\alpha, \beta)}_k(x)\) the shifted (\(x \mapsto 2x-1\)) Jacobi polynomial, \(H(x)\) the Heaviside step function, and \(I_x(\alpha, \beta)\) the regularized incomplete gamma function, and \(u = F_X(x)\) the probability integral transform of \(x \sim X\).
Notes
Numerical integration is performed with scipy.integrate.quad
, which cannot verify whether the integral exists and is finite. If it returns an error message, an IntegrationWarning
is issues, and nan
is returned (even if quad
returned a finite result).
Examples:
Evaluate the first 4 L- and TL-moments of the standard normal distribution:
>>> from scipy.special import ndtr # standard normal CDF
>>> l_moment_from_cdf(ndtr, [1, 2, 3, 4])
array([0. , 0.56418958, 0. , 0.06917061])
>>> l_moment_from_cdf(ndtr, [1, 2, 3, 4], trim=1)
array([0. , 0.29701138, 0. , 0.01855727])
Evaluate the first 4 TL-moments of the standard Cauchy distribution:
>>> def cdf_cauchy(x: float) -> float:
... return np.arctan(x) / np.pi + 1 / 2
>>> l_moment_from_cdf(cdf_cauchy, [1, 2, 3, 4], trim=1)
array([0. , 0.69782723, 0. , 0.23922105])
PARAMETER | DESCRIPTION |
---|---|
cdf | Cumulative Distribution Function (CDF), \(F_X(x) = P(X \le x)\). Must be a continuous monotone increasing function with signature TYPE: |
r | L-moment order(s), non-negative integer or array-like of integers. TYPE: |
trim | Left- and right- trim, either as a \((s, t)\) tuple with \(s, t > -1/2\), or \(t\) as alias for \((t, t)\). TYPE: |
PARAMETER | DESCRIPTION |
---|---|
support | The subinterval of the nonzero domain of TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
alpha | Split the integral into integrals with limits \([a, F^{-1}(\alpha)]\), \([F(\alpha), F^{-1}(1 - \alpha)]\) and \([F^{-1}(1 - \alpha), b]\) to improve numerical stability. So \(\alpha\) can be consideresd the size of the tail. Numerical experiments have found 0.05 to give good results for different distributions. TYPE: |
ppf | The inverse of the cdf, used with TYPE: |
RAISES | DESCRIPTION |
---|---|
TypeError |
|
ValueError |
|
RETURNS | DESCRIPTION |
---|---|
lmbda | The population L-moment(s), a scalar or float array like |
References
See Also
theoretical.l_moment_from_ppf
: population L-moment, using the inverse CDFl_moment
: sample L-moment
lmo.theoretical.l_moment_from_ppf(ppf, r, /, trim=(0, 0), *, support=(0, 1), quad_opts=None, alpha=ALPHA)
¶
Evaluate the population L-moment of a univariate probability distribution, using its Percentile Function (PPF), \(x(F)\), also commonly known as the quantile function, which is the inverse of the Cumulative Distribution Function (CDF).
where
and \(\widetilde{P}^{(\alpha, \beta)}_k(x)\) the shifted (\(x \mapsto 2x-1\)) Jacobi polynomial.
Notes
Numerical integration is performed with scipy.integrate.quad
, which cannot verify whether the integral exists and is finite. If it returns an error message, an IntegrationWarning
is issues, and nan
is returned (even if quad
returned a finite result).
Examples:
Evaluate the L- and TL-location and -scale of the standard normal distribution:
>>> from scipy.special import ndtri # standard normal inverse CDF
>>> l_moment_from_ppf(ndtri, [1, 2])
array([0. , 0.56418958])
>>> l_moment_from_ppf(ndtri, [1, 2], trim=1)
array([0. , 0.29701138])
PARAMETER | DESCRIPTION |
---|---|
ppf | The quantile function \(x(F)\), a monotonically continuous increasing function with signature TYPE: |
r | L-moment order(s), non-negative integer or array-like of integers. E.g. 0 gives 1, 1 the L-location, 2 the L-scale, etc. TYPE: |
trim | Left- and right- trim, either as a \((s, t)\) tuple with \(s, t > -1/2\), or \(t\) as alias for \((t, t)\). TYPE: |
PARAMETER | DESCRIPTION |
---|---|
support | Integration limits. Defaults to (0, 1), as it should. There is no need to change this to anything else, and only exists to make the function signature consistent with the TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
alpha | Split the integral into integrals with limits \([0, \alpha]\), \([\alpha, 1-\alpha]\) and \([1-\alpha, 0]\) to improve numerical stability. So \(\alpha\) can be consideresd the size of the tail. Numerical experiments have found 0.1 to give good results for different distributions. TYPE: |
RAISES | DESCRIPTION |
---|---|
TypeError | Invalid |
ValueError | Invalid |
RETURNS | DESCRIPTION |
---|---|
lmbda | The population L-moment(s), a scalar or float array like |
References
See Also
theoretical.l_moment_from_cdf
: population L-moment, using the CDF (i.e. the inverse PPF)l_moment
: sample L-moment
lmo.theoretical.l_moment_from_qdf(qdf, r, /, trim=(0, 0), *, support=(0, 1), quad_opts=None, alpha=ALPHA)
¶
Evaluate the population L-moments \( \tlmoment{s, t}{r} \) for \( r > 1 \) from the quantile distribution function (QDF), which is the derivative of the PPF (quantile function).
lmo.theoretical.l_ratio_from_cdf(cdf, r, s, /, trim=(0, 0), *, support=None, quad_opts=None, alpha=ALPHA, ppf=None)
¶
Population L-ratio’s from a CDF.
See Also
lmo.theoretical.l_ratio_from_ppf(ppf, r, s, /, trim=(0, 0), *, support=(0, 1), quad_opts=None, alpha=ALPHA)
¶
Population L-ratio’s from a PPF.
See Also
lmo.theoretical.l_stats_from_cdf(cdf, num=4, /, trim=(0, 0), *, support=None, quad_opts=None, alpha=ALPHA, ppf=None)
¶
Calculates the theoretical- / population- L-moments (for \(r \le 2\)) and L-ratio’s (for \(r > 2\)) of a distribution, from its CDF.
By default, the first num = 4
population L-stats are calculated:
- \(\lambda^{(s,t)}_1\) - L-location
- \(\lambda^{(s,t)}_2\) - L-scale
- \(\tau^{(s,t)}_3\) - L-skewness coefficient
- \(\tau^{(s,t)}_4\) - L-kurtosis coefficient
This function is equivalent to l_ratio_from_cdf(cdf, [1, 2, 3, 4], [0, 0, 2, 2], *, **)
.
Note
This should not be confused with the term L-statistic, which is sometimes used to describe any linear combination of order statistics.
See Also
l_stats_from_ppf
- Population L-stats from the quantile function.l_ratio_from_cdf
- Generalized population L-ratio’s from the CDF.lmo.l_stats
- Unbiased sample estimation of L-stats.
lmo.theoretical.l_stats_from_ppf(ppf, num=4, /, trim=(0, 0), *, support=(0, 1), quad_opts=None, alpha=ALPHA)
¶
Calculates the theoretical- / population- L-moments (for \(r \le 2\)) and L-ratio’s (for \(r > 2\)) of a distribution, from its quantile function.
By default, the first num = 4
population L-stats are calculated:
- \(\lambda^{(s,t)}_1\) - L-location
- \(\lambda^{(s,t)}_2\) - L-scale
- \(\tau^{(s,t)}_3\) - L-skewness coefficient
- \(\tau^{(s,t)}_4\) - L-kurtosis coefficient
This function is equivalent to l_ratio_from_cdf(cdf, [1, 2, 3, 4], [0, 0, 2, 2], *, **)
.
Note
This should not be confused with the term L-statistic, which is sometimes used to describe any linear combination of order statistics.
See Also
l_stats_from_cdf
- Population L-stats from the CDF.l_ratio_from_ppf
- Generalized population L-ratio’s from the quantile function.lmo.l_stats
- Unbiased sample estimation of L-stats.
lmo.theoretical.l_moment_cov_from_cdf(cdf, r_max, /, trim=(0, 0), *, support=None, quad_opts=None)
¶
L-moments that are estimated from \(n\) samples of a distribution with CDF \(F\), converge to the multivariate normal distribution as the sample size \(n \rightarrow \infty\).
Here, \(\vec{l}^{(s, t)} = \left[l^{(s, t)}_r, \dots, l^{(s, t)}_{r_{max}} \right]^T\) is a vector of estimated sample L-moments, and \(\vec{\lambda}^{(s, t)}\) its theoretical (“true”) counterpart.
This function calculates the covariance matrix
where \(u = F_X(x)\) and \(v = F_Y(y)\) (marginal) probability integral transforms, and
the shifted Jacobi polynomial \(p^{(s, t)}_n(u) = P^{(t, s)}_{n-1}(2u - 1)\), \(P^{(t, s)}_m\), and \(w^{(s, t)}(u) = u^s (1 - u)^t\) its weight function.
Notes
This function uses scipy.integrate.nquad
for numerical integration. Unexpected results may be returned if the integral does not exist, or does not converge. The results are rounded to match the order of magnitude of the absolute error of scipy.integrate.nquad
.
This function is not vectorized or parallelized.
For small sample sizes (\(n < 100\)), the covariances of the higher-order L-moments (\(r > 2\)) can be biased. But this bias quickly disappears at roughly \(n > 200\) (depending on the trim- and L-moment orders).
PARAMETER | DESCRIPTION |
---|---|
cdf | Cumulative Distribution Function (CDF), \(F_X(x) = P(X \le x)\). Must be a continuous monotone increasing function with signature TYPE: |
r_max | The amount of L-moment orders to consider. If for example TYPE: |
trim | Left- and right- trim. Must be a tuple of two non-negative ints or floats. TYPE: |
PARAMETER | DESCRIPTION |
---|---|
support | The subinterval of the nonzero domain of TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
RETURNS | DESCRIPTION |
---|---|
cov | Covariance matrix, with shape |
RAISES | DESCRIPTION |
---|---|
RuntimeError | If the covariance matrix is invalid. |
See Also
l_moment_from_cdf
- Population L-moments from the cumulative distribution functionl_moment_from_ppf
- Population L-moments from the quantile functionlmo.l_moment
- Unbiased L-moment estimation from sampleslmo.l_moment_cov
- Distribution-free exact L-moment exact covariance estimate.
lmo.theoretical.l_stats_cov_from_cdf(cdf, num=4, /, trim=(0, 0), *, support=None, quad_opts=None, alpha=ALPHA, ppf=None)
¶
Similar to l_moment_from_cdf
, but for the lmo.l_stats
.
As the sample size \(n \rightarrow \infty\), the L-moment ratio’s are also distributed (multivariate) normally. The L-stats are defined to be L-moments for \(r\le 2\), and L-ratio coefficients otherwise.
The corresponding covariance matrix has been found to be
where \(\bf{\Lambda}^{(s, t)}\) is the covariance matrix of the L-moments from l_moment_cov_from_cdf
, and \(\tau^{(s,t)}_r = \lambda^{(s,t)}_r / \lambda^{(s,t)}_2\) the population L-ratio.
PARAMETER | DESCRIPTION |
---|---|
cdf | Cumulative Distribution Function (CDF), \(F_X(x) = P(X \le x)\). Must be a continuous monotone increasing function with signature TYPE: |
num | The amount of L-statistics to return. Defaults to 4. TYPE: |
trim | Left- and right- trim. Must be a tuple of two non-negative ints or floats. TYPE: |
PARAMETER | DESCRIPTION |
---|---|
support | The subinterval of the nonzero domain of TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
alpha | Two-sided quantile to split the integral at. TYPE: |
ppf | Quantile function, for calculating the split integral limits. TYPE: |
lmo.theoretical.l_moment_influence_from_cdf(cdf, r, /, trim=(0, 0), *, support=None, l_moment=None, quad_opts=None, alpha=ALPHA, tol=1e-08)
¶
Influence Function (IF) of a theoretical L-moment.
with \(F\) the CDF, \(\tilde{P}^{(s,t)}_{r-1}\) the shifted Jacobi polynomial, and
where \(B\) is the (complete) Beta function.
The proof is trivial, because population L-moments are linear functionals.
Notes
The order parameter r
is not vectorized.
PARAMETER | DESCRIPTION |
---|---|
cdf | Vectorized cumulative distribution function (CDF). TYPE: |
r | The L-moment order. Must be a non-negative integer. TYPE: |
trim | Left- and right- trim lengths. Defaults to (0, 0). TYPE: |
PARAMETER | DESCRIPTION |
---|---|
support | The subinterval of the nonzero domain of TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
l_moment | The relevant L-moment to use. If not provided, it is calculated from the CDF. |
alpha | Two-sided quantile to split the integral at. TYPE: |
tol | Zero-roundoff absolute threshold. TYPE: |
RETURNS | DESCRIPTION |
---|---|
influence_function | The influence function, with vectorized signature TYPE: |
See Also
lmo.theoretical.l_ratio_influence_from_cdf(cdf, r, k=2, /, trim=(0, 0), *, support=None, l_moments=None, quad_opts=None, alpha=ALPHA, tol=1e-08)
¶
Construct the influence function of a theoretical L-moment ratio.
where the generalized L-moment ratio is defined as
Because IF’s are a special case of the general Gâteuax derivative, the L-ratio IF is derived by applying the chain rule to the L-moment IF.
PARAMETER | DESCRIPTION |
---|---|
cdf | Vectorized cumulative distribution function (CDF). TYPE: |
r | L-moment ratio order, i.e. the order of the numerator L-moment. TYPE: |
k | Denominator L-moment order, defaults to 2. TYPE: |
trim | Left- and right- trim lengths. Defaults to (0, 0). TYPE: |
PARAMETER | DESCRIPTION |
---|---|
support | The subinterval of the nonzero domain of TYPE: |
l_moments | The L-moments corresponding to \(r\) and \(k\). If not provided, they are calculated from the CDF. TYPE: |
quad_opts | Optional dict of options to pass to TYPE: |
alpha | Two-sided quantile to split the integral at. TYPE: |
tol | Zero-roundoff absolute threshold. TYPE: |
RETURNS | DESCRIPTION |
---|---|
influence_function | The influence function, with vectorized signature TYPE: |
See Also
lmo.theoretical.l_comoment_from_pdf(pdf, cdfs, r, /, trim=(0, 0), *, supports=None, quad_opts=None)
¶
Evaluate the theoretical L-comoment matrix of a multivariate probability distribution, using the joint PDF \(f(\vec x) \equiv f(x_1, x_2, \ldots, x_n)\) of random vector \(\vec{X}\), and the marginal CDFs \(F_k\) of its \(k\)-th random variable.
The L-comoment matrix is defined as
with elements
where \(U_j = F_j(X_j)\) and \(u_j = F_j(x_j)\) denote the (marginal) probability integral transform of \(X_j\) and \(x_j \sim X_j\). Furthermore, \(\widetilde{P}^{(\alpha, \beta)}_k\) is a shifted Jacobi polynomial, and
a positive constant.
For \(r \ge 2\), it can also be expressed as
and without trim (\(s = t = 0\)), this simplifies to
with \(\tilde{P}_n = \tilde{P}^{(0, 0)}_n\) the shifted Legendre polynomial. This last form is precisely the definition introduced by Serfling & Xiao (2007).
Note that the L-comoments along the diagonal, are equivalent to the (univariate) L-moments, i.e.
Notes
At the time of writing, trimmed L-comoments have not been explicitly defined in the literature. Instead, the author (@jorenham) derived it by generizing the (untrimmed) L-comoment definition by Serfling & Xiao (2007), analogous to the generalization of L-moments into TL-moments by Elamir & Seheult (2003).
Examples:
Find the L-coscale and TL-coscale matrices of the multivariate Student’s t distribution with 4 degrees of freedom:
>>> from scipy.stats import multivariate_t
>>> df = 4
>>> loc = np.array([0.5, -0.2])
>>> cov = np.array([[2.0, 0.3], [0.3, 0.5]])
>>> X = multivariate_t(loc=loc, shape=cov, df=df)
>>> from scipy.special import stdtr
>>> std = np.sqrt(np.diag(cov))
>>> cdf0 = lambda x: stdtr(df, (x - loc[0]) / std[0])
>>> cdf1 = lambda x: stdtr(df, (x - loc[1]) / std[1])
>>> l_cov = l_comoment_from_pdf(X.pdf, (cdf0, cdf1), 2)
>>> l_cov.round(4)
array([[1.0413, 0.3124],
[0.1562, 0.5207]])
>>> tl_cov = l_comoment_from_pdf(X.pdf, (cdf0, cdf1), 2, trim=1)
>>> tl_cov.round(4)
array([[0.4893, 0.1468],
[0.0734, 0.2447]])
The (Pearson) correlation coefficient can be recovered in several ways:
>>> cov[0, 1] / np.sqrt(cov[0, 0] * cov[1, 1]) # "true" correlation
0.3
>>> l_cov[0, 1] / l_cov[0, 0]
0.3
>>> l_cov[1, 0] / l_cov[1, 1]
0.3
>>> tl_cov[0, 1] / tl_cov[0, 0]
0.3
>>> tl_cov[1, 0] / tl_cov[1, 1]
0.3
PARAMETER | DESCRIPTION |
---|---|
pdf | Joint Probability Distribution Function (PDF), that accepts a float vector of size \(n\), and returns a scalar in \([0, 1]\). |
cdfs | Sequence with \(n\) marginal CDF’s. |
r | Non-negative integer \(r\) with the L-moment order. TYPE: |
trim | Left- and right- trim, either as a \((s, t)\) tuple with \(s, t > -1/2\), or \(t\) as alias for \((t, t)\). TYPE: |
PARAMETER | DESCRIPTION |
---|---|
supports | A sequence with \(n\) 2-tuples, corresponding to the marginal integration limits. Defaults to \([(-\infty, \infty), \dots]\). |
quad_opts | Optional dict of options to pass to TYPE: |
RETURNS | DESCRIPTION |
---|---|
lmbda | The population L-comoment matrix with shape \(n \times n\). |
lmo.theoretical.l_coratio_from_pdf(pdf, cdfs, r, r0=2, /, trim=(0, 0), *, supports=None, quad_opts=None)
¶
Evaluate the theoretical L-comoment ratio matrix of a multivariate probability distribution, using the joint PDF \(f_{\vec{X}}(\vec{x})\) and \(n\) marginal CDFs \(F_X(x)\) of random vector \(\vec{X}\).
See Also
lmo.theoretical.ppf_from_l_moments(lmbda, /, trim=(0, 0), *, support=(-np.inf, np.inf), validate=True, extrapolate=False)
¶
Return a PPF (quantile function, or inverse CDF), with the specified. L-moments \( \tlmoment{s, t}{1}, \tlmoment{s, t}{2}, \ldots, \tlmoment{s, t}{R} \). Other L-moments are considered zero.
For \( R \) L-moments, this function returns
where \( \shjacobi{n}{a}{b}{x} \) is an \( n \)-th degree shifted Jacobi polynomial, which is orthogonal for \( (a, b) \in (-1, \infty)^2 \) on \( u \in [0, 1] \).
This nonparametric quantile function estimation method was first described by J.R.M. Hosking in 2007. However, his derivation contains a small, but obvious error, resulting in zero-division for \( r = 1 \). So Lmo derived this correct version himself, by using the fact that L-moments are the disguised coefficients of the PPF’s generalized Fourier-Jacobi series expansion.
With Parseval’s theorem it can be shown that, if the probability-weighted moment \( M_{2,s,t} \) (which is the variance if \( s = t = 0 \)) is finite, then \( \hat{Q}_R(u) = Q(u) \) as \( R \to \infty \).
PARAMETER | DESCRIPTION |
---|---|
lmbda | 1-d array-like of L-moments \( \tlmoment{s,t}{r} \) for \( r = 1, 2, \ldots, R \). At least 2 L-moments are required. All remaining L-moments with \( r > R \) are considered zero. |
trim | The trim-length(s) of L-moments TYPE: |
support | A tuple like |
validate | If TYPE: |
extrapolate | If set to TYPE: |
RETURNS | DESCRIPTION |
---|---|
ppf | A vectorized PPF (quantile function). Its extra optional keyword argument TYPE: |
lmo.theoretical.qdf_from_l_moments(lmbda, /, trim=(0, 0), *, validate=True, extrapolate=False)
¶
Return the QDF (quantile density function, the derivative of the PPF), with the specified L-moments \( \tlmoment{s, t}{1}, \tlmoment{s, t}{2}, \ldots, \tlmoment{s, t}{R} \). Other L-moments are considered zero.
This function returns
where \( \shjacobi{n}{a}{b}{x} \) is an \( n \)-th degree shifted Jacobi polynomial, which is orthogonal for \( (a, b) \in (-1, \infty)^2 \) on \( u \in [0, 1] \).
See ppf_from_l_moments
for options.
lmo.theoretical.cdf_from_ppf(ppf)
¶
Numerical inversion of the PPF.
lmo.theoretical.entropy_from_qdf(qdf, /, *args, **kwds)
¶
Evaluate the (differential / continuous) entropy \( H(X) \) of a univariate random variable \( X \), from its quantile density function (QDF), \( q(u) = \frac{\mathrm{d} F^{-1}(u)}{\mathrm{d} u} \), with \( F^{-1} \) the inverse of the CDF, i.e. the PPF / quantile function.
The derivation follows from the identity \( f(x) = 1 / q(F(x)) \) of PDF \( f \), specifically:
PARAMETER | DESCRIPTION |
---|---|
qdf | The quantile distribution function (QDF). TYPE: |
*args | Optional additional positional arguments to pass to TYPE: |
**kwds | Optional keyword arguments to pass to TYPE: |
RETURNS | DESCRIPTION |
---|---|
float | The differential entropy \( H(X) \). |