fit#

pycafee.normalitycheck.lilliefors.Lilliefors.fit(self, x_exp, alfa=None, comparison=None, details=None)#

This function is a wraper around statsmodels.stats.diagnostic.lilliefors() [1] to perform the Lilliefors Normality test, but with some facilities.

The main difference between this method and the original one is that this wrap only allows the comparison of a sample with the Normal distribution, using dist="norm". Also, the method to estimate the p-value is set to table, using pvalmethod="table". Hence:

>>> statsmodels.stats.diagnostic(x_exp, dist="norm", pvalmethod="table")

Parameters

x_expnumpy array

One dimension numpy array with at least 4 sample data.

alfafloat, optional

The level of significance (ɑ). Default is None which results in 0.05 (ɑ = 5%).

comparisonstr, optional

This parameter determines how to perform the comparison test to evaluate the Normality test.

If comparison = "critical" (or None), the comparison test is performed by comparing the critical value (with ɑ significance level) with the test statistic.
If comparison="p-value", the comparison test is performed comparing the p-value with the adopted significance level (ɑ).

Both results should lead to the same comparison.

detailsstr, optional

The details parameter determines the amount of information presented about the hypothesis test.

If details = "short" (or None), a simplified version of the test result is returned.
If details = "full", a detailed version of the hypothesis test result is returned.
if details = "binary", the conclusion will be 1 (\(H_0\) is rejected) or 0 (\(H_0\) is accepted).

Returns

resulttuple with

statisticfloat: The test statistic.
criticalfloat or None: The tabulated value for alpha equal to 1%, 5%, 10%, 15% or 20%. Other values will return None.
p_valuefloat: The p-value for the hypothesis test.

conclusionstr or int

The test conclusion (e.g, Normal/ not Normal).

See also

pycafee.normalitycheck.abdimolin.AbdiMolin.fit
pycafee.normalitycheck.andersondarling.AndersonDarling.fit
pycafee.normalitycheck.kolmogorovsmirnov.KolmogorovSmirnov.fit
pycafee.normalitycheck.shapirowilk.ShapiroWilk.fit

Notes

The critical values [2] includes samples with sizes between 4 and 20 in addition to the values for 25 and 30 samples, for ɑ equal to 1%, 5%, 10%, 15% or 20%.

For data with sample size between 21 and 24 (20 < n_rep < 25), the critical value returned is the value for 25 observations;
For data with sample size between 26 and 29 (25 < n_rep < 30), the critical value returned is the value for 30 observations;
For data with a sample size higher than 31 (n_rep > 30), the critical value returned is the aproximation proposed by the authors.

The Lilliefors Normality test has the following premise:

☕

\(H_0:\) data comes from Normal distribution.

\(H_1:\) data does not come from Normal distribution.

By default (comparison="critical"), the conclusion is based on the comparison between the critical value (at ɑ significance level) and statistic of the test. In summary:

if critical >= statistic:
    Data is Normal
else:
    Data is not Normal

The other option (comparison="p-value") makes the conclusion comparing the p-value with ɑ:

if p-value >= ɑ:
    Data is Normal
else:
    Data is not Normal

If comparison="critical" and ɑ is not 0.01, 0.05, 0.10, 0.15 or 0.20, the function will raise ValueError.

References

1: STATSMODELS. statsmodels.stats.diagnostic.lilliefors. Available at: www.statsmodels.org. Access on: 10 May. 2022
2: Hubert W. Lilliefors (1967) On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown, Journal of the American Statistical Association, 62:318, 399-402, DOI: 10.1080/01621459.1967.10482916.

Examples

Applying the test with default values

>>> from pycafee.normalitycheck.lilliefors import Lilliefors
>>> import scipy.stats as stats
>>> x = stats.norm.rvs(loc=5, scale=3, size=100, random_state=42)
>>> li_test = Lilliefors()
>>> result, conclusion = li_test.fit(x)
>>> print(result)
LillieforsResult(Statistic=0.05177647360597687, Critical=0.0866, p_value=0.7370142762533124, Alpha=0.05)
>>> print(conclusion)
Data is Normal at a 95.0% of confidence level.

Applying the test using the p-value to make the conclusion

>>> from pycafee.normalitycheck.lilliefors import Lilliefors
>>> import scipy.stats as stats
>>> x = stats.norm.rvs(loc=5, scale=3, size=100, random_state=42)
>>> li_test = Lilliefors()
>>> result, conclusion = li_test.fit(x, comparison='p-value')
>>> print(result)
LillieforsResult(Statistic=0.05177647360597687, Critical=0.0866, p_value=0.7370142762533124, Alpha=0.05)
>>> print(conclusion)
Data is Normal at a 95.0% of confidence level.

Applying the test at 1% of significance level

>>> from pycafee.normalitycheck.lilliefors import Lilliefors
>>> import numpy as np
>>> x = np.array([1.90642, 2.22488, 2.10288, 1.69742, 1.52229, 3.15435, 2.61826, 1.98492, 1.42738, 1.99568])
>>> li_test = Lilliefors()
>>> result, conclusion = li_test.fit(x, alfa=0.01)
>>> print(result)
LillieforsResult(Statistic=0.17709753067016487, Critical=0.294, p_value=0.4976450090923252, Alpha=0.01)
>>> print(conclusion)
Data is Normal at a 99.0% of confidence level.

Applying the test with a detailed conclusion

>>> from pycafee.normalitycheck.lilliefors import Lilliefors
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> li_test = Lilliefors()
>>> result, conclusion = li_test.fit(x, alfa=0.10, details="full")
>>> print(result)
LillieforsResult(Statistic=0.15459867079959644, Critical=0.239, p_value=0.7104644322958894, Alpha=0.1)
>>> print(conclusion)
Since the critical value (0.239) >= statistic (0.154), we have NO evidence to reject the hypothesis of data normality, according to the Lilliefors test at a 90.0% of confidence level.

Applying the test using a not Normal data

>>> from pycafee.normalitycheck.lilliefors import Lilliefors
>>> import numpy as np
>>> x =  np.array([0.8, 1, 1.1, 1.15, 1.15, 1.2, 1.2, 1.2, 1.2, 1.6, 1.8, 2, 2.2, 3, 5, 8.2, 8.4, 8.6, 9])
>>> li_test = Lilliefors()
>>> result, conclusion = li_test.fit(x, alfa = 0.05, comparison = "p-value", details="full")
>>> print(result)
LillieforsResult(Statistic=0.3072356484569813, Critical=0.195, p_value=0.0009999999999998899, Alpha=0.05)
>>> print(conclusion)
Since p-value (0.0) < alpha (0.05), we HAVE evidence to reject the hypothesis of data normality, according to the Lilliefors test at a 95.0% of confidence level.

Lilliefors

to_xlsx