NormalityCheck#

class pycafee.normalitycheck.normalitycheck.NormalityCheck(alfa=None, language=None, n_digits=None, **kwargs)[source]#

This class instantiates an object to apply a Normality test on a dataset

Attributes

criticalfloat: The critical value for the test at ɑ significance level
msgstr: A message describing the test
normality_teststr: The chosen test
p_valuefloat: The p-value for the hypothesis test.
statisticfloat: The estimated statistic of the normality test
x_expnumpy array: The data provided

Methods

`fit`(x_exp[, test, alfa, n_digits, ...])	This function aggregates all available Normality tests.
`get_alfa`()	Returns the current `alpha` value
`get_language`()	Returns the current language
`get_n_digits`()	Returns the `n_digits` parameter
`set_alfa`(alfa)	Changes the `alpha` value
`set_language`(language)	Changes the current language
`set_n_digits`(n_digits)	Sets the `n_digits` parameter

fit(x_exp, test=None, alfa=None, n_digits=None, comparison=None, details=None)[source]#

This function aggregates all available Normality tests.

Parameters

x_expnumpy array

One dimension numpy array with at least 3 sample data, except for the Lilliefors and Abdi-Molin tests, which must have at least 4 samples.

teststr, optional

The test that will be applied:

If "shapiro-wilk", "sw" or None, the function will apply the Shapiro Wilk normality test;
If "abdi-molin" or "am", the function will apply the Abdi Molin normality test;
If "anderson-darling" or "ad", the function will apply the Anderson Darling normality test;
If "kolmogorov-smirnov" or "ks", the function will apply the Kolmogorov Smirnov normality test;
If "lilliefors" or "li", the function will apply the Lilliefors normality test;

alfafloat, optional

The level of significance (ɑ). Default is None which results in 0.05 (ɑ = 5%).

comparisonstr, optional

This parameter determines how to perform the comparison test to perform the Normality test.

If comparison = "critical" (or None, e.g, the default), the comparison test is made between the critical value (with ɑ significance level) and the calculated value of the test statistic.
If "p-value", the comparison test is performed between the p-value and the adopted significance level (ɑ). This parameter does not influence the result if test = "abdi-molin".

Both results should lead to the same conclusion.

detailsstr, optional

The details parameter determines the amount of information presented about the hypothesis test.

If details = "short" (or None, e.g, the default), a simplified version of the test result is returned.
If details = "full", a detailed version of the hypothesis test result is returned.
if details = "binary", the conclusion will be 1 (\(H_0\) is rejected) or 0 (\(H_0\) is accepted).

Returns

resulttuple with

statisticfloat: The test statistic.
criticalfloat or None: Each test has a different set of critical values, but all contain critical values for alpha equal to 1%, 5% or 10%. For more details, see the specific details for each test in its respective documentation.
p_valuefloat or None: The p-value for the hypothesis test.

conclusionstr or int

The test conclusion (e.g, Normal/ not Normal).

See also

pycafee.normalitycheck.abdimolin.AbdiMolin.fit
pycafee.normalitycheck.andersondarling.AndersonDarling.fit
pycafee.normalitycheck.lilliefors.Lilliefors.fit
pycafee.normalitycheck.kolmogorovsmirnov.KolmogorovSmirnov.fit
pycafee.normalitycheck.shapirowilk.ShapiroWilk.fit

Examples

Checking data normality with default values

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x)
>>> print(result)
ShapiroWilkResult(Statistic=0.9698116779327393, Critical=0.842, p_value=0.8890941739082336, Alpha=0.05)
>>> print(conclusion)
Data is Normal at a 95.0% of confidence level.

Checking data normality using the Shapiro Wilk test at ɑ = 10% in portuguese ("pt-br")

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck(language="pt-br")
>>> result, conclusion = normality_test.fit(x, test="sw", alfa=0.1)
>>> print(result)
ShapiroWilkResultado(Estatistica=0.9698116779327393, Critico=0.869, p_valor=0.8890941739082336, Alfa=0.1)
>>> print(conclusion)
Os dados são Normais com 90.0% de confiança.

Checking data normality using the Lilliefors test with details = "full"

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="li", details="full")
>>> print(result)
LillieforsResult(Statistic=0.15459867079959644, Critical=0.258, p_value=0.7104644322958894, Alpha=0.05)
>>> print(conclusion)
Since the critical value (0.258) >= statistic (0.154), we have NO evidence to reject the hypothesis of data normality, according to the Lilliefors test at a 95.0% of confidence level.

Checking data normality using the Anderson Darling test with conclusion = "p-value"

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="ad", details="full", comparison="p-value")
>>> print(result)
AndersonDarlingResult(Statistic=0.22687861079050364, Critical=None, p_value=0.7479231606974011, Alpha=0.05)
>>> print(conclusion)
Since p-value (0.747) >= alpha (0.05), we have NO evidence to reject the hypothesis of data normality, according to the AndersonDarling test at a 95.0% of confidence level.

Checking data normality using the Kolmogorov Smirnov test at ɑ = 1%

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="ks", details="full", comparison="p-value")
>>> print(result)
KolmogorovSmirnovResult(Statistic=0.15459867079959644, Critical=0.41, p_value=0.9706128123504146, Alpha=0.05)
>>> print(conclusion)
Since p-value (0.97) >= alpha (0.05), we have NO evidence to reject the hypothesis of data normality, according to the Kolmogorov Smirnov test at a 95.0% of confidence level.

Checking data normality using the Abdi Molin test with default values

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="am")
>>> print(result)
AbdiMolinResult(Statistic=0.15459867079959644, Critical=0.2616, p_value=None, Alpha=0.05)
>>> print(conclusion)
Data is Normal at a 95.0% of confidence level.

get_alfa()#: Returns the current alpha value

get_language()#: Returns the current language

get_n_digits()#: Returns the n_digits parameter

set_alfa(alfa)#

Changes the alpha value

Parameters

alfafloat: The new significance level

Notes

This method only allows input of type float and between 0.0 and 1.0.

set_language(language)#

Changes the current language

Parameters

languagestr: The language code

Notes

The language must be a str with no more then 5 elements.

set_n_digits(n_digits)#

Sets the n_digits parameter

Parameters

n_digitsint: The maximum number of decimal places to be shown.

Normality Check

Abdi Molin