NormalityCheck#


class pycafee.normalitycheck.normalitycheck.NormalityCheck(alfa=None, language=None, n_digits=None, **kwargs)[source]#

This class instantiates an object to apply a Normality test on a dataset

Attributes
criticalfloat

The critical value for the test at ɑ significance level

msgstr

A message describing the test

normality_teststr

The chosen test

p_valuefloat

The p-value for the hypothesis test.

statisticfloat

The estimated statistic of the normality test

x_expnumpy array

The data provided

Methods

fit(x_exp[, test, alfa, n_digits, ...])

This function aggregates all available Normality tests.

get_alfa()

Returns the current alpha value

get_language()

Returns the current language

get_n_digits()

Returns the n_digits parameter

set_alfa(alfa)

Changes the alpha value

set_language(language)

Changes the current language

set_n_digits(n_digits)

Sets the n_digits parameter

fit(x_exp, test=None, alfa=None, n_digits=None, comparison=None, details=None)[source]#

This function aggregates all available Normality tests.

Parameters
x_expnumpy array

One dimension numpy array with at least 3 sample data, except for the Lilliefors and Abdi-Molin tests, which must have at least 4 samples.

teststr, optional

The test that will be applied:

alfafloat, optional

The level of significance (ɑ). Default is None which results in 0.05 (ɑ = 5%).

comparisonstr, optional

This parameter determines how to perform the comparison test to perform the Normality test.

  • If comparison = "critical" (or None, e.g, the default), the comparison test is made between the critical value (with ɑ significance level) and the calculated value of the test statistic.

  • If "p-value", the comparison test is performed between the p-value and the adopted significance level (ɑ). This parameter does not influence the result if test = "abdi-molin".

Both results should lead to the same conclusion.

detailsstr, optional

The details parameter determines the amount of information presented about the hypothesis test.

  • If details = "short" (or None, e.g, the default), a simplified version of the test result is returned.

  • If details = "full", a detailed version of the hypothesis test result is returned.

  • if details = "binary", the conclusion will be 1 (\(H_0\) is rejected) or 0 (\(H_0\) is accepted).

Returns
resulttuple with
statisticfloat

The test statistic.

criticalfloat or None

Each test has a different set of critical values, but all contain critical values for alpha equal to 1%, 5% or 10%. For more details, see the specific details for each test in its respective documentation.

p_valuefloat or None

The p-value for the hypothesis test.

conclusionstr or int

The test conclusion (e.g, Normal/ not Normal).

Examples

Checking data normality with default values

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x)
>>> print(result)
ShapiroWilkResult(Statistic=0.9698116779327393, Critical=0.842, p_value=0.8890941739082336, Alpha=0.05)
>>> print(conclusion)
Data is Normal at a 95.0% of confidence level.

Checking data normality using the Shapiro Wilk test at ɑ = 10% in portuguese ("pt-br")

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck(language="pt-br")
>>> result, conclusion = normality_test.fit(x, test="sw", alfa=0.1)
>>> print(result)
ShapiroWilkResultado(Estatistica=0.9698116779327393, Critico=0.869, p_valor=0.8890941739082336, Alfa=0.1)
>>> print(conclusion)
Os dados são Normais com 90.0% de confiança.

Checking data normality using the Lilliefors test with details = "full"

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="li", details="full")
>>> print(result)
LillieforsResult(Statistic=0.15459867079959644, Critical=0.258, p_value=0.7104644322958894, Alpha=0.05)
>>> print(conclusion)
Since the critical value (0.258) >= statistic (0.154), we have NO evidence to reject the hypothesis of data normality, according to the Lilliefors test at a 95.0% of confidence level.

Checking data normality using the Anderson Darling test with conclusion = "p-value"

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="ad", details="full", comparison="p-value")
>>> print(result)
AndersonDarlingResult(Statistic=0.22687861079050364, Critical=None, p_value=0.7479231606974011, Alpha=0.05)
>>> print(conclusion)
Since p-value (0.747) >= alpha (0.05), we have NO evidence to reject the hypothesis of data normality, according to the AndersonDarling test at a 95.0% of confidence level.

Checking data normality using the Kolmogorov Smirnov test at ɑ = 1%

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="ks", details="full", comparison="p-value")
>>> print(result)
KolmogorovSmirnovResult(Statistic=0.15459867079959644, Critical=0.41, p_value=0.9706128123504146, Alpha=0.05)
>>> print(conclusion)
Since p-value (0.97) >= alpha (0.05), we have NO evidence to reject the hypothesis of data normality, according to the Kolmogorov Smirnov test at a 95.0% of confidence level.

Checking data normality using the Abdi Molin test with default values

>>> from pycafee.normalitycheck import NormalityCheck
>>> import numpy as np
>>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9])
>>> normality_test = NormalityCheck()
>>> result, conclusion = normality_test.fit(x, test="am")
>>> print(result)
AbdiMolinResult(Statistic=0.15459867079959644, Critical=0.2616, p_value=None, Alpha=0.05)
>>> print(conclusion)
Data is Normal at a 95.0% of confidence level.
get_alfa()#

Returns the current alpha value

get_language()#

Returns the current language

get_n_digits()#

Returns the n_digits parameter

set_alfa(alfa)#

Changes the alpha value

Parameters
alfafloat

The new significance level

Notes

This method only allows input of type float and between 0.0 and 1.0.

set_language(language)#

Changes the current language

Parameters
languagestr

The language code

Notes

The language must be a str with no more then 5 elements.

set_n_digits(n_digits)#

Sets the n_digits parameter

Parameters
n_digitsint

The maximum number of decimal places to be shown.