NormalityCheck#
- class pycafee.normalitycheck.normalitycheck.NormalityCheck(alfa=None, language=None, n_digits=None, **kwargs)[source]#
This class instantiates an object to apply a Normality test on a dataset
- Attributes
- critical
float The critical value for the test at
ɑsignificance level- msg
str A message describing the test
- normality_test
str The chosen test
- p_value
float The p-value for the hypothesis test.
- statistic
float The estimated statistic of the normality test
- x_expnumpy array
The data provided
- critical
Methods
fit(x_exp[, test, alfa, n_digits, ...])This function aggregates all available Normality tests.
get_alfa()Returns the current
alphavalueReturns the current language
Returns the
n_digitsparameterset_alfa(alfa)Changes the
alphavalueset_language(language)Changes the current language
set_n_digits(n_digits)Sets the
n_digitsparameter- fit(x_exp, test=None, alfa=None, n_digits=None, comparison=None, details=None)[source]#
This function aggregates all available Normality tests.
- Parameters
- x_exp
numpy array One dimension numpy array with at least
3sample data, except for the Lilliefors and Abdi-Molin tests, which must have at least4samples.- test
str, optional The test that will be applied:
If
"shapiro-wilk","sw"orNone, the function will apply the Shapiro Wilk normality test;If
"abdi-molin"or"am", the function will apply the Abdi Molin normality test;If
"anderson-darling"or"ad", the function will apply the Anderson Darling normality test;If
"kolmogorov-smirnov"or"ks", the function will apply the Kolmogorov Smirnov normality test;If
"lilliefors"or"li", the function will apply the Lilliefors normality test;
- alfa
float, optional The level of significance (
ɑ). Default isNonewhich results in0.05(ɑ = 5%).- comparison
str, optional This parameter determines how to perform the comparison test to perform the Normality test.
If
comparison = "critical"(orNone, e.g, the default), the comparison test is made between the critical value (withɑsignificance level) and the calculated value of the test statistic.If
"p-value", the comparison test is performed between the p-value and the adopted significance level (ɑ). This parameter does not influence the result iftest = "abdi-molin".
Both results should lead to the same conclusion.
- details
str, optional The
detailsparameter determines the amount of information presented about the hypothesis test.If
details = "short"(orNone, e.g, the default), a simplified version of the test result is returned.If
details = "full", a detailed version of the hypothesis test result is returned.if
details = "binary", the conclusion will be1(\(H_0\) is rejected) or0(\(H_0\) is accepted).
- x_exp
- Returns
- result
tuplewith - statistic
float The test statistic.
- critical
floatorNone Each test has a different set of critical values, but all contain critical values for alpha equal to
1%,5%or10%. For more details, see the specific details for each test in its respective documentation.- p_value
floatorNone The p-value for the hypothesis test.
- statistic
- conclusion
strorint The test conclusion (e.g, Normal/ not Normal).
- result
See also
Examples
Checking data normality with default values
>>> from pycafee.normalitycheck import NormalityCheck >>> import numpy as np >>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9]) >>> normality_test = NormalityCheck() >>> result, conclusion = normality_test.fit(x) >>> print(result) ShapiroWilkResult(Statistic=0.9698116779327393, Critical=0.842, p_value=0.8890941739082336, Alpha=0.05) >>> print(conclusion) Data is Normal at a 95.0% of confidence level.
Checking data normality using the
Shapiro Wilktest atɑ = 10%in portuguese ("pt-br")>>> from pycafee.normalitycheck import NormalityCheck >>> import numpy as np >>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9]) >>> normality_test = NormalityCheck(language="pt-br") >>> result, conclusion = normality_test.fit(x, test="sw", alfa=0.1) >>> print(result) ShapiroWilkResultado(Estatistica=0.9698116779327393, Critico=0.869, p_valor=0.8890941739082336, Alfa=0.1) >>> print(conclusion) Os dados são Normais com 90.0% de confiança.
Checking data normality using the
Lillieforstest withdetails = "full">>> from pycafee.normalitycheck import NormalityCheck >>> import numpy as np >>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9]) >>> normality_test = NormalityCheck() >>> result, conclusion = normality_test.fit(x, test="li", details="full") >>> print(result) LillieforsResult(Statistic=0.15459867079959644, Critical=0.258, p_value=0.7104644322958894, Alpha=0.05) >>> print(conclusion) Since the critical value (0.258) >= statistic (0.154), we have NO evidence to reject the hypothesis of data normality, according to the Lilliefors test at a 95.0% of confidence level.
Checking data normality using the
Anderson Darlingtest withconclusion = "p-value">>> from pycafee.normalitycheck import NormalityCheck >>> import numpy as np >>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9]) >>> normality_test = NormalityCheck() >>> result, conclusion = normality_test.fit(x, test="ad", details="full", comparison="p-value") >>> print(result) AndersonDarlingResult(Statistic=0.22687861079050364, Critical=None, p_value=0.7479231606974011, Alpha=0.05) >>> print(conclusion) Since p-value (0.747) >= alpha (0.05), we have NO evidence to reject the hypothesis of data normality, according to the AndersonDarling test at a 95.0% of confidence level.
Checking data normality using the
Kolmogorov Smirnovtest atɑ = 1%>>> from pycafee.normalitycheck import NormalityCheck >>> import numpy as np >>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9]) >>> normality_test = NormalityCheck() >>> result, conclusion = normality_test.fit(x, test="ks", details="full", comparison="p-value") >>> print(result) KolmogorovSmirnovResult(Statistic=0.15459867079959644, Critical=0.41, p_value=0.9706128123504146, Alpha=0.05) >>> print(conclusion) Since p-value (0.97) >= alpha (0.05), we have NO evidence to reject the hypothesis of data normality, according to the Kolmogorov Smirnov test at a 95.0% of confidence level.
Checking data normality using the
Abdi Molintest with default values>>> from pycafee.normalitycheck import NormalityCheck >>> import numpy as np >>> x = np.array([5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9]) >>> normality_test = NormalityCheck() >>> result, conclusion = normality_test.fit(x, test="am") >>> print(result) AbdiMolinResult(Statistic=0.15459867079959644, Critical=0.2616, p_value=None, Alpha=0.05) >>> print(conclusion) Data is Normal at a 95.0% of confidence level.
- get_alfa()#
Returns the current
alphavalue
- get_language()#
Returns the current language
- get_n_digits()#
Returns the
n_digitsparameter
- set_alfa(alfa)#
Changes the
alphavalue- Parameters
- alfa
float The new significance level
- alfa
Notes
This method only allows input of type
floatand between0.0and1.0.
- set_language(language)#
Changes the current language
- Parameters
- language
str The language code
- language
Notes
The
languagemust be astrwith no more then5elements.
- set_n_digits(n_digits)#
Sets the
n_digitsparameter- Parameters
- n_digits
int The maximum number of decimal places to be shown.
- n_digits