compare_with_constant#

pycafee.sample.studentdistribution.StudentDistribution.compare_with_constant(self, x_exp, value, alfa=None, which=None, comparison=None, details=None)#

This function is a wraper around scipy.stats.ttest_1samp [1] to compare the mean of a sample with a constant using the Student’s t-test (one-sided or two-sided).

The test is performed using:

>>> scipy.stats.ttest_1samp(x_exp, value, axis=None)

Parameters

x_expnumpy array

One dimension numpy array with at least 2 sample data.

valueint or float

The value that will be used as a reference. This value is treated as a constant.

alfafloat, optional

The level of significance (ɑ). Default is None which results in 0.05 (ɑ = 5%).

whichstr, optional

The kind of comparison to perform.

If which = "two-side" (or None, e.g, the default), the comparison test is performed with the two-sided Student’s distribution.
If which = "one-side", the comparison test is performed with the one-sided Student’s distribution.

comparisonstr, optional

This parameter determines how to perform the comparison test between the means.

If comparison = "critical" (or None, e.g, the default), the comparison test is made between the critical value (with ɑ significance level) and the calculated value of the test statistic.
If "p-value", the comparison test is performed between the p-value and the adopted significance level (ɑ).

Both results should lead to the same conclusion.

detailsstr, optional

The details parameter determines the amount of information presented about the hypothesis test.

If details = "short" (or None, e.g., the default), a simplified version of the test result is returned.
If details = "full", a detailed version of the hypothesis test result is returned.
if details = "binary", the conclusion will be 1 (\(H_0\) is rejected) or 0 (\(H_0\) is accepted).

Returns

resulttuple with

statisticfloat

The test statistic.

criticallist of two floats

The critical values for the adopted significance level, where:

critical[0] is the upper critical value (always positive);
critical[1] is the lower critical value (always negative);

p_valuefloat

The p-value for the hypothesis test.

whichstr

The kind of comparison that was performed.

alphafloat

The adopted level of significance.

conclusionstr or int

The test conclusion (e.g, Normal/ not Normal).

See also

get_critical_value

Notes

The parameter comparison uses the hypothesis test to compare the means as follows:

☕

\(H_0:\) the mean is equal to constant

\(H_1:\) the mean is different from the constant (1)

\(H_1:\) the mean is lower than the constant (2)

\(H_1:\) the mean is greater than the constant (3)

The parameter which controls which alternative hypothesis will be used. If which = "two-side" the relation (1) will be used as the alternative hypothesis. In this case, when comparison = "critical", the comparison is performed between the calculated test statistic and the critical values (at alpha significance level) as follows:

if critical.Lower <= statistic <= critical.Upper:
    The mean is equal to the constant
else:
    The mean is different from the constant

The lower critical value is obtained with alfa/2 and the upper critical value is obtained with 1 - alfa/2 significance level (e.g., two side distribution).

When comparison = "p-value", the comparison is performed between the calculated p-value and the adopted significance level) as follows:

if p-value >= ɑ:
    The mean is equal to the constant
else:
    The mean is different from the constant

If which = "one-side" the relation (2) or (3) will be used as the alternative hypothesis, which will depend on the difference between the sample mean and the value of the constant. If this difference is lower than zero (negative), the alternative hypothesis (2) will be used. In this case, when comparison = "critical", the comparison is performed between the calculated test statistic and the lower critical value (at alpha significance level) as follows:

if critical.Lower <= statistic:
    The mean is equal to the constant
else:
    The mean is lower than the constant

The lower critical value is obtained with alfa significance level (one side distribution).

When comparison = "p-value", the comparison is performed between the calculated p-value and the adopted significance level) as follows:

if p-value >= ɑ:
    The mean is equal to the constant
else:
    The mean is lower than the constant

If the difference between the sample mean and the value of the constant is higher than zero (positive), the alternative hypothesis (3) will be used. In this case, when comparison = "critical", the comparison is performed between the calculated test statistic and the upper critical value (at alpha significance level) as follows:

if statistic <= critical.Upper:
    The mean is equal to the constant
else:
    The mean is higher than the constant

The upper critical value is obtained with 1 - alfa significance level (one side distribution).

When comparison = "p-value", the comparison is performed between the calculated p-value and the adopted significance level) as follows:

if p-value >= ɑ:
    The mean is equal to the constant
else:
    The mean is higher than the constant

References

1: SCIPY. scipy.stats.ttest_1samp. Available at: https://docs.scipy.org. Access on: 10 May. 2022.

Examples

Two side t test

>>> from pycafee.sample import StudentDistribution
>>> import numpy as np
>>> x = np.array([3.335, 3.328, 3.288, 3.198, 3.254])
>>> constant = 3.2
>>> comparison_test = StudentDistribution()
>>> result, conclusion = comparison_test.compare_with_constant(x, constant)
>>> print(result)
OneSampleStudentComparison(statistic=3.187090493341284, critical=[2.7764451051977987, -2.7764451051977996], p_value=0.03330866140058606, which='two-side', alpha=0.05)
>>> print(conclusion)
The mean (3.28) is different from the constant (3.2) (with 95.0% confidence).

>>> from pycafee.sample import StudentDistribution
>>> import numpy as np
>>> x = np.array([3.335, 3.328, 3.288, 3.198, 3.254])
>>> constant = 3.2
>>> comparison_test = StudentDistribution()
>>> result, conclusion = comparison_test.compare_with_constant(x, constant, comparison='p-value', details='full')
>>> print(result)
OneSampleStudentComparison(statistic=3.187090493341284, critical=[2.7764451051977987, -2.7764451051977996], p_value=0.03330866140058606, which='two-side', alpha=0.05)
>>> print(conclusion)
Since the p-value (0.033) is lower than the adopted significance level (0.05), we have evidence to reject the null hypothesis of equality of means, and we can say that the mean (3.28) is different from the constant (3.2) (with 95.0% confidence).

One side t test

>>> from pycafee.sample import StudentDistribution
>>> import numpy as np
>>> x = np.array([3380, 3500, 3600, 3450, 3490, 3390])
>>> constant = 3450
>>> comparison_test = StudentDistribution()
>>> result, conclusion = comparison_test.compare_with_constant(x, constant, which="one-side")
>>> print(result)
OneSampleStudentComparison(statistic=0.5520741745513498, critical=[2.015048372669157, -2.0150483726691575], p_value=0.3023326513892771, which='one-side', alpha=0.05)
>>> print(conclusion)
The mean (3468.333) is equal to the constant (3450) (with 95.0% confidence).

>>> from pycafee.sample import StudentDistribution
>>> import numpy as np
>>> x = np.array([3380, 3500, 3600, 3450, 3490, 3390])
>>> constant = 3450
>>> comparison_test = StudentDistribution()
>>> result, conclusion = comparison_test.compare_with_constant(x, constant, which="one-side", alfa=0.01, details='full')
>>> print(result)
OneSampleStudentComparison(statistic=0.5520741745513498, critical=[3.3649299989072743, -3.3649299989072756], p_value=0.3023326513892771, which='one-side', alpha=0.01)
>>> print(conclusion)
Since the test statistic (0.552) is lower than the upper critical value (3.364), we have no evidence to reject the null hypothesis of equality between the means, and we can say that the mean (3468.333) is equal to the constant (3450) (with 99.0% confidence)

get_critical_value

Outliers