_three#

pycafee.sample.outliers.Grubbs._three(self, x_exp, which)#

This function calculates the statistic for the Grubbs test to check if the sample has two outlier on the same side (the two highest or the two lowest observations) as described by Grubbs [1] (\(G^{'''}\)).

Parameters
x_expnumpy array

One dimension numpy array with the data ordered.

whichstr

The side that should be evaluated.

  • If which"max" (or None), the two highest values are checked if they are possible outliers

  • If which"min", the two lowest values are checked if they are possible outliers

Returns
statisticfloat

The test statistic

Notes

If which=="min", the equation used is:

\[G^{'''} = \frac{(n-3)\times s^2_{2 \; lower}}{(n-1)\times s^2}\]

where \(s^2_{2 \; lower}\) is the sample variance disregarding the two samples suspected of being outliers (the two lowest values)

If which=="max", the equation used is:

\[G^{'''} = \frac{(n-3)\times s^2_{2 \; upper}}{(n-1)\times s^2}\]

where \(s^2_{2 \; upper}\) is the sample variance disregarding the two samples suspected of being outliers (the two highest values)

The data must be ordered.

References

1

GRUBBS, F. E. Sample Criteria for Testing Outlying Observations. The Annals of Mathematical Statistics, v. 21, n. 1, p. 27–58, 1950.

Examples

>>> from pycafee.sample.outliers import Grubbs
>>> import numpy as np
>>> x_exp = np.array([159, 153, 184, 153, 156, 150, 147])
>>> x_exp.sort(kind='quicksort')
>>> test = Grubbs()
>>> result = test._three(x_exp, which="max")
>>> print(result)
0.05121951219512194
>>> from pycafee.sample.outliers import Grubbs
>>> import numpy as np
>>> x_exp = np.array([15.42, 15.51, 15.52, 15.53, 15.68, 15.52, 15.56, 15.53, 15.54, 15.56])
>>> x_exp.sort(kind='quicksort')
>>> test = Grubbs()
>>> result = test._three(x_exp, which="min")
>>> print(result)
0.5353728489483768