_three#
- pycafee.sample.outliers.Grubbs._three(self, x_exp, which)#
This function calculates the statistic for the Grubbs test to check if the sample has two outlier on the same side (the two highest or the two lowest observations) as described by Grubbs [1] (\(G^{'''}\)).
- Parameters
- x_exp
numpy array One dimension numpy array with the data ordered.
- which
str The side that should be evaluated.
If
which"max"(orNone), the two highest values are checked if they are possible outliersIf
which"min", the two lowest values are checked if they are possible outliers
- x_exp
- Returns
- statistic
float The test statistic
- statistic
Notes
If
which=="min", the equation used is:\[G^{'''} = \frac{(n-3)\times s^2_{2 \; lower}}{(n-1)\times s^2}\]where \(s^2_{2 \; lower}\) is the sample variance disregarding the two samples suspected of being outliers (the two lowest values)
If
which=="max", the equation used is:\[G^{'''} = \frac{(n-3)\times s^2_{2 \; upper}}{(n-1)\times s^2}\]where \(s^2_{2 \; upper}\) is the sample variance disregarding the two samples suspected of being outliers (the two highest values)
The data must be ordered.
References
- 1
GRUBBS, F. E. Sample Criteria for Testing Outlying Observations. The Annals of Mathematical Statistics, v. 21, n. 1, p. 27–58, 1950.
Examples
>>> from pycafee.sample.outliers import Grubbs >>> import numpy as np >>> x_exp = np.array([159, 153, 184, 153, 156, 150, 147]) >>> x_exp.sort(kind='quicksort') >>> test = Grubbs() >>> result = test._three(x_exp, which="max") >>> print(result) 0.05121951219512194
>>> from pycafee.sample.outliers import Grubbs >>> import numpy as np >>> x_exp = np.array([15.42, 15.51, 15.52, 15.53, 15.68, 15.52, 15.56, 15.53, 15.54, 15.56]) >>> x_exp.sort(kind='quicksort') >>> test = Grubbs() >>> result = test._three(x_exp, which="min") >>> print(result) 0.5353728489483768