The main problem when performing the equivalence tests is the prospective (!) Establishment of the equivalence limits. First of all: this is a substantive and
no statistical
Question. Nevertheless, the establishment of the equivalence limits takes a lot of space in the
statistical advice
a. Large areas result in a smaller sample size, and verification is easier. On the other hand, the validity of the proof can be limited, and it is e.g. B. not accepted by the authorities.
One can approach this question by the following considerations:
- Which difference is not relevant?
- "A difference that makes no difference."
- What is the minimally interesting difference (MID) - the equivalence range should be a bit smaller, e.g. B. 0.7 times.
- How large is the measurement uncertainty or the biological variability - here too the equivalence range should be smaller.
In the area of bioequivalence studies, the limits are set by the authorities: "Decision in favor of bioequivalence will be accepted when the parametric confidence intervals do not exceed the limits of 80 and 125% for the ratio of AUC-values and for the ratio of Cmax- values. The decision procedure based on 90% confidence intervals. "
Method validation experiments are often about proving that a target is = 0. For example, when comparing methods, the aim may be to show that the bias (= systematic error) of the test method is negligible (= 0) compared to a comparison method. Or in the case of a robustness or stability study, it must be shown that no relevant changes occur.
The frequently encountered procedure: "The test for difference does not provide a significant difference, so the groups are the same with regard to the examined feature" is incorrect from a statistical point of view. Such a result is an indication, but not a proof. Because these are significance tests with which the rejection of the null hypothesis - which states the equality - can be proven, but not its acceptance.
If the aim of a project is to prove equivalence, the appropriate tests are appropriate: the equivalence tests.
While studies aimed at proving equivalency or non-inferiority are widespread in the pharmaceutical industry and have always been adequately evaluated (since the 1990s the associated tests have been referred to as equivalence tests), the laboratory diagnostics community struggles extremely, to introduce the methodology. The first publication known to us [Lung KR, Gorko MA, Llewelyn J, Wiggins N. Statistical methodfor the determination of equivalence of automated test procedures.J Autom Methods Manag Chem 2003; 25: 123-7] was not reflected.
We have published a corresponding procedure for studies on carry-over, for demonstrating commutability and for comparing methods [
Basement T, Brinkmann T (2014). Proposed guidance for carryover studies, based on elementary equivalence testing. Clin. Lab 7, 1153-61; Keller T, Weber S
(2009): Statistical Test for Equivalence in Analysis of Commutability Experiments. CCLM 47, 376-377 (Download poster); Keller T, Faye S, Katzorke T (2011): Statistical Test for Equivalence in Analysis of Method Comparison Experiments. Application in comparison of AMH assays. CCLM 49: 806 (Download poster)].
In the meantime, the procedure is slowly finding its way into the community [Holland MD, Budd JR, et. al. (2017): Improved statistical methods for evaluation of stability
of in vitro diagnostic reagents, Stat Biopharm Res, 9: 272-278],), even if the test is not yet called an equivalence test in the case of commutability [Nilsson G, Budd JR, Greenberg N, Delatour V, Rej R, Panteghini M, Ceriotti F, Schimmel H, Weykamp C, Basement T, Camara JE, Burns C, Vesper HW, MacKenzie F, Miller WG (2018). IFCC Working Group Recommendations for Assessing commutability
Part 2: Using the Difference in Bias Between a Reference Material and Clinical Samples. Clin Chem 64: 455-464].
Figure: Carry-over as a non-inferiority problem, fig from Keller T, Brinkmann T (2014). Proposed guidance for carryover studies, based on elementary equivalence testing. Clin. Lab 7, 1153-61