Asymptotic Relative Efficiencies of the Score and Robust Tests in Genetic Association Studies

Yuan, Ao; Fan, Ruzong; Xu, Jinfeng; Xue, Yuan; Li, Qizhai

The Open Mathematics, Statistics and Probability Journal

The Open Statistics & Probability Journal

(Discontinued)

ISSN: 2666-1489 ― Volume 10, 2020

RESEARCH ARTICLE

Asymptotic Relative Efficiencies of the Score and Robust Tests in Genetic Association Studies

Ao Yuan^{1, *}, Ruzong Fan¹, Jinfeng Xu², Yuan Xue³, Qizhai Li⁴

¹ Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington DC, 20057, USA

² Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong

³ School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100190, China

⁴ LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China

Abstract

Introduction:

The score statistic Z(θ) and the maximin efficient robust test statistic Z_MERT are commonly used in genetic association study, but according to our knowledge there is no formal comparison of them.

Methods:

In this report, we compare the asymptotic behavior of Z(θ) and Z_MERT, by computing their Asymptotic Relative Efficiencies (AREs) relative to each other. Four commonly used ARE measures, the Pitman ARE, Chernoff ARE, Hodges-Lehmann ARE and the Bahadur ARE are considered. Some modifications of these methods are made to simplify the computations. We found that the Chernoff, Hodges-Lehmann and Bahadur AREs are suitable for our setting.

Results and Conclusion:

Based on our study, the efficiencies of the two test statistic varies for different criterion used, and for different parameter values under the same criterion, so each test has its advantages and dis-advantages according to the criterion used and the parameters involved, which are described in the context. Numerical examples are given to illustrate the use of the two statistics in genetic association study.

Keywords: Asymptotic relative efficiency, Genetic association study, Maximin efficiency robust test, Score test Z(θ), Test statistic Z_mert, Pitman ARE, Chernoff ARE.

Article Information

Identifiers and Pagination:

Year: 2018
Volume: 9
First Page: 26
Last Page: 41
Publisher Id: TOSPJ-9-26
DOI: 10.2174/1876527001809010026

Article History:

Received Date: 7/3/2018
Revision Received Date: 23/7/2018
Acceptance Date: 2/10/2018
Electronic publication date: 28/12/2018
Collection year: 2018

© 2018 Yuan et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

^* Address correspondence to this author at the Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington DC, 20057, USA, Tel: +91 22 33611111, E-mail: ay312@georgetown.edu

View Abstract

Download PDF

Download ePub

1. INTRODUCTION

In genetic association studies, several test statistics are often used, including the score test Z(θ) and the maximin efficient robust test statistic Z_MERT. Although numerical behavior of the two tests are reported in various genetic association studies based on simulations, to our knowledge, a formal theoretical comparison of the two tests hasn’t been seen in the literature. It is of meaning to compare their asymptotic performances. Although for likelihood ratio based test statistic for testing hypothesis of simple null versus simple alternative, there is a uniformly most powerful test under some regularity conditions. However, most test statistics are not constructed directly from likelihood ratio, the hypothesis are composite, and there is generally no such optimal test. Therefore, the classical method to compare any two test statistics is to evaluate the Asymptotic Relative Efficiency (ARE) between them.

The ARE is a well studied area, with vast literatures and numerous different definitions. But often the computation of ARE is very difficult in the general case, some of the classical methods for ARE require that the test statistics have some standard forms, such as they have the same asymptotic distribution, or have the forms of i.i.d. summations. However, in practice, such as in genetic association studies, some test statistics do not have these forms. Sitlani and McKnight [1C.M. Sitlani, and B. McKnight, "Relative efficiency of trend tests with misspecified genetic models in stratified analyses of case-control or cohort data", Hum. Hered., vol. 71, no. 4, pp. 246-255.
[http://dx.doi.org/10.1159/000328858] [PMID: 21811075] ] studied AREs for the trend test under different models and stratifications. In this communication, wecompare the asymptotic behavior of two commonly used test statistics the score statistic Z(θ) and the maximin efficient robust test statistic Z_MERT, arise in case-control genetic association study, as given in Zheng, Li and Yuan [2G. Zheng, "Some statistical properties of efficiency robust tests with applications to genetic association studies", Scand. J. Stat., pp. 762-774.
[http://dx.doi.org/10.1111/sjos.12060] ], hereafter ZLY, by evaluate their AREs relative to each other. Four commonly used ARE measures, the Pitman ARE, Chernoff ARE, Hodges-Lehmann ARE and the Bahadur ARE are considered. Pitman’s ARE does not apply directly. We found the Chernoff, Hodges-Lehmann and the Bahadur AREs are suitable for our setting. Some modifications of these methods are made to simplify the computations.

Existing studies on ARE are mainly focused on two categories. One is to compare efficiencies of estimators of the same parameter; the other is to compare test statistics of the same hypothesis, in which the test statistics may not estimate the same parameter. The latter study can be under the assumption that the test statistics in comparison are asymptotic normality. In this case, the ARE’s can often be easily computed. There are also methods for compare ARE of different test statistics in general, in which different test statistics of the same hypothesis may have different asymptotic distributions. In this general case, Pitman, Bahadure and Hodges-Lehmann proposed different ways to compute the ARE, and it is often difficult. Although, when the test statistics have the same asymptotic distribution, the ARE can be computed easily. We also give a simple definition of ARE, so that it can be computed in the case of different asymptotic distributions, as long as the asymptotic distributions of the test statistics are known.

In Section 2, we describe the background of the genetic association study problem and a brief review of the classical definitions of ARE. In Section 3 we compare the ARE of the test statistics arose from our genetic association study. We found that he performances, or the efficiencies of the two test statistic varies for different criterion used, and for different parameter values under the same criterion, which described in the context. Section 4 gives brief numerical examples in simulation and application of the two tests in genetic association study, from our previous study, to illustration their usage.

2. BACKGROUND

Denote the log-likelihood function as, where Y_i is the outcome, R² are the parameters of interest, is a vector of parameters (m ≥ 0) for the covariate X_i = (x_1,....,x_im)^T, and n is the sample size. The goal is to test the null hypothesis against the alternative H₁ (λ₁, λ₂) \ {(1, 1)}, where has two edges with known slopes θ₀ and θ₁, and the null point (1, 1) is on the boundary of . We assume - ∞ < θ₀ < θ₁ <∞ and the endpoints θ₀ and θ₁ satisfy some constraints as specified in ZLY. If θ₁ = ∞ which corresponds to a vertical edge, we can switch λ₁ and λ₂ and define new (θ₁, θ₂) so - ∞ < θ₀ < θ₁ <∞ is satisfied by the new (θ₁, θ₂). For example, we can write λ₁ = 1 + (λ₂ - 1)/ λ₁^* (λ₂-1) and λ₁ = 1 + (λ₂ - 1)/ θ₀ = 1 + θ₀^* (λ₂ - 1) where - ∞ < θ₀ < θ₁ <∞.

Assume θ₀ and θ₁ are known from the problem of interest and/or scientific knowledge. Given λ₁ = λ ≥ 1, λ₂ can be written as . We treat η as a nuisance parameter not estimable under H₀ λ = 1, but it is estimable under ₀. Then the log-likelihood becomes. l_n (λ, η, θ) The score test statistic H₀ λ = 1 for is given by;

(1)

where is the MLE of η under H₀. It would be difficult to deal with l_n (λ, η, θ) because θ in Z (θ) is implicitly expressed.

So we work with l_n (λ,1 - θ + θλ, η), where θ is explicitly expressed. It is convenient to view l_n (λ, η, θ) as a tri-variate function with variables x₁ = λ, x₂ = 1 - θ + θλ and x₃ = η. Denote l_n,u = ∂l_n/ ∂x_u for, u = 1,2,3, l_{n, uv} = ∂²l_n/∂x_u∂x_v for u = 1,2 and, v = 1.2.3, and l_n.33 = ∂²l_n/∂x₃∂x^T₃. Assume and v = 3. Denote L_vu (η) = E_Hnl_lvu (1.1, η).

Suppose we have a family of asymptotically normally distributed tests , where under H₁ λ = 1 for a given , which determines the data-generating model under H₀: λ = 1. When is the true value Z(θ⁽⁾), is asymptotically most powerful (optimal). In this case, θ⁽¹⁾ ≠ θ⁽⁰⁾ when is used, the Pitman ARE of Z(θ⁽¹⁾) relative to Z(θ⁽¹⁾) is given by (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185] , 4J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384.
[http://dx.doi.org/10.1080/01621459.1985.10478127] ])

(2)

where is the asymptotic null correlation coefficient between and. Let be a set of all convex linear combinations of. A simple robust test derived under efficiency robust theory (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185] , 4J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384.
[http://dx.doi.org/10.1080/01621459.1985.10478127] ]; Birnbaum and Laska [5A. Birnbaum, "Efficiency robust 2-sample rank tests", J. Am. Stat. Assoc., pp. 1241-1251.
[http://dx.doi.org/10.1080/01621459.1967.10500929] ],) is the maximin efficient robust test (MERT), denoted as. When, is given by;

(3)

When T₀ has more than two members, generally exists and is unique (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185] ]), but its computation needs quadratic programming methods (Rosen [6J. Rosen, The gradient projection method for non-linear programming Part I: Linear constraints., SIAM J., pp. 181-217.]). However, when there is an extreme pair (Z(θ_i), Z(θ _i)) in T₀i.e. p_{θi, θi} = is MERT for if and only if (Gastwirth [7J. Gastwirth, "On robust rank test", Nonparametric Techniques in Statistical Inference, Cambridge University Press: London, .]).

and thus

(4)

That is, the MERT reaches the maximin ARE due to model uncertainty. The MERT was first derived for linear rank tests for the two-sample problem (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185] ]; Birnbaum and Laska [5A. Birnbaum, "Efficiency robust 2-sample rank tests", J. Am. Stat. Assoc., pp. 1241-1251.
[http://dx.doi.org/10.1080/01621459.1967.10500929] ],) and later extended to a family of asymptotically normally distributed tests (Gastwirth [4J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384.
[http://dx.doi.org/10.1080/01621459.1985.10478127] ]).

The Z (θ) statistic has the following property (ZLY): Let. Then where and.

Let be the MLE of η under H₀, and be that of (η, λ) under H₁. For given θ, the X² likelihood ratio test statistic is . For fixed θ, the number of parameters under H₁ is just 1 more than that under H₀, so by Wilk’s theorem, under H₀,

the chi-squared distribution with one degree of freedom. The likelihood ratio test is also widely used in genetic association studies, its properties, including its ARE is well studied in the literature, so we will not investigate it here.

Let the MLE here 0 presents a vector of 0’s. Let η₀ be the true value (unknown) of η under either H₀ or H₁, we define the score function as;

and the test statistic for H₀ as;

(5)

where “~” means asymptotically equivalent, in the above is replaced by it is approximated by n^-1l_{n, vu} (1.1, η).

Denote . For a vector v (v₁, v₂, v₃)^T, denote . be the true density of the data y. The null model f (1, 1, η) is and the alternative model is . The following notation is also used under H₁. For fixed, (λ, θ) let;

(6)

Under H₁, the empirical version of η₀ is just . We denote the Fisher information and its inverse in the blocked forms as;

Let

by is replaced by Note that with defined in the above,

Below we give a brief review of the notions of ARE for test statistics in the general case, more detailed account can be found in Serfling (1980) [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481] ] and Nikitin (2011) [9Y. Nikitin, "Asymptotic relative efficiency in testing (version 5)", Stat Prob: The Encyclopedia Sponsored by Statistics and Probability Societies, .].

The calculation of the existing of versions of ARE is generally not easy, as in the examples (Serfling, 1980 [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481] ]; Nikitin, 1995 [10Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511530081] ]; van der Varrt, 1998 [11A. van der Varrt, Asymptotic Statistics., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511802256] ]). We only point out that the Pitman ARE is based on the central limit theorem for test statistics, that the Bahadur ARE requires the large deviation asymptotics of test statistics under the null-hypothesis, while the Hodges-Lehmann ARE is connected with large deviation asymptotics under the alternative. Each type of ARE has its own advantage and dis-advantage, and the different notions of ARE are not always give consistent conclusion.

If the condition of asymptotic normality (or common asymptotic distribution) fails, considerable difficulties will arise in calculating the Pitman ARE as it may not at all exist or may depend on α and β. Usually one considers limiting Pitman ARE as α → 0 Wieand (1976) [12H. Wieand, "A condition under which the Pitman and Bahadur", Ann. Stat., pp. 1003-1011.
[http://dx.doi.org/10.1214/aos/1176343600] ] established the correspondence between this kind of ARE and the limiting approximate Bahadur efficiency which is easy to compute.

The Bahadur (1960) [13R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295.
[http://dx.doi.org/10.1214/aoms/1177705894] ] ARE is to fix the power of tests and compare the exponential rate of decrease of their sizes for the increasing number of observations and fixed alternative. Its computation is always non-trivial, and heavily depends on advancements in large deviation theory, as in Dembo and Zeitouni (1998) [14A. Dembo, Large deviation techniques and applications 2^nd., Springer: New York, .
[http://dx.doi.org/10.1007/978-1-4612-5320-4] ] and Deuschel and Strook (1989) [15J. Deuschel, Large deviations, Academic Press: Boston, .].

It is proved that under some regularity conditions the likelihood ratio statistic is asymptotically optimal in Bahadur sense (Bahadur, 1967 [16R. Bahadur, "Rates of convergence of estimates and test statistics", Ann. Math. Stat., pp. 303-324.
[http://dx.doi.org/10.1214/aoms/1177698949] ]; Arcones, 2005 [17M. Arcones, "Bahadur efficiency of the likelihood ratio test", Math. Methods Stat., pp. 163-179.]). Often the Bahadur ARE is difficult to compute for any alternative but it is possible to calculate the limit of Bahadur ARE as θ approaches the null-hypothesis, to obtain the local Bahadur efficiency.

The Hodges-Lehmann ARE is, in contrast to Bahadur efficiency, it fixes the level of tests and compares the exponential rate of decrease of their type-II errors for the increasing number of observations and fixed alternative. The computation of Hodges-Lehmann ARE is also difficult as it requires large deviation asymptotics of test statistics under the alternative.

The drawback of Hodges-Lehmann efficiency is that most two-sided tests like Kolmogorov and Cramer-von Mises tests are all asymptotically optimal, and hence one cannot discriminate among them. On the other hand, under some regularity conditions the one-sided tests, such as linear rank tests can be compared, and their Hodges-Lehmann efficiency coincides locally with Bahadur efficiency (Nikitin, 1995 [10Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511530081] ]).

The Chernoff ARE is to minimize, asymptotically, a linear combination of type I and type II errors, it does not depend on the nominal level nor the power. But it basically only applies to test statistics of the form of i.i.d. summation.

The local ARE is much easier to compute than the previous ones, but it only applies to test statistics which are asymptotical normal with rate . We will see that some test statistics used in genetic association studies do not satisfy this condition.

Besides the four commonly used AREs for hypothesis tests described above, there are some other interesting methods. Hoeffding’s (1965) ARE [18W. Hoeffding, "Asymptotically optimal tests for multinomial distributions (with discussion)", Ann. Math. Stat., pp. 369-408.
[http://dx.doi.org/10.1214/aoms/1177700150] ], based on the work of Sanov (1957) [19I. Sanov, On the probability of large deviations of random variables.Sel., Transl. Math. Statist. Prob, pp. 213-244.], is theoretically appealing, but ony applies to multinomial data; Rubin and Sethurman ARE (1965) [20H. Rubin, "Bayes risk efficiency", Sankhya, A., pp. 325-346.] is based on Bayes risk; others including Kallenberg ARE (1983) [21W. Kallenberg, "Intermediate efficiency, theory and examples", Ann. Stat., pp. 170-182.
[http://dx.doi.org/10.1214/aos/1176346067] ], and the Borovkov-Mogulskii ARE (1993) [22A. Borovkov, "Large deviations and testing of statistical hypotheses", Sib. Adv. Math., .], etc.

3. ARE OF TWO TESTS IN GENETIC ASSOCIATION STUDIES

In this section, we investigate the uses of Pitman ARE, Chernoff ARE, Hodges-Lehmman ARE, and Bahadur ARE to the commonly used statistics in genetic association analysis. We focus on the statistics used in ZLY, Z(θ) and, Z_MERTand refer the notations there. Although some other commonly used test statistics in genetic association studies, such as the likelihood ratio statistic (chi-squared statistic), we will not discuss them here, as most of them are well studied in the literatures.

Pitman ARE. Consider testing Let S_n be a test statistic based on data of size n, with mean µ_n (λ) and standard deviation µ_n (λ). To use this method the following conditions are needed.

(P1). For some continuous strictly increasing distribution function F independent of λ, and some, δ > 0 as n → ∞,

(P2). For , is k times differentiable, with µ_n⁽¹⁾ (λ₀) = ... =

(P3). For d(n) → ∞ some and some constant

(P4). For

Pitman appears as the first to introduce the notion of ARE for tests in his unpublished lectures, and the following result was stated in Noether’s works.

(Pitman, 1949 [23E. Pitman, Lecture Notes on Nonparametric Statistical Inference, Columbia University: Mimeographed., 1949.]; Noether, 1950 [24G. Noether, "Asmptotic properties of the wald-wolfowitz test of randomness", Ann. Math. Stat., pp. 231-246.
[http://dx.doi.org/10.1214/aoms/1177729841] ]). Assume (P1)-(P4), that α_n = P_{λ 0} (S_n > then , if and only if

(7)

(ii) Let S_1,n and S_2,n each satisfy (P1)-(P4) with the common F, K, n₁ and n₂ be the sample size required for S_1,n and S_2,n to have the same asymptotic power 1 - β, then

Thus, if d(n) = n^q (q > 0), then the Pitman ARE is given by; .

and Pitman ARE is then;

(8)

Let l (λ₀) be the Fisher information at λ₀. Under some additional conditions, Rao (1963) [25C. Rao, "Criteria of estimation in large samples", Sankhya Ser. A, pp. 189-206.] proved that

Any test statistic S_n achieves the equality in the above is called Pitman efficient.

Under suitable conditions, Pitman ARE can be expressed in terms of correlation coefficient between the two test statistics in their standardized form, as given below.

(P5) are asymptotic joint normal uniformly in a neighborhood of λ₀.

Denote p(λ)the asymptotic correlation coefficient between them under, and and be the distribution and density function of. The following result is true.

(van Eden, 1963 [26C. van Eden, "The relationship between Pitman’s asymptotic relative efficiency of two tests and correlation coefficient between their test statistics", Ann. Math. Stat., pp. 1442-1451.
[http://dx.doi.org/10.1214/aoms/1177703876] ]). Assume that S_1,n and S_2,n satisfy (P1)-(P5) in their standardized form with , and that p(λ_n) → p(λ λ_n): = p as λ_n → λ₀ Then;

(i) For 0 ≤ λ ≤ 1, tests of the form satisfy (P1)-(P5), and the “best” S_yn which maximizes is the one with;

and

(9)

(ii) If S_1n is the best test satisfying (P1)-(P5), then;

(10)

In the typical case, S_n is an i.i.d. summation (upto scale), then µ_n(λ) = nµ(λ)

Note does not (α, β) depend on , thus if or, C₁ > C₂ then {S_1n} is better than {S_2n} for all (α, β).

Pitman ARE given by (3) or (4) are easy to use. However, they require the two comparing test statistics have the same asymptotic distribution (after standardization), (4) require further that they are jointly asymptotic normal. In practice, these conditions some times cannot be satisfied. For example the chi-squared test Z (θ₀) and have different asymptotic distributions. Below we give a generalized version of (3) to the case the two comparing test statistics not necessarily have the same asymptotic distribution (after standardization). Similar generalizations may have already exist in the literature, we still state our version to see what form it has in this case. Let F_i be the asymptotic distribution of We have;

Assume (P1)-(P4) for S_in with µ_in, σ_in and F_i separately, but with the same K and nominal level α, n₁ and n₂ be the sample sizes required for S_1n and N_2n to have the same asymptotic power 1 - β(0 < β < 1 - α), then

Thus for d(n) = n^q (q > 0), we define the generalized Pitman ARE as;

(11)

In the typical case or 1/q = 2, and;

Note, unlike the case of F₁ = F₂, in this case, Pitman’s ARE depends on the values of level α and power β , and comparison of two tests may not have consistent result.

Can we have the corresponding form of (10) in the case S_1n and S_2n have different asymptotic distribution? For this we checked the proof for (4), and find in this case, although in principle there is a relationship among the asymptotic correlation coefficient p between S_1n and S_2n , the asymptotic distributions’s, F_i's, and the level α and power β , but its mathematically intractable. Below we give its actual value.

Proposition 1.

Remark: When some of the conditions (P1)-(P5) are not satisfied, ARE may not be characterized by correlation coefficient. For example, T₁ = Z is an estimate of θ = 0 under H₀, and Z is symmetrically distributed around 0, so E_Ho (Z) = 0 and suppose VAR_Ho (Z) = 1 . Let, is an estimate of can also be used to test H₀. However , but we cannot say that T₂ is a ‘bad’ test statistic, and .

Chernoff ARE. This notion only considers test statistic of the form with the s i.i.d. with be the moment generating function of Y, and;

Let and (assume µ₀ ≤ µ₁), (i = 0,1), and is called the Chernoff index . be a linear combination of type I and type II errors evaluated at the critical value t, and Q_n = inf_{µ0 ≤ t ≤ µ,}Q_n (t) be the minimum of these errors for test statistic S_n. Chernoff (1952) [27H. Chernoff, "A measure of asymptotic efficiency for tests of a hypothesis based on sums of observations", Ann. Math. Stat., pp. 493-507.
[http://dx.doi.org/10.1214/aoms/1177729330] ] showed that Q_n tends to 0 at exponential rate, (so the faster the rate, or the larger absolute value of logQ_n, the better the test statistic), and established.

the result is independent of γ.

Let {S_1,n} and {S_2,n} both of the form of i.i.d. summation and have Chernoff indices p₁ and p₂ respectively, n₁ and n₂ be the corresponding sample sizes for which Q_1,n, ~ Q_2,n, the Chernoff ARE of {S_1,n} relative to {S_2,n} is defined and given by;

(12)

For test statistic not in the form of i.i.d summation, its Chernoff index is difficult to compute. The following result sometimes is very helpful in this case, and give an upper bound of Chernoff index.

(Kallenberg, 1982 [28W. Kallenberg, "Chernoff efficiency and deficiency", Ann. Stat., pp. 583-594.
[http://dx.doi.org/10.1214/aos/1176345799] ]) Let for some

Then

In the case of simple null vs simple alternative, Kallenberg (1982) [28W. Kallenberg, "Chernoff efficiency and deficiency", Ann. Stat., pp. 583-594.
[http://dx.doi.org/10.1214/aos/1176345799] ] also gives an upper bound of the Chernoff index, and any test statistic achieves this bound is said to be Chernoff efficient. As this bound itself is not easy to compute, we won’t pursue it here, interested readers can check the mentioned paper or the book by Nikitin (1995) [10Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511530081] ].

As another way to simplify the computation, we consider a modified version of this Chernoff index. Let S be the weak limit of S_n, be the distribution function of S, and H_n: λ_n + λ_n = n^-1/2be a sequence of local alternatives. As the sample size increases, the test statistic S_n is expected to be able to distinguish the local alternatives from the null. Let (assume µ₁ ≥µ₀), and be the asymptotic linear combination of type I and local type II errors evaluated at t, and . The smaller is , the better S_n as a test statistic for H₀vs.H₁ For two test statistics S_1n and S_2n with we define the modified Chernoff ARE as;

(13)

Let, ;

Below we give values p_z(θ(0)) and p_{Z_MERT} and so that their Chernoff ARE can be obtained. We also give and, so their modified Chernoff ARE can be obtained. For the chi-squared test T, under T₁ its asymptotic distribution is a non-central chi-squared distribution, with a non-closed form, its modified Chernoff index is not directly computable. Let , where g₁ is the observed genotype of the i-th individual, x₁ is the corresponding covariates, and let;

Let, and

Proposition 2. (i) Assume is normal with mean and variance . Then, for E to denote expectation with respect to (x_i, g_i), we have;

Hodges-Lehmann ARE. Consider testing the null hypothesis be given a level α test statistic S_n with critical value the type II error at λ is β_n (λ) = Typically, β_n (λ) tends to zero at exponential rate, the faster the better S_n is. Hodges and Lehmann (1956) [29J. Hodges, "The efficiency of some nonparametric competitors of the t-test", Ann. Math. Stat., pp. 324-335.
[http://dx.doi.org/10.1214/aoms/1177728261] ] proposed;

as a measure of the performance of S_n and it called the Hodges-Lehmann index of the statistic S_n. For two test statistics S_1n and S_2n for the same H₀vs,H₁ with d₁ (λ) and d₂ (λ), the Hodges-Lehmann ARE of {S_1n} relative to {S_2n} at is defined as;

(14)

For probability density functions f and g, let g(x]dx) be the Kullback-Leibler divergence between f) and g). For any test statistic S_n (X₁,.....,X_n) based on (X₁,.....,X_n) i.i.d. density , the Hodges-Lehmann index has the following property;

and any test statistic achieve the equality in the above is said to be Hodges-Lehmann efficient.

Compared to the Pitman and Chernoff ARE, the Hodges-Lehmman ARE does not require the comparing test statistic have the same asymptotic distribution, nor they have the form of i.i.d. summations, so it has wilder application scope.

Proposition 3. Under conditions of Theorem 4 in Zheng et al. (2010) [30Y. Zang, "Simple algorithms to calculate asymptotic Simple algorithms to calculate asymptotic", J. Stat. Softw., pp. 1-24.], with , given in (2), for λ > 1, we have;

For the chi-squared test T, under H₁ its asymptotic distribution is a non-central chi-squared distribution, with no-closed form. So its Hodges-Lehmann ARE is not directly available.

Bahadur ARE. Consider testing the null hypothesis be Let F_n,λ(.) be the distribution function of a test statistic S_n under p_λ, and for , let;

the p-value of the observed S_n under the distribution p_λ, and;

if the limit exists. Typically, L_n tends to one and L_n tends to zero exponentially fast, and the faster, or the bigger c(.), the better S_n is. For two test statistics S_i,n (l = 1,2) for the same hypothesis with L_n, C_i (λ), and sample size n_i, to perform “equivalently” in the sense lim n₁^-1 log L_2,n₂ = lim n₁^-1 Log L_1,n₁, the Bahadur ARE of S_1,n log L_1n, the Bahadur ARE of relative to S_2,n, at , is defined as, and has the property

(15)

The limit C can be computed under the following conditions.

(B1). For

(B2). For the interval , there is a function g on l, such that;

(Bahadur, 1960 [13R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295.
[http://dx.doi.org/10.1214/aoms/1177705894] ]). If S_n satisfies (B1)-(B2), then for ,

For any test statistic S_n (X₁,....,X₂) based on X₁,....,X_n i.i.d. density Bahadur (1967) [16R. Bahadur, "Rates of convergence of estimates and test statistics", Ann. Math. Stat., pp. 303-324.
[http://dx.doi.org/10.1214/aoms/1177698949] ] obtained the following;

Note although the above relationship is regarded as a dual to that of the Hodges-Lehmann index, the two are not equivalent as A test statistic is said to be Bahadur efficient if for each lim_n, log

Bahadur efficiency of likelihood ratio test has been studied by a number of researchers for some special distribution families. Arcones (2005 [17M. Arcones, "Bahadur efficiency of the likelihood ratio test", Math. Methods Stat., pp. 163-179.], Theorem 3.3) proved that, under some regularity conditions, the likelihood ratio statistic is Bahadur efficient. Let be the density function of the data, under his conditions of Theorem 3.3, for each fixed λ > 1 and θ, we have;

Like the Hodges-Lehmman ARE, Bahadur ARE does not require the comparing test statistic have the same asymptotic distribution, nor they have the form of i.i.d. summations, so it has wide application scope.

For computation easiness, we consider a local version of Bahadur ARE. Consider testing H₀: λ = λ₀vs the local alternative H₀: λ = λ₀ + n^-1/2. Let F₀ be the asymptotic distribution function of S_n under H₀, we define;

Typically, 0 < <1. The smaller , the better S_n is. For two test statistics S_i,n(i = 1,2) for the same hypothesis with G_i,n and , we define the local Bahadur ARE of S_1,n relative to S_2,nas;

(16)

Proposition 4. (i) with µ_MERT (λ) given in Proposition 3, we have;

(ii) Under conditions of Theorem 4 in ZLY, µ_MERT (λ) with be the derivative of µ_MERT (λ), θ₀ be the value of θ H₀ under, we have;

4. SIMULATION AND APPLICATION TO GENETIC ASSOCIATION STUDIES

4.1. Simulation Study

Let P be the Minor Allele Frequency (MAF) of a marker of interest. We consider case-control data with r = 500 cases and s = 500 controls, and the disease prevalence K = 0.05. We generate 1000 datasets, and compute the means and standard deviations of For Z_MERT, we choose θ_i = 0 and θ_j = 1.

Table T1 shows the result, the means of AREs and the standard deviations of AREs are in brackets. First we can see the mean of all three AREs are less than 1, which show that Z_θ^o is consistent better than Z_MERT. Corresponding tothis fact when θ = θ_^(o) is the true value Z_θ^(o), is asymptotically most powerful. Then the three AREs are increased with the P or λ increased. Third, the e_p has the lowest variance among the three AREs, next is , last is

Table 1
The AREs of Z_MERT and Z_θ(0).

4.2. Application

We use 6 reported SNPs associated with breast cancer 2 (Hunter et al. 2007 [31D.J. Hunter, P. Kraft, K.B. Jacobs, D.G. Cox, M. Yeager, S.E. Hankinson, S. Wacholder, Z. Wang, R. Welch, A. Hutchinson, J. Wang, K. Yu, N. Chatterjee, N. Orr, W.C. Willett, G.A. Colditz, R.G. Ziegler, C.D. Berg, S.S. Buys, C.A. McCarty, H.S. Feigelson, E.E. Calle, M.J. Thun, R.B. Hayes, M. Tucker, D.S. Gerhard, J.F. Fraumeni Jr, R.N. Hoover, G. Thomas, and S.J. Chanock, "A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer", Nat. Genet., vol. 39, no. 7, pp. 870-874.
[http://dx.doi.org/10.1038/ng2075] [PMID: 17529973] ]; Li et al., 2008 [32Q. Li, G. Zheng, Z. Li, and K. Yu, "Efficient approximation of P-value of the maximum of correlated tests, with applications to genome-wide association studies", Ann. Hum. Genet., vol. 72, no. Pt 3, pp. 397-406.
[http://dx.doi.org/10.1111/j.1469-1809.2008.00437.x] [PMID: 18318785] ]) to illustrate the ARE of Z_MERT. These 6 SNPs are rs10510126, rs12505080, rs17157903, rs1219648, rs7696175, and rs2420946. The counts of subjects with three types of genotypes in cases and controls are shown in Table 2, where (r₀, r₁, r₂) is the number of three genotypes in cases and (s₀, s₁, s₂) is the number of genotypes in controls. From the table, we find three AREs of E_p, E_c and E_b are higher than 75%, sometimes it can reach 97%. For example, for SNP rs17157903, the AREs of, and are 0.8255, 0.8453 and 0.7642, respectively. It shows that Z_MERT is a robust test.

Table 2
Three AREs of for 6 reported SNPs associated with breast cancer 2.

APPENDIX

Derivation of : From (P3), we have . Also, as in the proof in Serfling (1980 [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481] ], p. 317-318), if and only if

Thus, for β_i,n (θ _n) → β, we must have;

Proof of Proposition 1: We use (4) to compute e_p (Z_MERT,Z(θ⁽⁰⁾)). By definition of Z(θ⁽⁰⁾)) and CLT we have , and by Theorem 3 in ZLY, Also Z(θ⁽⁰⁾)), and Z_MERT)are jointly asymptotic normal with correlation . Thus the condition of (4) are satisfied, and it gives;

Proof of Proposition 2. (i) By assumption As in the proof of Theorem 4 in ZLY, we have that where the V_i = V_i (θ) ’s are i.i.d. with;

Under with and Under with and So we have

By example A in Serfling (1980 [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481] ], p. 330), we have;

similar to that for (Z(θ⁽⁰⁾)).

(ii). We first compute . In this case, let be the weak limit of (Z(θ⁽⁰⁾)). Then,

Proof of Proposition 3. Since under , we have t_n(α)→ Ф^-1 (1- α); and under is continuous on (- ∞, ∞), the distribution function of converges to uniformly Ф (.). Note µ(λ, θ) > 0, so for λ > 1 we have;

Let , using L’hopital’s rule twice, we get;

Similarly, under where The same way we get;

Proof of Proposition 4. i). In our case and when uniformly in S_n. From proof of Theorem 4 in ZLY, we have that for (a.s.). Now we compute, for

Let , and use L’Hopital’s rule,

Since , and by L’hopital’s rule, , so use L’Hopital’s rule on the above again,

Thus by Bahadur’s (1960) [13R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295.
[http://dx.doi.org/10.1214/aoms/1177705894] ] Theorem,

is similarly computed;

Similarly, under , (a.s.), so

CONSENT FOR PUBLICATION

Not applicable.

CONFLICT OF INTEREST

The authors declare no conflict of interest, financial or otherwise.

ACKNOWLEDGEMENTS

Declared none.

REFERENCES

[1]	C.M. Sitlani, and B. McKnight, "Relative efficiency of trend tests with misspecified genetic models in stratified analyses of case-control or cohort data", Hum. Hered., vol. 71, no. 4, pp. 246-255. [http://dx.doi.org/10.1159/000328858] [PMID: 21811075]
[2]	G. Zheng, "Some statistical properties of efficiency robust tests with applications to genetic association studies", Scand. J. Stat., pp. 762-774. [http://dx.doi.org/10.1111/sjos.12060]
[3]	J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948. [http://dx.doi.org/10.1080/01621459.1966.10482185]
[4]	J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384. [http://dx.doi.org/10.1080/01621459.1985.10478127]
[5]	A. Birnbaum, "Efficiency robust 2-sample rank tests", J. Am. Stat. Assoc., pp. 1241-1251. [http://dx.doi.org/10.1080/01621459.1967.10500929]
[6]	J. Rosen, The gradient projection method for non-linear programming Part I: Linear constraints., SIAM J., pp. 181-217.
[7]	J. Gastwirth, "On robust rank test", Nonparametric Techniques in Statistical Inference, Cambridge University Press: London, .
[8]	R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, . [http://dx.doi.org/10.1002/9780470316481]
[9]	Y. Nikitin, "Asymptotic relative efficiency in testing (version 5)", Stat Prob: The Encyclopedia Sponsored by Statistics and Probability Societies, .
[10]	Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, . [http://dx.doi.org/10.1017/CBO9780511530081]
[11]	A. van der Varrt, Asymptotic Statistics., Cambridge University Press, . [http://dx.doi.org/10.1017/CBO9780511802256]
[12]	H. Wieand, "A condition under which the Pitman and Bahadur", Ann. Stat., pp. 1003-1011. [http://dx.doi.org/10.1214/aos/1176343600]
[13]	R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295. [http://dx.doi.org/10.1214/aoms/1177705894]
[14]	A. Dembo, Large deviation techniques and applications 2^nd., Springer: New York, . [http://dx.doi.org/10.1007/978-1-4612-5320-4]
[15]	J. Deuschel, Large deviations, Academic Press: Boston, .
[16]	R. Bahadur, "Rates of convergence of estimates and test statistics", Ann. Math. Stat., pp. 303-324. [http://dx.doi.org/10.1214/aoms/1177698949]
[17]	M. Arcones, "Bahadur efficiency of the likelihood ratio test", Math. Methods Stat., pp. 163-179.
[18]	W. Hoeffding, "Asymptotically optimal tests for multinomial distributions (with discussion)", Ann. Math. Stat., pp. 369-408. [http://dx.doi.org/10.1214/aoms/1177700150]
[19]	I. Sanov, On the probability of large deviations of random variables.Sel., Transl. Math. Statist. Prob, pp. 213-244.
[20]	H. Rubin, "Bayes risk efficiency", Sankhya, A., pp. 325-346.
[21]	W. Kallenberg, "Intermediate efficiency, theory and examples", Ann. Stat., pp. 170-182. [http://dx.doi.org/10.1214/aos/1176346067]
[22]	A. Borovkov, "Large deviations and testing of statistical hypotheses", Sib. Adv. Math., .
[23]	E. Pitman, Lecture Notes on Nonparametric Statistical Inference, Columbia University: Mimeographed., 1949.
[24]	G. Noether, "Asmptotic properties of the wald-wolfowitz test of randomness", Ann. Math. Stat., pp. 231-246. [http://dx.doi.org/10.1214/aoms/1177729841]
[25]	C. Rao, "Criteria of estimation in large samples", Sankhya Ser. A, pp. 189-206.
[26]	C. van Eden, "The relationship between Pitman’s asymptotic relative efficiency of two tests and correlation coefficient between their test statistics", Ann. Math. Stat., pp. 1442-1451. [http://dx.doi.org/10.1214/aoms/1177703876]
[27]	H. Chernoff, "A measure of asymptotic efficiency for tests of a hypothesis based on sums of observations", Ann. Math. Stat., pp. 493-507. [http://dx.doi.org/10.1214/aoms/1177729330]
[28]	W. Kallenberg, "Chernoff efficiency and deficiency", Ann. Stat., pp. 583-594. [http://dx.doi.org/10.1214/aos/1176345799]
[29]	J. Hodges, "The efficiency of some nonparametric competitors of the t-test", Ann. Math. Stat., pp. 324-335. [http://dx.doi.org/10.1214/aoms/1177728261]
[30]	Y. Zang, "Simple algorithms to calculate asymptotic Simple algorithms to calculate asymptotic", J. Stat. Softw., pp. 1-24.
[31]	D.J. Hunter, P. Kraft, K.B. Jacobs, D.G. Cox, M. Yeager, S.E. Hankinson, S. Wacholder, Z. Wang, R. Welch, A. Hutchinson, J. Wang, K. Yu, N. Chatterjee, N. Orr, W.C. Willett, G.A. Colditz, R.G. Ziegler, C.D. Berg, S.S. Buys, C.A. McCarty, H.S. Feigelson, E.E. Calle, M.J. Thun, R.B. Hayes, M. Tucker, D.S. Gerhard, J.F. Fraumeni Jr, R.N. Hoover, G. Thomas, and S.J. Chanock, "A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer", Nat. Genet., vol. 39, no. 7, pp. 870-874. [http://dx.doi.org/10.1038/ng2075] [PMID: 17529973]
[32]	Q. Li, G. Zheng, Z. Li, and K. Yu, "Efficient approximation of P-value of the maximum of correlated tests, with applications to genome-wide association studies", Ann. Hum. Genet., vol. 72, no. Pt 3, pp. 397-406. [http://dx.doi.org/10.1111/j.1469-1809.2008.00437.x] [PMID: 18318785]

Track Your Manuscript:
Enter Correct Manuscript Reference Number:

Submit Reference Number

Endorsements

"Open access will revolutionize 21^st century knowledge work and accelerate the diffusion of ideas and evidence that support just in time learning and the evolution of thinking in a number of disciplines."

Daniel Pesut
(Indiana University School of Nursing, USA)

"It is important that students and researchers from all over the world can have easy access to relevant, high-standard and timely scientific information. This is exactly what Open Access Journals provide and this is the reason why I support this endeavor."

Jacques Descotes
(Centre Antipoison-Centre de Pharmacovigilance, France)

"Publishing research articles is the key for future scientific progress. Open Access publishing is therefore of utmost importance for wider dissemination of information, and will help serving the best interest of the scientific community."

Patrice Talaga
(UCB S.A., Belgium)

"Open access journals are a novel concept in the medical literature. They offer accessible information to a wide variety of individuals, including physicians, medical students, clinical investigators, and the general public. They are an outstanding source of medical and scientific information."

Jeffrey M. Weinberg
(St. Luke's-Roosevelt Hospital Center, USA)

"Open access journals are extremely useful for graduate students, investigators and all other interested persons to read important scientific articles and subscribe scientific journals. Indeed, the research articles span a wide range of area and of high quality. This is specially a must for researchers belonging to institutions with limited library facility and funding to subscribe scientific journals."

Debomoy K. Lahiri
(Indiana University School of Medicine, USA)

"Open access journals represent a major break-through in publishing. They provide easy access to the latest research on a wide variety of issues. Relevant and timely articles are made available in a fraction of the time taken by more conventional publishers. Articles are of uniformly high quality and written by the world's leading authorities."

Robert Looney
(Naval Postgraduate School, USA)

"Open access journals have transformed the way scientific data is published and disseminated: particularly, whilst ensuring a high quality standard and transparency in the editorial process, they have increased the access to the scientific literature by those researchers that have limited library support or that are working on small budgets."

Richard Reithinger
(Westat, USA)

"Not only do open access journals greatly improve the access to high quality information for scientists in the developing world, it also provides extra exposure for our papers."

J. Ferwerda
(University of Oxford, UK)

"Open Access 'Chemistry' Journals allow the dissemination of knowledge at your finger tips without paying for the scientific content."

Sean L. Kitson
(Almac Sciences, Northern Ireland)

"In principle, all scientific journals should have open access, as should be science itself. Open access journals are very helpful for students, researchers and the general public including people from institutions which do not have library or cannot afford to subscribe scientific journals. The articles are high standard and cover a wide area."

Hubert Wolterbeek
(Delft University of Technology, The Netherlands)

"The widest possible diffusion of information is critical for the advancement of science. In this perspective, open access journals are instrumental in fostering researches and achievements."

Alessandro Laviano
(Sapienza - University of Rome, Italy)

"Open access journals are very useful for all scientists as they can have quick information in the different fields of science."

Philippe Hernigou
(Paris University, France)

"There are many scientists who can not afford the rather expensive subscriptions to scientific journals. Open access journals offer a good alternative for free access to good quality scientific information."

Fidel Toldrá
(Instituto de Agroquimica y Tecnologia de Alimentos, Spain)

"Open access journals have become a fundamental tool for students, researchers, patients and the general public. Many people from institutions which do not have library or cannot afford to subscribe scientific journals benefit of them on a daily basis. The articles are among the best and cover most scientific areas."

M. Bendandi
(University Clinic of Navarre, Spain)

"These journals provide researchers with a platform for rapid, open access scientific communication. The articles are of high quality and broad scope."

Peter Chiba
(University of Vienna, Austria)

"Open access journals are probably one of the most important contributions to promote and diffuse science worldwide."

Jaime Sampaio
(University of Trás-os-Montes e Alto Douro, Portugal)

"Open access journals make up a new and rather revolutionary way to scientific publication. This option opens several quite interesting possibilities to disseminate openly and freely new knowledge and even to facilitate interpersonal communication among scientists."

Eduardo A. Castro
(INIFTA, Argentina)

"Open access journals are freely available online throughout the world, for you to read, download, copy, distribute, and use. The articles published in the open access journals are high quality and cover a wide range of fields."

Kenji Hashimoto
(Chiba University, Japan)

"Open Access journals offer an innovative and efficient way of publication for academics and professionals in a wide range of disciplines. The papers published are of high quality after rigorous peer review and they are Indexed in: major international databases. I read Open Access journals to keep abreast of the recent development in my field of study."

Daniel Shek
(Chinese University of Hong Kong, Hong Kong)

"It is a modern trend for publishers to establish open access journals. Researchers, faculty members, and students will be greatly benefited by the new journals of Bentham Science Publishers Ltd. in this category."

Jih Ru Hwu
(National Central University, Taiwan)

The Open Mathematics, Statistics and Probability Journal

Asymptotic Relative Efficiencies of the Score and Robust Tests in Genetic Association Studies

Abstract

Introduction:

Methods:

Results and Conclusion:

Article Information

Identifiers and Pagination:

Article History:

1. INTRODUCTION

2. BACKGROUND

3. ARE OF TWO TESTS IN GENETIC ASSOCIATION STUDIES

Proposition 1.

4. SIMULATION AND APPLICATION TO GENETIC ASSOCIATION STUDIES

4.1. Simulation Study

4.2. Application

APPENDIX

CONSENT FOR PUBLICATION

CONFLICT OF INTEREST

ACKNOWLEDGEMENTS

REFERENCES

Endorsements

Browse Contents

Volume 10 - 2020

Volume 9 - 2018

Volume 8 - 2017

Volume 7 - 2016

Volume 6 - 2014

Volume 5 - 2013

Volume 4 - 2012

Volume 3 - 2011

Volume 2 - 2010

Volume 1 - 2009

Table of Contents