ISSN: 2666-1489 ― Volume 10, 2020

RESEARCH ARTICLE

We studied the consistency of the semi-parametric maximum likelihood estimator (SMLE) under the Cox regression model with right-censored (RC) data.

Consistency proofs of the MLE are often based on the Shannon-Kolmogorov inequality, which requires finite *E*(lnL), where L is the likelihood function.

The results of this study show that one property of the semi-parametric MLE (SMLE) is established.

Under the Cox model with RC data, E(lnL) may not exist. We used the Kullback-Leibler information inequality in our proof.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Open Peer Review Details | |||
---|---|---|---|

Manuscript submitted on 14-05-2020 |
Original Manuscript | Consistency of the Semi-parametric MLE under the Cox Model with Right-Censored Data |

We studied the consistency of the semi-parametric maximum likelihood estimator (SMLE) under the Cox model with right-censored (RC) data.

Let *Y* be a random survival time, **X** a *p*-dimensional random covariate. Conditional on **X** = **x**, Y satisfies the Cox model if its hazard function satisfies

(1.1) |

where *h _{o}* is the baseline hazard function,

[http://dx.doi.org/10.1016/j.spl.2006.11.008] ])

In this paper, we shall make use of the assumptions as follows:

**AS1**. Suppose that *C* is a random variable with the df *f _{C}* (

(1.2) |

and *S*(*t***|x**) is a function of (*S _{o}*,

Due to (AS1) and Eq. (1.2), the generalized likelihood function can be written as:

(1.3) |

which coincides with the standard form of the generalized likelihood [2J. Kiefer, "J and J. Wolfowitz. “Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters", *Ann. Math. Stat.**, *vol. 1, pp. 887-906.

[http://dx.doi.org/10.1214/aoms/1177728066] ]. Eq. (1.3) is identical to the next expression:

(1.4) |

where *η _{n}* = min{

(1.5) |

If Y is discrete then *S*(*t***|x**) = ∏* _{s≤t}*(1 -

The SMLE of (S* _{o}*,

[http://dx.doi.org/10.1080/03610918.2016.1255968] ]. Their simulation results suggest that the SMLE is more efficient than the partial likelihood estimator under the Cox model.

The partial likelihood estimator is a common estimator under the Cox model, which maximizes the partial likelihood: , where *D* is the collection of indices of the exact observations and **R*** _{i}* is the risk set {j:

The consistency of the SMLE under the continuous Cox model with interval-censored (IC) data has been established, making use of the following result [5Q.Q. Yu, and Q.G Diao, "Consistency of the semi-parametric MLE under the Cox model with linearly time-dependent covariates and interval-censored data", *J. Adv.Stat.**, *vol. 4, no. 1, .]:

**The Shannon-Kolmogorov (S-K) inequality**. *Let f _{o} and f be two densities with respect to (w.r.t.)* a measure

Under the Cox model with IC data, the S-K inequality becomes *E* (ln*L*(*S _{o}*,

[http://dx.doi.org/10.1080/03610918.2016.1255968] ].

That is, finite *E* (lnL (*S _{o}*,

A related inequality is as follows.

**The Kullback-Leibler (K-L) information inequality**. *Let f _{o} and f be two densities w.r.t. a measure μ*.

The K-L inequality says that ∫ *f*_{0 }(*t*)ln (*f*_{0 }/*f*)(*t*)*dμ*(*t*) exists, though it maybe ∞. The two inequalities are not equivalent. In fact,

In this note, we show that the SMLE under the Cox model is consistent, making use of the Kullback-Leibler information inequality [6S. Kullback, and R.A. Leibler, "On information and sufficiency", *Ann. Math. Stat.**, *vol. 22, pp. 79-86.

[http://dx.doi.org/10.1214/aoms/1177729694] ]

**2. The Main Results.** Notice that under the assumption that *h _{o}* exists,

**Theorem 1**. *Under the Cox model with RC data, if Y is either continuous or discrete, and if**S _{o}* (τ

The proof of Theorem 1 makes use of a modified K-L inequality. K-L inequality requires that *f*_{0 } and *f* are both densities w.r.t. the measure *μ*. That is ∫ *f*(*t*)*dμ*(*t* = 1. However, in our case, we encounter the case that ∫ *f*(*t*)*dμ*(*t*) [0,1].

**Lemma 1** (the modified K-L inequality). If *f _{i}* ≥ 0,

**Proof.** In view of the K-L inequality, it suffices to prove the inequality ∫ *f*_{1}(*t*)ln *dμ*_{1}(*t*) ≥ 0 under the additional assumptions that ∫ *f*_{2}(*t*)*dμ*_{1}(*t* < 1, ∫ *f*_{1}(*t*)*dμ*_{2}(*t* = 0 and ∫ *f*_{2}(*t*)*dμ*(*t* < 1, where *μ*_{2} is a measure and *μ* = *μ*_{1} + *μ*_{2} Since ∫ *f*_{2}(*t*)*dμ*(*t*) = 1, *f*_{1} and *f*_{2} are df's w.r.t. *μ*.

**Proof of Theorem 1.** Let *Ω _{0 }* be the subset of the sample space

(2.1) |

Since *ω* can be arbitrary in Ω* _{0 }* and P(Ω

Before we prove Theorems 2 and 3, we present a preliminary result.

**Lemma 2** (Proposition 17 in Royden (1968), page 231). *Suppose that**μ _{n} is a sequence of measures on the measurable space (J, ) such that μ_{n}(B) μ(B),*

**Corollary 1.*** Suppose that μ _{n} is a sequence of measures on the measurable space (J , B) such that*

**Proof.** Let *k* = *inf _{n}* in

lim_{n→∞} ∫ *f _{n} dμ_{n}* = lim

**Theorem 2.** Under the discrete Cox model with RC data, Eq. (2.1) holds.

**Proof.** For the given *ω* Ω_{0 } and (*S _{*}, β_{*}*) in the proof of Theorem 1, as assumed, () (

Let *G _{n}*(

(2.2) |

. |

where *B* is a measurable set in **R**^{p+1}. To apply Lemma 2,

(2.3) |

(2.4) |

(2.5) |

(2.6) |

(2.7) |

(2.8) |

(2.9) |

and *v _{n}* converges set wisely to a finite measure

(2.10) |

Thus, ∫ *ln**dF*(*t*, 0, **x**) + ∫ *ln**dF*(*t*, 1, **x**). Hence, (*S*_{0 }(*t*),*β*) = (*S*_{*}(*t*),*β*)*t*D by the 2nd statement of the K-L inequality.

**Theorem 3.***Under the Cox model with RC data, if Y is continuous then Eq. (2.1) holds*.

**Proof.** For the given *ω*Ω* _{0}* and (

(2.11) |

In view of Eq. (1.4) due to *Y* is continuous, we denote:

(2.12) |

(2.13) |

as *S _{*}* is a monotone function,

(2.14) |

The reason is as follows. For each (*t*, **x**) such that *F ^{'}*(

*F _{*}^{'}*(

We shall prove in Lemma 3 that

(2.15) |

. |

(2.16) |

. |

The last inequality further implies that ∫*ln**d F*(*t,0,***x**) + ∫*ln**d F*(*t,1,***x**) = 0. Thus, (*S*_{0 }(*t*),*β*) = (*S*_{*}(*t*),*β*_{*}) *t* D by the 2nd statement of the K-L inequality and by the assumption ASI.

**Lemma 3.** Inequality (2.15) holds.

Proof. Let *k* ≥ 1 and , where B i*s* a measurable set and

. |

Not applicable.

Not applicable.

None.

The author declare no conflict of interest, financial or otherwise.

The author would like to thank the editor and two referees for their invaluable comments.

[1] | Yu. Qiqing, "A note on the proportional hazards model with discontinuous data", Stat. Probab. Lett., vol. 77, no. 7, pp. 735-739.[http://dx.doi.org/10.1016/j.spl.2006.11.008] |

[2] | J. Kiefer, "J and J. Wolfowitz. “Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters", Ann. Math. Stat., vol. 1, pp. 887-906.[http://dx.doi.org/10.1214/aoms/1177728066] |

[3] | G.Y.C. Wong, M.P. Osborne, Q.G. Diao, and Q.Q. Yu, "The piece-wise cox model with right-censored data", Comm. Statist. Comput. Simul., vol. 46, pp. 7894-7908.[http://dx.doi.org/10.1080/03610918.2016.1255968] |

[4] | D.R. Cox, and D. Oakes, Analysis of Survival Data., Chapman & Hall NY, . |

[5] | Q.Q. Yu, and Q.G Diao, "Consistency of the semi-parametric MLE under the Cox model with linearly time-dependent covariates and interval-censored data", J. Adv.Stat., vol. 4, no. 1, . |

[6] | S. Kullback, and R.A. Leibler, "On information and sufficiency", Ann. Math. Stat., vol. 22, pp. 79-86.[http://dx.doi.org/10.1214/aoms/1177729694] |

Webmaster Contact: info@benthamopen.net