Resampled Cox Proportional Hazards Models for Infant Mortality at the Kigali University Teaching Hospital
Paul Gatabazi*, Sileshi Fanta Melesse, Shaun Ramroop
School of Mathematics, Statistics and Computer Science University of KwaZulu-Natal Pietermaritzburg Private Bag X 01 Scottsville 3209, South Africa
Resampling technique as a way of overcoming instability in Cox Proportional hazard model is used for measuring the risk and related standard error for the infant mortality, given socio-economic and clinical covariates for mother and children at the Kigali University Teaching Hospital in Rwanda.
Bootstrap and jackknife Cox proportional hazards models was applied to N=2117 newborn data collected in 2016 at the Kigali University Teaching Hospital in Rwanda.
The unadjusted models revealed significance of the age of female parents, information on previous abortion, gender of a newborn, number of newborns at a time, APGAR, the weight of a newborn and the circumference of the head of a newborn.
Statistical analysis supports two major findings: 1) parents under 20 years of age indicate a relatively higher risk of infant death, and 2) abnormality in the newborn's head and weight indicates a relatively higher risk of infant mortality. Recommendations include avoidance of pregnancy until after age 20 and clinically recommended nutrition for the mother during pregnancy to decrease the risk of infant mortality.
open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
* Address correspondence to this author at the School of Mathematics, Statistics and Computer Science University of KwaZulu-Natal Pietermaritzburg Private Bag X 01 Scottsville 3209, South Africa; Tel: +27710513309; E-mail: email@example.com
The resampling in Cox proportional hazards model consists of conducting the Cox Proportional Hazards Model (CPHM) on a given number of samples obtained after applying a relevant technique of resampling. The popular nonparametric techniques of resampling include bootstrap method which is based on random sampling with replacement [1Efron B, Tibshirani RJ. An introduction to the bootstrap 1994.], jackknife method which consists of making samples by leaving out one observation a time [1Efron B, Tibshirani RJ. An introduction to the bootstrap 1994.], and jackknife after bootstrap [2Efron B. Jackknife-After-Bootstrap standard errors and influence functions. Wiley for the Royal Statistical Society 1992; 54(1): 83-127.]. The interest in this study will be on Bootstrap Cox Proportional Hazards Model (BCPHM) and Jackknife Cox Proportional Hazards Model (JCPHM).
Hamada [3Hamada C. Bootstrap Cox regression using SAS procedures. SAS Institute Japan Ltd 1995; 211: 1258-62.] points out the aim of using the resampling technique in CPHM. Firstly the resampling allows the assessment of the stability of the CPHM. The instability may be caused by the correlation of the covariates. Secondly, the resampling may be used when the sample size is relatively small. Model adequacy may be satisfied by selecting variables on which the model is stable rather than testing the proportionality of variables.
BCPHM and JCPHM have been extensively applied to different studies. In [4Utzet F, Sánchez A. Some applications of the bootstrap to survival analysis. Anu Psicol 1992; 55: 155-67.], bootstrap is applied for estimating the survival function and the hazard rate with respective standard errors. Belašková, Fišerová, and Krupicková [5Belašková S, Fišerová E, Krupicková S. Study of bootstrap estimates in Cox regression model with delayed entry. Mathematica 2013; 52(2): 21-30.] published a clinical study which used BCPHM with consideration of right censoring and delayed entries. The study of Belašková et al. adapted BCPHM due to the small sample size (N=61). Xu, Sen, and Ying [6Xu G, Sen B, Ying Z. Bootstrapping a change-point Cox model for survival data. Electron J Stat 2014; 8(1): 1345-79.[http://dx.doi.org/10.1214/14-EJS927] [PMID: 25400719] ] conducted the BCPHM with consideration of a change-point along the study time with right censored survival data. The study proved the consistency of the model by making a comparison with the model based on data simulation. The JCPHM was adopted by Xiao, Yao-Hua, and Dong-Sheng [7Xiao L, Yao-Hua W, Dong-Sheng T. Jackknifed random weighting for Cox pro-portional hazards model. Sci China Math 2012; 55(4): 775-86.[http://dx.doi.org/10.1007/s11425-012-4380-4] ] together with a random weighting which consists of approximating the distribution of the maximum partial likelihood estimates in the CPHM [8Wang Z, Wu Y, Zhao LC. Approximation by randomly weighting method in censored regression model. Sci China Ser A 2009; 52: 567-76.[http://dx.doi.org/10.1007/s11425-008-0116-x] -10Zheng ZG, Tu D. Random weighting method in regression models. Sci Sinica Series A 1988; 31: 1442-59.]. Several other manuscripts also discussed the use of the resampled survival analysis including [11James LF. A study of a class of weighted bootstraps for censored data. Ann Stat 1997; 25: 1595-621.[http://dx.doi.org/10.1214/aos/1031594733] -17Kim J. Conditional bootstrap methods for censored data 1990.]. In this study, the BCPHM with 1000 bootstrap replicates and the JCPHM were used and compared to the CPHM in modeling the risk of infant death at the Kigali University teaching Hospital from 01-January-2016 to 31-December-2016. The study comprises five sections including the introduction presented in Section 1. Section 2 presents the methods of the study where mathematical formulation of bootstrap and jackknife are reviewed. Section 3 gives the main results. Section 4 discusses the results and Section 5 concludes the paper.
2.1. Bootstrap Method
Assume a sample
are independent and identically distributed with distribution
is the statistical parameter of interest. Consider the distribution function
of a random variable
. The bootstrap method as described by Efron and Tibshirani [1Efron B, Tibshirani RJ. An introduction to the bootstrap 1994.], consists of generating
are random samples of size n drawn with replacement from the sample x. The varibles
are independent and identically distributed with distribution
is an estimator of
from x; B is a number of bootstrap samples (replications).
2.1.2. Bootstrap Standard Error
Assume B bootstrap samples
. Efron and Tibshirani [1Efron B, Tibshirani RJ. An introduction to the bootstrap 1994.] propose the estimated standard error of the bootstrap statistic of interest
*(b) is an estimate of the statistic of interest from the bth bootstrap sample, b=1,2, …, B.
2.1.3. Bootstrap Cox Proportional Hazard Model (BCPHM)
Assume a CPHM, h (t|xi) over the p fixed covariates with values
and the hazard function h0(t) when values of all covariates are zeros, that is
[18Collet D. Modeling survival data in medical research 2nd ed. 2nd ed.2003.], where
is a p-dimensional vector of model parameters.
Consider three approaches of approximating the partial likelihood in the presence of tied events namely Breslow [19Breslow N. Covariance analysis of censored survival data. Biometrics 1974; 30(1): 89-99.[http://dx.doi.org/10.2307/2529620] [PMID: 4813387] ] approximation of the partial likelihood function given by:
is the set of dj individuals drawn from the risk set
at time t(j). The inference of model (2) based on bootstrap consists of applying model (2) to each of the B bootstrap samples
. Bootstrap model parameter estimation uses either Breslow, Efron or Cox approach. The bootstrap standard error is obtained by using Equation (1).
2.2. Jackknife Method
Assume a sample
are the values of the covariate x. Let
be a statistic of interest. The jackknife samples consist of leaving out one observation at a time, that is n samples
[1Efron B, Tibshirani RJ. An introduction to the bootstrap 1994.]. The jackknife standard error estimate as proposed [1Efron B, Tibshirani RJ. An introduction to the bootstrap 1994.], is given as:
is a statistic of interest for the ith jackknife sample.
2.2.2. Jackknife Cox Proportional Hazard Model (JCPHM)
Model (2) based on jackknife is made by applying it to each of the n jackknife samples
[1, n] of covariates
. Either Breslow, Efron or Cox approach is used for estimating the jackknife model parameters, with the standard error given by Eq (6).
Table 1 describes the variables of interest and Table 2 summarises the dataset. The full dataset can be obtained from the authors of this article.
Table 1 Description of variables in the dataset on newborns at Kigali University Teaching Hospital (KUTH) during the period 01-January-2016 to 31-December-2016.
The time to event primary dataset of 2117 newborns at the Kigali University Teaching Hospital (KUTH) was recorded from 1st January to 31st December 2016. A complete case analysis is considered where the event is the death of the infant. Eighty-two babies died during the study time, 69 stillborn babies were recorded and 1966 babies were censored. Eleven covariates of interest are demographic covariates that include the age and the place of residence for parents; clinical covariates for parents include obstetric antecedents, type of childbirth and previous abortion. Clinical covariates for children include APGAR; gender, number of births at a time, weight, circumference of the head, and height. The minimum sample size according to Peduzzi et al. [22Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996; 49(12): 1373-9.[http://dx.doi.org/10.1016/S0895-4356(96)00236-3] [PMID: 8970487] ] is
where k is the number of predictor variables and p is the number of events. This suggests the minimum sample size at KUTH as:
Table 2 Summary of newborns under study.
STATA-15 displays the results in three tables: Table 3 presents estimates of unadjusted CPHM, BCPHM, JCPHM and corresponding adjusted models, by using Breslow estimation method. Both unadjusted and adjusted CPHM, BCPHM and JCPHM by Efron and Cox estimation are also presented in Tables 4 and 5. The results displayed by the jackknife model are relatively close to that of the Cox proportional hazards model (Table 3). The standard errors in JCPHM and CPHM are not critically different for all covariates except for the upper levels of covariates weight, head and height where the standard error in JCPHM is more than 40 times that of CPHM. The critical difference in standard error is also observed in BCPHM for the upper levels of covariates weight, head and height, for all levels of covariate childbirth and for the covariate number where the standard error is relatively higher in BCPHM. Also, BCPHM does not take age and number as significant covariates unlike the fact of JCPHM and CPHM where these covariates are included in significant covariates. Following suggestions in [23Parzen M, Lipsitz SR. A global goodness-of-fit statistic for Cox regression models. Biometrics 1999; 55(2): 580-4.[http://dx.doi.org/10.1111/j.0006-341X.1999.00580.x] [PMID: 11318 217] ], the χ2 test statistics suggest a higher performance of the JCPHM as compared to the CPHM and BCPHM since the value of the χ2 is relatively everywhere lower for the JCPHM.
Table 3 Breslow estimation.
Table 4 Efron estimation.
Table 5 Cox estimation.
The resampling methods adopted in the Cox Proportional Hazard Model (CPHM) include Bootstrap Cox Proportional Hazards Model (BCPHM) and Jackknife Cox Proportional Hazards Model (JCPHM) with three approaches of ties handling. The results by different approaches of ties handling are not critically different as expected. The analysis is then made on the STATA-15 default method [19Breslow N. Covariance analysis of censored survival data. Biometrics 1974; 30(1): 89-99.[http://dx.doi.org/10.2307/2529620] [PMID: 4813387] ]. The similarity observed between the results of JCPHM and those of CPHM is relatively stronger than that of BCPHM and CPHM. The similarity between CPHM and JCPHM suggests that the CPHM may be stable. The overall analysis confirms the significant difference of levels of covariates age, gender, number, APGAR, weight and head. The results show relatively higher risk of babies from under 20 years old parents as compared to the older parents, that is 4.651 times that of babies whose parents’ ages range from 20 to 34 years, and 3.247 times that of babies whose parents are 35 years old and above. The risk of male babies is 1.942 times that of female babies. The risk of multiple babies is 0.264 times that of singleton babies. Babies with APGAR below 4/10 are at a relatively higher risk, that is 2.433 times that babies with APGAR ranging from 4/10 to 6/10 and 16.949 times that of babies whose APGAR range from 7/10 to 10/10. The risk of babies whose weight is below 2500 g is 5.525 times that of babies whose weight range from 2500 g to 4500 g and 2.688 times that of babies with weight above 4500 g. The risk for babies born with a circumference of head below 32 cm is 4.808 times that of newborns whose circumference of head ranges from 32 cm to 36 cm, and 9.524 times that of newborns whose circumference of head is above 36 cm.
The results of BCPHM are also close to that of JCPHM and CPHM for all significant covariates but the model shows a relatively high standard error for non-significant levels of covariates. The critical discrepancy between standard errors after resampling for some covariates suggests instability of the CPHM at these specific covariates and this emphasizes their non-significance in the CPHM.
The dataset was recorded for one year. The stability of the adjusted CPHM is justified by the non-critical difference between the adjusted resampled models.
This paper reviewed different methods of resampling in Cox Proportional Hazards Model (CPHM) namely the Bootstrap Cox Proportional Hazards Model (BCPHM) and the Jackknife Cox Proportional Hazards Model (JCPHM). The results after resampling are compared to that of the CPHM for three different ties handling methods namely Breslow, Efron and Cox approximation. The test statistics show everywhere a higher performance of the JCPHM as compared to the CPHM and BCPHM.
The results displayed by the JCPHM and CPHM are very close and suggested the significance of the age of female parent, information on previous abortion, the gender of a newborn, the number of newborns at a time, APGAR, the weight of a newborn and the circumference of the head of a newborn. Male babies are at a relatively higher risk as compared to female babies. The risk is higher for babies whose parents are under 20 years old as compared to older parents. Babies born with APGAR less than 4/10 were found to have a higher risk as compared to newborns with APGAR greater than 4/10. Underweight babies were found to have a higher risk as compared to babies with normal weight and overweight. Babies with a normal circumference of the head were found to survive better than those with a relatively big head and relatively small head. Under-height babies were found to have a higher risk as compared to babies born with normal height and over-height newborns. The results of the BCPHM are not far from that of JCPHM and CPHM but the non-significant covariates displayed relatively higher standard error. The overall results for non-significant covariates showed a relatively higher standard error after resampling. Due to a relatively higher risk to death of an infant from under 20 years old parents, the pregnancy of parents belonging in such range of age should be avoided. Also as abnormality lead to a relatively higher risk to infant mortality, clinically recommended nutrition during pregnancy would decrease abnormality of the newborn; this would decrease the infant mortality.
Analysis was limited to one event which is the death of the infant. Resampling with multiple events could improve models where an alternative event is attracting a chronic disease or clinical complication for the infant during the study time.
LIST OF ABBREVIATIONS
= Appearance, Pulse, Grimace, Activity and Respiration
= Cox Proportional Hazards Model
= Bootstrap Cox Proportional Hazards Model
= Jackknife Cox Proportional Hazards Model
= Kigali University Teaching Hospital
ETHICS APPROVAL AND CONSENT TO PARTICIPATE
The study was approved by the Kigali University Teaching Hospital where dataset was taken from the hospital database, with consent that the names of both the parents and the children cannot be published.
HUMAN AND ANIMAL RIGHTS
No animals/ humans were used for the studies that are the basis of this research.
CONSENT FOR PUBLICATION
AVAILABILITY OF DATA AND MATERIALS
The data supporting the findings of the article is available in the School of Mathematics, Statistics and Computer Science, University of KwaZulu Natal at http://smscs.ukzn.ac.za/ Homepage.aspxL, reference number 00033 260 5610.
This work was funded by the University of KwaZulu Natal and Sub-Saharan Africa Consortium for Advanced Biostatistics (SSACAB) programme.
CONFLICT OF INTEREST
The authors declare that there is no conflict of interest, financial or otherwise.
This work was supported through the DELTAS Africa Initiative. The DELTAS Africa Initiative is an independent funding scheme of the African Academy of Sciences (AAS)’s Alliance for Accelerating Excellence in Science in Africa (AESA) and supported by the New Partnership for Africa’s Development Planning and Coordinating Agency (NEPAD Agency) with funding from the Wellcome Trust [grant 107754/Z/15/Z- DELTAS Africa Sub-Saharan Africa Consortium for Advanced Biostatistics (SSACAB) programme] and the UK government. The views expressed in this publication are those of the author(s) and not necessarily those of AAS, NEPAD Agency, Wellcome Trust or the UK government.
Efron B, Tibshirani RJ. An introduction to the bootstrap 1994.
Efron B. Jackknife-After-Bootstrap standard errors and influence functions. Wiley for the Royal Statistical Society 1992; 54(1): 83-127.
Hamada C. Bootstrap Cox regression using SAS procedures. SAS Institute Japan Ltd 1995; 211: 1258-62.
Utzet F, Sánchez A. Some applications of the bootstrap to survival analysis. Anu Psicol 1992; 55: 155-67.
Belašková S, Fišerová E, Krupicková S. Study of bootstrap estimates in Cox regression model with delayed entry. Mathematica 2013; 52(2): 21-30.