The Kaplan-Meier survival function gives the probability of surviving past time, t:

86 bladder cancer patients had tumors removed. After removal, these patients were separated into two groups, a placebo group (group 0), and a drug Thiopeta treatment group (group 1). The variable time in this dataset represents how many months until a tumor reoccurred in the patient. The variable censor indicates if the tumor reoccurred. If the tumor did not reoccur, the observation was censored. Variable number represents the number of tumors removed 1 vs 2 or more.
Estimate the survival function in each treatment group using the Kaplan-Meier estimator. Report Kaplan-Meier estimates of S(5), S(10) and S(25) for treatment group and placebo group. Given an explanation of Kaplan-Meier estimates of S(10) for treatment group and placebo group.


S(t) gives the probability that a patient will survive past time(t) without a reoccurrence of a tumor. S(10) can be interpreted 55.77% of the placebo group not experiencing a tumor reoccurrence by 10 months while 66.87% of the treatment group had not experienced a tumor reoccurrence by 10 months.


The log-rank test will examine the difference between the treatment and placebo groups by testing the difference in the probability of tumor reoccurrence between the groups over time.
Null Hypothesis: The data does not support the claim that there is a difference in tumor reoccurrence times between the placebo and treatment groups.
Alternative Hypothesis: The data supports the claim that there is a difference in tumor reoccurrence times between the placebo and treatment groups.

Conclusion: Fail to reject the null hypothesis. The data does not support the claim that there is a difference in tumor reoccurrence times between the placebo and treatment groups.
The Cox proportional hazards model: h(t) = h0(t) exp{β1group + β2number} where h(t) is the hazard function which gives the probability that the tumor occurred before t time, and h0(t) is the hazard function and β is the regression coefficient.

The hazard of recurrence of tumor in people who are in a treatment group is exp(−0.3928) = 0.6751 times that of people in the placebo group. The 95% CI is (0.3726, 1.223), which contains 1. This suggests that the hazard of recurrence of tumor in people who are in a treatment group is not significantly different from that of people in the placebo group.
R Code:
bladder<-read.csv(“/Users/aspengulley/Desktop/bladder.csv”, header=T)
bladder
library(survival)
bladder.km<- survfit(Surv(bladder$time, bladder$censor)~bladder$group, data=bladder)
bladder.km
summary(survfit(Surv(bladder$time, bladder$censor)~bladder$group), times=5)
summary(survfit(Surv(bladder$time, bladder$censor)~bladder$group), times=10)
summary(survfit(Surv(bladder$time, bladder$censor)~bladder$group), times=25)
library(survminer)
ggsurvplot(
fit = survfit(Surv(bladder$time, bladder$censor)~bladder$group, data = bladder),
xlab = “Months”,
ylab = “Overall Probability”)
bladder.diff<- survdiff(Surv(bladder$time, bladder$censor)~bladder$group)
bladder.diff
fit <- coxph(Surv(bladder$time, bladder$censor)~bladder$group + bladder$number, data = bladder)
summary(fit)
ISLR2 11.8 Lab: Survival Analysis
Reference:
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021) An Introduction to Statistical Learning with applications in R, Second Edition,https://www.statlearning.com, Springer-Verlag, New York
Here is the online link to the book if you want to check out some other machine learning topics and labs. The labs are at the end of every chapter. https://hastie.su.domains/ISLR2/ISLRv2_website.pdf
BrainCancer {ISLR2}
A data set consisting of survival times for patients diagnosed with brain cancer. A data frame with 88 observations and 8 variables:
sex: factor with levels “Female” and “Male”
diagnosis: factor with levels “Meningioma”, “LG glioma”, “HG glioma”, and “Other”
loc: location factor with levels “Infratentorial” and “Supratentorial”
ki: Karnofsky index
gtv: gross tumor volume, in cubic centimeters
stereo: stereotactic method factor with levels “SRS” and “SRT”
status: whether the patient is still alive at the end of the study: 0=Yes, 1=No
time: age, in years
Source:
I. Selingerova, H. Dolezelova, I. Horova, S. Katina, and J. Zelinka. Survival of patients with primary brain tumors: Comparison of two statistical approaches. PLoS One, 11(2):e0148733, 2016. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4749663/


The log-rank test will examine the difference between the male and female groups by testing the difference in the probability of survival between the groups over time.
Null Hypothesis: The data does not support the claim that there is a difference in the probability of survival between the groups over time.
Alternative Hypothesis: The data supports the claim that there is a difference in the probability of survival between the groups over time.

Conclusion: Fail to reject the null hypothesis. The data does not support the claim that there is a difference in the probability of survival between the groups over time.
The Cox proportional hazard model results:

There is no evidence of a difference between survival in males and females.
Fit a Cox model and include other predictors


Death is 2 times more likely with the diagnosis of LG glioma in comparison to the baseline condition of meningioma, while death with the diagnosis of HG glioma is more than 8 times more likely in comparison to the baseline condition of meningioma. Additionally, the Karnofsky index has a negative coefficient, suggesting higher values are associated with longer survival.

R Code:
library (ISLR2)
data(“BrainCancer”)
?BrainCancer
names (BrainCancer)
attach (BrainCancer)
table (sex)
table (diagnosis)
table (status)
library (survival)
fit.surv<-survfit(Surv(BrainCancer$time, BrainCancer$status)~1, data=BrainCancer)
plot (fit.surv , xlab = “ Months “,
ylab = “Estimated Probability of Survival”)
library(survminer)
ggsurvplot(
fit = fit.surv,
xlab = “Months”,
ylab = “Estimated Probability of Survival”)
fit.sex<-survfit(Surv(BrainCancer$time, BrainCancer$status)~BrainCancer$sex, data=BrainCancer)
quartz()
plot (fit.sex , xlab = “ Months “,
ylab = “ Estimated Probability of Survival “, col = c(2,4))
legend (c(“bottomleft”), levels (sex), col = c(2,4), lty = 1)
logrank.test <- survdiff(Surv(BrainCancer$time, BrainCancer$status)~BrainCancer$sex, data=BrainCancer)
logrank.test
fit.cox <- coxph(Surv(BrainCancer$time, BrainCancer$status)~BrainCancer$sex, data=BrainCancer)
summary(fit.cox)
fit.cox <- coxph(Surv(BrainCancer$time, BrainCancer$status)~BrainCancer$sex + BrainCancer$diagnosis + BrainCancer$loc + BrainCancer$ki + BrainCancer$gtv + BrainCancer$stereo, data=BrainCancer)
summary(fit.cox)
modaldata <- data.frame (diagnosis = levels (diagnosis),
sex = rep (“ Female “, 4),
loc = rep (“ Supratentorial “, 4),
ki = rep ( mean (ki), 4),
gtv = rep ( mean (gtv), 4),
stereo = rep (“ SRT “, 4))
survplots <- survfit (fit.cox, newdata = modaldata)
plot (survplots , xlab = “Months”,
ylab = “ Survival Probability “, col = 2:5)
legend (c(“bottomleft”), levels (diagnosis), col = 2:5, lty = 1)
By Aspen Gulley on .

Leave a Reply