Estimate lifetime risk using R package ltRISK
Qiong Chen Ph.D
Henan Cancer Center/Henan Cancer Hospital
Monday Nov 4, 2024(Beijing)
Contents
Introduction of lifetime risk
Introduction of R package ltRISK
Practice of lifetime risk estimation using GCO data
Results of Global and regional lifetime risk
A measure of the risk that a certain event will happen during a person’s lifetime. In cancer research, it is usually given as the likelihood that a person who is free of a certain type of cancer will develop or die from that type of cancer during his or her lifetime. 1
Estimates of lifetime risk usually expressed as a percentage or as odds
The lifetime risk of developing or dying from cancer refers to the chance a person has, over the course of their lifetime (from birth to death), of being diagnosed with or dying from cancer (Table 1).
Cumrate=A∑i=1wipi
The cumulative rate can be converted into true cumulative risk using the following formula:
Cumrisk=1−e-cumrate
Although the cumulative risk does not give an estimate of the risk of developing cancer over a lifetime, it has been used as an approximation of this when the truncated upper age band is chosen as an age close to the average life expectancy of the population.
A realistic estimate of the lifetime risk of getting cancer can be obtained by estimating the number of cancers that would arise during the lifetime of a hypothetical birth cohort. This approach was termed ‘current probability’ by Esteve et al (1994).
p=1ℓ0g∑x=1Lxtx
Two assumptions are made when using routine incidence data
The SEER analytical program adjusting the denominator in the current probability method
AMP (Adjusted for Multiple Primaries) method can address the issue of multiple primary tumors in the same perions for registries can’t precisely identify them or for the situtation that individual data was not available.
S=f∑i=1RiRi+Mi−DiˆS∗0(ai)×{1−exp(−wiNi(Ri+Mi−Di))}
Assumptions for AMP method
The non-cancer mortality rates are the same in individuals without cancer as they are in the general population
the risk of (a new) cancer is the same in individuals who have never previously had cancer as they are in the general population
one cannot die of cancer if one has never had cancer;
people with cancer have the same risk of developing cancer again as those who have never had cancer before;
the probability of dying from other causes (not cancer) is the same between cancer patients and those who have never had cancer.
Contents
Introduction of lifetime risk
Introduction of R package ltRISK
Practice of lifetime risk estimation using GCO data
Results of Global and regional lifetime risk
We include the method AMP in R package ltRISK
We can install it from github repository
or install it from local source file ltRISK_0.1.0.tar.gz
library(ltRISK)
pop <- c(20005, 86920, 102502, 151494, 182932, 203107, 240289, 247076, 199665,
163820, 145382, 86789, 69368, 51207, 39112, 20509, 12301, 6586, 1909)
inci <- c(156, 58, 47, 49, 48, 68, 120, 162, 160, 294, 417, 522, 546, 628,
891, 831, 926, 731, 269)
mx <- inci / pop
r1 <- cumrate(mx, eage = 70)
r1
Cumulative Rate(1/1)
0.49771
Cumulative Rate(1/1)
0.29511
Cumulative Risk (1/100)
39.21
Cumulative Risk (1/100)
25.56
Aggregated data in 5-year age groups was required, number of cancer cases, cancer deaths, all cause mortality, and population.
library(ltRISK)
ni <- c(
73872987, 82029530, 72267070, 78303514, 99425613, 119915673, 98068725,
96644427, 121225951, 121250720, 96012917, 79863455, 75972753, 52929797,
37551107, 29047207, 19584254, 13854299)
mi <- c(
60594, 17718, 18883, 28127, 37493, 75223, 83574, 100655, 211467, 278913,
419663, 445223, 770865, 929008, 1058922, 1346942, 1576852, 2305312)
di <- c(
3511, 2801, 2553, 3183, 4960, 9456, 13509, 23935, 62386, 111640, 147866,
203955, 301892, 304985, 302785, 323804, 275557, 197614)
ri <- c(
9303, 6887, 6248, 8509, 16961, 39439, 56670, 86535, 189251, 289320, 344395,
411232, 552071, 491213, 433786, 395544, 292672, 173503)
The ltr function can estimate lifetime risk using the AMP method and return an object strore the result which is a list of 3 elements including age groups, age conditional propability, and variance in each agegroup.
# mi The annual number of all-cause mortality deaths in each age group.
# di The annual number of cancer-related deaths in each age group.
# ri The annual number of diagnosed cancer cases in each age group.
# ni The number of population in each age group.
# age_width The age width of each age group.
# type Characters "developing" or "dying" indicate estimate the probability of developing cancer or dying from it.
ltr(mi, di, ri, ni, age_width = 5, type = "developing")
The estimate function can get the estimate value of lifetime risk and its 95%CI. When a starting age is specified, it is assumed that the individuals are cancer-free and alive at that age, so the lifetime cancer risk is the risk from that age until death.
You can aslo use post_ci function to wrap the lifetime risk and 95% CI.
[1] "26.85(26.70-27.00)"
[1] "25.68(25.61-25.75)"
[1] "23.80(23.74-23.86)"
[1] "19.85(19.81-19.90)"
[1] "13.20(13.17-13.23)"
You can use ztest function to test the difference between two groups.
Contents
Introduction of lifetime risk
Introduction of R package ltRISK
Practice of lifetime risk estimation using GCO data
Results of Global and regional lifetime risk
We prepare an example dataset including the number of cancer cases, deaths, number of all-cause deaths, and size of mid-year population in 2022, which are from Global Cancer Observatory Today, and the World Population Prospects 2022.
[1] "region" "cancers" "sex" "age" "inci" "mort" "death"
[8] "pop"
# A tibble: 6 × 8
region cancers sex age inci mort death pop
<chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 Australia/New Zealand 1 1 0 0 0 632 930933
2 Australia/New Zealand 1 1 1 0 0 73 993835
3 Australia/New Zealand 1 1 2 0 0 99 1001248
4 Australia/New Zealand 1 1 3 1 0 364 949610
5 Australia/New Zealand 1 1 4 4 0 589 1019806
6 Australia/New Zealand 1 1 5 12 0 752 1169165
GCO_Today is a data frame with 40,824 rows and 8 variables, the description of variables was listed in Table 2.
Suggested citation from Global Cancer Observatory, Cancer Today 1
library(ltRISK)
library(dplyr)
data(GCO_Today)
data <- GCO_Today |>
filter(region == "World", cancers == 39) |>
mutate(sex = factor(sex, levels= c(1, 2, 3), labels = c("Male", "Female", "Total"))) |>
group_by(sex)
model <- data |>
reframe(model_develop = list(ltr(death, mort, inci, pop, type = "developing")),
model_dying = list(ltr(death, mort, inci, pop, type = "dying")))
model$model_develop[[1]]
$age
[1] 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85
$si
[1] 0.0006451535 0.0004885180 0.0004816078 0.0005774606 0.0007812455
[6] 0.0012558958 0.0018863017 0.0026858277 0.0042788911 0.0072884901
[11] 0.0130064996 0.0196248331 0.0295701570 0.0376395945 0.0407228256
[16] 0.0383286564 0.0290418626 0.0278566411
$vari
[1] 4.162422e-07 5.216538e-11 1.249082e-10 5.752388e-11 5.298314e-11
[6] 1.345461e-10 2.526803e-10 3.535544e-10 5.925897e-10 1.049516e-09
[11] 1.698918e-09 1.958084e-09 2.333881e-09 2.225952e-09 1.846702e-09
[16] 1.520990e-09 9.331459e-10 3.543868e-10
attr(,"class")
[1] "ltr"
The overall lifetime cancer risk refers to the risk of developing cancer for an individual throughout their entire life time, from birth to death.
res <- model |>
mutate(lr_developing = post_ci(estimate(model_develop)),
lr_dying = post_ci(estimate(model_dying)))
res
# A tibble: 3 × 5
sex model_develop model_dying lr_developing lr_dying
<fct> <list> <list> <chr> <chr>
1 Male <ltr> <ltr> 25.62(25.49-25.74) 15.85(15.80-15.90)
2 Female <ltr> <ltr> 23.91(23.80-24.02) 13.10(13.06-13.14)
3 Total <ltr> <ltr> 24.81(24.69-24.92) 14.50(14.45-14.54)
When a starting age is specified, it is assumed that the individuals are cancer-free and alive at that age, so the lifetime cancer risk is the risk from that age until death.
res <- model |>
mutate(lr_deve_40 = post_ci(estimate(model_develop, sage = 40)),
lr_deve_50 = post_ci(estimate(model_develop, sage = 50)),
lr_deve_60 = post_ci(estimate(model_develop, sage = 60)),
lr_deve_70 = post_ci(estimate(model_develop, sage = 70)),
lr_deve_80 = post_ci(estimate(model_develop, sage = 80))
) |>
select(-model_develop, -model_dying)
res
# A tibble: 3 × 6
sex lr_deve_40 lr_deve_50 lr_deve_60 lr_deve_70 lr_deve_80
<fct> <chr> <chr> <chr> <chr> <chr>
1 Male 24.74(24.71-24.76) 23.58(23.56-23.60) 20.32(20.3… 13.59(13.… 5.69(5.68…
2 Female 22.39(22.35-22.42) 20.19(20.16-20.22) 16.42(16.4… 11.29(11.… 5.57(5.56…
3 Total 23.61(23.59-23.63) 21.95(21.93-21.96) 18.43(18.4… 12.50(12.… 5.65(5.64…
Contents
Introduction of lifetime risk
Introduction of R package ltRISK
Practice of lifetime risk estimation using GCO data
Results of Global and regional lifetime risk
code
data(GCO_Today)
data <- GCO_Today |>
filter(cancers == 39) |>
mutate(sex = factor(sex, levels= c(1, 2, 3), labels = c("Male", "Female", "Total"))) |>
group_by(region, sex)
model <- data |>
reframe(model_develop = list(ltr(death, mort, inci, pop, type = "developing")),
model_dying = list(ltr(death, mort, inci, pop, type = "dying")))
res1 <- model |>
mutate(ltr1 = post_ci(estimate(model_develop)),
ltr2 = post_ci(estimate(model_develop, sage = 40)),
ltr3 = post_ci(estimate(model_develop, sage = 50)),
ltr4 = post_ci(estimate(model_develop, sage = 60)),
ltr5 = post_ci(estimate(model_develop, sage = 70)),
ltr6 = post_ci(estimate(model_develop, sage = 80))) |>
select(-model_develop, -model_dying)
Table 3 showed the lifetime risk of estimated initiated from different age stratified by sex.
Table 4 showed the lifetime risk of estimated initiated from different age stratified by region.
code
res2 <- model |>
mutate(ltr1 = post_ci(estimate(model_dying)),
ltr2 = post_ci(estimate(model_dying, sage = 40)),
ltr3 = post_ci(estimate(model_dying, sage = 50)),
ltr4 = post_ci(estimate(model_dying, sage = 60)),
ltr5 = post_ci(estimate(model_dying, sage = 70)),
ltr6 = post_ci(estimate(model_dying, sage = 80))) |>
select(-model_develop, -model_dying)
Table 5 showed the lifetime risk of estimated initiated from different age stratified by sex.
Table 6 showed the lifetime risk of estimated initiated from different age stratified by region.
Estimation of lifetime risk using R package ltRISK