This article describes how to create an events SDTM domain using the {sdtm.oak} package. Examples are currently presented and tested in the context of the AE domain.
Before reading this article, it is recommended that users review the
“Creating an Interventions Domain” article, which provides a detailed
explanation of various concepts in {sdtm.oak}, such as
oak_id_vars
, condition_add
, etc. It also
offers guidance on which mapping algorithms or functions to use for
different mappings and provides a more detailed explanation of how these
mapping algorithms or functions work.
In this article, we will dive directly into programming and provide further explanation only where it is required.
Repeat the above steps for different raw datasets before proceeding with the below steps.
Read all the raw datasets into the environment. In this example, the
raw dataset name is ae_raw
. Users can read them from the
{pharmaverseraw}
package using the below code:
PATNUM | FOLDER | IT.AETERM | AEOUTCOME | AEDECOD | IT.AESEV | IT.AESER | IT.AEREL | AEDTCOL | IT.AESTDAT | IT.AEENDAT |
---|---|---|---|---|---|---|---|---|---|---|
701-1015 | AE | Application Site Erythema | Not Recovered/not Resolved | APPLICATION SITE ERYTHEMA | Mild Adverse Event | No | Probably Related | 01/16/2014 | 01/03/2014 | NA |
701-1015 | AE | Application Site Pruritus | Not Recovered/not Resolved | APPLICATION SITE PRURITUS | Mild Adverse Event | No | Probably Related | 01/16/2014 | 01/03/2014 | NA |
701-1015 | AE | Diarrhoea | Recovered/Resolved | DIARRHOEA | Mild Adverse Event | No | Remote | 01/16/2014 | 01/09/2014 | 01/11/2014 |
701-1023 | AE | Atrioventricular Block Second Degree | Not Recovered/not Resolved | ATRIOVENTRICULAR BLOCK SECOND DEGREE | Mild Adverse Event | No | Possibly Related | 08/27/2012 | 08/26/2012 | NA |
701-1023 | AE | Erythema | Not Recovered/not Resolved | ERYTHEMA | Mild Adverse Event | No | Possibly Related | 08/27/2012 | 08/07/2012 | 08/30/2012 |
701-1023 | AE | Erythema | Not Recovered/not Resolved | ERYTHEMA | Moderate Adverse Event | No | Probably Related | 08/27/2012 | 08/07/2012 | NA |
701-1023 | AE | Erythema | Recovered/Resolved | ERYTHEMA | Mild Adverse Event | No | Possibly Related | 09/02/2012 | 08/07/2012 | 08/30/2012 |
701-1028 | AE | Application Site Erythema | Not Recovered/not Resolved | APPLICATION SITE ERYTHEMA | Mild Adverse Event | No | Possibly Related | 08/01/2013 | 07/21/2013 | NA |
701-1028 | AE | Application Site Pruritus | Not Recovered/not Resolved | APPLICATION SITE PRURITUS | Mild Adverse Event | No | Probably Related | 08/14/2013 | 08/08/2013 | NA |
701-1034 | AE | Application Site Pruritus | Not Recovered/not Resolved | APPLICATION SITE PRURITUS | Mild Adverse Event | No | Probably Related | 09/25/2014 | 08/27/2014 | NA |
oak_id | raw_source | patient_number | PATNUM | FOLDER | IT.AETERM | AEOUTCOME | AEDECOD |
---|---|---|---|---|---|---|---|
1 | ae_raw | 701-1015 | 701-1015 | AE | Application Site Erythema | Not Recovered/not Resolved | APPLICATION SITE ERYTHEMA |
2 | ae_raw | 701-1015 | 701-1015 | AE | Application Site Pruritus | Not Recovered/not Resolved | APPLICATION SITE PRURITUS |
3 | ae_raw | 701-1015 | 701-1015 | AE | Diarrhoea | Recovered/Resolved | DIARRHOEA |
4 | ae_raw | 701-1023 | 701-1023 | AE | Atrioventricular Block Second Degree | Not Recovered/not Resolved | ATRIOVENTRICULAR BLOCK SECOND DEGREE |
5 | ae_raw | 701-1023 | 701-1023 | AE | Erythema | Not Recovered/not Resolved | ERYTHEMA |
6 | ae_raw | 701-1023 | 701-1023 | AE | Erythema | Not Recovered/not Resolved | ERYTHEMA |
7 | ae_raw | 701-1023 | 701-1023 | AE | Erythema | Recovered/Resolved | ERYTHEMA |
8 | ae_raw | 701-1028 | 701-1028 | AE | Application Site Erythema | Not Recovered/not Resolved | APPLICATION SITE ERYTHEMA |
9 | ae_raw | 701-1028 | 701-1028 | AE | Application Site Pruritus | Not Recovered/not Resolved | APPLICATION SITE PRURITUS |
10 | ae_raw | 701-1034 | 701-1034 | AE | Application Site Pruritus | Not Recovered/not Resolved | APPLICATION SITE PRURITUS |
Read in the DM domain
Controlled Terminology is part of the SDTM specification and it is
prepared by the user. In this example, the study controlled terminology
name is sdtm_ct.csv
. Users can read it from the package
using the below code:
codelist_code | term_code | term_value | collected_value | term_preferred_term | term_synonyms |
---|---|---|---|---|---|
C66726 | C25158 | CAPSULE | Capsule | Capsule Dosage Form | cap |
C66726 | C25394 | PILL | Pill | Pill Dosage Form | |
C66726 | C29167 | LOTION | Lotion | Lotion Dosage Form | |
C66726 | C42887 | AEROSOL | Aerosol | Aerosol Dosage Form | aer |
C66726 | C42944 | INHALANT | Inhalant | Inhalant Dosage Form | |
C66726 | C42946 | INJECTION | Injection | Injectable Dosage Form | |
C66726 | C42953 | LIQUID | Liquid | Liquid Dosage Form | |
C66726 | C42968 | PATCH | Patch | patch | Patch Dosage Form |
C66726 | C42998 | TABLET | Tablet | Tablet Dosage Form | tab |
C66728 | C25629 | BEFORE | Prior | Prior |
The topic variable is mapped as a first step in the mapping process.
It is the primary variable in the SDTM domain. The rest of the variables
add further definition to the topic variable. In this example, the topic
variable is AETERM
. It is mapped from the raw dataset
column IT.AETERM
. The mapping logic is
Map the collected value in the ae_raw dataset IT.AETERM variable to AE.AETERM
.
This mapping does not involve any controlled terminology. The
assign_no_ct
function is used for mapping. Once the topic
variable is mapped, the Qualifier, Identifier, and Timing variables can
be mapped.
ae <-
# Derive topic variable
# Map AETERM using assign_no_ct, raw_var=IT.AETERM, tgt_var=AETERM
assign_no_ct(
raw_dat = ae_raw,
raw_var = "IT.AETERM",
tgt_var = "AETERM",
id_vars = oak_id_vars()
)
oak_id | raw_source | patient_number | AETERM |
---|---|---|---|
1 | ae_raw | 701-1015 | Application Site Erythema |
2 | ae_raw | 701-1015 | Application Site Pruritus |
3 | ae_raw | 701-1015 | Diarrhoea |
4 | ae_raw | 701-1023 | Atrioventricular Block Second Degree |
5 | ae_raw | 701-1023 | Erythema |
6 | ae_raw | 701-1023 | Erythema |
7 | ae_raw | 701-1023 | Erythema |
8 | ae_raw | 701-1028 | Application Site Erythema |
9 | ae_raw | 701-1028 | Application Site Pruritus |
10 | ae_raw | 701-1034 | Application Site Pruritus |
The Qualifiers, Identifiers, and Timing Variables can be mapped in any order.
ae <- ae %>%
# Map AEOUT using assign_ct, raw_var=AEOUTCOME, tgt_var=AEOUT
assign_ct(
raw_dat = ae_raw,
raw_var = "AEOUTCOME",
tgt_var = "AEOUT",
ct_spec = study_ct,
ct_clst = "C66768",
id_vars = oak_id_vars()
) %>%
# Map AESEV using assign_no_ct, raw_var=IT.AESEV, tgt_var=AESEV
assign_ct(
raw_dat = ae_raw,
raw_var = "IT.AESEV",
tgt_var = "AESEV",
ct_spec = study_ct,
ct_clst = "C66769",
id_vars = oak_id_vars()
) %>%
# Map AESER using assign_no_ct, raw_var=IT.AESER, tgt_var=AESER
assign_ct(
raw_dat = ae_raw,
raw_var = "IT.AESER",
tgt_var = "AESER",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AEACN using assign_no_ct, raw_var=IT.AEACN, tgt_var=AEACN
assign_no_ct(
raw_dat = ae_raw,
raw_var = "IT.AEACN",
tgt_var = "AEACN",
id_vars = oak_id_vars()
) %>%
# Map AEREL using assign_ct, raw_var=IT.AEREL, tgt_var=AEREL
# User-added codelist is in the ct,
assign_ct(
raw_dat = ae_raw,
raw_var = "IT.AEREL",
tgt_var = "AEREL",
ct_spec = study_ct,
ct_clst = "AEREL",
id_vars = oak_id_vars()
) %>%
# Map AESCAN using assign_ct, raw_var=AESCAN, tgt_var=AESCAN
assign_ct(
raw_dat = ae_raw,
raw_var = "AESCAN",
tgt_var = "AESCAN",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AESCNO using assign_ct, raw_var=AESCNO, tgt_var=AESCNO
assign_ct(
raw_dat = ae_raw,
raw_var = "AESCNO",
tgt_var = "AESCONG",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AEDIS using assign_ct, raw_var=AEDIS, tgt_var=AEDIS
assign_ct(
raw_dat = ae_raw,
raw_var = "AEDIS",
tgt_var = "AESDISAB",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AESDTH using assign_ct, raw_var=IT.AESDTH, tgt_var=AESDTH
assign_ct(
raw_dat = ae_raw,
raw_var = "IT.AESDTH",
tgt_var = "AESDTH",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AESHOSP using assign_ct, raw_var=IT.AESHOSP, tgt_var=AESHOSP
assign_ct(
raw_dat = ae_raw,
raw_var = "IT.AESHOSP",
tgt_var = "AESHOSP",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AESLIFE using assign_ct, raw_var=IT.AESLIFE, tgt_var=AESLIFE
assign_ct(
raw_dat = ae_raw,
raw_var = "IT.AESLIFE",
tgt_var = "AESLIFE",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AESOD using assign_ct, raw_var=AESOD, tgt_var=AESOD
assign_ct(
raw_dat = ae_raw,
raw_var = "AESOD",
tgt_var = "AESOD",
ct_spec = study_ct,
ct_clst = "C66742",
id_vars = oak_id_vars()
) %>%
# Map AEDTC using assign_datetime, raw_var=AEDTCOL
assign_datetime(
raw_dat = ae_raw,
raw_var = "AEDTCOL",
tgt_var = "AEDTC",
raw_fmt = c("m/d/y")
) %>%
# Map AESTDTC using assign_datetime, raw_var=IT.AESTDAT
assign_datetime(
raw_dat = ae_raw,
raw_var = "IT.AESTDAT",
tgt_var = "AESTDTC",
raw_fmt = c("m/d/y"),
id_vars = oak_id_vars()
) %>%
# Map AEENDTC using assign_datetime, raw_var=IT.AEENDAT
assign_datetime(
raw_dat = ae_raw,
raw_var = "IT.AEENDAT",
tgt_var = "AEENDTC",
raw_fmt = c("m/d/y"),
id_vars = oak_id_vars()
)
There is only one topic variable in this raw data source, and there are no additional topic variable mappings. Users can proceed to the next step. This is required only if there is more than one topic variable to map.
The SDTM derived variables or any SDTM mapping that is applicable to
all the records in the ae
dataset produced in the previous
step cam be created now.
ae <- ae %>%
dplyr::mutate(
STUDYID = ae_raw$STUDY,
DOMAIN = "AE",
USUBJID = paste0("01-", ae_raw$PATNUM),
AELLT = ae_raw$AELLT,
AELLTCD = ae_raw$AELLTCD,
AEDECOD = ae_raw$AEDECOD,
AEPTCD = ae_raw$AEPTCD,
AEHLT = ae_raw$AEHLT,
AEHLTCD = ae_raw$AEHLTCD,
AEHLGT = ae_raw$AEHLGT,
AEHLGTCD = ae_raw$AEHLGTCD,
AEBODSYS = ae_raw$AEBODSYS,
AEBDSYCD = ae_raw$AEBDSYCD,
AESOC = ae_raw$AESOC,
AESOCCD = ae_raw$AESOCCD,
AETERM = toupper(AETERM)
) %>%
derive_seq(
tgt_var = "AESEQ",
rec_vars = c("USUBJID", "AETERM")
) %>%
derive_study_day(
sdtm_in = .,
dm_domain = dm,
tgdt = "AESTDTC",
refdt = "RFXSTDTC",
study_day_var = "AESTDY"
) %>%
derive_study_day(
sdtm_in = .,
dm_domain = dm,
tgdt = "AEENDTC",
refdt = "RFXENDTC",
study_day_var = "AEENDY"
) %>%
select(
"STUDYID", "DOMAIN", "USUBJID", "AESEQ", "AETERM", "AELLT", "AELLTCD", "AEDECOD", "AEPTCD", "AEHLT", "AEHLTCD", "AEHLGT",
"AEHLGTCD", "AEBODSYS", "AEBDSYCD", "AESOC", "AESOCCD", "AESEV", "AESER", "AEACN", "AEREL", "AEOUT", "AESCAN", "AESCONG",
"AESDISAB", "AESDTH", "AESHOSP", "AESLIFE", "AESOD", "AEDTC", "AESTDTC", "AEENDTC", "AESTDY", "AEENDY"
)
STUDYID | DOMAIN | USUBJID | AESEQ | AETERM | AELLT | AELLTCD | AEDECOD | AEPTCD | AEHLT | AEHLTCD | AEHLGT | AEHLGTCD | AEBODSYS | AEBDSYCD | AESOC | AESOCCD | AESEV | AESER | AEACN | AEREL | AEOUT | AESCAN | AESCONG | AESDISAB | AESDTH | AESHOSP | AESLIFE | AESOD | AEDTC | AESTDTC | AEENDTC | AESTDY | AEENDY |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CDISCPILOT01 | AE | 01-701-1015 | 1 | APPLICATION SITE ERYTHEMA | APPLICATION SITE REDNESS | 10003058 | APPLICATION SITE ERYTHEMA | NA | HLT_0617 | NA | HLGT_0152 | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | 10018065 | MILD | N | NA | PROBABLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2014-01-16 | 2014-01-03 | NA | 2 | NA |
CDISCPILOT01 | AE | 01-701-1015 | 2 | APPLICATION SITE PRURITUS | APPLICATION SITE ITCHING | 10003047 | APPLICATION SITE PRURITUS | NA | HLT_0317 | NA | HLGT_0338 | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | 10018065 | MILD | N | NA | PROBABLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2014-01-16 | 2014-01-03 | NA | 2 | NA |
CDISCPILOT01 | AE | 01-701-1015 | 3 | DIARRHOEA | DIARRHEA | 10012727 | DIARRHOEA | NA | HLT_0148 | NA | HLGT_0588 | NA | GASTROINTESTINAL DISORDERS | NA | GASTROINTESTINAL DISORDERS | 10017947 | MILD | N | NA | REMOTE | RECOVERED/RESOLVED | No | N | N | N | N | N | No | 2014-01-16 | 2014-01-09 | 2014-01-11 | 8 | -172 |
CDISCPILOT01 | AE | 01-701-1023 | 1 | ATRIOVENTRICULAR BLOCK SECOND DEGREE | AV BLOCK SECOND DEGREE | 10003851 | ATRIOVENTRICULAR BLOCK SECOND DEGREE | NA | HLT_0415 | NA | HLGT_0086 | NA | CARDIAC DISORDERS | NA | CARDIAC DISORDERS | 10007541 | MILD | N | NA | POSSIBLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2012-08-27 | 2012-08-26 | NA | 22 | NA |
CDISCPILOT01 | AE | 01-701-1023 | 2 | ERYTHEMA | ERYTHEMA | 10015150 | ERYTHEMA | NA | HLT_0284 | NA | HLGT_0192 | NA | SKIN AND SUBCUTANEOUS TISSUE DISORDERS | NA | SKIN AND SUBCUTANEOUS TISSUE DISORDERS | 10040785 | MILD | N | NA | POSSIBLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2012-08-27 | 2012-08-07 | 2012-08-30 | 3 | -2 |
CDISCPILOT01 | AE | 01-701-1023 | 3 | ERYTHEMA | LOCALIZED ERYTHEMA | 10024781 | ERYTHEMA | NA | HLT_0284 | NA | HLGT_0192 | NA | SKIN AND SUBCUTANEOUS TISSUE DISORDERS | NA | SKIN AND SUBCUTANEOUS TISSUE DISORDERS | 10040785 | MODERATE | N | NA | PROBABLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2012-08-27 | 2012-08-07 | NA | 3 | NA |
CDISCPILOT01 | AE | 01-701-1023 | 4 | ERYTHEMA | ERYTHEMA | 10015150 | ERYTHEMA | NA | HLT_0284 | NA | HLGT_0192 | NA | SKIN AND SUBCUTANEOUS TISSUE DISORDERS | NA | SKIN AND SUBCUTANEOUS TISSUE DISORDERS | 10040785 | MILD | N | NA | POSSIBLE | RECOVERED/RESOLVED | No | N | N | N | N | N | No | 2012-09-02 | 2012-08-07 | 2012-08-30 | 3 | -2 |
CDISCPILOT01 | AE | 01-701-1028 | 1 | APPLICATION SITE ERYTHEMA | APPLICATION SITE ERYTHEMA | 10003041 | APPLICATION SITE ERYTHEMA | NA | HLT_0617 | NA | HLGT_0152 | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | 10018065 | MILD | N | NA | POSSIBLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2013-08-01 | 2013-07-21 | NA | 3 | NA |
CDISCPILOT01 | AE | 01-701-1028 | 2 | APPLICATION SITE PRURITUS | APPLICATION SITE ITCHING | 10003047 | APPLICATION SITE PRURITUS | NA | HLT_0317 | NA | HLGT_0338 | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | 10018065 | MILD | N | NA | PROBABLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2013-08-14 | 2013-08-08 | NA | 21 | NA |
CDISCPILOT01 | AE | 01-701-1034 | 1 | APPLICATION SITE PRURITUS | APPLICATION SITE ITCHING | 10003047 | APPLICATION SITE PRURITUS | NA | HLT_0317 | NA | HLGT_0338 | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | NA | GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | 10018065 | MILD | N | NA | PROBABLE | NOT RECOVERED/NOT RESOLVED | No | N | N | N | N | N | No | 2014-09-25 | 2014-08-27 | NA | 58 | NA |
Yet to be developed.