The Cell Key Method (CKM) is a statistical technique used to protect the confidentiality of tabular data by perturbating all cells in it. This package provides tools to apply the CKM in R
, enabling users to select the best set of parameters and to generate perturbed counting tables from microdata.
For more information on the Cell Key Method, you can refer to the chapter 5.4 of the Handbook on Statistical Disclsoure Control.
The package is designed to perturb only frequency tables only for the moment.
For detailed documentation, please refer to the package vignette.
The transition matrices are built using the ptable
package.
For French readers, you can also refer to a methdological document for more information on the Cell Key Method.
# install.packages("remotes")
remotes::install_github("inseefrlab/ckm", dependencies = TRUE)
library(ckm)
data("dtest", package = "ckm")
set.seed(4081789) # Ensure reproducibility
dtest_with_keys <- build_individual_keys(dtest)
hist(dtest_with_keys$rkey)
tab_before <- tabulate_cnt_micro_data(
df = dtest_with_keys,
cat_vars = c("DIPLOME", "SEXE", "AGE"),
hrc_vars = list(GEO = c("REG", "DEP")),
marge_label = "Total"
)
res_ckm <- apply_ckm(tab_before, D = 5, V = 2)
After generating the individual key on your dataset, you can directly build the perturbed table:
res_ckm <- tabulate_and_apply_ckm(
df = dtest_with_keys,
cat_vars = c("DIPLOME", "SEXE", "AGE"),
hrc_vars = list(GEO = c("REG", "DEP")),
marge_label = "Total",
D = 5, V = 2
)