GitHub - yng-me/tsgx

About the `tsgx` package

tsgx stands for "table summary generator." This package is designed to facilitate generation of statistical summary tables with ease. It also adheres to the tidyverse specifications.

The package allows you to:

generate frequency tables, cross-tabulations (2-way table or more);
extract multiple-letter response variable from survey data;
include frequency and/or percent distributions in the generated tables;
specify whether the 'percent to total' is computed by row (default) or by column;
export Excel file with default formatting/styling which is can also be customized.

Installation

You may install the tsgx package either from GitHub or R-CRAN.

# Install devtools if not yet installed in your machine
if(!('devtools' %in% installed.packages()[,'Package'])){
   install.packages('devtools')
}

# Install the package from GitHub
devtools::install_github('yng-me/tsgx')

# Install via R-CRAN
install.packages('tsgx')

Then load the package after installation.

library(tsgx)

`tsgx` core functions

1. `generate_frequency`

This function allows you to generates a frequency distribution table (marginal table) of a categorical variable x specified in its second argument. It returns five (5) columns by default if x_group is not specified. These include (1) categories of x, (2) frequency of each category, (3) percent to total, (4) cumulative frequency, and (5) cumulative percent to total.

generate_frequency(
  .data,
  x,
  x_group = NULL,
  x_label = get_config('x_label'),
  sort_frequency = FALSE,
  x_as_group = FALSE,
  include_total = TRUE,
  include_cumulative = TRUE,
  exclude_zero_value = FALSE
)

Parameters:

`.data`	Required. A data frame, data frame extension (e.g. a `tibble`), a lazy data frame (e.g. from `dbplyr` or `dtplyr`), or Arrow data format.
`x`	Required. Variable to be used as categories.
`x_group`	Accepts a vector of string/character as grouping variables present in the input `.data`.
`x_label`	Stubhead label or label for `x`.
`x_as_group`	Use `x` variable as top level grouping
`sort_frequency`	Whether to sort the output. If set to `TRUE`, the frequency will be sorted in descending order.
`include_total`	Whether to include row total.
`include_cumulative`	Whether to cumulative frequencies.
`exclude_zero_value`	Whether to drop categories with zero (0) values

Example 1.1: Basic usage

library(palmerpenguins)

generate_frequency(penguins, species)

Example 1.2: Add grouping variable and define label for x

penguins |> 
  generate_frequency(
    x = sex, 
    x_group = 'species', 
    x_label = 'Sex'
  )

Example 1.3: Add grouping variable, use x as group, and exclude column total

penguins |> 
  generate_frequency(
    x = sex, 
    x_group = 'species', 
    x_as_group = TRUE, 
    include_total = FALSE
  )

Example 1.4: Exclude cumulative values and sort the output by frequency

penguins |> 
  generate_frequency(
    x = species, 
    x_label = 'Species', 
    sort_frequency = TRUE, 
    include_cumulative = FALSE
  )

Example 1.5: Exclude cumulative values and define multiple grouping variables

dplyr::starwars |> 
  generate_frequency(
    x = sex, 
    x_group = c('skin_color', 'gender'), 
    x_label = 'Sex', 
    include_cumulative = FALSE
  )

2. `generate_crosstab`

generate_crosstab extends the functionality of generate_frequency by allowing you to generate cross-tabulations of two (2) or more categorical variables.

generate_crosstab(
  .data,
  x,
  y = NULL,
  x_group = NULL,
  y_group = NULL,
  x_label = get_config('x_label'),
  y_group_separator = '>',
  x_as_group = FALSE,
  total_by = 'row',
  group_values_by = 'statistics',
  include_frequency = TRUE,
  include_proportion = TRUE,
  include_column_total = TRUE,
  convert_to_percent = TRUE,
  format_precision = 2,
  total_label = NULL,
  ...
)

Parameters:

`.data`	Required. A data frame, data frame extension (e.g. a tibble), a lazy data frame (e.g. from dbplyr or dtplyr), or Arrow data format.
`x`	Required. Variable to be used as categories.
`y`	Variable to be used as columns (like in pivot_wider). If not supplied, `generate_frequency` will used in the function call.
`x_group`	Accepts a vector of string/character as grouping variables.
`y_group`	Accepts a vector of string/character as grouping variables in the column.
`x_label`	Stubhead label or label for `x`.
`x_as_group`	Use `x` variable as top level grouping
`y_group_separator`	A character string that defines the column separator to be used to show table hierarchy.
`total_by`	Accepts `row` \| `column`. Whether to apply the sum columnwise or rowwise.
`group_values_by`	Accepts `statistics` \| `indicators`.
`include_frequency`	Whether to include frequency columns.
`include_proportion`	Whether to include proportion/percentage columns.
`include_column_total`	Whether to include column total.
`convert_to_percent`	Whether to format to percent or proportion.
`format_precision`	[Not yet implemented] Specify the precision of rounding the percent or proportion. Default is `2`.
`total_label`	Whether to rename the column total.
`...`	Valid arguments for `generate_frequency`.

Example 2.1: Basic usage

penguins |> 
  generate_crosstab(
    x = species, 
    y = sex
  )

Example 2.2: Percent/proportion total by column

penguins |> 
  generate_crosstab(
    x = species, 
    y = sex,
    total_by = 'column'
  )

Example 2.3: Exclude frequencies

penguins |> 
  generate_crosstab(
    x = species, 
    y = sex,
    include_frequency = F
  )

Example 2.4: Exclude percentages/proportions

penguins |> 
  generate_crosstab(
    x = species, 
    y = sex,
    include_proportion = F
  )

Example 2.5: Add row grouping variable and exclude percentages/proportions

penguins |> 
  generate_crosstab(
    x = species, 
    y = sex, 
    x_group = 'island',
    include_proportion = F
  )

Example 2.6: Add column grouping variable, exclude frequencies, and convert_to_percent set to FALSE.

penguins |> 
  generate_crosstab(
    x = species, 
    y = sex, 
    y_group = 'island',
    convert_to_percent = F,
    include_frequency = F
  )

3. `generate_multiple_response`

This function allows you to generate summary table from a multiple response category.

generate_multiple_response(
  .data,
  x,
  ...,
  y = NULL,
  x_group = NULL,
  x_label = get_config('x_label'),
  x_as_group = FALSE,
  y_group_separator = '>',
  group_values_by = 'statistics',
  value_to_count = 1,
  include_frequency = TRUE,
  include_proportion = TRUE,
  format_precision = 2,
  convert_to_percent = TRUE
)

Parameters:

`.data`	Required. A data frame, data frame extension (e.g. a tibble), a lazy data frame (e.g. from dbplyr or dtplyr), or Arrow data format.
`x`	Required. Variable to be used as categories.
`...`	Columns with binary-coded response (generally). Use tidyselect specification.
`y`	Column variable to specify for a letter-coded response.
`x_group`	Accepts a vector of string/character as grouping variables.
`x_label`	Stubhead label or label for `x`.
`x_as_group`	Use `x` as top-level grouping. Applicable only if `x_group` is specified.
`y_group_separator`	A character string that defines the column separator to be used to show table hierarchy.
`group_values_by`	Accepts `statistics` \| `indicators`.
`include_frequency`	Whether to include frequency columns.
`include_proportion`	Whether to include proportion/percentage columns.
`convert_to_percent`	Whether to format to percent or proportion.
`format_precision`	[Not yet implemented] Specify the precision of rounding the percent or proportion. Default is `2`.

Example 3.1: Basic usage (extract multiple-letter response)

df <- data.frame(
  category = c("G1", "G1", "G2", "G1", "G2", "G1"),
  response = c("AB", "AC", "B", "ABC", "AB", "C"),
  A = c(1, 1, 0, 1, 1, 0),
  B = c(1, 0, 1, 1, 1, 0),
  C = c(0, 1, 0, 1, 0, 1)
) 

df |> generate_multiple_response(category, y = response)

Example 3.1: Basic usage (wide format multiple response)

df |> generate_multiple_response(category, A:C)

4. `generate_as_list`

generate_as_list(
  .data,
  list_group,
  x,
  ...,
  fn = 'generate_crosstab',
  list_name_overall = 'ALL',
  exclude_overall = FALSE,
  collapse_overall = TRUE,
  save_as_excel = FALSE,
  formatted = TRUE,
  filename = NULL
)

Example 2.1: Basic usage

penguins |> 
  generate_as_list(
    list_group = island,
    x = species, 
    sex
  )

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
R		R
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md
tsg.Rproj		tsg.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

About the `tsgx` package

Installation

`tsgx` core functions

1. `generate_frequency`

2. `generate_crosstab`

3. `generate_multiple_response`

4. `generate_as_list`

About

Licenses found

Uh oh!

Releases

Packages

Languages

License

Licenses found

yng-me/tsgx

Folders and files

Latest commit

History

Repository files navigation

About the tsgx package

Installation

tsgx core functions

1. generate_frequency

2. generate_crosstab

3. generate_multiple_response

4. generate_as_list

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

About the `tsgx` package

`tsgx` core functions

1. `generate_frequency`

2. `generate_crosstab`

3. `generate_multiple_response`

4. `generate_as_list`

Packages