diff --git a/NEWS.md b/NEWS.md index 60a74294..45a4364d 100644 --- a/NEWS.md +++ b/NEWS.md @@ -18,6 +18,18 @@ - When `statistic == "min"` and `annotation %in% c("ws", "wd")`, the ws/wd returned will correspond to the minimum daily pollutant, rather than the minimum daily ws/wd. +- `timeProp()` refinements: + + - `proportion` is now treated more like `type` internally. For a user, this means it can now be passed `"default"` to avoid any conditioning and create a regular period average barchart. + + - `sub` can now be defined via `...`; set `sub = NA` to remove the text annotation which appears by default at the bottom of a `timeProp()` plot. + + - Gained the `key` argument to remove a legend. + + - `"season"` is now a permitted `avg.time` option in `timeProp()`, better aligning it with the options in `timeAverage()`. + + - `...` is now correctly passed to `cutData()` when using `type`/`proportion`. + - `trajPlot()` and `trajLevel()` have gained the `grid.nx` and `grid.ny` arguments which can be used to control the number of ticks on the coordinate grid, or remove it altogether. ## Bug Fixes diff --git a/R/timeAverage.R b/R/timeAverage.R index fc896564..ed41de04 100644 --- a/R/timeAverage.R +++ b/R/timeAverage.R @@ -45,8 +45,8 @@ #' @param avg.time This defines the time period to average to. Can be `"sec"`, #' `"min"`, `"hour"`, `"day"`, `"DSTday"`, `"week"`, `"month"`, `"quarter"` or #' `"year"`. For much increased flexibility a number can precede these options -#' followed by a space. For example, a timeAverage of 2 months would be -#' `period = "2 month"`. In addition, `avg.time` can equal `"season"`, in +#' followed by a space. For example, an average of 2 months would be +#' `avg.time = "2 month"`. In addition, `avg.time` can equal `"season"`, in #' which case 3-month seasonal values are calculated with spring defined as #' March, April, May and so on. #' diff --git a/R/timePlot.R b/R/timePlot.R index a4b1dd80..9371ceae 100644 --- a/R/timePlot.R +++ b/R/timePlot.R @@ -76,7 +76,7 @@ #' directly. This offers great flexibility for understanding the variation of #' different variables and how they depend on one another. #' -#' Only one `type` is currently allowed in `timePlot`. +#' `type` must be of length one. #' @param cols Colours to be used for plotting; see [openColours()] for details. #' @param plot.type The `lattice` plot type, which is a line (`plot.type = "l"`) #' by default. Another useful option is `plot.type = "h"`, which draws diff --git a/R/timeProp.R b/R/timeProp.R index 22392b54..7ecf77a1 100644 --- a/R/timeProp.R +++ b/R/timeProp.R @@ -3,92 +3,49 @@ #' This function shows time series plots as stacked bar charts. The different #' categories in the bar chart are made up from a character or factor variable #' in a data frame. The function is primarily developed to support the plotting -#' of cluster analysis output from [polarCluster()] and -#' [trajCluster()] that consider local and regional (back trajectory) -#' cluster analysis respectively. However, the function has more general use for -#' understanding time series data. +#' of cluster analysis output from [polarCluster()] and [trajCluster()] that +#' consider local and regional (back trajectory) cluster analysis respectively. +#' However, the function has more general use for understanding time series +#' data. #' #' In order to plot time series in this way, some sort of time aggregation is #' needed, which is controlled by the option `avg.time`. #' -#' The plot shows the value of `pollutant` on the y-axis (averaged -#' according to `avg.time`). The time intervals are made up of bars split -#' according to `proportion`. The bars therefore show how the total value -#' of `pollutant` is made up for any time interval. +#' The plot shows the value of `pollutant` on the y-axis (averaged according to +#' `avg.time`). The time intervals are made up of bars split according to +#' `proportion`. The bars therefore show how the total value of `pollutant` is +#' made up for any time interval. #' -#' @param mydata A data frame containing the fields `date`, -#' `pollutant` and a splitting variable `proportion` +#' @inheritParams timePlot +#' +#' @param mydata A data frame containing the fields `date`, `pollutant` and a +#' splitting variable `proportion` #' @param pollutant Name of the pollutant to plot contained in `mydata`. #' @param proportion The splitting variable that makes up the bars in the bar -#' chart e.g. `proportion = "cluster"` if the output from -#' `polarCluster` is being analysed. If `proportion` is a numeric -#' variable it is split into 4 quantiles (by default) by `cutData`. If -#' `proportion` is a factor or character variable then the categories are -#' used directly. -#' @param avg.time This defines the time period to average to. Can be -#' \dQuote{sec}, \dQuote{min}, \dQuote{hour}, \dQuote{day}, \dQuote{DSTday}, -#' \dQuote{week}, \dQuote{month}, \dQuote{quarter} or \dQuote{year}. For much -#' increased flexibility a number can precede these options followed by a -#' space. For example, a timeAverage of 2 months would be `period = "2 -#' month"`. -#' -#' Note that `avg.time` when used in `timeProp` should be greater -#' than the time gap in the original data. For example, `avg.time = -#' "day"` for hourly data is OK, but `avg.time = "hour"` for daily data -#' is not. -#' @param type `type` determines how the data are split i.e. conditioned, -#' and then plotted. The default is will produce a single plot using the -#' entire data. Type can be one of the built-in types as detailed in -#' `cutData` e.g. "season", "year", "weekday" and so on. For example, -#' `type = "season"` will produce four plots --- one for each season. -#' -#' It is also possible to choose `type` as another variable in the data -#' frame. If that variable is numeric, then the data will be split into four -#' quantiles (if possible) and labelled accordingly. If type is an existing -#' character or factor variable, then those categories/levels will be used -#' directly. This offers great flexibility for understanding the variation of -#' different variables and how they depend on one another. +#' chart e.g. `proportion = "cluster"` if the output from `polarCluster` is +#' being analysed. If `proportion` is a numeric variable it is split into 4 +#' quantiles (by default) by `cutData`. If `proportion` is a factor or +#' character variable then the categories are used directly. +#' @param avg.time This defines the time period to average to. Can be `"sec"`, +#' `"min"`, `"hour"`, `"day"`, `"DSTday"`, `"week"`, `"month"`, `"quarter"` or +#' `"year"`. For much increased flexibility a number can precede these options +#' followed by a space. For example, an average of 2 months would be +#' `avg.time = "2 month"`. In addition, `avg.time` can equal `"season"`, in +#' which case 3-month seasonal values are calculated with spring defined as +#' March, April, May and so on. #' -#' `type` must be of length one. -#' @param normalise If `normalise = TRUE` then each time interval is scaled -#' to 100. This is helpful to show the relative (percentage) contribution of -#' the proportions. -#' @param cols Colours to be used for plotting. Options include -#' \dQuote{default}, \dQuote{increment}, \dQuote{heat}, \dQuote{jet} and -#' `RColorBrewer` colours --- see the `openair` `openColours` -#' function for more details. For user defined the user can supply a list of -#' colour names recognised by R (type `colours()` to see the full list). -#' An example would be `cols = c("yellow", "green", "blue")` -#' @param date.breaks Number of major x-axis intervals to use. The function will -#' try and choose a sensible number of dates/times as well as formatting the -#' date/time appropriately to the range being considered. This does not -#' always work as desired automatically. The user can therefore increase or -#' decrease the number of intervals by adjusting the value of -#' `date.breaks` up or down. -#' @param date.format This option controls the date format on the x-axis. While -#' `timePlot` generally sets the date format sensibly there can be some -#' situations where the user wishes to have more control. For format types see -#' `strptime`. For example, to format the date like \dQuote{Jan-2012} set -#' `date.format = "\%b-\%Y"`. -#' @param key.columns Number of columns to be used in the key. With many -#' pollutants a single column can make to key too wide. The user can thus -#' choose to use several columns by setting `columns` to be less than the -#' number of pollutants. -#' @param key.position Location where the scale key is to plotted. Allowed -#' arguments currently include \dQuote{top}, \dQuote{right}, \dQuote{bottom} -#' and \dQuote{left}. +#' Note that `avg.time` when used in `timeProp` should be greater than the +#' time gap in the original data. For example, `avg.time = "day"` for hourly +#' data is OK, but `avg.time = "hour"` for daily data is not. +#' @param normalise If `normalise = TRUE` then each time interval is scaled to +#' 100. This is helpful to show the relative (percentage) contribution of the +#' proportions. #' @param key.title The title of the key. -#' @param auto.text Either `TRUE` (default) or `FALSE`. If `TRUE` -#' titles and axis labels etc. will automatically try and format pollutant -#' names and units properly e.g. by subscripting the `2' in NO2. -#' @param plot Should a plot be produced? `FALSE` can be useful when -#' analysing data to extract plot components and plotting them in other ways. -#' @param ... Other graphical parameters passed onto `timeProp` and -#' `cutData`. For example, `timeProp` passes the option -#' `hemisphere = "southern"` on to `cutData` to provide southern -#' (rather than default northern) hemisphere handling of `type = -#' "season"`. Similarly, common axis and title labelling options (such as -#' `xlab`, `ylab`, `main`) are passed to `xyplot` via +#' @param ... Other graphical parameters passed onto `timeProp` and `cutData`. +#' For example, `timeProp` passes the option `hemisphere = "southern"` on to +#' `cutData` to provide southern (rather than default northern) hemisphere +#' handling of `type = "season"`. Similarly, common axis and title labelling +#' options (such as `xlab`, `ylab`, `main`) are passed to `xyplot` via #' `quickText` to handle routine formatting. #' @export #' @return an [openair][openair-package] object @@ -96,7 +53,7 @@ #' @family time series and trend functions #' @family cluster analysis functions #' @examples -#' ## monthly plot of SO2 showing the contribution by wind sector +#' # monthly plot of SO2 showing the contribution by wind sector #' timeProp(mydata, pollutant = "so2", avg.time = "month", proportion = "wd") timeProp <- function( mydata, @@ -104,149 +61,137 @@ timeProp <- function( proportion = "cluster", avg.time = "day", type = "default", - normalise = FALSE, cols = "Set1", - date.breaks = 7, - date.format = NULL, + normalise = FALSE, + key = TRUE, key.columns = 1, key.position = "right", key.title = proportion, + date.breaks = 7, + date.format = NULL, auto.text = TRUE, plot = TRUE, ... ) { - ## keep check happy - sums <- NULL - freq <- NULL - Var1 <- NULL - means <- NULL - date2 <- NULL - mean_value <- weighted_mean <- xleft <- NULL - - ## greyscale handling - if (length(cols) == 1 && cols == "greyscale") { - trellis.par.set(list(strip.background = list(col = "white"))) - } - + # can only have one type if (length(type) > 1) { - stop("'type' can only be of length 1.") - } - - ## if proportion is not categorical then make it so - if (!class(mydata[[proportion]]) %in% c("factor")) { - mydata <- cutData(mydata, proportion, ...) + cli::cli_abort("{.arg type} can only be of length {1L}.") } - ## extra.args setup + # extra.args setup extra.args <- list(...) - ## set graphaics - current.strip <- trellis.par.get("strip.background") + # reset graphic parameters current.font <- trellis.par.get("fontsize") - - ## reset graphic parameters on.exit(trellis.par.set( fontsize = current.font )) - ## label controls - - main <- if ("main" %in% names(extra.args)) { - quickText(extra.args$main, auto.text) - } else { - quickText("", auto.text) - } - - xlab <- if ("xlab" %in% names(extra.args)) { - quickText(extra.args$xlab, auto.text) - } else { - "date" - } - - ylab <- if ("ylab" %in% names(extra.args)) { - quickText(extra.args$ylab, auto.text) - } else { - quickText(pollutant, auto.text) - } - - xlim <- if ("xlim" %in% names(extra.args)) { - extra.args$xlim - } else { - NULL - } - - ylim <- if ("ylim" %in% names(extra.args)) { - extra.args$ylim - } else { - NULL - } + # label controls + main <- quickText(extra.args$main %||% "", auto.text) + xlab <- quickText(extra.args$xlab %||% "date", auto.text) + ylab <- quickText(extra.args$ylab %||% pollutant, auto.text) + sub <- extra.args$sub %||% "contribution weighted by mean" + # fontsize handling if ("fontsize" %in% names(extra.args)) { trellis.par.set(fontsize = list(text = extra.args$fontsize)) } - ## variables needed - vars <- c("date", pollutant, proportion) - - if (any(type %in% dateTypes)) { - vars <- unique(c("date", vars)) + # greyscale handling + if (length(cols) == 1 && cols == "greyscale") { + trellis.par.set(list(strip.background = list(col = "white"))) } - ## check the data - mydata <- checkPrep(mydata, vars, type, remove.calm = FALSE) + # variables needed + vars <- c("date", pollutant) - # time zone of input data - tzone <- attr(mydata$date, "tzone") + # check the data + mydata <- checkPrep(mydata, vars, c(type, proportion), remove.calm = FALSE) # cut data - mydata <- cutData(mydata, c(type, proportion)) + mydata <- cutData(mydata, c(type, proportion), ...) + + # time zone of input data + tzone <- attr(mydata$date, "tzone") + # groups for dplyr group_1 <- c("xleft", "xright", type) group_2 <- c(type, "xleft", "xright", proportion) - # summarise by proportion, type etc - # add the most common non-zero time interval + # calculate left and right extremes of each bar, add the most common + # non-zero time interval to left to get right + if (avg.time == "season") { + if (any(c("season", "seasonyear") %in% type)) { + cli::cli_abort( + "In {.fun openair::timeProp}, {.arg type} and {.arg avg.time} cannot both be 'season'." + ) + } + results <- mydata |> + cutData(type = "seasonyear") |> + dplyr::mutate(xleft = min(.data$date), .by = c("seasonyear", type)) |> + dplyr::mutate( + xright = .data$xleft + median(diff(.data$xleft)[diff(.data$xleft) != 0]) + ) |> + dplyr::select(-dplyr::any_of("seasonyear")) + } else { + results <- + dplyr::mutate( + mydata, + xleft = as.POSIXct(cut(.data$date, avg.time), tz = tzone), + xright = .data$xleft + median(diff(.data$xleft)[diff(.data$xleft) != 0]) + ) + } - results <- mydata |> - mutate( - xleft = as.POSIXct(cut(date, avg.time), tz = tzone), - xright = xleft + median(diff(xleft)[diff(xleft) != 0]) + # summarise by proportion, type etc + results <- + results |> + # calculate group mean + dplyr::mutate( + mean_value = mean(.data[[pollutant]], na.rm = TRUE), + .by = dplyr::all_of(group_1) ) |> - group_by(across(group_1)) |> # group by type and date interval to get overall average - mutate(mean_value = mean(.data[[pollutant]], na.rm = TRUE)) |> - group_by(across(group_2)) |> - summarise( + # calculate mean & count per type & pollutant, retain type mean + dplyr::summarise( {{ pollutant }} := mean(.data[[pollutant]], na.rm = TRUE), - mean_value = mean(mean_value, na.rm = TRUE), - n = length(date) + mean_value = mean(.data$mean_value, na.rm = TRUE), + n = dplyr::n(), + .by = dplyr::all_of(group_2) + ) |> + # needs specific arrangement for lattice + dplyr::arrange( + .data[[type]], + .data$xleft, + .data$xright, + .data[[proportion]] ) |> - group_by(across(group_1)) |> - mutate( + # weighted mean, with cumulative sum for bar heights + dplyr::mutate( weighted_mean = .data[[pollutant]] * n / sum(n), - Var1 = replace_na(weighted_mean, 0), - var2 = cumsum(Var1), - date = xleft + Var1 = tidyr::replace_na(.data$weighted_mean, 0), + var2 = cumsum(.data$Var1), + date = .data$xleft, + .by = dplyr::all_of(group_1) ) - ## normlaise to 100 if needed - vars <- c(type, "date") + # normalise to 100 if needed if (normalise) { - results <- results |> - group_by(across(vars)) |> - mutate( - Var1 = Var1 * (100 / sum(Var1, na.rm = TRUE)), - var2 = cumsum(Var1) + results <- + dplyr::mutate( + results, + Var1 = .data$Var1 * (100 / sum(.data$Var1, na.rm = TRUE)), + var2 = cumsum(.data$Var1), + .by = dplyr::all_of(c(type, "date")) ) } - ## proper names of labelling ################################################### + # make sure we know order of data frame for adding other dates + results <- dplyr::arrange(results, .data[[type]], "date") + + # proper names of labelling # strip.dat <- strip.fun(results, type, auto.text) strip <- strip.dat[[1]] strip.left <- strip.dat[[2]] - pol.name <- strip.dat[[3]] - - ## work out width of each bar - nProp <- length(levels(results[[proportion]])) # labelling on plot labs <- sapply( @@ -254,76 +199,53 @@ timeProp <- function( function(x) quickText(x, auto.text) ) - # make sure we know order of data frame for adding other dates - results <- arrange(results, type, "date") - - # xleft, xright used by plot function - # results$xleft <- results$date - # results$xright <- results$date2 - # ## don't need date2 - # results <- select(results, -date2) - # the few colours used for scaling + nProp <- length(levels(results[[proportion]])) scaleCol <- openColours(cols, nProp) - # levels of proportion - thelevels <- levels(results[[proportion]]) - - # add colour directly to data frame for easy reference - cols <- data.frame(cols = scaleCol, stringsAsFactors = FALSE) - cols[[proportion]] <- as.character(levels(results[[proportion]])) - - # need to merge based on character, not factor - results[[proportion]] <- as.character(results[[proportion]]) - - results <- full_join(results, cols, by = proportion) - - results[[proportion]] <- factor(results[[proportion]], levels = thelevels) - - # remove missing so we can do a cumsum - # results <- na.omit(results) - - # y values for plotting rectangles - # results <- results |> - # group_by(across(vars)) |> - # mutate(var2 = cumsum(Var1)) + # add colours to the dataframe + results <- + dplyr::mutate( + results, + cols = scaleCol[as.integer(.data[[proportion]])] + ) + # formula for lattice myform <- formula(paste("Var1 ~ date | ", type, sep = "")) - dates <- dateBreaks(results$date, date.breaks)$major ## for date scale - - ## date axis formating - if (is.null(date.format)) { - formats <- dateBreaks(results$date, date.breaks)$format - } else { - formats <- date.format - } - + # date axis formatting + breaks <- dateBreaks(results$date, date.breaks) + dates <- breaks$major + formats <- date.format %||% breaks$format scales <- list(x = list(at = dates, format = formats)) - y.max <- max(results$var2, na.rm = TRUE) - - if (is.null(xlim)) { - xlim <- range(c(results$xleft, results$xright)) - } - + # change in style if normalising data if (normalise) { + ylab <- quickText(paste("% contribution to", pollutant), auto.text) pad <- 1 } else { pad <- 1.04 } - if (is.null(ylim)) { - ylim <- c(0, pad * y.max) - } - - if (normalise) { - ylab <- quickText(paste("% contribution to", pollutant), auto.text) - } - ## sub heading + # set limits, if not set by user + xlim <- extra.args$xlim %||% range(c(results$xleft, results$xright)) + ylim <- extra.args$ylim %||% c(0, pad * max(results$var2, na.rm = TRUE)) - sub <- "contribution weighted by mean" + # set up key, if required + if (key) { + key <- list( + rectangles = list(col = rev(scaleCol), border = NA), + text = list(labs), + space = key.position, + title = quickText(key.title, auto.text), + cex.title = 1, + columns = key.columns + ) + } else { + key <- NULL + } + # construct plot plt <- xyplot( myform, data = results, @@ -332,18 +254,10 @@ timeProp <- function( strip.left = strip.left, groups = get(proportion), stack = TRUE, - sub = sub, scales = scales, col = scaleCol, border = NA, - key = list( - rectangles = list(col = rev(scaleCol), border = NA), - text = list(labs), - space = key.position, - title = quickText(key.title, auto.text), - cex.title = 1, - columns = key.columns - ), + key = key, par.strip.text = list(cex = 0.8), ..., panel = function(..., col, subscripts) { @@ -353,7 +267,7 @@ timeProp <- function( } ) - ## update extra args; usual method does not seem to work... + # update extra args; usual method does not seem to work plt <- modifyList( plt, list( @@ -361,7 +275,8 @@ timeProp <- function( xlab = xlab, x.limits = xlim, y.limits = ylim, - main = main + main = main, + sub = sub ) ) @@ -378,8 +293,8 @@ timeProp <- function( invisible(output) } -# plot individual rectangles as lattice panel.barchar is *very* slow - +#' plot individual rectangles as lattice panel.barchar is *very* slow +#' @noRd panelBar <- function(dat) { xleft <- unclass(dat$xleft) ybottom <- lag(dat$var2, default = 0) diff --git a/man/calcPercentile.Rd b/man/calcPercentile.Rd index 50556f6f..6b426e0e 100644 --- a/man/calcPercentile.Rd +++ b/man/calcPercentile.Rd @@ -26,8 +26,8 @@ pollutant (e.g., \code{"o3"}).} \item{avg.time}{This defines the time period to average to. Can be \code{"sec"}, \code{"min"}, \code{"hour"}, \code{"day"}, \code{"DSTday"}, \code{"week"}, \code{"month"}, \code{"quarter"} or \code{"year"}. For much increased flexibility a number can precede these options -followed by a space. For example, a timeAverage of 2 months would be -\code{period = "2 month"}. In addition, \code{avg.time} can equal \code{"season"}, in +followed by a space. For example, an average of 2 months would be +\code{avg.time = "2 month"}. In addition, \code{avg.time} can equal \code{"season"}, in which case 3-month seasonal values are calculated with spring defined as March, April, May and so on. diff --git a/man/timeAverage.Rd b/man/timeAverage.Rd index bec19b52..c5557036 100644 --- a/man/timeAverage.Rd +++ b/man/timeAverage.Rd @@ -27,8 +27,8 @@ or \code{Date}.} \item{avg.time}{This defines the time period to average to. Can be \code{"sec"}, \code{"min"}, \code{"hour"}, \code{"day"}, \code{"DSTday"}, \code{"week"}, \code{"month"}, \code{"quarter"} or \code{"year"}. For much increased flexibility a number can precede these options -followed by a space. For example, a timeAverage of 2 months would be -\code{period = "2 month"}. In addition, \code{avg.time} can equal \code{"season"}, in +followed by a space. For example, an average of 2 months would be +\code{avg.time = "2 month"}. In addition, \code{avg.time} can equal \code{"season"}, in which case 3-month seasonal values are calculated with spring defined as March, April, May and so on. diff --git a/man/timePlot.Rd b/man/timePlot.Rd index 3e762a98..4aa07047 100644 --- a/man/timePlot.Rd +++ b/man/timePlot.Rd @@ -68,8 +68,8 @@ time series together in one panel.} \item{avg.time}{This defines the time period to average to. Can be \code{"sec"}, \code{"min"}, \code{"hour"}, \code{"day"}, \code{"DSTday"}, \code{"week"}, \code{"month"}, \code{"quarter"} or \code{"year"}. For much increased flexibility a number can precede these options -followed by a space. For example, a timeAverage of 2 months would be -\code{period = "2 month"}. In addition, \code{avg.time} can equal \code{"season"}, in +followed by a space. For example, an average of 2 months would be +\code{avg.time = "2 month"}. In addition, \code{avg.time} can equal \code{"season"}, in which case 3-month seasonal values are calculated with spring defined as March, April, May and so on. @@ -121,7 +121,7 @@ character or factor variable, then those categories/levels will be used directly. This offers great flexibility for understanding the variation of different variables and how they depend on one another. -Only one \code{type} is currently allowed in \code{timePlot}.} +\code{type} must be of length one.} \item{cols}{Colours to be used for plotting; see \code{\link[=openColours]{openColours()}} for details.} diff --git a/man/timeProp.Rd b/man/timeProp.Rd index 9a0c7d13..e99c9533 100644 --- a/man/timeProp.Rd +++ b/man/timeProp.Rd @@ -10,49 +10,50 @@ timeProp( proportion = "cluster", avg.time = "day", type = "default", - normalise = FALSE, cols = "Set1", - date.breaks = 7, - date.format = NULL, + normalise = FALSE, + key = TRUE, key.columns = 1, key.position = "right", key.title = proportion, + date.breaks = 7, + date.format = NULL, auto.text = TRUE, plot = TRUE, ... ) } \arguments{ -\item{mydata}{A data frame containing the fields \code{date}, -\code{pollutant} and a splitting variable \code{proportion}} +\item{mydata}{A data frame containing the fields \code{date}, \code{pollutant} and a +splitting variable \code{proportion}} \item{pollutant}{Name of the pollutant to plot contained in \code{mydata}.} \item{proportion}{The splitting variable that makes up the bars in the bar -chart e.g. \code{proportion = "cluster"} if the output from -\code{polarCluster} is being analysed. If \code{proportion} is a numeric -variable it is split into 4 quantiles (by default) by \code{cutData}. If -\code{proportion} is a factor or character variable then the categories are -used directly.} - -\item{avg.time}{This defines the time period to average to. Can be -\dQuote{sec}, \dQuote{min}, \dQuote{hour}, \dQuote{day}, \dQuote{DSTday}, -\dQuote{week}, \dQuote{month}, \dQuote{quarter} or \dQuote{year}. For much -increased flexibility a number can precede these options followed by a -space. For example, a timeAverage of 2 months would be \code{period = "2 month"}. - -Note that \code{avg.time} when used in \code{timeProp} should be greater -than the time gap in the original data. For example, \code{avg.time = "day"} for hourly data is OK, but \code{avg.time = "hour"} for daily data -is not.} - -\item{type}{\code{type} determines how the data are split i.e. conditioned, -and then plotted. The default is will produce a single plot using the -entire data. Type can be one of the built-in types as detailed in -\code{cutData} e.g. "season", "year", "weekday" and so on. For example, -\code{type = "season"} will produce four plots --- one for each season. - -It is also possible to choose \code{type} as another variable in the data -frame. If that variable is numeric, then the data will be split into four +chart e.g. \code{proportion = "cluster"} if the output from \code{polarCluster} is +being analysed. If \code{proportion} is a numeric variable it is split into 4 +quantiles (by default) by \code{cutData}. If \code{proportion} is a factor or +character variable then the categories are used directly.} + +\item{avg.time}{This defines the time period to average to. Can be \code{"sec"}, +\code{"min"}, \code{"hour"}, \code{"day"}, \code{"DSTday"}, \code{"week"}, \code{"month"}, \code{"quarter"} or +\code{"year"}. For much increased flexibility a number can precede these options +followed by a space. For example, an average of 2 months would be +\code{avg.time = "2 month"}. In addition, \code{avg.time} can equal \code{"season"}, in +which case 3-month seasonal values are calculated with spring defined as +March, April, May and so on. + +Note that \code{avg.time} when used in \code{timeProp} should be greater than the +time gap in the original data. For example, \code{avg.time = "day"} for hourly +data is OK, but \code{avg.time = "hour"} for daily data is not.} + +\item{type}{\code{type} determines how the data are split i.e. conditioned, and +then plotted. The default is will produce a single plot using the entire +data. Type can be one of the built-in types as detailed in \code{\link[=cutData]{cutData()}}, +e.g., \code{"season"}, \code{"year"}, \code{"weekday"} and so on. For example, \code{type = "season"} will produce four plots --- one for each season. + +It is also possible to choose \code{type} as another variable in the data frame. +If that variable is numeric, then the data will be split into four quantiles (if possible) and labelled accordingly. If type is an existing character or factor variable, then those categories/levels will be used directly. This offers great flexibility for understanding the variation of @@ -60,53 +61,48 @@ different variables and how they depend on one another. \code{type} must be of length one.} -\item{normalise}{If \code{normalise = TRUE} then each time interval is scaled -to 100. This is helpful to show the relative (percentage) contribution of -the proportions.} +\item{cols}{Colours to be used for plotting; see \code{\link[=openColours]{openColours()}} for details.} -\item{cols}{Colours to be used for plotting. Options include -\dQuote{default}, \dQuote{increment}, \dQuote{heat}, \dQuote{jet} and -\code{RColorBrewer} colours --- see the \code{openair} \code{openColours} -function for more details. For user defined the user can supply a list of -colour names recognised by R (type \code{colours()} to see the full list). -An example would be \code{cols = c("yellow", "green", "blue")}} +\item{normalise}{If \code{normalise = TRUE} then each time interval is scaled to +100. This is helpful to show the relative (percentage) contribution of the +proportions.} -\item{date.breaks}{Number of major x-axis intervals to use. The function will -try and choose a sensible number of dates/times as well as formatting the -date/time appropriately to the range being considered. This does not -always work as desired automatically. The user can therefore increase or -decrease the number of intervals by adjusting the value of -\code{date.breaks} up or down.} - -\item{date.format}{This option controls the date format on the x-axis. While -\code{timePlot} generally sets the date format sensibly there can be some -situations where the user wishes to have more control. For format types see -\code{strptime}. For example, to format the date like \dQuote{Jan-2012} set -\code{date.format = "\\\%b-\\\%Y"}.} +\item{key}{Should a key be drawn? The default is \code{TRUE}.} \item{key.columns}{Number of columns to be used in the key. With many pollutants a single column can make to key too wide. The user can thus choose to use several columns by setting \code{columns} to be less than the number of pollutants.} -\item{key.position}{Location where the scale key is to plotted. Allowed -arguments currently include \dQuote{top}, \dQuote{right}, \dQuote{bottom} -and \dQuote{left}.} +\item{key.position}{Location where the scale key is to plotted. Can include +\code{"top"}, \code{"bottom"}, \code{"right"} and \code{"left"}.} \item{key.title}{The title of the key.} -\item{auto.text}{Either \code{TRUE} (default) or \code{FALSE}. If \code{TRUE} -titles and axis labels etc. will automatically try and format pollutant -names and units properly e.g. by subscripting the `2' in NO2.} +\item{date.breaks}{Number of major x-axis intervals to use. The function will +try and choose a sensible number of dates/times as well as formatting the +date/time appropriately to the range being considered. This does not always +work as desired automatically. The user can therefore increase or decrease +the number of intervals by adjusting the value of \code{date.breaks} up or down.} + +\item{date.format}{This option controls the date format on the x-axis. While +\code{\link[=timePlot]{timePlot()}} generally sets the date format sensibly there can be some +situations where the user wishes to have more control. For format types see +\code{\link[=strptime]{strptime()}}. For example, to format the date like "Jan-2012" set +\code{date.format = "\\\%b-\\\%Y"}.} + +\item{auto.text}{Either \code{TRUE} (default) or \code{FALSE}. If \code{TRUE} titles and +axis labels will automatically try and format pollutant names and units +properly, e.g., by subscripting the '2' in NO2.} -\item{plot}{Should a plot be produced? \code{FALSE} can be useful when -analysing data to extract plot components and plotting them in other ways.} +\item{plot}{Should a plot be produced? \code{FALSE} can be useful when analysing +data to extract plot components and plotting them in other ways.} -\item{...}{Other graphical parameters passed onto \code{timeProp} and -\code{cutData}. For example, \code{timeProp} passes the option -\code{hemisphere = "southern"} on to \code{cutData} to provide southern -(rather than default northern) hemisphere handling of \code{type = "season"}. Similarly, common axis and title labelling options (such as -\code{xlab}, \code{ylab}, \code{main}) are passed to \code{xyplot} via +\item{...}{Other graphical parameters passed onto \code{timeProp} and \code{cutData}. +For example, \code{timeProp} passes the option \code{hemisphere = "southern"} on to +\code{cutData} to provide southern (rather than default northern) hemisphere +handling of \code{type = "season"}. Similarly, common axis and title labelling +options (such as \code{xlab}, \code{ylab}, \code{main}) are passed to \code{xyplot} via \code{quickText} to handle routine formatting.} } \value{ @@ -116,22 +112,22 @@ an \link[=openair-package]{openair} object This function shows time series plots as stacked bar charts. The different categories in the bar chart are made up from a character or factor variable in a data frame. The function is primarily developed to support the plotting -of cluster analysis output from \code{\link[=polarCluster]{polarCluster()}} and -\code{\link[=trajCluster]{trajCluster()}} that consider local and regional (back trajectory) -cluster analysis respectively. However, the function has more general use for -understanding time series data. +of cluster analysis output from \code{\link[=polarCluster]{polarCluster()}} and \code{\link[=trajCluster]{trajCluster()}} that +consider local and regional (back trajectory) cluster analysis respectively. +However, the function has more general use for understanding time series +data. } \details{ In order to plot time series in this way, some sort of time aggregation is needed, which is controlled by the option \code{avg.time}. -The plot shows the value of \code{pollutant} on the y-axis (averaged -according to \code{avg.time}). The time intervals are made up of bars split -according to \code{proportion}. The bars therefore show how the total value -of \code{pollutant} is made up for any time interval. +The plot shows the value of \code{pollutant} on the y-axis (averaged according to +\code{avg.time}). The time intervals are made up of bars split according to +\code{proportion}. The bars therefore show how the total value of \code{pollutant} is +made up for any time interval. } \examples{ -## monthly plot of SO2 showing the contribution by wind sector +# monthly plot of SO2 showing the contribution by wind sector timeProp(mydata, pollutant = "so2", avg.time = "month", proportion = "wd") } \seealso{