Create a frequency table of a `vector`

or a `data.frame`

. It supports tidyverse's quasiquotation and RMarkdown for reports. Easiest practice is: `data %>% freq(var)`

using the tidyverse.

`top_freq`

can be used to get the top/bottom *n* items of a frequency table, with counts as names. It respects ties.

```
freq(x, ...)
# S3 method for default
freq(
x,
sort.count = TRUE,
nmax = getOption("max.print.freq"),
na.rm = TRUE,
row.names = TRUE,
markdown = !interactive(),
digits = 2,
quote = NULL,
header = TRUE,
title = NULL,
na = "<NA>",
sep = " ",
decimal.mark = getOption("OutDec"),
big.mark = "",
wt = NULL,
...
)
# S3 method for factor
freq(x, ..., droplevels = FALSE)
# S3 method for matrix
freq(x, ..., quote = FALSE)
# S3 method for table
freq(x, ..., sep = " ")
# S3 method for numeric
freq(x, ..., digits = 2)
# S3 method for Date
freq(x, ..., format = "yyyy-mm-dd")
# S3 method for hms
freq(x, ..., format = "HH:MM:SS")
is.freq(f)
top_freq(f, n)
header(f, property = NULL)
# S3 method for freq
print(
x,
nmax = getOption("max.print.freq", default = 10),
markdown = !interactive(),
header = TRUE,
decimal.mark = getOption("OutDec"),
big.mark = ifelse(decimal.mark != ",", ",", "."),
...
)
```

- x
vector of any class or a

`data.frame`

or`table`

- ...
up to nine different columns of

`x`

when`x`

is a`data.frame`

or`tibble`

, to calculate frequencies from - see Examples. Also supports quasiquotion.- sort.count
sort on count, i.e. frequencies. This will be

`TRUE`

at default for everything except when using grouping variables.- nmax
number of row to print. The default,

`10`

, uses`getOption("max.print.freq")`

. Use`nmax = 0`

,`nmax = Inf`

,`nmax = NULL`

or`nmax = NA`

to print all rows.- na.rm
a logical value indicating whether

`NA`

values should be removed from the frequency table. The header (if set) will always print the amount of`NA`

s.- row.names
a logical value indicating whether row indices should be printed as

`1:nrow(x)`

- markdown
a logical value indicating whether the frequency table should be printed in markdown format. This will print all rows (except when

`nmax`

is defined) and is default behaviour in non-interactive R sessions (like when knitting RMarkdown files).- digits
how many significant digits are to be used for numeric values in the header (not for the items themselves, that depends on

`getOption("digits")`

)- quote
a logical value indicating whether or not strings should be printed with surrounding quotes. Default is to print them only around characters that are actually numeric values.

- header
a logical value indicating whether an informative header should be printed

- title
text to show above frequency table, at default to tries to coerce from the variables passed to

`x`

- na
a character string that should be used to show empty (

`NA`

) values (only useful when`na.rm = FALSE`

)- sep
a character string to separate the terms when selecting multiple columns

- decimal.mark
the character to be used to indicate the numeric decimal point

- big.mark
character; if not empty used as mark between every `big.interval` decimals

*before*(hence big) the decimal point- wt
frequency weights. If a variable, computes

`sum(wt)`

instead of counting the rows.- droplevels
a logical value indicating whether in factors empty levels should be dropped

- format
a character to define the printing format (it supports

`format_datetime`

to transform e.g.`"d mmmm yyyy"`

to`"%e %B %Y"`

)- f
a frequency table

- n
number of top

*n*items to return, use -n for the bottom*n*items. It will include more than`n`

rows if there are ties.- property
property in header to return this value directly

A `data.frame`

(with an additional class `"freq"`

) with five columns: `item`

, `count`

, `percent`

, `cum_count`

and `cum_percent`

.

Frequency tables (or frequency distributions) are summaries of the distribution of values in a sample. With the `freq` function, you can create univariate frequency tables. Multiple variables will be pasted into one variable, so it forces a univariate distribution.

Input can be done in many different ways. Base R methods are:

Tidyverse methods are:

For numeric values of any class, these additional values will all be calculated with `na.rm = TRUE`

and shown into the header:

Mean, using

`mean`

Standard Deviation, using

`sd`

Coefficient of Variation (CV), the standard deviation divided by the mean

Mean Absolute Deviation (MAD), using

`mad`

Tukey Five-Number Summaries (minimum, Q1, median, Q3, maximum), see

*NOTE*belowInterquartile Range (IQR) calculated as

`Q3 - Q1`

, see*NOTE*belowCoefficient of Quartile Variation (CQV, sometimes called coefficient of dispersion) calculated as

`(Q3 - Q1) / (Q3 + Q1)`

, see*NOTE*belowOutliers (total count and percentage), using

`boxplot.stats`

*NOTE*: These values are calculated using the same algorithm as used by Minitab and SPSS: *p[k] = E[F(x[k])]*. See Type 6 on the `quantile`

page.

For dates and times of any class, these additional values will be calculated with `na.rm = TRUE`

and shown into the header:

In factors, all factor levels that are not existing in the input data will be dropped at default.

The function `top_freq`

will include more than `n`

rows if there are ties. Use a negative number for *n* (like `n = -3`

) to select the bottom *n* values.

`freq()`

functionInterested in extending the `freq()`

function with your own class? Add a method like below to your package, and optionally define some header info by passing a `list`

to the `.add_header`

parameter, like below example for class `difftime`

. This example assumes that you use the `roxygen2`

package for package development.

```
#' @method freq difftime
#' @importFrom cleaner freq.default
#' @export
#' @noRd
freq.difftime <- function(x, ...) {
freq.default(x = x, ...,
.add_header = list(units = attributes(x)$units))
}
```

Be sure to call `freq.default`

in your function and not just `freq`

. Also, add `cleaner`

to the `Imports:`

field of your `DESCRIPTION`

file, to make sure that it will be installed with your package, e.g.:

```
Imports: cleaner
```

```
freq(unclean$gender, markdown = FALSE)
#> Frequency table
#>
#> Class: character
#> Length: 500
#> Available: 500 (100%, NA: 0 = 0%)
#> Unique: 5
#>
#> Shortest: 1
#> Longest: 6
#>
#> Item Count Percent Cum. Count Cum. Percent
#> --- -------- ------- --------- ------------ --------------
#> 1 male 240 48.0% 240 48.0%
#> 2 female 220 44.0% 460 92.0%
#> 3 man 22 4.4% 482 96.4%
#> 4 m 15 3.0% 497 99.4%
#> 5 F 3 0.6% 500 100.0%
#>
freq(x = clean_factor(unclean$gender,
levels = c("^m" = "Male",
"^f" = "Female")),
markdown = TRUE,
title = "Frequencies of a cleaned version for a markdown report!",
header = FALSE,
quote = TRUE)
#>
#>
#> **Frequencies of a cleaned version for a markdown report!**
#>
#>
#>
#>
#> | |Item | Count| Percent| Cum. Count| Cum. Percent|
#> |:--|:---------|------:|--------:|-----------:|-------------:|
#> |1 |"Male" | 277| 55.4%| 277| 55.4%|
#> |2 |"Female" | 223| 44.6%| 500| 100.0%|
#>
#>
```