What does wt in count() mean (R language)? :
Think of it as “group by sum”, see example:
mtcars %>%
count(cyl, wt = mpg)
# cyl n
# 1 4 293.3
# 2 6 138.2
# 3 8 211.4
mtcars %>%
group_by(cyl) %>%
summarise(n = sum(mpg))
# # A tibble: 3 x 2
# cyl n
# <dbl> <dbl>
# 1 4 293.
# 2 6 138.
# 3 8 211.
count {dplyr}
wt to perform weighted counts, switching the summary fromn = n()
ton = sum(wt)
wt
stands for "weights"
.
The first example in help('count')
that uses object df
, is, in my opinion, very clear.
First, create the object.
library(dplyr)
df <- tribble(
~name, ~gender, ~runs,
"Max", "male", 10,
"Sandra", "female", 1,
"Susan", "female", 4
)
1. Now, an example without wt
.
As you can see from the data set above, there are
- 2 rows with
gender == "female"
; - 1 row with
gender == "male"
.
And a non-weighted count will return those counts.
# counts rows:
df %>% count(gender)
## A tibble: 2 x 2
# gender n
# <chr> <int>
#1 female 2
#2 male 1
2. Now an example with weights, argument wt
.
Suppose that in the original data there were 10 rows with males and 5 rows with females. All male rows were obtained from the same individual, "Max"
. And the female gender rows from two individuals, one row only for "Sandra"
and 4 rows for "Susan"
.
Then the user aggregated the original, unprocessed data by name
and the result was the data as posted. To get counts that account for the original, use a weighted count.
This is what the comment above the wt
example says.
# use the `wt` argument to perform a weighted count. This is useful
# when the data has already been aggregated once
# counts runs:
df %>% count(gender, wt = runs)
## A tibble: 2 x 2
# gender n
# <chr> <dbl>
#1 female 5
#2 male 10