Function of the week: uncount
Background
I like the idea of short posts focussing on one function because there are so many great functions out there. I had been thinking about doing a function a week for a while. A post a week is way too ambitious but a month sounds better than most other time periods so I am sticking with it.
Uncount
A function that I think is really under appreciated is uncount
. Sometimes get summarised/cross-tabbed data that you want in a non-summarised format.
I remember being in this situation a few years ago and it spent ages writing a loop in stata do create a dataset in the right format. I had designed the data collection myself so I have noone else to blame but now I know that there is a better way. uncount
to the rescue!
We will use some fake data that has the same format as the data I had to work with back when I was young and innocent but I have changed the values and labels. Below is a table with some of the data.
gt(animal_scores[1:10,])
score | animal | count |
---|---|---|
0 | dog | 8 |
0 | cat | 16 |
1 | dog | 23 |
1 | cat | 32 |
2 | dog | 61 |
2 | cat | 110 |
3 | dog | 107 |
3 | cat | 172 |
4 | dog | 194 |
4 | cat | 292 |
To make it ‘unsummarised’ you just specify the dataset and the weights which is your variable with the counts.
animals_uncounted <- uncount(data=animal_scores, weights=count)
In my orginal case, I wanted to change the format of my data to make density plots. And voila!
ggplot(animals_uncounted, aes(x=score, group=animal, colour=animal)) + geom_density() + theme_minimal()