I like the idea of short posts focussing on one function because there are so many great functions out there. I had been thinking about doing a function a week for a while. A post a week is way too ambitious but a month sounds better than most other time periods so I am sticking with it.


A function that I think is really under appreciated is uncount. Sometimes get summarised/cross-tabbed data that you want in a non-summarised format.

I remember being in this situation a few years ago and it spent ages writing a loop in stata do create a dataset in the right format. I had designed the data collection myself so I have noone else to blame but now I know that there is a better way. uncount to the rescue!

We will use some fake data that has the same format as the data I had to work with back when I was young and innocent but I have changed the values and labels. Below is a table with some of the data.

score animal count
0 dog 8
0 cat 16
1 dog 23
1 cat 32
2 dog 61
2 cat 110
3 dog 107
3 cat 172
4 dog 194
4 cat 292

To make it ‘unsummarised’ you just specify the dataset and the weights which is your variable with the counts.

animals_uncounted <- uncount(data=animal_scores, weights=count)

In my orginal case, I wanted to change the format of my data to make density plots. And voila!

ggplot(animals_uncounted, aes(x=score, group=animal, colour=animal)) + geom_density() + theme_minimal()