R was built by and for statisticians, so it’s not like other programming languages. Its idiosyncrasies can be a source of deep frustration for beginners. But I’d argue there is no better tool for data analysis.
That’s why I’m writing a free ebook How To Make Mistakes In R for O’Reilly. It’s modeled after the excellent How To Make Mistakes In Python, by Mike Pirnat.
The target audience is all R coders, from those just starting out all the way to the advanced developers. It’ll cover mistakes in set-up, style, and statistics -- and other surprises, too. I’m especially qualified to write this book because I’ve made so many R mistakes in my own work.
It's an exciting project, but I need your help. What are your “favorite” R mistakes?
I’m looking for all types, ranging from the dead-simple, beginner-level screwups to the subtle, advanced bugs you’ve encountered. Here are a few examples of mistakes I plan to address:
- Function masking due to conflicting packages (e.g., dplyr and plyr)
- Repeatedly typing
stringsAsFactors = FALSE
- The default
table()
function masking NA values - Not using piping (
%>%
) to improve code readability - Not using the
broom
package to standardize the output of statistical models - Not using GitHub and/or RStudio Projects for collaboration
Send them to me, via email, a GitHub pull request or on Twitter. The more the merrier. Feel free to contact me multiple times, as you recall your “favorite” R mistakes.