Routine for Starting a Data Science Project in R

Routine is mostly a good thing. Morning routine, gym routine, bedtime routine, etc. Thanks to routine or good habit, one doesn't spend too much time and energy on deciding on what/how to do it, saving energy for more important questions like "why".

Routine is mostly a good thing for data scientist, too. Here's my routine for starting a new data science project in R, large or small:

Create a github repo for the project with sensible name, all lowercase and dash, no underscore (~1min)
git clone to my usual project directory (~/projects/) (30sec)
Write README.md for what the project is about (~1min)
Fire up Rstudio and create RStudio project (.Rproj) in the directory (~1min)
Write the first R script, typically named initial-analysis.R
First few lines of the scripts are almost always the same, like:
- library(tidyverse)
- df <- read_csv("datafile")
- glimpse(df)
- df %>% ggplot(aes(x, y)) + geom_.... : yes... this is where things start to diverge...

So, that's about 10min to hit the ground running and start producing useful stuff.

Once things start rolling, daily routines are similar:

Bunch of data massaging, like:
- df %>%
- group_by(x) %>%
- filter(y %in% c("good", "fine")) %>%
- summarize(mz=median(z))
... and visualization:
- df %>%
- ggplot(aes(x, y)) +
- geom_... +
- facet_wrap(~w)
... and reporting:
- rmarkdown::render("that-special-markdown.Rmd")
... and git commit / git push frequently.
Talk to the stakeholders for questions, news, etc.

But, overall, fairly automatic, fast, and effective. Yes, routine is mostly a good thing.

What's your routine for starting a data science project in R?

Very different from mine??

Let me (and the world) know!