Routine is mostly a good thing. Morning routine, gym routine, bedtime routine, etc. Thanks to routine or good habit, one doesn't spend too much time and energy on deciding on what/how to do it, saving energy for more important questions like "why".
Routine is mostly a good thing for data scientist, too. Here's my routine for starting a new data science project in R, large or small:
- Create a github repo for the project with sensible name, all lowercase and dash, no underscore (~1min)
git cloneto my usual project directory (
README.mdfor what the project is about (~1min)
- Fire up Rstudio and create RStudio project (
.Rproj) in the directory (~1min)
- Write the first R script, typically named
- First few lines of the scripts are almost always the same, like:
df <- read_csv("datafile")
df %>% ggplot(aes(x, y)) + geom_....: yes... this is where things start to diverge...
So, that's about 10min to hit the ground running and start producing useful stuff.
Once things start rolling, daily routines are similar:
- Bunch of data massaging, like:
filter(y %in% c("good", "fine")) %>%
- ... and visualization:
ggplot(aes(x, y)) +
- ... and reporting:
- ... and
- Talk to the stakeholders for questions, news, etc.
But, overall, fairly automatic, fast, and effective. Yes, routine is mostly a good thing.
What's your routine for starting a data science project in R?
Very different from mine??
Let me (and the world) know!