A Workflow for Data Analysis

library(command)
library(fs)

https://r4ds.hadley.nz/workflow-scripts.html

https://www.tidyverse.org/blog/2017/12/workflow-vs-script/

The only way to write complex software that won’t fall on its face is to build it out of simple modules connected by well-defined interfaces, so that most problems are local and you can have some hope of fixing or optimizing a part without breaking the whole. http://catb.org/%7Eesr/writings/taoup/html/modularitychapter.html

https://swcarpentry.github.io/r-novice-inflammation/06-best-practices-R.html

“Another way you can be explicit about the requirements of your code and improve it’s reproducibility is to limit the “hard-coding” of the input and output files for your script. If your code will read in data from a file, define a variable early in your code that stores the path to that file. For example”

The problem of writing safe, flexible data analysis code

classic example: big file
- difficult to understand
- slow or confusing/unreliable - have to re-run
- difficult to debug
split into logical parts (modular), then source
- helps
- but doesn’t totally solve - confusing environment
better: break into pieces, and run each in own environment
a bit like functions (but without global environment)
push harder - make inputs outputs completely transparenet

have to re-run - slow
- difficult to change
- polluted environment
classic software solution: break into small files
but if
how to run them?

[https://stemurphy.com/post/rep_manu_think_about/]

The problem of writing safe, flexible data analysis code

Other solutions