Tuesday, January 02, 2007

How I maintain my data

As soon as you have a lot of experiments floating around, you tend to get a proliferation of code and data files. Usually, chaos ensues. If someone asks you for the data of some experiment you published, you can (a) ignore the request (this is the most economical but also the least ethical response) (b) send them they can realistically use. This post is about how to carry out (b).

Example.

Suppose I have a collection of data and R code that I will cryptically call intml.

1. package.skeleton("intml")
This will create some directories (see previous post)
2. now add the data to the data directory
3. add the .Rnw file you used to analyze the data as a vignette

Build a package as indicated in the last post, and send it to the person who asked for it. Or make it available on CRAN.

It's simple and it enforces a certain self-discipline. Nothing like the knowledge that anyone can read your code to force you to write it properly :-)

No comments: