Data Availability 

 Currently most students in Gov 2001 are preparing for the final assignment of the course: replicating and then improving on a published article.  While scouting for a suitable piece myself, I came across the debate about whether (and how) data should be made available. 


 It is somewhat surprising that nowadays one can get all sorts of scholarly research off the web, except for the data that produced the results.  Given that methods already exist to ensure that data remains proprietary and confidential, omitting the data from publication seems rather antiquated, unnecessary and counter-productive to scientific advance.  Some health datasets -- such as  AddHealth , which arguably contains some of the most sensitive information -- have successfully been public for a few years already.  There's of course an intriguing debate about this which  Gary's website  partly documents. 

 It seems that we are slowly coming in reach of universal data publication.  Apart from projects like  ICPSR,  several major journals recently started to request authors to submit data and codes.  The JPE explained to me that they expect to have data for some articles from April 2006, and that 'only the rare article will not include the relevant datasets' from early 2007. 

 Since debating the robustness of existing results seems like good research, making data and codes available could spur quite a lot of articles.  I wonder what the effects on journal content will be.  Rather than publishing various replications, maybe journals will post those only online?  Or will there be specialized journals to do that to keep the major publications from being jammed?