Data from China: Land of Plenty? (I) 

 While the media keeps preaching that this century is Chinese, many researchers are getting excited about new opportunities for data collection and access to data.  For the past decades, many development researchers have focused on India because of the regional variation and good infrastructure for surveys.  It seems that now China holds a similar promise, and could provide an interesting comparison to India. 

 I recently started collecting information on China ( here ); below are some highlights. If you know of more surveys, do let me know. 


 Probably the best known micro-survey at this point is the China Health and Nutrition Survey CHNS, which is a panel with rounds in 1989, 1991, 1993, 1997, 2000, and 2004 (the 2006 wave is funded) and covers more than 4,000 households in 9 provinces.  Though this is an amazing dataset, using it is not always easy.  For example there are problems of linking individuals over time.  New longitudinal master files are continuously released but the fixes are sometimes are hard to integrate in ongoing projects (the ID's are mixed up).  Also there seem to be some inconsistencies in the recording, especially in earlier rounds and some key variables such as education.  The best waves seem to be those of 1997 and 2000.  

 There is also a World Bank Living Standards Measurement Study (LSMS) for China.  That survey used standardized (internationally comparable?) questionnaires and was conducted in 780 households and 31 villages in 1996/7.  For those interested in the earlier periods, there is commercial data at the China Population Information and Research Center which has mainly census-based data starting from 1982.  The census itself is also available electronically now (and with GIS maps) but there is a lively debate as to how reliable the figures are, and whether key measures changed over time.  But it should still be good for basic cross-sectional analysis.