The Changing Evidence Base of Political Science Research 

Kay Schlozman and Norman Nie and I are preparing an edited volume in honor of Sidney Verba. The volume is entitled  Political Science: What Should We Know?  What Should They Know? .  Instead of the usual 10 or so chapters representing something other than each contributor's best work, we invited 100 scholars to write about 1,000 words each -- basically one idea (similar to a blog entry) to address to one or both of the questions in the title.  I include a draft of mine below.  Comments welcome.

  The Changing Evidence Base of Political Science Research 

 I believe the evidence base of political science and the related social sciences are beginning an underappreciated but historic change. As a result, our knowledge of and practical solutions for problems of government and politics will begin to grow at an enormous rate --- if we are ready.




 
For the last half-century, we have learned about human populations
primarily through sample surveys taken every few years, end-of-period
government statistics, and in-depth studies of particular places,
people, or events.  These sources of information have served us well
but, as is widely known, are limited: Survey research produces
occasional snapshots of random selections of isolated individuals from
unknown geographic locations, and the increases in cell phone use and
growing levels of nonresponse are crumbling its scientific foundation.
Aggregate government statistics are valuable, but in many countries
are of dubious validity and are reported only with intentionally
limited resolution or after obscuring valuable information.  One-off
in-depth studies are highly informative but for the most part do not
scale, are not representative, and do not measure long-term change.

 In the next half-century, these existing data collection mechanisms
will surely continue to be used and improved --- such as with
inexpensive web surveys, if the problems with their representativeness
can be addressed --- but they will be supplemented by the profusion of
massive data bases already becoming available in many areas.  Some
produce extensive or continuous time information on individual
political behavior and its causes, such as based on text sources (via
automated information extraction from blogs, emails, speeches,
government reports, and other web sources), electoral activity (via
ballot images, precinct-level results, and individual-level
registration, primary participation, and campaign contribution data),
commercial activity (through every credit card and real estate
transaction and via product RFIDs), geographic location (by carrying
cell phones or passing through toll booths with Fastlane or EZPass
transponders), health information (through digital medical records,
hospital admittances, and accelerometers and other devices being
included in cell phones), and others.  Parts of the biological
sciences are now effectively becoming social sciences, as developments
in genomics, proteomics, metabolomics, and brain imaging produce huge
numbers of person-level variables.  Satellite imagery is increasing in
scope, resolution, and availability.  The internet is spawning
numerous ways for individuals to interact, such as through social
networking sites, social bookmarking, comments on blogs, participating
in product reviews, and entering virtual worlds, all of which are
possibilities for observation and experimentation.  (Ensuring privacy
and protection of personal information during the analyses to be
conducted with this information will require considerable effort,
care, and new work in research ethics, but should not be markedly more
difficult than the now routine medical research involving experiments
on human subjects with drugs and surgical procedures of unknown safety
and efficacy.)

 The analogue-to-digital transformation of numerous devices people own
makes them work better, faster, and less expensively, but also enables
each one to produce data in domains not previously accessible via
systematic analysis.  This includes everything from real-time changes
in the web of contacts among people in in society (the bluetooth in
your cell phone knows whether other people are nearby!) to records
kept of individuals' web clicking, searches, and advertising
clickthroughs.  Partly as a result of new technology, governmental
bureaucracies are improving their record keeping by moving from paper
to electronic data bases, many of which are increasingly available to
researchers.  Some governmental policies are furthering these changes
by requiring more data collection, such as the ``No Child Left Behind
Act'' in education and via the proliferation of randomized policy
experiments.  All these changes are being supplemented by the
replication movement in academia that encourages or requires social
scientists to share data we have created with other researchers.

 These data put numerous advances within our reach for the first time.
Instead of trying to extract information from a few thousand
activists' opinions about politics every two years, in the necessarily
artificial conversation initiated by a survey interview, we can use
new methods to mine the tens of millions of political opinions
expressed daily in published blogs.  Instead of studying the effects
of context and interactions among people by asking respondents to
recall their frequency and nature of social contacts, we now have the
ability to obtain a continuous record of all phone calls, emails, text
messages, and in-person contacts among a much larger group.  In place
of dubious or nonexistent governmental statistics to study economic
development or population spread in Africa, we can use satellite
pictures of human-generated light at night or networks of roads and
other infrastructure measured from space during the day.  The number,
extent, and variety of questions we can address are considerable and
increasing fast.

 If we can tackle the substantial privacy issues, build more powerful
and more widely applicable theories with observable implications in
these new forms of data, help create informatics techniques to ensure
that the data are accessible and preserved, and develop new
statistical methods adapted to the new types of data, political
science can make more dramatic progress than ever before.  The
challenge before us as a profession, before each of us as researchers,
and before the broader community of social scientists, is to prepare
for the collection and analysis of these new data sources, to unlock
the secrets they hold, and to use this new information to better
understand and ameliorate the major problems that affect society and
the well-being of human populations.

  original PDF version