JITP 2011 Sponsors



 

            Follow jitp2011 on Twitter





 

The Third Annual Journal of Information Technology & Politics Conference
May 16 & 17, 2011 – University of Washington - Seattle, WA

 

JITP 2011 Speaker: John Wilkerson, University of Washington

Title: "Tradeoffs in Accuracy and Efficiency in Supervised Learning Methods"

Abstract: Text is becoming a central source of data for social science research. With advances in digitization and open records practices, the central challenge has in large part shifted away from availability to usability. Automated text classification methodologies are becoming increasingly important within political science because they hold the promise of substantially reducing the costs of converting text to data for a variety of tasks. In this paper, we consider a number of questions of interest to prospective users of supervised learning methods, which are appropriate to classification tasks where known categories are applied. For the right task, supervised learning methods can dramatically lower the costs associated with labeling large volumes of textual data while maintaining high reliability and accuracy. Information science researchers devote considerable attention to comparing the performance of supervised learning algorithms and different feature representations, but the questions posed are often less directly relevant to the practical concerns of social science researchers. The first question prospective social science users are likely to ask is — how well do such methods work? The second is likely to be — how much do they cost in terms of human labeling effort? Relatedly, how much do marginal improvements in performance cost? We address these questions in the context of a particular dataset — the Congressional Bills Project — which includes more than 400,000 labeled bill titles (19 policy topics). This corpus also provides opportunities to experiment with varying sample sizes and sampling methodologies. We are ultimately able to locate an accuracy/efficiency sweet spot of sorts for this dataset by leveraging results generated by an ensemble of supervised learning algorithms.

John Wilkerson (Ph.D., University of Rochester, 1991) is an associate professor in the Political Science Department at the University of Washington. His research centers on legislative organization and decision-making, with related interests in health politics and comparative legislative studies. He is particularly interested how new information technologies can advance political science research and teaching.          
 


Join Now!
 


[Conference Home Page] [Conference Speakers & Authors] [Registered Participants]
[JITP.net] [Join the JITP Reviewer Database] [Browse the JITP archives]