Our work builds on vast bodies of literature in communications, linguistics, sociology and political science. We have tried to include some useful online references followed by an annotated bibiliography, compiled by Lori Young, with much of the critical work as of 2008, categorized by topic, below.

  Page Index:
    Further Information Online
    Content Analysis Software
        Quantifying Language? Theories and Reviews of Automated
                 Content Analysis
        Content Analytic Dictionaries, Affective Language and
                 Lexical Resources
        Sentiment Analysis
                            Dictionary-based approaches
                            Supervised machine-learning
                            Unsupervised machine-learning
        Automated Topic Coding
                            Dictionary-based approaches
                            Supervised machine-learning
                            Unsupervised machine-learning
        Beyond a Bag of Words: Advances in Computer Automation
                            Proximity and Lexical Rules
                            Valence Shifters and Modifiers
                            Semantic Parsing and Roles
                            Subjectivity Detection

Further Information Online

The Content Analysis Guidebook Online


Content Analysis @ utexas

An Overview of Content Analysis, by Steve Stemler (Yale) in Practical Assessment, Research and Evaluation

Content Analysis Software


The General Inquirer


Provalis Research (WordStat, etc.)


Quantifying Language? Theories and Reviews of Automated Content Analysis

Alexa, Melina and Cornelia Zuell. 2000. “Text Analysis Software: Commonalities, Differences and Limitations: The Results of a Review.” Quality & Quantity 34: 299-321.

Comprehensive review of fifteen software packages including: AQAD, ATLAS.ti, CoAn, Code-A-Text, DICTION, DIMAO-MCCA, HyperRESEARCH, KEDS, NUD*IST, QED, TATOE, TEXTPACK, TextSmart, WinMAXpro and WordStat. Includes capabilities and suggestions for future development.

Blismas N. and A. Dainty. 2003. “Computer-aided Qualitative Data Analysis: Panacea or Paradox? Building Research and Information 31(6): 455-463.

Argues that computer automation often restricts, rather than aids the analytical process. Use NVivo to demonstrate some of the limitations of automation.

Conway, Mike. 2006. “The Subjective Precision of Computers: A Methodological Comparison with Human Coding
In Content Analysis.” Journalism and Mass Communication Quarterly 83(1): 186-200.

Few automated studies present computer-human reliability. It is tested here and the author finds dramatically different results in the coding of issues, attributes of candidates and tone in a political campaign.

Herrera and Braumoeker. 2004. “Symposium: Discourse and Content Analysis.” Qualitative Methods 15-19.

Summary of a symposium of the ontological, epistemological and methodological differences between content analysis and discourse analysis. Limitations on the type of meaning that can be extracted with content analysis (discovering content vs mapping).

Hogenraad, R., D.P. McKenzie and N. Péladeau. 2003. “Force and Influence in Content Analysis: The Production of New Social Knowledge.” Quality & Quantity 37: 221-238.

Compares the epistemological foundations of supervised and unsupervised approaches to automation. Delimits types of texts appropriate for each approach.

Kadushin, Charles, Joseph Lovett and James D. Merriman. 1968. “REVIEWS: Literary Analysis with the Aid of the Computer: Review Symposium.” Computers and the Humanities 2(4): 177-202.

Three early and amusingly unfavourable reviews of the prospects for computer automation using Stone et al.’s (1966) General Inquirer program.

MacMillan, K. 2005. “More Than Just Coding?: Evaluating CAQDAS in a Discourse Analysis of News Texts.” Forum Qualitative Social Research 6(3), Art. 25.

Discussion of practical versus methodological concerns with respect to automated coding and discourse – delimiting what automatic coding can and cannot be expected to do. Concludes that discourse analysis cannot be automated.

Mayring, P. 2000. “Qualitative Content Analysis” Forum Qualitative Social Research 1(2), Art. 20.

A systematic, rules-guided approach to qualitative content analysis. The possibility for and limits of an automated procedure.

Mehl, M. R. 2005. “Quantitative Text Analysis.” In M. Eid & E. Diener (Eds.), Handbook of Multimethod Measurement in Psychology, Washington, DC: American Psychological Association, 141-156.

Overview and review of text analysis applications in psychology. Includes applications, conceptual foundations, different methodological approaches (i.e. GI, RID, LIWC, TAS/C, Diction, LSA), and some potentials and problems of automation.

Mohler, Peter Ph. and Cornelia Zuell. 2000. “Observe! A Popperian Critique of Automatic Content Analysis.” 5es Journées Internationales d’Analyse Statistique des Données Textuelles (JADT 2000).

Compares unsupervised automatic coding “untouched by human hands” to dictionary-based approaches with a priori theory-driven categories. Outlines the limitations of theory-free automatic coding and some virtues of computer assisted dictionary approaches.

Osherenko, A., & André, E. 2007. Lexical Affect Sensing: Are Affect Dictionaries Necessary to Analyze Affect?. Second International Conference on Affective Computing and Intelligent Interaction, ACII 2007: Lecture Notes in Computer Science: 230-241

Compares recognition rates for affect dictionaries (LIWC and DAL) and a general-purpose dictionary learned from manually annotated text. Methods perform equally well but affect lexicons are far more efficacious – 68 affect categories produce similar results to a classifier using several thousand features.

Psathas, George. 1969. “The General Inquirer: Useful or Not?” Computers and the Humanities 3(3): 163-174.

A slightly more favourable review of the General Inquirer. The program can be applied to substantive topics, but neither quickly nor easily. Text analysis requires long-term investment from the researcher – it is not yet ‘ready-made’ for the novice.

Riffe, Daniel and Alan Freitag. 1997. “A Content Analysis of Content Analyses: Twenty-Five Years of Journalism Quarterly.” Journalism and Mass Communication Quarterly 74(4): 873-882.

Exponential increase in quantitative content analytic studies. Most focus on news/editorials in the U.S., few use random sampling or theoretical grounding/hypothesis testing; half report inter-coder reliability; 2/5 report descriptive statistics. Reliability and sophistication is improving, but not theoretical grounding.

Rourke, Liam, Terry Anderson, D. R. Garrison and Walter Archer. 2001. “Methodological Issues in the Content Analysis of Computer Conference Transcripts.” International Journal of Artificial Intelligence in Education 12(1) 8-22.

Methodological challenges of content analysis. Discussion of criteria for content analysis, research designs, types of content, units of analysis, ethical issues, and software to aid analysis.


Content Analytic Dictionaries, Affective Language and Lexical Resources

Bradley, M.M., & Lang, P.J. 1999. Affective Norms for English Words (ANEW): Stimuli, Instruction Manual and Affective Ratings. Technical report C-1, Gainesville, FL. The Center for Research in Psychophysiology, University of Florida.

A set of normative emotional ratings for a large number of words in the English language. The words have been manually rated for pleasure, arousal, and dominance in order to create a standard for use in studies of emotion and attention.

Esuli, Andrea and Fabrizio Sebastiani. 2006. “Sentiwordnet: A Publicly Available Lexical Resource for Opinion Mining.” In Proceedings of LREC-06, the 5th Conference on Language Resources and Evaluation, Genova, IT.

Describes SentiWordNet, a lexical resource that assigns scores to each WordNet synset describing how objective, positive or negative they are.

Fellbaum, Christiane. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.

An electronic lexical database considered to be the most important resource available to researchers in computational linguistics, text analysis, and many related areas. Its design is inspired by current psycholinguistic and computational theories of human lexical memory.

Hart, R. P. 1984. Verbal Style and the Presidency: A Computer-Based Analysis. New York: Academic Press.

Text analysis system designed to analyze political rhetoric and campaign style using a series of dictionaries with five semantic features—Activity, Optimism, Certainty, Realism and Commonality—as well as thirty-five sub-features. Results can be compared to norms for different genres.

Hart, R. P. 2000. Political Keywords: Using Language That Uses Us. New York: Oxford University Press.

Political keywords are imbued with meaning and value, which can change over time. Study of the change in political discourse during the last fifty years focusing on eight dominant words.

Hogenraad, R. (2003). The Words that Predict the Outbreak of Wars. Empirical Studies of the Arts 21: 5-20.

Uses several existing resources (GI, RID, Lasswell Value Dictionary etc.) to build the Motive Dictionary around the three categories of need for achievement, need for affiliation, and need for power. The gap between power and affiliation motives predict conflict.

Kelly, Edward and Philip Stone. 1975. Computer Recognition of English Word Senses. Amsterdam: North-Holland Publishing Co.

Outlines disambiguation routines for the General Inquirer program.

Lasswell,  Harold D., Nathan Leites, and associates. 1949. Language of Politics: Studies in Quantitative Semantics. New York: George W. Stewart, Publisher, Inc.

The psychological basis of categories in the Lasswell Value dictionary.

Martindale, C. 1975. Romantic Progression: The Psychology of Literary History. Washington, D.C.: Hemisphere.

Martindale, C. 1990. The Clockwork Muse: The Predictability of Artistic Change. New York: Basic Books.

Outlines and validates the Regressive Imagery Dictionary (RID), a coding scheme for text analysis that is designed to measure primordial and conceptual content. The dictionary includes categories for primary and secondary cognitive processes and emotion.

Mathieu, Yvette Yannick. 2008. “Navigation dans un texte à la recherche des sentiments.” Lingvisticæ Investigationes 31(2) : 313–322.

Development of a lexicon of positive and negative sentiment in French for automated content analysis.

Mergenthaler, E. 1996. Emotion-abstraction Patterns in Verbatim Protocols: A New way of Describing Psychotherapeutic Processes. Journal of Consulting and Clinical Psychology 64(6): 1306-1315.

Mergenthaler, E. 2008. Resonating Minds: A School-independent Theoretical Conception and its Empirical Application to Psychotherapeutic Processes. Psychotherapy Research 18(2): 109-126.

Outlines and validates the TAS/C word list for psychotherapy sessions, focusing on the two dimensions of emotional tone and abstraction. Words are labelled for positive or negative emotion tone on along three dimensions: pleasure, approval and attachment.

Miller, G., R. Beckwith, C. Fellbaum, D. Gross and K. Miller. 1990. “Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography 3(4):235–312. [and 1993 update].

Introduction to WordNet, a comprehensive dictionary organized by underlying lexical concepts based on psycholinguistic theories of human lexical memory.

Lasswell, H.D. and Namenwirth, J.Z. The Lasswell Value Dictionary. New Haven: Yale University Press, 1969

Classification scheme according to values expressed in a message. Language is divided into four deference domains: power, rectitude, respect, affiliation, and four welfare domains: wealth, well-being, enlightenment and skill, each with subcategories such as gains, losses, participants, ends, and arenas.

Osgood, C.E., G.J. Suci, and P.H. Tannenbaum, 1957. The Measurement of Meaning. Urbana, US: University of Illinois Press.

The Theory of Semantic Differentiation delineated three dimensions of affective meaning: “evaluative”, i.e., Orientation; “potency”, referring to the strength of feeling expressed; and “activity”, referring to how active or passive an evaluation is, upon which much subsequent work in affect theory has been based.

Pennebaker, James W., Matthias R. Mehl and Kate G. Niederhoffer. 2003. “Psychological Aspects of Natural Language Use: Our Words, Our Selves.” Annual Review of Psychology 54: 547-77.

Review of several dictionary-based approaches for studying language use. Theorizing affective language – how affective words are used and which ones should we be studying.

Pennebaker, James W., Martha E. Francis and Roger J. Booth. 2001. Linguistic Inquiry and Word Count: LIWC 2001. Mahway, NJ: Erlbaum Publishers.

The LIWC dictionary comprises over 70 basic emotional and cognitive dimensions and other categories including positive or negative emotion words, self-references, big words, or words that refer to sex, eating, or religion. The dictionary was compiled and rated manually by judges.

Roget, Peter Mark. 1911. Roget’s Thesaurus of English Words and Phrases, supplemented electronic version (June 1991). Project Gutenberg Library Archive Foundation.

Roget, Peter Mark. 1883. Roget’s Thesaurus of English Words and Phrases. London: Longmans, Green, and Co.

Roget's Thesaurus was manually organized into six major categories of abstract relations, space, matter, intellect, volition and affections, and includes hundreds of subcategories.

Schrauf, R. and J. Sanchez. 2004. “The Preponderance of Negative Emotion Words in the Emotion Lexicon: A Cross-generational and Cross-linguistic Study.” Journal of Multilingual and Multicultural Development 25(2&3): 266-84.

Affect-as-emotion theory (negative emotions signal threat and are accompanied by systematic cognitive processing, whereas positive emotions signal safety and are accompanied by heuristic, schema-based cognitive processing). Evidence presented for the universality and cultural invariance of emotion processing based on free-lists of emotions under time-constraints by various monolingual speakers.

Stone, P.J. 1986. [untitled]. Review of Hart, R. P. 1984. Verbal Style and the Presidency: A Computer-Based Analysis. New York: Academic Press. In Contemporary Sociology 15(1): 75-77.

General favourable review with some criticism, including the ‘skimpy’ theory for dictionary development. Stone laments the lack of word sense disambiguation (words with multiple senses are collapsed or weighted) calling the tendency “a step backwards in both theory and technique.”

Stone, P.J., Robert F. Bales, J. Zvi Namenwirth and Daniel M. Ogilvie. 1962. The General Inquirer: A Computer System for Content Analysis and Retrieval Based on the Sentence as a Unit of Information. Behavioral Science 7: 484-94.

Stone, P.J., D.C. Dumphy and D.M. Ogilvie. 1966. The General Inquirer: A Computer Approach to Content Analysis. M.I.T. Press: Cambridge, MA.

Seminal work on dictionary-based automated analysis. The GI dictionary is developed from the Harvard-IV Psychosociological Dictionary and Lasswell Value Dictionary, the Stanford Political Dictionary (Holsti 1963; 1969, based on Osgood 1957) and the Need-Achievement Dictionary.

Strapparava, Carlo and Alessandro Valitutti. 2004. “WordNet-Affect: An Affective Extension of WordNet. In 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, May: 1083–1086.

A linguistic resource for the lexical representation of affective knowledge, WordNet-Affect adds an additional hierarchy of “affective domain labels” to WordNet synsets representing affective concepts.

Whissell, C., Fournier, M., Pelland, R., Weir, D. & Makarec, K. (1986). A Dictionary of Affect in Language. IV. Reliability, Validity, and Applications. Perceptual and Motor Skills 62: 875–888.

Whissell, C. 1989. The Dictionary of Affect in Language. In R. Plutchnik and H. Kellerman, eds. Emotion: Theory and Research. New York, Harcourt Brace, 113-131.

The DAL is an attempt to quantify emotion in language. Volunteers rated thousands of words in terms of their Pleasantness, Activation, and Imagery (concreteness). It has been applied to many literary texts and other applications.

Valitutti, Alessandro, Carlo Strapparava and Oliviero Stock. 2004. “Developing Affective Lexical Resources.” PsychNology Journal 2(1): 61-83.

Presents WordNet-Affect, a linguistic resource for a lexical representation of affective knowledge, developed from WORDNET through the selection and labeling of the synsets representing affective concepts.


Sentiment Analysis

Dictionary-based Approaches

Athanaselis, T., S. Bakamidis, I. Dologlou, R. Cowie, E. Douglas-Cowie, C. Cox. 2005. ASR for Emotional Speech: Clarifying the Issues and Enhancing Performance. Neural Networks 18: 437–444.

Recognition rates for spontaneous emotionally coloured speech can be improved by post-processing using a language model based on increased representation of emotional utterances. Whissell's DAL is used to extract emotional material from the British National Corpus (BNC).

Bolasco, Sergio and Francesca della Ratta-Rinaldi. 2004. “Experiments on Semantic Categorisation of Texts: Analysis of Positive and Negative Dimension.” JADT 2004: 7es Journées internationales d’Analyse statistique des Données Textuelles.

Classifies words with tone (positive or negative) using a thematic dictionary (GI) of adjectives, substantives and adverbs. Develops a threshold point from which various texts can be said to have a positive or negative connotation.

Cho, Jaeho, Michael P. Boyle, Heejo Keum, Mark D. Shevy, Douglas M. McLeod,Dhavan V. Shan, and Zhongdang Pan. 2003. Media, Terrorism, and Emotionality: Emotional Differences in Media Content and Public Reactions to the September 11th Terrorist Attacks. Journal of Broadcasting & Electronic Media 47(3): 309-327.

Analysis of differences in the “emotionality” of television and print media coverage of the 9/11 terrorism attacks using several emotion categories from LIWC.

Doucet, Lorna and Karen A. Jehn. 1997. “Analyzing Harsh Words in a Sensitive Setting: American Expatriates in Communist China.” Journal of Organizational Behaviour 18: 559-82.

Analyze the use of hostile words by American expatriate managers in communist China to explore organizational conflicts using a conflict dictionary based on the Thesaurus-Snowball technique (Jehn and Werner 1993) and the DAL.

Forsythe, Alexandra M. 2004. “Mapping the Political Language of the 1998 Good Friday Agreement.” Current Psychology 23(3): 215-224.

Analyze political statements of 12 prominent politicians involved in the 1998 Good Friday Agreement using Diction. Using multi-dimensional scaling Diction’s five measures could be plotted along two dimensions coinciding with published policy positions. Considers the limitations of text analysis.

Hart, Roderick P. and Jay P. Childers. 2005. “The Evolution of Candidate Bush: A Rhetorical Analysis.” American Behavioral Scientist 49(2): 180-197.

Examination of Bush’s campaign rhetoric in 2000 and 2004 compared to his predecessors using DICTION. He begins tentatively, but dramatically increases oratory and narratives scores by 2004, signaling the important effects of 9/11 on his presidency.

Hart, R. P. 2000. DICTION 5.0: The Text Analysis Program. Thousand Oaks, CA: Sage-Scolari.

Hart’s DICTION program designed been used on over fifty studies to analyze presidential rhetoric and campaign style and for numerous other applications. For a full reference list see: http://www.dictionsoftware.com/index.php

Hirschberg, Julia, Stefan Benus, Jason M. Brenier and associates. 2005. “Distinguishing Deceptive from Non-Deceptive Speech.” In Proceedings of Interspeech’2005 – Eurospeech.

Use LIWC and DAL to discriminate deceptive from non-deceptive speech. In contrast to Newman et al. (2003) the study finds that deceptive speech tends to have a greater pleasantness score and a greater proportion of positive emotion words seem than truthful speech.

Hogenraad, R. 2005. “What the Words of War Can Tell Us About the Risk of War.” Peace and Conflict: Journal of Peace Psychology, 11(2): 137–151.

Automated analysis of conflict-related documents to explore motivation in terms of the gap between ‘power’ and ‘affiliation’ words in real and fictional texts.

Holsti, Ole R. 1969. Content Analysis for the Social Sciences and Humanities. Reading, Mass.: Addison-Wesley.

Holsti, Ole R., Richard A. Brody and Robert C. North. 1964. “Measuring Affect and Action in International Reaction Models: Empirical Materials from the 1962 Cuban Crisis.” Journal of Peace Research 1(3/4): 170-190.

Holsti, Ole R. 1963. A System of Automated Content Analysis of Documents. Stanford, Calif.: Stanford University.

Analyzed the affective content of US-Soviet correspondence during the Cuban Missile Crisis. Analyzed perception in bargaining behaviour of one’s own versus others actions. Develops the Stanford Political Dictionary to infer political attitudes from messages.

Lowry, Dennis T. 2008. Network TV News Framing of Good VS. Bad Economic News Under Democrat and Republican Presidents: A Lexical Analysis of Political Bias.” Journalism and Mass Communication Quarterly 85(3): 483-498.

Analyzes objectively positive/negative economic stories (based on Dow Jones Scores) with Diction 5.0 to assess TV news bias during Clinton and bush administrations. Finds pro-Democrat and negative news bias.

Mairesse, F., Walker, M. 2006. “Words Mark the Nerds: Computational Models of Personality Recognition through Language.” In Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006): 543–548.

LIWC and the MRC used to analyze conversations to identify a speaker’s personality.

Namenwirth, J. Zvi and Robert Philip Weber. 1987. Dynamics of Culture. Unwin Hyman.

Test the theory that in modern society culture changes in theoretically meaningful cyclical patterns by measuring shifting themes and content in language use over time in party platforms, political speeches and other social texts.

Newman, Matthew L., James W. Pennebaker, Diane S. Berry and Jane M. Richards. 2003 “Lying Words: Predicting Deception From Linguistic Styles.” Personality and Social Psychology Bulletin 29(5): 665-675.

Investigates features of linguistic style that distinguish true and false stories using LIWC. Compared to truth-tellers, liars show lower cognitive complexity, used fewer self-references and other-references, and used more negative emotion words.

Pennebaker, J. W., R. B. Slatcher and C. K. Chung. 2005. Linguistic Markers of Psychological State through Media Interviews: John Kerry and John Edwards in 2004, Al Gore in 2000. Analyses of Social Issues and Public Policy 5(1): 197-204.

Examine the linguistic styles of John Kerry and John Edwards on television interviews during the presidential primary campaign and Al Gore in 2000 using LIWC. Kerry used more negative emotion words than Edward and Gore and Kerry’s style was shown to be quite similar.

Saris-Gallhofer, I.N., W.E. Saris and E.L. Morton. 1978. “A Validation Study of Holsti’s Content Analysis Procedure.” Quanlity and Quantity 12: 131-145.

Test of the construct validity of the Osgood dimensions underlying the Stanford Political Dictionary. The study confirms the validity of ‘evaluation’ and ‘potency’ but finds ‘activity’ to be invalid.

Sigelman, Lee and Cynthia Whissell. 2002. ““The Great Communicator” and “The Great Talker” on the Radio: Projecting Presidential Personas.” Presidential Studies Quarterly 32(1): 137-146.

Study of presidential rhetorical comparing Reagan and Clinton on Saturday morning radio using the DAL for dimensions of ‘activity’ and ‘positivity.’ Clinton is found to be more active and positive, but Reagan closer to the American norm, explaining why is generally considered the more effective communicator.

Tetlock, P., Saar-Tsechansky, M. and S. Macskassy. 2007. More than Words: Quantifying Language to Measure Firms’ Fundamentals. 9th Annual Texas Finance Festival. Available at SSRN: http://ssrn.com/abstract=923911.

Uses the number of negative words (GI) in news stories to predict firm earnings and stock returns. Stories reflect firm-specific news and predictions work best when stories focus on fundamentals.

Supervised machine-learning

Andreevskaia, A. and S. Bergler. 2006. “Mining WordNet For a Fuzzy Sentiment: Sentiment Tag Extraction From WordNet Glosses”. In Proceedings EACL-06, the 11rd Conference of the European Chapter of the Association for Computational Linguistics, Trento, IT.

Andreevskaia, A. and S. Bergler. 2006. “Sentiment Tagging of Adjectives at the Meaning Level.” In L. Lamontagne and M. Marchard (eds). Canadian AI 2006, LNAI 4013. Springer-Verglag.

Tagging words for meaning and magnitude (positive, negative, or neutral) using WordNet glosses to generate dictionaries of sentiment-bearing words. Semi-supervised.

Baker, C., C. Fillmore and J. Lowe. 1998. “The Berkeley FrameNet Project”. In Proceedings of the Joint Conference on Computational Linguistics and the 36th Annual Meeting of the ACL (COLING-ACL98). Montreal, Canada: Association for Computational Linguistics.

Database of 10,000 manually-annotated sentences for supervised learning.

Balog, Krisztian, Gilad Mishne and Maarten de Rijke. 2006. “Why Are They Excited: Identifying and Explaining Spikes in Blog Mood Levels. In Proceedings of EACL-06, Trento, Italy.

Detects spikes and cycles relating to mood in blogs.

Benamara, Farah, Carmine Cesarano, Antonio Picariello, Diego Reforgiato and VS Subrahmanian. 2007. “Sentiment Analysis: Adjectives and Adverbs are Better than Adjectives Alone.” In Proceedings of the International Conference on Weblogs and Social Media (ICWSM-2007) Boulder, CO, March 26-28, 2007.

Using adverbs and adverb-adjective combinations for sentiment analysis produces better results than adjectives alone.

Généreux, M. and R. Evans. 2006. Towards a Validated Model for Affective Classification of Texts. Proceedings of the Workshop of Sentiment and Subjectivity in Text, Association for Computational Linguistics, Sydney, AU, 55-62.

Validates a two-dimensional typology of affective states: positive/negative; active/passive. Trains on weblog posts annotated for mood by authors using support vector machine binary classifiers. This standard approach works just as well with multiple dimensions.

Hatzivassiloglou, V. and K. McKeown. 1997. “Predicting the Semantic Orientation of Adjectives.” In Proceedings of the 35th Annual Meeting of the ACL and the 8th Conference of the European Chapter of the ACL, 174–181. Madrid, Spain: Association for Computational Linguistics.

Uses constraints from conjunctions on adjectives to predict the semantic orientation of conjoining adjectives on a 21 million word Wall Street Journal corpus. In the aggregate a clustering algorithm separates adjectives to label them positive or negative.

Ide, Nancy. 2006. “Making Senses: Bootstrapping Sense-Tagged Lists of Semantically-Related Words.” In Alexander Gelbukh (ed), Computational Linguistics and Intelligent Text, LNCS. Springer-Verlag.

Method for tagging lists of sentiment-bearing words according to senses. Merges all available sentiment word lists. Uses sense-tagging to eliminate false hits – maps lists to word senses. Augments the list and refines the categories.

Kotsiantis, S. B. 2007. “Supervised Machine Learning: A Review of Classification.” Techniques Informatica 31: 249–268.

An overview of approaches to and theoretical issues guiding the development of supervised learning classification techniques.

Leshed, Gilly and Joseph ’Jofish’ Kaye. 2006. “Understanding How Bloggers Feel: Recognizing Affect in Blog Posts.” In CHI-2006, April 22-27, 2006, Montreal, Canada.

Supervised machine-learning approach using a support vector machine classifiers on blog entries tagged with moods by their authors. TF/IDF word frequencies are extracted from pre-defined categories.

Mishne, Gilad and Natalie Glance. 2006. “Predicting Movie Sales from Blogger Sentiment.” In Proceedings ofAAAI-CAAW-06, the Spring Symposia on Computational Approaches to Analyzing Weblogs, Stanford, US.

Uses positive sentiment in weblogs to predict movie success.

Mishne, Gilad and Maarten de Rijke. 2006. “MoodViews: Tools for Blog Mood Analysis.” In Proceeding of AAAI-CAAW-06, the Spring Symposia on Computational Approaches to Analyzing Weblogs, Stanford, US.

MoodViews tracks moods over time on LiveJournal. It has three components: Moodgrapher tracks mood levels; Moodteller uses NLP and machine learning to estimate mood levels from the text of blog entries; and Moodsignals detects words and phrases associated with a given mood in a given time interval (e.g. Katrina and “worry”).

Mishne, Gilad. 2005. “Experiments with Mood Classification in Blog Posts.” In Proceedings of Style2005 - the 1st Workshop on Stylistic Analysis of Text for Information Access, at SIGIR 2005.

Supervised machine-learning approach using annotated blogs as reference texts.

Mullen, Tony and Nigel Collier. 2004. “Sentiment Analysis Using Support Vector Machines with Diverse Information Sources.” In Proceedings of EMNLP-04, 9th Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain.

Uses support vector machine classification, incorporating topic information.

Joachims, T. 1998. “Text categorization with Support Vector Machines: Learning with Many Relevant Features. In Proceedings of the European Conference on Machine Learning.

Theoretical and empirical evidence for the use of support vector machines for automated analysis, which learn text classifiers from text, eliminating the need to build them manually.

Subasic, Pero and Alison Huettner. 2001. “Affect Analysis of Text Using Fuzzy Typing.” IEEE Transactions on Fuzzy Systems 9(4): 483–496.

Generates a weighted, fuzzy affect lexicon by clustering on a thesaurus, to get a centrality scored reflecting all meanings of a word to. Classifies news stories and movie reviews, reliable with human coding.

Yang, Yiming. 1999. “An Evaluation of Statistical Approaches to Text Categorization.” Information Retrieval 1: 69–90.

Evaluation of a selection of classifiers for supervised machine learning approaches to automation.

Unsupervised machine-learning

Gamon, Michael and Anthony Aue. 2005. “Automatic Identification of Sentiment Vocabulary: Exploiting Low Association with Known Sentiment Terms.” In Proceedings of the ACL-05 Workshop on Feature Engineering for Machine Learning in Natural Language Processing, Ann Arbor, MI.

An extension of Turney’s (2002) technique to show that opposite sentiment terms do not co-occur at the sentence level. New terms in combination with Naïve Bayes bootstrapping improves classifier performance at the phrase level.

Grefenstette, Gregory, Yan Qu, David A. Evans and James G. Shanahan. 2004. “Validating the Coverage of Lexical Resources for Affect Analysis and Automatically Classifying New Words along Semantic Axes.” In Yan Qu, James Shanahan, and JanyceWiebe (eds.), Exploring Attitude and Affect in Text: Theories and Applications, AAAI-2004 Spring Symposium Series: 71–78.

Expands an existing affect lexicon and lexical patterns using word clusters based on pointwise mutual information on an unannotated corpus.

Kamps, Jaap, Maarten Marx, Robert J. Mokken and Maarten de Rijke. 2004. “Using Wordnet to Measure Semantic Orientation of Adjectives. In 4th International Conference on Language Resources and Evaluation (LREC 2004), volume IV. Paris. European Language Resources Association: 115–118.

Measures the geodesic distance of adjectives in WordNet using synonymy to the words ‘good’ and ‘bad’ to determine the semantic orientation of adjectives.

Landauer, T.K. and S.T. Dumais. 1997. “A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation o Knowledge. Psychological Review, 104, 211-240.

Explains the foundations of LSA – inducing knowledge from local co-occurrences in text. (The first unsupervised learning from text).

Landauer Thomas K., Darrell Laham, Bob Rehder, and M. E. Schreiner. 1997. “How Well Can Passage Meaning be Derived without Using Word Order? A Comparison of Latent Semantic Analysis and Humans.” Nineteenth Annual Conference of the Cognitive Science Society (COGSCI-97): 412-417.

Compares intercoder reliability and the predictive accuracy of human coding of the quality and quantity of student essays on scientific topics with a LSA, a statistical model that uses a bag of words representation. There is surprisingly little difference between humans and the model.

Strapparava, Carlo, Alexandro Vilitutti and Olivero Stock. 2006. “The Affective Weight of Lexicon.” In Proceedings of LREC-2006, Genoa, IT.

A method to recognize and select affective evaluative terms using an extension of the WordNet-Affect lexical database. An unsupervised semantic similarity function applies relational concepts and emotional categories.

Thelen, Michael and Ellen Riloff. 2002. “A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts.” In Conference on Empirical Methods in Natural Language Processing (EMNLP 2002).

Describes a bootstrapping algorithm called Basilisk that learns semantic lexicons for multiple categories with an unannotated corpus and seed words for each category. Hypothesizes semantic classes based on extraction patterns.

Turney, Peter and Michael Littman. 2003. “Measuring Praise and Criticism: Inference of Semantic Orientation from Association.” ACM Transactions on Information Systems (TOIS) 21(4): 315–346.

Method infers the semantic orientation (direction – pos/neg; and degree – mild/strong) of a word from statistical association with a set of positive and negative paradigm words.

Turney, P. 2002. “Thumbs up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews.” In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002: 417-24.

To avoid labeling training set, predicts the semantic orientation of phrases based on difference in mutual information of adjectives and adverbs with the words ‘excellent’ and ‘poor.’ The text is coded by the average terms it contains. Its reliability is tested against given consumer review info.

Turney, P. and M.L. Littman. 2002. “Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus.” Technical Report ERC-1094 (NRC 44929), National
Research Council of Canada.

Describes and defends semantic-orientation point-wise mutual information (SO-PMI) in a research report.


Automated Topic Coding

Dictionary-based Approaches

Supervised machine-learning

Bond, D., J. Bond, J. Craig Jenkins, C. Oh and C. L. Taylor. 2003. “Integrated Data for Events Analysis (IDEA): An Event Form Typology for Automated Events Data Development.” Journal of Peace Research 40(6): 733-745.

A comprehensive events framework for the analysis of international interactions. Uses a flexible multi-leveled event and actor/target hierarchy that can be expanded to incorporate new event forms and actors/targets. Can construct indicators for early warning and assessing conflict escalation.

Budge, Ian and Paul Pennings. 2007. “Do they work? Validating Computerised Word Frequency Estimates Against Policy Series.” Electoral Studies 26: 121-29.

Test the validity and reliability of the WordScores method against expert codes from the Manifesto Research Group/Comparative Manifesto Project (MRG/CMP). Finds that Wordscores flatten out party movement, as have previous computerized approaches.

Cieri, Christopher. 2000. “Multiple Annotations of Reusable Data Resources: Corpora for Topic Detection and Tracking.” 5es Journées Internationales d’Analyse Statistique des Données Textuelles (JADT 2000).

Responding to demands for very large, easily accessible, reusable news corpora to support research in the topic detection and tracking paradigm, the Linguistic Data Consortium created the manually annotated TDT corpora. Supports research in the Topic Detection and Tracking program and many other projects.

Hillard, D. Purpura, S. Wilkerson, J. “Computer Assisted Topic Classification for Mixed Methods Social Science Research.”  Journal of Information Technology and Politics 4:4, forthcoming.

A supervised learning system that can provide estimates of the class of each document (or event). This system maintains high classification accuracy and provides accurate estimates of document proportions, while achieving reliability levels associated with human efforts.

Hillard, Dustin, S. Purpura and J. Wilkerson. 2007. “An Active Learning Framework for Classifying Political Text.” Prepared for delivery at the 2007 Annual Meeting of the Midwest Political Science Association, Chicago, IL. April 14-17.

A framework and tools to automate the classification of topics for Congressional Bills using a large corpus of human-labeled events.

King, Gary and Will Lowe. 2003. “An Automated Information Extraction Tool for International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design.” International Organization 57: 617–642.

Following from the KEDS project, propose a software program (The Reader) to automatically analyze daily events data from Reuters Business Briefing (RBB) newswire. Advance a 157-category event categorization scheme called the Integrated Data for Events Analysis (IDEA).

Kwon, Namhee, Stuart Shulman, and Eduard Hovy. 2006. “Multidimensional Text Analysis for eRulemaking,” Proceedings of the Seventh National Conference on Digital Government Research (Digital Government Research Center).

Uses FrameNet to analyze a large number of public comments on proposed regulations, accounting for argument structure, topics and opinions.

Laver, Michael, Kenneth Benoit and John Garry. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97(2): 311-31.

Presents a word-scoring method that is “language-blind” and replicates policy estimates elsewhere. Word scores generated from reference texts are used to score virgin texts based on a priori policy dimensions.

Laver, Michael, and Kenneth Benoit. 2002. “Locating TDs in Policy Spaces: Wordscoring Dail Speeches.” Irish Political Studies 17 (Summer): 59–73.

Uses Wordscores to identify speakers’ pro- versus anti-government dimension in the Italian legislature.

Laver, Michael and John Garry. 2000. “Estimating Policy Positions from Political Texts.” American Journal of Political Science 44: 619–34.

Codes policy positions rather than policy emphasis. Seeks to overcome problems with static models and move beyond the unitary actor assumption.

Lowe, Will. 2008. “Understanding Wordscores.” Political Analysis 16: 356–371.

Addresses practical and theoretical problems of Wordscores method. Badly scaled document score estimates fail to ensure consistent and unbiased estimates.

Purpura, Stephen and Dustin Hillard. 2006. “Automated Classification of Congressional Legislation.” The 7th Annual Internatoinal conference on Digital Government Research ’06, May 21-24, 2006, San Diego, CA,USA.

Congressional Bills Project: a supervised machine-learning approach using a human coded corpus to classify attentiveness to issues in Congress according to the Policy Agendas Project coding scheme.

Schrodt, P. A., E. M. Simpson and D. J. Gerner. 2001. Monitoring Conflict Using Automated Coding of Newswire Reports: A Comparison of Five Geographical Regions. Paper presented at the PRIO/Uppsala University/DECRG High-Level Scientific Conference on Identifying Wars: Systematic Conflict Research and Its Utility in Conflict Resolution and Prevention, Uppsala, Sweden.

Schrodt, P. A, S. G. Davis and J. L. Weddle. 1994. Political Science: KEDS—A Program for the Machine Coding of Event Data. Social Science Computer Review 12(4): 561–88.

The Kansas Event Data System (KEDS) project uses automated coding of English-language news reports to generate political event data, which is used in statistical early warning models to predict political change.

Unsupervised machine-learning

Quinn, K., B. Monroe, M. Colaresi and M. Crespin. 2006. An Automated Method to Topic-Coding Legislative Speech Over Time with Application to the 105th-109th U.S. Senate. Paper presented at the American Political Science Association Conference, Philadelphia, PA.

Unsupervised machine learning approach to estimate the probability that attention will be paid to topics in Congress over time. Inference is based on patterns in word choice with no predefined coding scheme.

Simon, A. and M. Xenos. 2004. “Dimensional Reduction of Word-Frequency Data as a Substitute for Intersubjective Content Analysis.” Political Analysis 12(1): 63-75.

Uses dimensional reduction drawing on latent semantic analysis theory to model human language in open-ended survey questions. This approach can be used to generate coding dictionaries and avoiding circularity is set coding system. Performs well against human coding.

Treeratpituk Pucktada and Jamie Callan. 2006. “Automatically Labeling Hierarchical Clusters” Proceedings of the International Conference on Digital Government 151: 167-176.

Presents an algorithm which labels hierarchical clusters in a document, organizing and analyzing comments received by government agencies.


Beyond a Bag of Words: Advances in Automation

Proximity and Lexical Rules>

Agrin, N. 2006. “Developing a Flexible Sentiment Analysis Technique for Multiple Domains: Unpublished Research Note”

Sentiment analysis techniques in NSL compared: bag-of-words classification, lexical rule analysis using term expansion, and statistical classification through rule generated features.

Dave, Kushal, Steve Lawrence and David M. Pennock. 2003. “Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. In The Twelfth International World Wide Web Conference (WWW’03). Budapest, Hungary. ACM: 519–528

Mixed-method approach incorporating feature extraction and scoring to distinguish positive and negative reviews about particular product attributes.

Murphy, Chad, S. Bowler, C. Burgess and M. Johnson. 2006. “The Rhetorical Semantics of State Ballot Initiative Arguments in California, 1980-2004.” Paper delivered at the American Political Science Association Conference, Philadelphia, PA, August 28-September 1, 2006.

Uses the local co-occurrence of words to analyze differences in the nature of pro and con arguments.

Pang, Bo, Lillian Lee and Shivakumar Vaithyanathan. 2002. “Thumbs up? Sentiment Classification using Machine Learning Techniques.” In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP-02): 79-86.

Comparing three supervised classifiers (Naïve Bayes, maximum entrophy and support vector machines), concluding that they don’t work as well on tone as they do for topic because tone depends on the context of words (which may not be independently sentiment-laden).

Scharl, A., I. Pollach and C. Bauer. 2003. “Determining the Semantic Orientation of Web-Based Corpora.” In J. Liu et al. (eds). IDEAL 2003, LNCS 2690: 840-9.

Multiple methods. Web media monitoring methodology using the webLyzard project. Structural and textual analysis using word frequencies and distributions, distance measures, perceptual maps and semantic orientation.

Tong, R. 2001. “Detecting and Tracking Opinions in Online Discussions.” SIGIR 2001 Workshop on Operational Text Classification, New Orleans, LA.

Uses the proximity of subjects and affective judgments to determine opinions about particular topics.

Valence Shifters and Modifiers

Andreevskaia, Alina, Sabine Bergler and Monica Urseanu 2007. “All Blogs are Not Made Equal: Exploring Genre Differences in Sentiment Tagging of Blogs.” In Proceedings of the International Conference on Weblogs and Social Media (ICWSM-2007) Boulder, CO, March 26-28, 2007.

Classifies blogs using ternary (positive/negative/neutral) classification of sentiment at the sentence level accounting for negations and other valence shifters to improve performance.

Durbin, Stephen D., Richter, J. Neal, Warner, Doug  2003. “A System for Affective Rating of Texts.” In Proceedings of OTC-03, 3rd Workshop on Operational Text Classification.

Tone is harder to code than topics because it is not based on an accumulation of, but relationship between words. Uses a speech tagger, list of affect words and modifiers and syntactic rules to account for negation.

Kennedy, Alistair and Diana Inkpen. 2006. “Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters.” Computational Intelligence 22(2): 110-125.

Frequency analysis of positive, negative and neutral terms accounting for negations and intensifiers (overstatements and understatements). Expands the GI dictionary using synonyms and computes the semantic orientation of terms using association scores with a small group of positive and negative terms.

Semantic Parsing and Roles

Bethard, S., H. Yu, A Thornton, V. Hatzivassiloglou and D. Jurafsky. 2004. “Automatic Extraction of Opinion Propositions and their Holders.” In Exploring Attitude and Affect in Text: Theories and Application (AAAI-EAAT2004), Stanford University.

Finding propositional opinions – the sentential complements that contain the actual opinion rather than full opinion sentences. Manually annotate propositional opinions. The same technique can be used to identify opinion holders and associate them with propositional opinions.

Gildea, D. and D. Jurafsky. 2002. “Automatic Labeling of Semantic Roles. Computational Linguistics 28(3): 245–288.

Supervised machine-learning based on trained text from FrameNet. Labels semantic roles such as “speaker”, “message” and “topic.”

Hurst, Matthew and Kamal Nigam. 2004. “Retrieving Topical Sentiments from Online Document Collection.” In Exploring Attitude and Affect in Text: Theories and Application (AAAI-EAAT2004), Stanford University, March. AAAI.

Lightweight but robust approach to combine topic and polarity, enabling content access systems to select content based on opinion about a certain topic.

Holsti, Ole R. 1969. Content Analysis for the Social Sciences and Humanities. Reading, Mass.: Addison-Wesley.

Holsti, Ole R. 1963. A System of Automated Content Analysis of Documents. Stanford, Calif.: Stanford University.

Outlines an automated method using sparse parsing and parts of speech tagging to identify semantic roles – the ‘agents,’ ‘actions’ and ‘targets’ of sentences – to correctly attribute tone to the target, or to account for “who does what to whom.”

Hopmann, P. Terrence and Timothy King. 1976. Interactions and Perceptions in the Test Ban Negotiations. International Studies Quarterly 20(1): 105-142.

Analyze transcripts of nuclear test ban negotiations to measure bargaining behaviour and perceptions of stakeholders according to positive and negative linguistic cues in the text using Holsti’s (1963; 1969) method to identify semantic roles, identifying who does what to whom.

Kim, Soo-Min and Eduard Hovy. 2006. “Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text.” In Proceedings of the COLING/ACL-06 Workshop on Sentiment and Subjectivity in Text, Sydney, AU.

Method for identifying an opinion with its holder and topic in a phrase using semantic structure, anchored to an opinion-bearing verb or adjective. Semantic-roles are labeled using FrameNet data.

Kim, Soo-Min and Eduard Hovy. 2005. “Automatic Detection of Opinion Bearing Words and Sentences.” In Companion Volume to the Proceedings of IJCNLP-05, the Second International Joint Conference on Natural Language Processing, Jeju Island, KR: 61–66.

Method to identify opinion-bearing words to recognize opinion-bearing sentences with positive or negative valence. Describes a system that identifies the [Holder/Topic/Valence] opinion triad.

Pradhan, S., K. Hacioglu, W. Ward, J. Martin and D. Jurafsky. 2003. “Semantic Role Parsing: Adding Semantic Structure to Unstructured Text.” In Proceedings of the International Conference on Data Mining (ICDM-2003).

Technique for semantic parsing to assign WHO did WHAT to WHOM in sentences using support vector machines and human-coded training data.

Yi and Niblack. 2005. “Sentiment Mining in WebFoundation”  Proceedings from the 21st International Conference on Data Engineering (ICDE). The Computer Society.

Describes WebFountain, a sentiment (opinion) miner application that classifies the sentiment of each subject reference rather than the document.

Subjectivity Detection

Baroni, Marco and Stefano Vegnaduzzo. 2004. “Identifying Subjective Adjectives through Web-based Mutual Information” In Konferentz zur Verarbeitung Naturlicher Sprache (KONVENS): 613–619.

Method for ranking adjectives according to a subjectivity score without knowledge-intensive resources (i.e. lexical databases, parsers, manual annotation). The method relies on small sets of “seeds” using frequency and co-occurrence on the web.

Hatzivassiloglou, V. and J. Wiebe. 2000. “Effects of Adjective Orientation and Gradability on Sentence Subjectivity.” In Proceedings of the Conference on Computational Linguistics (COLING-2000).

Study of the effect of dynamic adjectives, semantically oriented adjectives, and gradable adjectives on a simple subjectivity classifier to establish that they are strong predictors of subjectivity.

Morinaga, S., K. Yamanishi, K. Tateishi, and T. Fukushima. 2002. “Mining Product Reputations on the Web.” In Proceedings of KDD-02, 8th ACM International Conference on Knowledge Discovery and Data Mining, Edmonton, CA, 2002: 341-9.

Using training data, syntactic and linguistic rules to determine whether a statement on a target topic is an opinion and whether that opinion is positive or negative. Combines dictionaries, co-occurrence, sentences and correspondence analysis.

Pang, B. and L. Lee.  2004. “A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts.” In Proceedings of the 42nd ACL: 271-278.

Machine-learning approach to sentiment analysis that applies text-categorization techniques to the subjective portions of the document only.

Riloff, E., J. Wiebe and T. Wilson. 2003. “Learning Subjective Nouns Using Extraction Pattern Bootstrapping.” In Proceedings of the Seventh Conference on Natural Language Learning (CoNLL-03).

Create a subjectivity classifier using lists of subjective nouns which are used to train a Naïve Bayes classifier using the subjective nouns, discourse features and subjectivity clues.

Riloff, E., J. Wiebe and T. Wilson. 2003. “Learning Extraction Patterns for Subjective Expressions.” In Proceedings of EMNLP-03, 8th Conference on Empirical Methods in Natural Language Processing, Sapporo, JP: 105–112

Bootstrapping process to extract patterns for subjective expressions. Classifiers label unannotated data to create a training set which is then given to an extraction pattern learning algorithm. The learned patterns are then used to classify more subjective sentences.

Thomas, M., B. Pang and L. Lee. 2006. “Get Out the Vote: Determining Support or Opposition from Congressional Floor-Debate Transcripts.”In Proceedings of Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP-06): 327-335.

Considers inter-document similarity (rather than independent documents) to classify sentiment-polarity by incorporating rules to account for agreement between speakers. Concludes it is not possible given state of the art sentiment analysis techniques to determine the relationship between facts and speaker opinion.

Wiebe, Janyce and Rada Mihalcea. 2006. “Word Sense and Subjectivity.” In Proceedings of COLING/ACL-2006, Sydney, Australia.

Explores the interaction between subjectivity and meaning. Presents empirical evidence that subjectivity can be associated with word senses and word sense disambiguation benefits from subjectivity annotations.

Wiebe, Janyce and Ellen Riloff. 2005. “Creating Subjective and Objective Sentence Classifiers from Unannotated Texts.” In Proceeding of CICLing-05, International Conference on Intelligent Text Processing and Computational Linguistics, volume 3406 of Lecture Notes in Computer Science, Mexico City, MX. Springer-Verlag: 475-486.

Subjectivity classifiers for unannotated texts for training.

Wiebe, Janyce, Theresa Wilson, Rebecca Bruce, Matthew Bell and Melanie Martin. 2004. “Learning Subjective Language.” Computational Linguistics 30(3): 277-308.

Tests a series of features to detect subjectivity in corpora, which are shown to perform consistently (well or poorly) on the same data sets. A higher-density of features is the best indicator of a word’s subjectivity.

Wiebe, Janyce, Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, TheresaWilson, David Day and Mark Maybury. 2003. “Recognizing and Organizing Opinions Expressed in World Press.” In Proceedings of the AA AI Spring Symposium on New Directions in Question Answering.

Results of a workshop on multiple perspectives about the importance of opinion mining and to present a framework for learning opinions from text.

Wiebe, J. 2002.  “Instructions for Annotating Opinions in Newspaper Articles.” Technical Report TR-02-101, Department of Computer Science, University of Pittsburgh, Pittsburgh, Pennsylvania.

A comprehensive and detailed training guide for manual coding classifications for multiple automated techniques.

Wiebe, J. 2000. “Learning Subjective Adjectives from Corpora.” In Proceedings of the 17th National Conference on Artificial Intelligence (AAAI-2000), Menlo Park, CA: 735–740.

Subjectivity tagging to distinguish opinions and evaluations from sentences used to objectively present factual information using a method for clustering words according to distributional similarity seeded by manual annotation. Uses lexical semantic features of adjectives (polarity and gradability).

Wiebe, J., R. Bruce and T. O’Hara. 1999. “Development and Use of a Gold Standard Data Set For Subjectivity Classifications.” In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), 246–253.

Developing an automatic subjectivity classifier.

Wiebe, Janyce. 1994. “Tracking Point of View in Narrative.” Computational Linguistics 20(2):233–287.

The beginning of Janyce Wiebe’s legacy on subjectivity detection.

Wilson, Theresa, Janyce Wiebe and Paul Hoffmann. 2005. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), Vancouver, BC.

Determines whether an expression is neutral or polar, then disambiguates the polarity of the polar expressions.

Wilson, T., J. Wiebe and R. Hwa. 2006. “Recognizing Strong and Weak Opinion Clauses.” Computational Intelligence 22(2): 73-99.

Classifies the intensity of opinions and other types of subjectivity and the subjectivity of deeply nested clauses. Uses vector support regression.

Yu, H.and V. Hatzivassiloglou. 2003. “Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences”. In Michael Collins and Mark Steedman (ed), Proceedings of EMNLP-03, 8th Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan: 129–136.

Separating fact from opinion at the document and sentence level using a bayesian classifier to distinguish between documents with opinions (such as editorials) and regular news stories. Three unsupervised statistical techniques are used to detect opinions and identify the polarity of opinion sentences.