Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Classifying latent user attributes in twitter

Classifying latent user attributes in twitter Classifying Latent User Attributes in Twitter Delip Rao , David Yarowsky, Abhishek Shreevats, Manaswi Gupta Department of Computer Science Johns Hopkins University 3400 N. Charles Street Baltimore, MD 21218 ∗ {delip, yarowsky, ashreev1, mgupta7}@cs.jhu.edu ABSTRACT Social media outlets such as Twitter have become an important forum for peer interaction. Thus the ability to classify latent user attributes, including gender, age, regional origin, and political orientation solely from Twitter user language or similar highly informal content has important applications in advertising, personalization, and recommendation. This paper includes a novel investigation of stacked-SVM-based classi cation algorithms over a rich set of original features, applied to classifying these four user attributes. It also includes extensive analysis of features and approaches that are e €ective and not e €ective in classifying user attributes in Twitter-style informal written genres as distinct from the other primarily spoken genres previously studied in the userproperty classi cation literature. Our models, singly and in ensemble, signi cantly outperform baseline models in all cases. A detailed analysis of model components and features provides an often entertaining insight into distinctive language-usage variation across gender, age, regional origin and political orientation in modern informal communication. Categories and Subject Descriptors: I.2.7 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

Classifying latent user attributes in twitter

Association for Computing Machinery — Oct 30, 2010

Loading next page...
 
/lp/association-for-computing-machinery/classifying-latent-user-attributes-in-twitter-3pQCDbYmIa

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Datasource
Association for Computing Machinery
Copyright
The ACM Portal is published by the Association for Computing Machinery. Copyright © 2010 ACM, Inc.
ISBN
978-1-4503-0386-6
doi
10.1145/1871985.1871993
Publisher site
See Article on Publisher Site

Abstract

Classifying Latent User Attributes in Twitter Delip Rao , David Yarowsky, Abhishek Shreevats, Manaswi Gupta Department of Computer Science Johns Hopkins University 3400 N. Charles Street Baltimore, MD 21218 ∗ {delip, yarowsky, ashreev1, mgupta7}@cs.jhu.edu ABSTRACT Social media outlets such as Twitter have become an important forum for peer interaction. Thus the ability to classify latent user attributes, including gender, age, regional origin, and political orientation solely from Twitter user language or similar highly informal content has important applications in advertising, personalization, and recommendation. This paper includes a novel investigation of stacked-SVM-based classi cation algorithms over a rich set of original features, applied to classifying these four user attributes. It also includes extensive analysis of features and approaches that are e €ective and not e €ective in classifying user attributes in Twitter-style informal written genres as distinct from the other primarily spoken genres previously studied in the userproperty classi cation literature. Our models, singly and in ensemble, signi cantly outperform baseline models in all cases. A detailed analysis of model components and features provides an often entertaining insight into distinctive language-usage variation across gender, age, regional origin and political orientation in modern informal communication. Categories and Subject Descriptors: I.2.7

There are no references for this article.