Metadata
Title
Topic Modeling for the Social Sciences
Category
general
UUID
984842e5a7764b608b43d3772a6270f0
Source URL
https://idl.uw.edu/papers/topic-modeling-social-sciences
Parent URL
https://idl.uw.edu/papers
Crawl Time
2026-03-11T03:12:23+00:00
Rendered Raw Markdown

Topic Modeling for the Social Sciences

Source: https://idl.uw.edu/papers/topic-modeling-social-sciences Parent: https://idl.uw.edu/papers

Daniel Ramage, Evan Rosen, Jason Chuang, Christopher D. Manning, Daniel A. McFarland. Proc. Workshop on Applications for Topic Models, NIPS, 2009

Daniel Ramage, Evan Rosen, Jason Chuang, Christopher D. Manning, Daniel A. McFarland

Proc. Workshop on Applications for Topic Models, NIPS, 2009

Materials

PDF | Software

Abstract

As textual datasets grow in size and scope, social scientists need better tools to help make sense of that data. Despite the natural applicability of topic modeling to many such problems, word counts and tag clouds are often used as the primary means of gleaning information from textual data. We characterize two barriers to adoption encountered during a collaboration between the Stanford NLP group and social scientists in the school of education: accessibility and trust. Accessibility refers to the technical barriers that make text processing and topic modeling difficult. Trust comes when practitioners can explore and validate a model being used to discover or support a hypothesis. We introduce recent work aimed at solving these challenges including the Stanford Topic Modeling Toolbox software.

BibTeX

@inproceedings{2009-topic-modeling-social-sciences,
  title = {Topic Modeling for the Social Sciences},
  author = {Ramage, Daniel AND Rosen, Evan AND Chuang, Jason AND Manning, Christopher AND McFarland, Dan},
  booktitle = {Proc. Workshop on Applications for Topic Models, NIPS},
  year = {2009},
  url = {https://idl.uw.edu/papers/topic-modeling-social-sciences}
}

{"status":200,"statusText":"","headers":{},"body":"[\n {\n \"fullName\": \"Proc. ACM Human Factors in Computing Systems (CHI)\",\n \"nickname\": \"CHI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. VIS)\",\n \"nickname\": \"VIS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Computer Graphics Forum (Proc. EuroVis)\",\n \"nickname\": \"EuroVis\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. EuroVis Short Papers\",\n \"nickname\": \"EuroVis-Short\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. IEEE VIS Short Papers\",\n \"nickname\": \"VIS-Short\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM User Interface Software & Technology (UIST)\",\n \"nickname\": \"UIST\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Computer-Supported Cooperative Work (CSCW)\",\n \"nickname\": \"CSCW\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Intelligent User Interfaces\",\n \"nickname\": \"IUI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"ACM Trans. on Computer-Human Interaction\",\n \"nickname\": \"ACM TOCHI\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Advanced Visual Interfaces\",\n \"nickname\": \"AVI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Conference on Innovative Data Systems Research (CIDR)\",\n \"nickname\": \"CIDR\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Very Large Database Endowment (PVLDB)\",\n \"nickname\": \"PVLDB\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Empirical Methods in Natural Language Processing\",\n \"nickname\": \"EMNLP\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. NAACL-HLT\",\n \"nickname\": \"NAACL-HLT\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. International Conference on Weblogs and Social Media (ICWSM)\",\n \"nickname\": \"ICWSM\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. InfoVis)\",\n \"nickname\": \"InfoVis\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Beautiful Data\",\n \"nickname\": \"Beautiful Data\",\n \"venueType\": \"book\"\n },\n {\n \"fullName\": \"Information Visualization Journal\",\n \"nickname\": \"IV Journal\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Visual Analytics Science & Technology (VAST)\",\n \"nickname\": \"VAST\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Cortex\",\n \"nickname\": \"Cortex\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Hawaii International Conference on Systems Sciences (HICSS)\",\n \"nickname\": \"HICSS\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. IEEE Information Visualization (InfoVis)\",\n \"nickname\": \"InfoVis (Pre-TVCG)\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Ubiquitous Computing\",\n \"nickname\": \"UbiComp\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. WEBKDD Workshop\",\n \"nickname\": \"WEBKDD\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"ACM Trans. on Information Systems\",\n \"nickname\": \"ACM TOIS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Communications of the ACM\",\n \"nickname\": \"CACM\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Social Network Mining & Analysis, ACM KDD\",\n \"nickname\": \"SNAKDD\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Social Visualization Workshop, ACM CHI\",\n \"nickname\": \"CHI Social Vis\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. AVI Workshop on Invisible & Transparent Interfaces\",\n \"nickname\": \"AVI ITI\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Color Imaging Conference\",\n \"nickname\": \"Color Imaging Conf.\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Workshop on Applications for Topic Models, NIPS\",\n \"nickname\": \"NIPS Topic Model Ws\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Mining Software Repositories\",\n \"nickname\": \"MSR\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Journal of Animal Ecology\",\n \"nickname\": \"J Anim Eco\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"J Am Med Inform Assoc\",\n \"nickname\": \"JAMIA\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. International Conference on Machine Learning (ICML)\",\n \"nickname\": \"ICML\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Computer Graphics and Applications\",\n \"nickname\": \"CG&A\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Biological Data Visualization (BioVis)\",\n \"nickname\": \"BioVis\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Poetics\",\n \"nickname\": \"Poetics\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM Web Search and Data Mining (WSDM)\",\n \"nickname\": \"WSDM\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. User Modeling and User-Adapted Interaction (UMUAI)\",\n \"nickname\": \"UMUAI\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Eye Tracking and Visualization (ETVIS)\",\n \"nickname\": \"ETVIS\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Trends in Ecology & Evolution\",\n \"nickname\": \"TREE\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"PLOS ONE\",\n \"nickname\": \"PLOS ONE\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM SIGMOD Human-in-the-Loop Data Analysis (HILDA)\",\n \"nickname\": \"HILDA\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. VAST)\",\n \"nickname\": \"VAST-TVCG\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Dealing with Cognitive Biases in Visualisations (DECISIVe), IEEE VIS\",\n \"nickname\": \"DECISIVe\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"arXiv\",\n \"nickname\": \"arXiv\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"The Journal of Open Source Software\",\n \"nickname\": \"JOSS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proceedings of the National Academy of Sciences\",\n \"nickname\": \"PNAS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Association for Computational Linguistics (ACL)\",\n \"nickname\": \"ACL\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Distill\",\n \"nickname\": \"Distill\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Harvard Data Science Review\",\n \"nickname\": \"HDSR\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Organizational Behavior and Human Decision Processes\",\n \"nickname\": \"OBHDP\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"EPJ Data Science\",\n \"nickname\": \"EPJ-DS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Symposium on Visual Languages and Human Centric Computing (VL/HCC)\",\n \"nickname\": \"VL/HCC\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Management of Data (SIGMOD)\",\n \"nickname\": \"SIGMOD\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Companion of ACM Management of Data (SIGMOD)\",\n \"nickname\": \"SIGMOD-Demo\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics\",\n \"nickname\": \"TVCG\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM Creativity & Cognition\",\n \"nickname\": \"C&C\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Workshop on Intelligent and Interactive Writing Assistants (In2Writing)\",\n \"nickname\": \"In2Writing\",\n \"venueType\": \"workshop\"\n }\n]\n"} {"status":200,"statusText":"","headers":{},"body":"{\n \"title\": \"Topic Modeling for the Social Sciences\",\n \"year\": 2009,\n \"start_page\": null,\n \"end_page\": null,\n \"volume\": null,\n \"issue\": null,\n \"editors\": \"\",\n \"publisher\": \"\",\n \"location\": \"\",\n \"pdf\": \"https://idl.cs.washington.edu/files/2009-TopicModels-NIPS-Workshop.pdf\",\n \"abstract\": \"As textual datasets grow in size and scope, social scientists need better tools to help make sense of that data. Despite the natural applicability of topic modeling to many such problems, word counts and tag clouds are often used as the primary means of gleaning information from textual data. We characterize two barriers to adoption encountered during a collaboration between the Stanford NLP group and social scientists in the school of education: accessibility and trust. Accessibility refers to the technical barriers that make text processing and topic modeling difficult. Trust comes when practitioners can explore and validate a model being used to discover or support a hypothesis. We introduce recent work aimed at solving these challenges including the Stanford Topic Modeling Toolbox software.\",\n \"thumbnail\": \"images/thumbs/topic-modeling-social-sciences.png\",\n \"figure\": \"\",\n \"caption\": \"\",\n \"web_name\": \"topic-modeling-social-sciences\",\n \"visible\": true,\n \"mod_date\": \"2010-08-23\",\n \"note\": \"\",\n \"pub_date\": \"2009-09-01\",\n \"venue\": \"NIPS Topic Model Ws\",\n \"authors\": [\n {\n \"first_name\": \"Daniel\",\n \"last_name\": \"Ramage\"\n },\n {\n \"first_name\": \"Evan\",\n \"last_name\": \"Rosen\"\n },\n {\n \"first_name\": \"Jason\",\n \"last_name\": \"Chuang\",\n \"url\": \"http://jason.chuang.info\"\n },\n {\n \"first_name\": \"Christopher\",\n \"last_name\": \"Manning\",\n \"display_name\": \"Christopher D. Manning\"\n },\n {\n \"first_name\": \"Dan\",\n \"last_name\": \"McFarland\",\n \"display_name\": \"Daniel A. McFarland\"\n }\n ],\n \"materials\": [\n {\n \"name\": \"Software\",\n \"link\": \"http://nlp.stanford.edu/software/tmt/tmt-0.2/\"\n }\n ],\n \"tags\": [],\n \"doi\": null\n}"}

{ __sveltekit_17copn9 = { base: new URL("..", location).pathname.slice(0, -1), assets: "/uwdata.github.io" }; const element = document.currentScript.parentElement; const data = [null,null]; Promise.all([ import("../_app/immutable/entry/start.CZdZnu7S.js"), import("../_app/immutable/entry/app.qRA-U4ZQ.js") ]).then(([kit, app]) => { kit.start(app, element, { node_ids: [0, 7], data, form: null, error: null }); }); }