Metadata
Title
An Algorithm and Analysis of Social Topologies from Email and Photo Tags
Category
general
UUID
f5dd005e882944dfacd9164cb76a9cfc
Source URL
https://idl.uw.edu/papers/analysis-social-topologies
Parent URL
https://idl.uw.edu/papers
Crawl Time
2026-03-11T03:13:31+00:00
Rendered Raw Markdown
# An Algorithm and Analysis of Social Topologies from Email and Photo Tags

**Source**: https://idl.uw.edu/papers/analysis-social-topologies
**Parent**: https://idl.uw.edu/papers

T. J. Purtell, [Diana MacLean](http://www.stanford.edu/~malcdi/), Seng Keat Teh, Sudheendra Hangal, Monica S. Lam, [Jeffrey Heer](http://homes.cs.washington.edu/~jheer/).
Proc. Workshop on Social Network Mining & Analysis, ACM KDD, 2011

T. J. Purtell, [Diana MacLean](http://www.stanford.edu/~malcdi/), Seng Keat Teh, Sudheendra Hangal, Monica S. Lam, [Jeffrey Heer](http://homes.cs.washington.edu/~jheer/)

Proc. Workshop on Social Network Mining & Analysis, ACM KDD, 2011

Materials

[PDF](https://idl.cs.washington.edu/files/2011-SocialTopologies-SNAKDD.pdf)  | [Application](http://mobisocial.stanford.edu/groupgenie/) | [Software](http://github.com/mobisocial/groupgenie-algo/)

Abstract

As peoples' participation in social media increases, online social identities accumulate contacts and data. We need a mechanism for creating a succinct but contextually rich representation of a person’s "social landscape" that would facilitate activities such as browsing personal social media feeds, or sharing data with nuanced social groups.
We formulate the social topology extraction problem as the compression of a group-tagged data set in which each group has a significance value, into a set containing a smaller number of overlapping and nested groups that best represent the value of the initial data set. We present four variants of a greedy algorithm that constructs a user's social topology based on egocentric, group communication data. We analyze our algorithm variants on about 2,000 personal email accounts and 1,100 tagged Facebook photograph collections. We find that our algorithm variants produce different topologies suitable for different purposes.
We show that our algorithm can capture 80% of the input data set value with 20% and 42% of the number of input groups for email and photographs respectively. Using edit distance as an objective metric, we also show that our algorithm outperforms results generated by Newman’s modularity-based clustering algorithm. We conclude that our algorithm is appropriately designed to find significant groups of friends from social contact data.

BibTeX

```
@inproceedings{2011-analysis-social-topologies,
  title = {An Algorithm and Analysis of Social Topologies from Email and Photo Tags},
  author = {Purtell, T. J. AND MacLean, Diana AND Teh, Seng Keat AND Hangal, Sudheendra AND Lam, Monica AND Heer, Jeffrey},
  booktitle = {Proc. Workshop on Social Network Mining \& Analysis, ACM KDD},
  year = {2011},
  url = {https://idl.uw.edu/papers/analysis-social-topologies}
}
```

{"status":200,"statusText":"","headers":{},"body":"[\n {\n \"fullName\": \"Proc. ACM Human Factors in Computing Systems (CHI)\",\n \"nickname\": \"CHI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. VIS)\",\n \"nickname\": \"VIS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Computer Graphics Forum (Proc. EuroVis)\",\n \"nickname\": \"EuroVis\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. EuroVis Short Papers\",\n \"nickname\": \"EuroVis-Short\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. IEEE VIS Short Papers\",\n \"nickname\": \"VIS-Short\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM User Interface Software & Technology (UIST)\",\n \"nickname\": \"UIST\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Computer-Supported Cooperative Work (CSCW)\",\n \"nickname\": \"CSCW\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Intelligent User Interfaces\",\n \"nickname\": \"IUI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"ACM Trans. on Computer-Human Interaction\",\n \"nickname\": \"ACM TOCHI\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Advanced Visual Interfaces\",\n \"nickname\": \"AVI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Conference on Innovative Data Systems Research (CIDR)\",\n \"nickname\": \"CIDR\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Very Large Database Endowment (PVLDB)\",\n \"nickname\": \"PVLDB\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Empirical Methods in Natural Language Processing\",\n \"nickname\": \"EMNLP\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. NAACL-HLT\",\n \"nickname\": \"NAACL-HLT\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. International Conference on Weblogs and Social Media (ICWSM)\",\n \"nickname\": \"ICWSM\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. InfoVis)\",\n \"nickname\": \"InfoVis\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Beautiful Data\",\n \"nickname\": \"Beautiful Data\",\n \"venueType\": \"book\"\n },\n {\n \"fullName\": \"Information Visualization Journal\",\n \"nickname\": \"IV Journal\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Visual Analytics Science & Technology (VAST)\",\n \"nickname\": \"VAST\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Cortex\",\n \"nickname\": \"Cortex\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Hawaii International Conference on Systems Sciences (HICSS)\",\n \"nickname\": \"HICSS\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. IEEE Information Visualization (InfoVis)\",\n \"nickname\": \"InfoVis (Pre-TVCG)\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Ubiquitous Computing\",\n \"nickname\": \"UbiComp\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. WEBKDD Workshop\",\n \"nickname\": \"WEBKDD\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"ACM Trans. on Information Systems\",\n \"nickname\": \"ACM TOIS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Communications of the ACM\",\n \"nickname\": \"CACM\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Social Network Mining & Analysis, ACM KDD\",\n \"nickname\": \"SNAKDD\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Social Visualization Workshop, ACM CHI\",\n \"nickname\": \"CHI Social Vis\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. AVI Workshop on Invisible & Transparent Interfaces\",\n \"nickname\": \"AVI ITI\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Color Imaging Conference\",\n \"nickname\": \"Color Imaging Conf.\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Workshop on Applications for Topic Models, NIPS\",\n \"nickname\": \"NIPS Topic Model Ws\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Mining Software Repositories\",\n \"nickname\": \"MSR\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Journal of Animal Ecology\",\n \"nickname\": \"J Anim Eco\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"J Am Med Inform Assoc\",\n \"nickname\": \"JAMIA\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. International Conference on Machine Learning (ICML)\",\n \"nickname\": \"ICML\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Computer Graphics and Applications\",\n \"nickname\": \"CG&A\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Biological Data Visualization (BioVis)\",\n \"nickname\": \"BioVis\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Poetics\",\n \"nickname\": \"Poetics\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM Web Search and Data Mining (WSDM)\",\n \"nickname\": \"WSDM\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. User Modeling and User-Adapted Interaction (UMUAI)\",\n \"nickname\": \"UMUAI\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Eye Tracking and Visualization (ETVIS)\",\n \"nickname\": \"ETVIS\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Trends in Ecology & Evolution\",\n \"nickname\": \"TREE\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"PLOS ONE\",\n \"nickname\": \"PLOS ONE\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM SIGMOD Human-in-the-Loop Data Analysis (HILDA)\",\n \"nickname\": \"HILDA\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. VAST)\",\n \"nickname\": \"VAST-TVCG\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Dealing with Cognitive Biases in Visualisations (DECISIVe), IEEE VIS\",\n \"nickname\": \"DECISIVe\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"arXiv\",\n \"nickname\": \"arXiv\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"The Journal of Open Source Software\",\n \"nickname\": \"JOSS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proceedings of the National Academy of Sciences\",\n \"nickname\": \"PNAS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Association for Computational Linguistics (ACL)\",\n \"nickname\": \"ACL\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Distill\",\n \"nickname\": \"Distill\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Harvard Data Science Review\",\n \"nickname\": \"HDSR\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Organizational Behavior and Human Decision Processes\",\n \"nickname\": \"OBHDP\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"EPJ Data Science\",\n \"nickname\": \"EPJ-DS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Symposium on Visual Languages and Human Centric Computing (VL/HCC)\",\n \"nickname\": \"VL/HCC\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Management of Data (SIGMOD)\",\n \"nickname\": \"SIGMOD\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Companion of ACM Management of Data (SIGMOD)\",\n \"nickname\": \"SIGMOD-Demo\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics\",\n \"nickname\": \"TVCG\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM Creativity & Cognition\",\n \"nickname\": \"C&C\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Workshop on Intelligent and Interactive Writing Assistants (In2Writing)\",\n \"nickname\": \"In2Writing\",\n \"venueType\": \"workshop\"\n }\n]\n"}
{"status":200,"statusText":"","headers":{},"body":"{\n \"title\": \"An Algorithm and Analysis of Social Topologies from Email and Photo Tags\",\n \"year\": 2011,\n \"start\_page\": null,\n \"end\_page\": null,\n \"volume\": null,\n \"issue\": null,\n \"editors\": \"\",\n \"publisher\": \"\",\n \"location\": \"\",\n \"pdf\": \"https://idl.cs.washington.edu/files/2011-SocialTopologies-SNAKDD.pdf\",\n \"abstract\": \"As peoples' participation in social media increases, online social identities accumulate contacts and data. We need a mechanism for creating a succinct but contextually rich representation of a person’s \\\"social landscape\\\" that would facilitate activities such as browsing personal social media feeds, or sharing data with nuanced social groups.\\r\\n\\r\\nWe formulate the social topology extraction problem as the compression of a group-tagged data set in which each group has a significance value, into a set containing a smaller number of overlapping and nested groups that best represent the value of the initial data set. We present four variants of a greedy algorithm that constructs a user's social topology based on egocentric, group communication data. We analyze our algorithm variants on about 2,000 personal email accounts and 1,100 tagged Facebook photograph collections. We find that our algorithm variants produce different topologies suitable for different purposes.\\r\\n\\r\\nWe show that our algorithm can capture 80% of the input data set value with 20% and 42% of the number of input groups for email and photographs respectively. Using edit distance as an objective metric, we also show that our algorithm outperforms results generated by Newman’s modularity-based clustering algorithm. We conclude that our algorithm is appropriately designed to find significant groups of friends from social contact data.\",\n \"thumbnail\": \"images/thumbs/analysis-social-topologies.gif\",\n \"figure\": \"\",\n \"caption\": \"\",\n \"web\_name\": \"analysis-social-topologies\",\n \"visible\": true,\n \"mod\_date\": \"2011-08-05\",\n \"note\": \"\",\n \"pub\_date\": \"2011-07-01\",\n \"venue\": \"SNAKDD\",\n \"authors\": [\n {\n \"first\_name\": \"T. J.\",\n \"last\_name\": \"Purtell\"\n },\n {\n \"first\_name\": \"Diana\",\n \"last\_name\": \"MacLean\",\n \"url\": \"http://www.stanford.edu/~malcdi/\"\n },\n {\n \"first\_name\": \"Seng Keat\",\n \"last\_name\": \"Teh\"\n },\n {\n \"first\_name\": \"Sudheendra\",\n \"last\_name\": \"Hangal\"\n },\n {\n \"first\_name\": \"Monica\",\n \"last\_name\": \"Lam\",\n \"display\_name\": \"Monica S. Lam\"\n },\n {\n \"first\_name\": \"Jeffrey\",\n \"last\_name\": \"Heer\",\n \"url\": \"http://homes.cs.washington.edu/~jheer/\"\n }\n ],\n \"materials\": [\n {\n \"name\": \"Application\",\n \"link\": \"http://mobisocial.stanford.edu/groupgenie/\"\n },\n {\n \"name\": \"Software\",\n \"link\": \"http://github.com/mobisocial/groupgenie-algo/\"\n }\n ],\n \"tags\": [],\n \"doi\": null\n}"}

{
\_\_sveltekit\_17copn9 = {
base: new URL("..", location).pathname.slice(0, -1),
assets: "/uwdata.github.io"
};
const element = document.currentScript.parentElement;
const data = [null,null];
Promise.all([
import("../\_app/immutable/entry/start.CZdZnu7S.js"),
import("../\_app/immutable/entry/app.qRA-U4ZQ.js")
]).then(([kit, app]) => {
kit.start(app, element, {
node\_ids: [0, 7],
data,
form: null,
error: null
});
});
}