# LumberJack: Intelligent Discovery and Analysis of Web User Traffic Composition
**Source**: https://idl.uw.edu/papers/lumberjack
**Parent**: https://idl.uw.edu/papers
Ed H. Chi, Adam Rosien, [Jeffrey Heer](http://homes.cs.washington.edu/~jheer/).
Proc. WEBKDD Workshop, 2002
Ed H. Chi, Adam Rosien, [Jeffrey Heer](http://homes.cs.washington.edu/~jheer/)
Proc. WEBKDD Workshop, 2002
Materials
[PDF](https://idl.cs.washington.edu/files/2002-Lumberjack-WEBKDD.pdf)
Abstract
Web Usage Mining enables new understanding of user goals on the Web. This understanding has broad applications, and traditional mining techniques such as association rules have been used in business applications. We have developed an automated method to directly infer the major groupings of user traffic on a Web site [Heer01]. We do this by utilizing multiple data features in a clustering analysis. We have performed an extensive, systematic evaluation of the proposed approach, and have discovered that certain clustering schemes can achieve categorization accuracies as high as 99% [Heer02b]. In this paper, we describe the further development of this work into a prototype service called LumberJack, a push-button analysis system that is both more automated and accurate than past systems.
BibTeX
```
@inproceedings{2002-lumberjack,
title = {LumberJack: Intelligent Discovery and Analysis of Web User Traffic Composition},
author = {Chi, Ed AND Rosien, Adam AND Heer, Jeffrey},
booktitle = {Proc. WEBKDD Workshop},
year = {2002},
pages = {1--16},
url = {https://idl.uw.edu/papers/lumberjack},
doi = {10.1007/978-3-540-39663-5_1}
}
```
{"status":200,"statusText":"","headers":{},"body":"[\n {\n \"fullName\": \"Proc. ACM Human Factors in Computing Systems (CHI)\",\n \"nickname\": \"CHI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. VIS)\",\n \"nickname\": \"VIS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Computer Graphics Forum (Proc. EuroVis)\",\n \"nickname\": \"EuroVis\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. EuroVis Short Papers\",\n \"nickname\": \"EuroVis-Short\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. IEEE VIS Short Papers\",\n \"nickname\": \"VIS-Short\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM User Interface Software & Technology (UIST)\",\n \"nickname\": \"UIST\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Computer-Supported Cooperative Work (CSCW)\",\n \"nickname\": \"CSCW\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Intelligent User Interfaces\",\n \"nickname\": \"IUI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"ACM Trans. on Computer-Human Interaction\",\n \"nickname\": \"ACM TOCHI\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Advanced Visual Interfaces\",\n \"nickname\": \"AVI\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Conference on Innovative Data Systems Research (CIDR)\",\n \"nickname\": \"CIDR\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Very Large Database Endowment (PVLDB)\",\n \"nickname\": \"PVLDB\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Empirical Methods in Natural Language Processing\",\n \"nickname\": \"EMNLP\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. NAACL-HLT\",\n \"nickname\": \"NAACL-HLT\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. International Conference on Weblogs and Social Media (ICWSM)\",\n \"nickname\": \"ICWSM\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. InfoVis)\",\n \"nickname\": \"InfoVis\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Beautiful Data\",\n \"nickname\": \"Beautiful Data\",\n \"venueType\": \"book\"\n },\n {\n \"fullName\": \"Information Visualization Journal\",\n \"nickname\": \"IV Journal\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Visual Analytics Science & Technology (VAST)\",\n \"nickname\": \"VAST\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Cortex\",\n \"nickname\": \"Cortex\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Hawaii International Conference on Systems Sciences (HICSS)\",\n \"nickname\": \"HICSS\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. IEEE Information Visualization (InfoVis)\",\n \"nickname\": \"InfoVis (Pre-TVCG)\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Ubiquitous Computing\",\n \"nickname\": \"UbiComp\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. WEBKDD Workshop\",\n \"nickname\": \"WEBKDD\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"ACM Trans. on Information Systems\",\n \"nickname\": \"ACM TOIS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Communications of the ACM\",\n \"nickname\": \"CACM\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Social Network Mining & Analysis, ACM KDD\",\n \"nickname\": \"SNAKDD\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Social Visualization Workshop, ACM CHI\",\n \"nickname\": \"CHI Social Vis\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. AVI Workshop on Invisible & Transparent Interfaces\",\n \"nickname\": \"AVI ITI\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Color Imaging Conference\",\n \"nickname\": \"Color Imaging Conf.\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. Workshop on Applications for Topic Models, NIPS\",\n \"nickname\": \"NIPS Topic Model Ws\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Proc. Mining Software Repositories\",\n \"nickname\": \"MSR\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Journal of Animal Ecology\",\n \"nickname\": \"J Anim Eco\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"J Am Med Inform Assoc\",\n \"nickname\": \"JAMIA\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. International Conference on Machine Learning (ICML)\",\n \"nickname\": \"ICML\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Computer Graphics and Applications\",\n \"nickname\": \"CG&A\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Biological Data Visualization (BioVis)\",\n \"nickname\": \"BioVis\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Poetics\",\n \"nickname\": \"Poetics\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM Web Search and Data Mining (WSDM)\",\n \"nickname\": \"WSDM\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. User Modeling and User-Adapted Interaction (UMUAI)\",\n \"nickname\": \"UMUAI\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Eye Tracking and Visualization (ETVIS)\",\n \"nickname\": \"ETVIS\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"Trends in Ecology & Evolution\",\n \"nickname\": \"TREE\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"PLOS ONE\",\n \"nickname\": \"PLOS ONE\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM SIGMOD Human-in-the-Loop Data Analysis (HILDA)\",\n \"nickname\": \"HILDA\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics (Proc. VAST)\",\n \"nickname\": \"VAST-TVCG\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Workshop on Dealing with Cognitive Biases in Visualisations (DECISIVe), IEEE VIS\",\n \"nickname\": \"DECISIVe\",\n \"venueType\": \"workshop\"\n },\n {\n \"fullName\": \"arXiv\",\n \"nickname\": \"arXiv\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"The Journal of Open Source Software\",\n \"nickname\": \"JOSS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proceedings of the National Academy of Sciences\",\n \"nickname\": \"PNAS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. Association for Computational Linguistics (ACL)\",\n \"nickname\": \"ACL\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Distill\",\n \"nickname\": \"Distill\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Harvard Data Science Review\",\n \"nickname\": \"HDSR\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Organizational Behavior and Human Decision Processes\",\n \"nickname\": \"OBHDP\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"EPJ Data Science\",\n \"nickname\": \"EPJ-DS\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. IEEE Symposium on Visual Languages and Human Centric Computing (VL/HCC)\",\n \"nickname\": \"VL/HCC\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Proc. ACM Management of Data (SIGMOD)\",\n \"nickname\": \"SIGMOD\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Companion of ACM Management of Data (SIGMOD)\",\n \"nickname\": \"SIGMOD-Demo\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"IEEE Trans. Visualization & Comp. Graphics\",\n \"nickname\": \"TVCG\",\n \"venueType\": \"journal\"\n },\n {\n \"fullName\": \"Proc. ACM Creativity & Cognition\",\n \"nickname\": \"C&C\",\n \"venueType\": \"conference\"\n },\n {\n \"fullName\": \"Workshop on Intelligent and Interactive Writing Assistants (In2Writing)\",\n \"nickname\": \"In2Writing\",\n \"venueType\": \"workshop\"\n }\n]\n"}
{"status":200,"statusText":"","headers":{},"body":"{\n \"title\": \"LumberJack: Intelligent Discovery and Analysis of Web User Traffic Composition\",\n \"year\": 2002,\n \"start\_page\": 1,\n \"end\_page\": 16,\n \"volume\": null,\n \"issue\": null,\n \"editors\": \"\",\n \"publisher\": \"\",\n \"location\": \"\",\n \"pdf\": \"https://idl.cs.washington.edu/files/2002-Lumberjack-WEBKDD.pdf\",\n \"abstract\": \"Web Usage Mining enables new understanding of user goals on the Web. This understanding has broad applications, and traditional mining techniques such as association rules have been used in business applications. We have developed an automated method to directly infer the major groupings of user traffic on a Web site [Heer01]. We do this by utilizing multiple data features in a clustering analysis. We have performed an extensive, systematic evaluation of the proposed approach, and have discovered that certain clustering schemes can achieve categorization accuracies as high as 99% [Heer02b]. In this paper, we describe the further development of this work into a prototype service called LumberJack, a push-button analysis system that is both more automated and accurate than past systems.\",\n \"thumbnail\": \"images/thumbs/lumberjack.gif\",\n \"figure\": \"\",\n \"caption\": \"\",\n \"web\_name\": \"lumberjack\",\n \"visible\": true,\n \"mod\_date\": \"2009-08-04\",\n \"note\": null,\n \"pub\_date\": \"2002-04-02\",\n \"venue\": \"WEBKDD\",\n \"authors\": [\n {\n \"first\_name\": \"Ed\",\n \"last\_name\": \"Chi\",\n \"display\_name\": \"Ed H. Chi\"\n },\n {\n \"first\_name\": \"Adam\",\n \"last\_name\": \"Rosien\"\n },\n {\n \"first\_name\": \"Jeffrey\",\n \"last\_name\": \"Heer\",\n \"url\": \"http://homes.cs.washington.edu/~jheer/\"\n }\n ],\n \"materials\": [],\n \"tags\": [],\n \"doi\": \"10.1007/978-3-540-39663-5\_1\"\n}"}
{
\_\_sveltekit\_17copn9 = {
base: new URL("..", location).pathname.slice(0, -1),
assets: "/uwdata.github.io"
};
const element = document.currentScript.parentElement;
const data = [null,null];
Promise.all([
import("../\_app/immutable/entry/start.CZdZnu7S.js"),
import("../\_app/immutable/entry/app.qRA-U4ZQ.js")
]).then(([kit, app]) => {
kit.start(app, element, {
node\_ids: [0, 7],
data,
form: null,
error: null
});
});
}