(Enter summary)
Abstract: this report we give a survey of the state-of-the-art in text categorisation. To be able to
measure progress in this field, it is important to use a standardised collection of documents
for analysis and testing. One such data set is the Reuters-21578 collection of newswires
for the year 1987, and our survey will focus on the work on text categorisation that have
used this collection for testing. (Update)
Context of citations to this paper: More .... and text categorization.VanRijsbergen [11] has carried out seminal work in this area, while excellent modern surveys have been done in [12] and [13] while [14] also provides a helpful tutorial. Various techniques have been proposed that aim to develop accurate methods for... ...category. While binary and multi class problems were investigated extensively, multi label problems have received very little attention [1] . To assign documents to categories, text categorization methods usually employ dictionaries consisting of words extracted from training... Cited by: More
Using Text Categorization Techniques for Intrusion Detection - Yihua Liao Rao (2002)
(Correct)
A hierarchical text categorization approach and its.. - Tikk, Biró, Yang
(Correct)
Fuzzy Relational Thesauri in Information Retrieval.. - Tikk, Yang, Baranyi.. (2002)
(Correct)
Active bibliography (related documents): More All
0.5 : Dimensionality Optimization By Heuristic Greedy Learning Vs.. - Fu (1999)
(Correct)
0.3 : Hierarchical Text Categorization Using Fuzzy Relational.. - Tikk, Yang, Bang
(Correct)
0.3 : Text Categorization with Support Vector Machines: Learning with.. - Joachims (1997)
(Correct)
Users who viewed this document also viewed: More All
0.5 : An Evaluation of Statistical Approaches to Text Categorization - Yang (1997)
(Correct)
0.4 : Text-Learning and Related Intelligent Agents - Mladenic (1999)
(Correct)
0.4 : Automated Learning of Decision Rules for Text Categorization - Apte, Damerau, Weiss (1994)
(Correct)
Similar documents based on text: More All
0.5 : Decoding Bar Codes from Human-Readable Characters - Aas, Eikvil
(Correct)
0.4 : Applications of Hidden Markov Chains in Image Analysis - Aas, Eikvil, Huseby (1999)
(Correct)
0.4 : A Survey on: Content-based Access to Image and Video Databases - Aas, Eikvil (1997)
(Correct)
Related documents from co-citation: More All
3 : An object-based approach to managing domain specific thesauri: semiautomatic the.. (context) - Choi, Park et al. - 1998
3 : A fast algorithm for hierarchical text classification
- Chuang, Tiyyagura et al. - 2000
3 : An evaluation of statistical approaches to text categorization
- Yang - 1999
BibTeX entry: (Update)
K. Aas and L. Eikvil. Text categorisation: A survey. Technical report, Norwegian Computing Center, June 1999. http://citeseer.ist.psu.edu/aas99text.html More @misc{ aas99text,
author = "K. Aas and L. Eikvil",
title = "Text categorisation: A survey",
text = "K. Aas and L. Eikvil. Text categorisation: A survey. Technical report,
Norwegian Computing Center, June 1999.",
year = "1999",
url = "citeseer.ist.psu.edu/aas99text.html" }
Citations (may not include all citations):
1680
Pattern Classification and Scene analysis (context) - Duda, Hart - 1973
1044
Classification and Regression Trees (context) - Breiman, Friedman et al. - 1984 Book Details from Barnes & Noble
940
An Introduction to Modern Information Retrieval (context) - Salton, McGill - 1983 Book Details from Barnes & Noble
445
Bagging predictors
- Breiman - 1996
381
Indexing by Latent Semantic Analysis
- Deerwester, Dumais et al. - 1990
327
Experiments with a new boosting algorithm
- Freund, Shapire - 1996
304
Term weighting approaches in automatic text retrieval (context) - Salton, Buckley - 1988
252
An Algorithm for Suffix Stripping (context) - Porter - 1980
211
Text categorization with support vector machines: Learning w..
- Joachims - 1998
211
Text Categorization with Support Vector Machines: Learning w..
- Joachims - 1997
181
Relevance feedback in information retrieval (context) - Rocchio - 1971
136
Using linear algebra for intelligent information retrieval
- Berry, Dumais et al. - 1995
117
Additive Logistic Regression: a Statistical View of Boosting
- Friedman, Hastie et al. - 1998
99
An Evaluation of Statistical Approaches to Text Categorizati..
- Yang - 1997
91
A sequential algorithm for training text classifiers
- Lewis, Gale - 1994
87
A probabilistic analysis of the rocchio algorithm with TFIDF..
- Joachims - 1997
73
Context-sensitive learning methods for text categorization
- Cohen, Singer - 1996
58
Improving the retrieval information from external sources (context) - Dumais - 1991
44
A neural network approach to topic spotting
- Wiener, Pedersen et al. - 1993
35
Feature selection in statistical learning of text categoriza.. (context) - Yang, Pedersen - 1997
35
Automatic Query Expansion Using SMART: TREC
- Buckley, Salton et al. - 1994
21
An exploratory technique for investigating large quantities .. (context) - Kass - 1980
12
Department of Computer Science (context) - Berry, Do et al. - 1993
9
Information Management Tools for Updating an SVD-Encoded Ind.. (context) - O'Brien - 1994
4
A comparison of two learning algorithms for text classificat.. (context) - Lewis, Ringuette - 1994
2
Programming for machine learning (context) - Quinlan - 1993
2
Automatic Text Categorization Using Support Vector Machine (context) - Kwok - 1998
1
BoosTexter: A System for Multi-Label Text Categorization (context) - Shapire, Singer - 1998
1
Norwegian Computing Center (context) - Weiss, Apte et al. - 1999
1
Sahami Inductive Learning Algorithms and Representations for.. (context) - Dumais, Platt et al. - 1998 Documents on the same site (http://www.nr.no/research/samba/textmining.html):
Information Extraction from World Wide Web - A Survey - Eikvil (1999)
(Correct)
Online articles have much greater impact More about CiteSeer.PSU Add search form to your site Submit documents
Feedback: citeseer-f eedback at ist dot psu dot edu CiteSeer.PSU - Copyright NEC and IST