next up previous contents
Next: IR, IE, QA Up: Statistical Natural Language Processing Previous: Language Identification and Authorship   Contents

Document Classification and Text Categorization

  1. Text Categorization Based on Regularized Linear Classification Methods.
    2001.
    Tong Zhang, Frank J. Oles
    Information Retrieval, 4(1). pages 5-31.

  2. NewsWeeder: Learning to Filter Netnews
    1995
    Ken Lang
    Proceedings of the 12th International Conference on Machine Learning

  3. Shuigeng Zhou and Jihong Guan
    2002
    Chinese document classification based on N-grams.
    Proceedings of The Third International Conference of Computational Linguistics and Intelligent Text Processing, (CICLing2002). pages 405-414. Mexico-City, Mexico, February 17-23, 2002.

  4. Hersh WR, Buckley C, Leone TJ, Hickam DH,
    1994
    OHSUMED: an interactive retrieval evaluation and new large test collection for research
    Proceedings of the 17th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, 1994, 192-201.

  5. Daphne Koller, Mehran Sahami
    1997
    Hierarchically classifying documents using very few words
    Proceedings of ICML-97, 14th International Conference on Machine Learning

  6. Ji He, Ah-Hwee Tan, and Chew-Lim Tan.
    2001
    "On Machine Learning Methods for Chinese Documents Classification".
    Applied Intelligence's Special Issue on Text and Web Mining, in press. 2001.

  7. William B. Cavnar, John M. Trenkle
    1994
    N-Gram-Based Text Categorization
    Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval

  8. Apte,C., Damerau, F., and Weiss,S.,
    1994
    Towards language independent automated learning of text categorization models,
    In Proceedings of the 17th Annual ACM/SIGIR conference, 1994.

  9. David D. Lewis, William A. Gale
    1994
    A Sequential Algorithm for Training Text Classifiers
    In Proceedings of the 17th Annual ACM/SIGIR conference, 1994.

  10. Shivakumar Vaithyanathan Jianchang Mao Byron Dom
    2000.
    Hierarchical Bayes for Text Classification
    PRICAI Workshop on Text and Web Mining

  11. Andrew McCallum.
    1999
    Multi-Label Text Classification with a Mixture Model Trained by EM.
    Revised version of paper appearing in AAAI'99 Workshop on Text Learning.

  12. Doug Baker, Andrew McCallum.
    1998.
    Distributional Clustering of Words for Text Classification.
    SIGIR-98.

  13. Bing Liu, Wee Sun Lee, Philip S Yu and Xiaoli Li.
    2002
    Partially Supervised Classification of Text Documents.
    To appear in the Proceedings of the Nineteenth International Conference on Machine Learning (ICML-2002)

  14. Fabrizio Sebastiani, Alessandro Sperduti and Nicola Valdambrini
    2000
    An improved boosting algorithm and its application to automated text categorization.
    In Arvin Agah, Jamie Callan and Elke Rundensteiner (eds.), Proceedings of CIKM-00, 9th ACM International Conference on Information and Knowledge Management, McLean, US, 2000, pp. 78-85.

  15. Fabrizio Sebastiani
    2002
    Machine learning in automated text categorization.
    ACM Computing Surveys, 34(1):1-47, 2002.

  16. David D. Lewis, Robert E. Schapire, James P. Callan, and Ron Papka.
    1996.
    Training algorithms for linear text classifiers.
    In Hans-Peter Frei, Donna Harman, Peter Schauble, and Ross Wilkinson, editors, SIGIR '96: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 298-306, Konstanz, 1996. Hartung-Gorre Verlag.

  17. David D. Lewis and Marc Ringuette.
    1994
    A comparison of two learning algorithms for text categorization.
    In Third Annual Symposium on Document Analysis and Information Retrieval, pages 81-93, Las Vegas, NV, April 11-13 1994. ISRI; Univ. of Nevada, Las Vegas.

  18. David D. Lewis.
    1996
    Challenges in machine learning for text classification.
    In Proceedings of the Ninth Annual Conference on Computational Learning Theory, page 1, New York, 1996. ACM.

  19. David Lewis
    1995
    Evaluating and Optimizing Autonomous Text Classification Systems
    Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 246-254. Seattle, Washington.

  20. Lewis D D,
    1992
    "Representation and Learning in Information Retrieval",
    Ph.D. dissertation, University of Massachusetts, 1992

  21. D. D. Lewis.
    1992
    An evaluation of phrasal and clustered representations on a text categorization task.
    In Proceedings of the 15th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 37-50, June 1992.

  22. Robert E. Schapire and Yoram Singer.
    2000
    BoosTexter: A boosting-based system for text categorization.
    Machine Learning, 39(2/3):135-168, 2000.

  23. Rayid Ghani.
    2001.
    Using Error-Correcting Codes for Efficient Text Classification with a Large Number of Categories.
    Masters Project Report. Center for Automated Learning & Discovery, Carnegie Mellon University (2001)

  24. Sam Scott, Stan Matwin.
    1999.
    Feature Engineering for Text Classification.
    Machine Learning: Proceedings of the Sixteenth International Conference 1999, pp. 379-388.

  25. Sam Scott
    1998.
    Feature Engineering for a Symbolic Approach to Text Classification
    Masters Thesis. University of Ottawa, Computer Science Department Technical Report TR-98-09.

  26. Susana Eyheramendy, David Lewis, David Madigan
    2003
    On the Naive Bayes Model for Text Categorization
    In Proceedings of Artificial Intelligence & Statistics 2003. Key West, FL.

  27. Andrew McCallum and Kamal Nigam
    1998.
    A Comparison of Event Models for Naive Bayes Text Classification. Andrew McCallum and Kamal Nigam. AAAI-98 Workshop on "Learning for Text Categorization". AAAI-98 Workshop on "Learning for Text Categorization".

  28. T. Joachims,
    2001
    A Statistical Learning Model of Text Classification with Support Vector Machines.
    Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), ACM, 2001.

  29. T. Joachims,
    1998
    Text Categorization with Support Vector Machines: Learning with Many Relevant Features.
    Proceedings of the European Conference on Machine Learning (ECML), Springer, 1998.

  30. C. Apte, F. Damerau, S.M. Weiss
    1994
    Automated Learning of Decision Rules for Text Categorization
    ACM Transactions on Information Systems, 12(3), pages 233-251

  31. William J. Teahan and David J. Harper.
    2001.
    Using Compression-Based Language Models for Text Categorization
    Workshop on Language Modeling and Information Retrieval, 2001.

  32. S. T. Dumais
    1998.
    Using SVMs for text categorization.
    In IEEE Intelligent Systems Magazine, Trends and Controversies, Marti Hearst, ed., 13(4), July/August 1998.

  33. S. T. Dumais, J. Platt, D. Heckerman and M. Sahami
    1998.
    Inductive learning algorithms and representations for text categorization.
    In Proceedings of ACM-CIKM98, Nov. 1998, pp. 148-155.

  34. Nitin Thaper
    2001
    Using compression for source based classification of text. MS thesis, MIT

  35. Andrew McCallum and Kamal Nigam.
    1998.
    Employing EM in Pool-Based Active Learning for Text Classification.
    ICML-98.

  36. Lars Kai Hansen, Sigurdur Sigurdsson, Thomas Kolenda, Finn Arup Nielsen.
    2000.
    Modeling Text With Generalizable Gaussian Mixtures.
    ICASSP'2000 Istanbul, Turkey, 2000 June

  37. Kamal Nigam, Andrew McCallum, Sebastian Thrun and Tom Mitchell.
    Text Classification from Labeled and Unlabeled Documents using EM.
    Machine Learning, 39(2/3). pp. 103-134. 2000.

  38. Rayid Ghani,Sean Slattery,Yiming Yang
    2001.
    Hypertext Categorization using Hyperlink Patterns and Meta Data.
    ICML 2001.

  39. Yiming Yang, Sean Slattery and Rayid Ghani.
    2001.
    A Study of Approaches for Hypertext Categorization.
    To Appear in the Journal of Intelligent Information systems - Special Issue on Automatic Text Categorization (2001).

  40. Yiming Yang and Xin Liu
    1999
    A re-examination of text categorization methods.
    Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'99, pp 42-49), 1999.

  41. Yiming Yang
    1999
    An evaluation of statistical approaches to text categorization.
    Journal of Information Retrieval, Vol 1, No. 1/2, pp 67-88, 1999.


next up previous contents
Next: IR, IE, QA Up: Statistical Natural Language Processing Previous: Language Identification and Authorship   Contents
Fuchun Peng 2003-08-21