Paper Abstracts

(Publications are ordered in reverse chronological order. For a breakdown by publication type, please feel free to send a request for a copy of my CV.)


Inductive Transfer for Text Classification using Generalized Reliability Indicators (pdf)
Paul N. Bennett, Susan T. Dumais, Eric Horvitz
Proceedings of the ICML-2003 Workshop on The Continuum from Labeled to Unlabeled Data, Washington DC, U.S.A., August 2003.

Machine-learning researchers face the omnipresent challenge of developing predictive models that converge rapidly in accuracy with increases in the quantity of scarce labeled training data. We introduce Layered Abstraction-Based Ensemble Learning (LABEL), a method that shows promise in improving generalization performance by exploiting additional labeled data drawn from related discrimination tasks within a corpus and from other corpora. LABEL first maps the original feature space, targeted at predicting membership in a specific topic, to a new feature space aimed at modeling the reliability of an ensemble of text classifiers. The resulting abstracted representation is invariant across each of the binary discrimination tasks, allowing the data to be pooled. We then construct a context-sensitive combination rule for each task using the pooled data. Thus, we are able to more accurately model domain structure which would not have been possible using only the limited labeled data from each task separately. Using several corpora for an empirical evaluation of topic classification accuracy of text documents, we demonstrate that LABEL can increase the generalization performance across a set of related tasks.



Using Asymmetric Distributions to Improve Text Classifier Probability Estimates    (pdf)
Paul N. Bennett
Proceedings of 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, July 28 - August 1, 2003. ACM Press.
An earlier version is available as CMU-CS-02-126.

Text classifiers that give probability estimates are more readily applicable in a variety of scenarios. For example, rather than choosing one set decision threshold, they can be used in a Bayesian risk model to issue a run-time decision which minimizes a user-specified cost function dynamically chosen at prediction time. However, the quality of the probability estimates is crucial. We review a variety of standard approaches to converting scores (and poor probability estimates) from text classifiers to high quality estimates and introduce new models motivated by the intuition that the empirical score distribution for the "extremely irrelevant", "hard to discriminate", and "obviously relevant" items are often significantly different. Finally, we analyze the experimental performance of these models over the outputs of two text classifiers. The analysis demonstrates that one of these models is theoretically attractive (introducing few new parameters while increasing flexibility), computationally efficient, and empirically preferable.



Reducing Boundary Friction Using Translation-Fragment Overlap (pdf)
Ralf Brown, Rebecca Hutchinson, Paul N. Bennett, Jaime Carbonell, Peter Jansen
Proceedings of the Machine Translation Summit IX, New Orleans, U.S.A., September 2003.
(Shares significant overlap in content with publication listing for CMU-CS-03-138, 2003.)

Many corpus-based Machine Translation (MT) systems generate a number of partial translations which are then pieced together rather than immediately producing one overall translation. While this makes them more robust to ill-formed input, they are subject to disfluencies at phrasal translation boundaries even for well-formed input. We address this "boundary friction" problem by introducing a method that exploits overlapping phrasal translations and the increased confidence in translation accuracy they imply. We specify an efficient algorithm for producing translations using overlap. Finally, our empirical analysis indicates that this approach produces higher quality translations than the standard method of combining non-overlapping fragments generated by our Example-Based MT (EBMT) system in a peak-to-peak comparison.



Maximal Lattice Overlap in Example-Based Machine Translation (pdf)
Rebecca Hutchinson, Paul N. Bennett, Jaime Carbonell, Peter Jansen, Ralf Brown
CMU-CS-03-138, Computer Science Department, School of Computer Science, Carnegie Mellon University, June 2003.
(Also listed as CMU-LTI-03-174. Shares significant overlap in content with publication listing for MT Summit IX, 2003.)

Example-Based Machine Translation (EBMT) retrieves pre-translated phrases from a sentence-aligned bilingual training corpus to translate new input sentences. EBMT uses long pre-translated phrases effectively but is subject to disfluencies at phrasal translation boundaries. We address this problem by introducing a novel method that exploits overlapping phrasal translations and the increased confidence in translation accuracy they imply. We specify an efficient algorithm for producing translations using overlap. Finally, our empirical analysis indicates that this approach produces higher quality translations than the standard method of EBMT in a peak-to-peak comparison.



Probabilistic Combination of Text Classifiers Using Reliability Indicators: Models and Results    (pdf)
Paul N. Bennett, Susan T. Dumais, Eric Horvitz
Proceedings of 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, August 2002. ACM Press.

The intuition that different text classifiers behave in qualitatively different ways has long motivated attempts to build a better metaclassifier via some combination of classifiers. We introduce a probabilistic method for combining classifiers that considers the context-sensitive reliabilities of contributing classifiers. The method harnesses reliability indicators --variables that provide a valuable signal about the performance of classifiers in different situations. We provide background, present procedures for building metaclassifiers that take into consideration both reliability indicators and classifier outputs, and review a set of comparative studies undertaken to evaluate the methodology.



Assessing the Calibration of Naive Bayes' Posterior Estimates (pdf)
Paul N. Bennett
CMU-CS-00-155, Computer Science Department, School of Computer Science, Carnegie Mellon University, September 2000.

In this paper, we give evidence that the posterior distribution of Naive Bayes goes to zero or one exponentially with document length. While exponential change may be expected as new bits of information are added, adding new words does not always correspond to new information. Essentially as a result of its independence assumption, the estimates grow too quickly. We investigate one parametric family that attempts to downweight the growth rate. The parameters of this family are estimated using a maximum likelihood scheme, and the results are evaluated.



Book Recommending Using Text Categorization with Extracted Information (pdf)
Raymond J. Mooney, Paul N. Bennett and Loriene Roy
Appeared in the AAAI-98/ICML-98 Workshop on Learning for Text Categorization and the AAAI-98 Workshop on Recommender Systems, Madison, WI, July 1998.

Content-based recommender systems suggest documents, items, and services to users based on learning a profile of the user from rated examples containing information about the given items. Text categorization methods are very useful for this task but generally rely on unstructured text. We have developed a book-recommending system that utilizes semi-structured information about items gathered from the web using simple information extraction techniques. Initial experimental results demonstrate that this approach can produce fairly accurate recommendations.



Text Categorization Through Probabilistic Learning: Applications to Recommender Systems (pdf)
Paul N. Bennett
Undergraduate Honor Thesis, Department of Computer Sciences, University of Texas at Austin, May 1998. Also appears as AI TR 98-270.

With the growth of the World Wide Web, recommender systems have received an increasing amount of attention. Many recommender systems in use today are based on collaborative filtering. This project has focused on LIBRA, a content-based book recommending system. By utilizing text categorization methods and the information available for each book, the system determines a user profile which is used as the basis of recommendations made to the user. Instead of the bag-of-words approach used in many other statistical text categorization approaches, LIBRA parses each text sample into a semi-structured representation. We have used standard Machine Learning techniques to analyze the performance of several algorithms on this learning task. In addition, we analyze the utility of several methods of feature construction and selection (i.e. methods of choosing the representation of an item that the learning algorithm actually uses). After analyzing the system we conclude that good recommendations are produced after a relatively small number of training examples. We also conclude that the feature selection method tested does not improve the performance of these algorithms in any systematic way, though the results indicate other feature selection methods may prove useful. Feature construction, however, while not providing a large increase in performance with the particular construction methods used here, holds promise of providing performance improvements for the algorithms investigated. This text assumes only minor familiarity with concepts of artificial intelligence and should be readable by the upper division computer science undergraduate familiar with basic concepts of probability theory and set theory.




Main Page

Paul N. Bennett
Last modified: Fri Jul 25 03:24:42 EDT 2000