- Title Pages
- Series Foreword
- Preface
-
1 Introduction to Semi-Supervised Learning -
1 A Taxonomy for Semi-Supervised Learning Methods -
3 Semi-Supervised Text Classification Using EM -
4 Risks of Semi-Supervised Learning: How Unlabeled Data Can Degrade Performance of Generative Classifiers -
5 Probabilistic Semi-Supervised Clustering with Constraints -
6 Transductive Support Vector Machines -
7 Semi-Supervised Learning Using Semi-Definite Programming -
8 Gaussian Processes and the Null-Category Noise Model -
9 Entropy Regularization -
10 Data-Dependent Regularization -
11 Label Propagation and Quadratic Criterion -
12 The Geometric Basis of Semi-Supervised Learning -
13 Discrete Regularization -
14 Semi-Supervised Learning with Conditional Harmonic Mixing -
15 Graph Kernels by Spectral Transforms -
16 Spectral Methods for Dimensionality Reduction -
17 Modifying Distances -
18 Large-Scale Algorithms -
19 Semi-Supervised Protein Classification Using Cluster Kernels -
20 Prediction of Protein Function from Networks -
25 Analysis of Benchmarks -
22 An Augmented PAC Model for Semi-Supervised Learning -
23 Metric-Based Approaches for Semi-Supervised Regression and Classification -
24 Transductive Inference and Semi-Supervised Learning -
25 A Discussion of Semi-Supervised Learning and Transduction - References
- Notation and Symbols
- Contributors
- Index
Entropy Regularization
Entropy Regularization
- Chapter:
- (p.151) 9 Entropy Regularization
- Source:
- Semi-Supervised Learning
- Author(s):
Grandvalet Yves
Bengio Yoshua
- Publisher:
- The MIT Press
This chapter promotes the use of entropy regularization as a means to benefit from unlabeled data in the framework of maximum a posteriori estimation. The learning criterion is derived from clearly stated assumptions and can be applied to any smoothly parameterized model of posterior probabilities. The regularization scheme favors low-density separation, without any modeling of the density of input features. The contribution of unlabeled data to the learning criterion induces local optima, but this problem can be alleviated by deterministic annealing. For well-behaved models of posterior probabilities, deterministic annealing expectation-maximization (EM) provides a decomposition of the learning problem in a series of concave subproblems. Other approaches to the semi-supervised problem are shown to be close relatives or limiting cases of entropy regularization. A series of experiments illustrates the good behavior of the algorithm in terms of performance and robustness with respect to the violation of the postulated low-density separation assumption.
Keywords: maximum a posteriori estimation, entropy regularization, unlabeled data, posterior probabilities, regularization scheme, low-density separation, expectation-maximization, EM
MIT Press Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.
Please, subscribe or login to access full text content.
If you think you should have access to this title, please contact your librarian.
To troubleshoot, please check our FAQs, and if you can't find the answer there, please contact us.
- Title Pages
- Series Foreword
- Preface
-
1 Introduction to Semi-Supervised Learning -
1 A Taxonomy for Semi-Supervised Learning Methods -
3 Semi-Supervised Text Classification Using EM -
4 Risks of Semi-Supervised Learning: How Unlabeled Data Can Degrade Performance of Generative Classifiers -
5 Probabilistic Semi-Supervised Clustering with Constraints -
6 Transductive Support Vector Machines -
7 Semi-Supervised Learning Using Semi-Definite Programming -
8 Gaussian Processes and the Null-Category Noise Model -
9 Entropy Regularization -
10 Data-Dependent Regularization -
11 Label Propagation and Quadratic Criterion -
12 The Geometric Basis of Semi-Supervised Learning -
13 Discrete Regularization -
14 Semi-Supervised Learning with Conditional Harmonic Mixing -
15 Graph Kernels by Spectral Transforms -
16 Spectral Methods for Dimensionality Reduction -
17 Modifying Distances -
18 Large-Scale Algorithms -
19 Semi-Supervised Protein Classification Using Cluster Kernels -
20 Prediction of Protein Function from Networks -
25 Analysis of Benchmarks -
22 An Augmented PAC Model for Semi-Supervised Learning -
23 Metric-Based Approaches for Semi-Supervised Regression and Classification -
24 Transductive Inference and Semi-Supervised Learning -
25 A Discussion of Semi-Supervised Learning and Transduction - References
- Notation and Symbols
- Contributors
- Index