- Title Pages
- Series Foreword
- Preface
-
1 Introduction to Semi-Supervised Learning -
1 A Taxonomy for Semi-Supervised Learning Methods -
3 Semi-Supervised Text Classification Using EM -
4 Risks of Semi-Supervised Learning: How Unlabeled Data Can Degrade Performance of Generative Classifiers -
5 Probabilistic Semi-Supervised Clustering with Constraints -
6 Transductive Support Vector Machines -
7 Semi-Supervised Learning Using Semi-Definite Programming -
8 Gaussian Processes and the Null-Category Noise Model -
9 Entropy Regularization -
10 Data-Dependent Regularization -
11 Label Propagation and Quadratic Criterion -
12 The Geometric Basis of Semi-Supervised Learning -
13 Discrete Regularization -
14 Semi-Supervised Learning with Conditional Harmonic Mixing -
15 Graph Kernels by Spectral Transforms -
16 Spectral Methods for Dimensionality Reduction -
17 Modifying Distances -
18 Large-Scale Algorithms -
19 Semi-Supervised Protein Classification Using Cluster Kernels -
20 Prediction of Protein Function from Networks -
25 Analysis of Benchmarks -
22 An Augmented PAC Model for Semi-Supervised Learning -
23 Metric-Based Approaches for Semi-Supervised Regression and Classification -
24 Transductive Inference and Semi-Supervised Learning -
25 A Discussion of Semi-Supervised Learning and Transduction - References
- Notation and Symbols
- Contributors
- Index
Semi-Supervised Protein Classification Using Cluster Kernels
Semi-Supervised Protein Classification Using Cluster Kernels
- Chapter:
- (p.342) (p.343) 19 Semi-Supervised Protein Classification Using Cluster Kernels
- Source:
- Semi-Supervised Learning
- Author(s):
Weston Jason
Leslie Christina
Ie Eugene
Noble William Stafford
- Publisher:
- The MIT Press
This chapter describes an experimental study of large-scale semi-supervised learning for the problem of protein classification. The protein classification problem, a central problem in computational biology, is to predict the structural class of a protein given its amino acid sequence. Such a classification helps biologists to understand the function of a protein. Building an accurate protein classification system, as with many tasks, depends critically upon choosing a good representation of the input sequences of amino acids. Early work using string kernels with support vector machines (SVMs) for protein classification achieved state-of-the-art classification performance. However, such representations are based only on labeled data—examples with known three-dimensional (3D) structures, organized into structural classes-while in practice, unlabeled data are far more plentiful.
Keywords: semi-supervised learning, protein classification problem, computational biology, amino acid sequence, input sequences, string kernels, support vector machines, SVMs, labeled data, unlabeled data
MIT Press Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.
Please, subscribe or login to access full text content.
If you think you should have access to this title, please contact your librarian.
To troubleshoot, please check our FAQs, and if you can't find the answer there, please contact us.
- Title Pages
- Series Foreword
- Preface
-
1 Introduction to Semi-Supervised Learning -
1 A Taxonomy for Semi-Supervised Learning Methods -
3 Semi-Supervised Text Classification Using EM -
4 Risks of Semi-Supervised Learning: How Unlabeled Data Can Degrade Performance of Generative Classifiers -
5 Probabilistic Semi-Supervised Clustering with Constraints -
6 Transductive Support Vector Machines -
7 Semi-Supervised Learning Using Semi-Definite Programming -
8 Gaussian Processes and the Null-Category Noise Model -
9 Entropy Regularization -
10 Data-Dependent Regularization -
11 Label Propagation and Quadratic Criterion -
12 The Geometric Basis of Semi-Supervised Learning -
13 Discrete Regularization -
14 Semi-Supervised Learning with Conditional Harmonic Mixing -
15 Graph Kernels by Spectral Transforms -
16 Spectral Methods for Dimensionality Reduction -
17 Modifying Distances -
18 Large-Scale Algorithms -
19 Semi-Supervised Protein Classification Using Cluster Kernels -
20 Prediction of Protein Function from Networks -
25 Analysis of Benchmarks -
22 An Augmented PAC Model for Semi-Supervised Learning -
23 Metric-Based Approaches for Semi-Supervised Regression and Classification -
24 Transductive Inference and Semi-Supervised Learning -
25 A Discussion of Semi-Supervised Learning and Transduction - References
- Notation and Symbols
- Contributors
- Index