目錄
1 Information Theory, Machine Learning, and Reproducing Kernel Hilbert Spaces
1.1 Introduction
1.2 Information Theory
1.3 Entropy
1.4 Mutual Information
1.5 Relative Entropy and Kullback-Leibler Divergence
1.6 Information Theory beyond Communications
1.7 Adaptive Model Building
1.8 Information-Theoretic Learning
1.9 ITL as a Unifying Learning Paradigm
1.10 Reproducing Kernel Hilbert Spaces
1.11 RKHS and ITL
1.12 Conclusions
2 Renyi's Entropy, Divergence and Their Nonparametric Estimators
Chapter Coauthors: Dongxin Xu and Deniz Erdogmuns
2.1 Introduction
2.2 Definition and Interpretation of Renyi's Entropy
2.3 Quadratic Renyi's Entropy Estimator
2.4 Properties of Renyi's Nonparametric Entropy Estimators
2.5 Bias and Variance of the Information Potential Estimator
2.6 Physical Interpretation of Renyi's Entropy Kernel Estimators
2.7 Extension to a-Information Potential with Arbitrary Kernels
2.8 Renyi's Divergence and Mutual Information
2.9 Quadratic Divergences and Mutual Information
2.10 Information Potentials and Forces in the Joint Space
2.11 Fast Computation of IP and CIP
2.12 Conclusion
3 Adaptive Information Filtering with Error Entropy and Error Correntropy Criteria
Chapter Coauthors: Deniz Erdogmus and Weifeng Liu
3.1 Introduction
3.2 The Error Entropy Criterion (EEC) for Adaptation
3.3 Understanding the Error Entropy Criterion
3.4 Minimum Error Entropy Algorithm
3.5 Analysis of MEE Performance Surface
3.6 Error Entropy, Correntropy, and M Estimation
3.7 Correntropy Induced Metric and M-Estimation
3.8 Normalized Information Potential as a Pseudometric
3.9 Adaptation of the Kernel Size in Adaptive Filtering
3.10 Conclusions
4 Algorithms for Entropy and Correntropy Adaptation with Applications to Linear Systems
Chapter Coauthors: Deniz Erdogmus, Seungju Han
and Abhishek Singh
4.1 Introduction
4.2 Recursive Information Potential for MEE (MEE-RIP)
4.3 Stochastic Information Gradient for MEE (MEE-SIG)
4.4 Self-Adjusting Stepsize for MEE (MEE-SAS)
4.5 Normalized MEE (NMEE)
4.6 Fixed-Point MEE (MEE-FP)
4.7 Fast Gauss Transform in MEE Adaptation
4.8 Incomplete Cholesky Decomposition for MEE
4.9 Linear Filter Adaptation with MSE, MEE and MCC
4.10 Conclusion
5 Nonlinear Adaptive Filtering with MEE, MCC and Applications
Chapter Coauthors: Deniz Erdogmus, Rodney Morejon and Weifeng Liu
5.1 Introduction
5.2 Backpropagation of Information Forces in MLP Training
5.3 Advanced Search Methods for Nonlinear Systems
5.4 ITL Advanced Search Algorithms
5.5 Application: Prediction of the Mackey Glass Chaotic Time Series
5.6 Application: Nonlinear Channel Equalization
5.7 Error Correntropy Criterion (ECC) in Regression
5.8 Adaptive Kernel Size in System Identification and Tracking
5.9 Conclusions
6 Classification with EEC, Divergence Measures and Error Bounds
Chapter Coauthors: Deniz Erdogmus, Dongxin Xu and Kenneth Hild H
6.1 Introduction
6.2 Brief Review of Classification
6.3 Error Entropy Criterion in Classification
6.4 Nonparametric Classifiers
6.5 Classification with Information Divergences
6.6 ITL Algorithms for Divergence and Mutual Information
6.6.1 Case Study: Automatic Target Recognition (ATR) with ITL
6.7 The Role of ITL Feature Extraction in Classification
6.8 Error Bounds for Classification
6.9 Conclusions
7 Clustering with ITL Principles
Chapter Coauthors: Robert Jenssen and Sudhir Rao
7.1 Introduction
7.2 Information-Theoretic Clustering
7.3 Differential Clustering Using Renyi's Entropy
7.4 The Clustering Evaluation ~ction
7.5 A Gradient Algorithm for Clustering with Des
7.6 Mean Shift Algorithms and Renyi's Entropy
7.7 Graph-Theoretic Clustering with ITL
7.8 Information Cut for Clustering
7.9 Conclusion
8 Self-Organizing ITL Principles for Unsupervised Learning
Chapter Coauthors: Sudhir Rao, Deniz Erdogmus, Dongxin Xu and Kenneth Hild H
8.1 Introduction
8.2 Entropy and Cross-Entropy Optimization
8.3 The Information Maximization Principle
8.4 Exploiting Spatial Structure for Self-Organization
8.5 Principle of Redundancy Reduction
8.6 Independent Component Analysis (ICA)
8.7 The Information Bottleneck (IB) Method
8.8 The Principle of Relevant Information (PRI)
8.9 Self-Organizing Principles with ITL Estimators
8.10 Conclusions
9 A Reproducing Kernel Hilbert Space Framework for ITL
Chapter Coauthors: Jianwu Xu, Robert Jenssen, Antonio Paiva and Il Park
9.1 Introduction
9.2 A RKHS Framework for ITL
9.3 ITL Cost F~lnctions in the RKHS Framework
9.4 ITL Estimators in RKHS
9.5 Connection Between ITL and Kernel Methods via RKHS Hv
9.6 An ITL Perspective of MAP and SVM Classifiers
9.7 Case Study: RKHS for Computation with Spike Train
9.8 Conclusion
10 Correntropy for Random Variables: Properties and Applications in Statistical Inference
Chapter Coauthors: Weifeng Liu, Puskal Pokharel, Jianwu Xu and Sohan Seth
10.1 Introduction
10.2 Cross-Correntropy: Definitions and Properties
10.3 Centered Cross-Correntropy and Correntropy Coefficient
10.4 Parametric Cross-Correntropy and Measures of Dependence
10.5 Application: Matched Filtering
10.6 Application: Nonlinear Coupling Tests
10.7 Application: Statistical Dependence Tests
10.8 Conclusions
11 Correntropy for Random Processes: Properties and Applications in Signal Processing
Chapter Coauthors: Puskal Pokharel, Ignacio Santamaria Jianwu Xu, Kyu-hwa Jeong, and Weifeng Liu
11.1 Introduction
11.2 Autocorrentropy Function: Definition and Properties
11.3 Cross-Correntropy Function: Definition and Properties
11.4 Optimal Linear Filters in Hv
11.5 Correntropy MACE (CMACE) Filter in Hv
11.6 Application: Autocorrentropy Function as a Similarity Measure over Lags
11.7 Application: Karhtmen-Loeve Transform in Hv
11.8 Application: Blind Source Separation
11.9 Application: CMACE for Automatic Target Recognition
11.10 Conclusion
A PDF Estimation Methods and Experimental Evaluation of ITL Descriptors
Chapter Coauthors: Deniz Erdogmus and Rodney Morejon
A.1 Introduction
A.2 Probability Density Function Estimation
A.3 Nonparametric Entropy Estimation
A.4 Estimation of Information-Theoretic Descriptors
A.5 Convolution Smoothing
A.6 Conclusions
Bibliography
Index