Posts

Showing posts with the label Machine Learning

Bayesian and Frequentist -- two worldviews of machine learning

Image
Before graduated as Ph. D., some of my friends and I have formed a Probabilistic Graphical Models (PGM) study group. But we stopped at 1/3 of the course content since everybody had his own business to mind. Until recently I finally finished this course online given by professor Dephne Koller, the founder of Coursera, and finally I start to see how the worldview of so called "Bayesian" and "Frequentist" differs. Most of the Machine Learning (ML) courses, for example the "Machine Learning Foundations" by professor Hsuan-Tien Lin at National Taiwan University , start from the concept of "classifiers" , or supervised discriminative models to be more specific. And the core concept of classifiers can be summarized as "separating hyper-plane" : Imagine we're looking for a way to separate a bunch of samples in the space so we can get the "best" outcome. Logistic regression or perceptron is simply cutting this space in h...

From Restricted Boltzmann Machine to Deep Neural Network -- Missing Links Explained

Image
LaTeX version: https://www.sharelatex.com/project/52295b43e77a8bec1401f6bc 1. Introduction Deep neural network (DNN) has been a hot topic in the community of speech processing in recent years [1]. People are eager to learn about the various theories of DNN. However, those who are not familiar with some background knowledge of Probabilistic Graphical Models [2], Markov chain Monte Carlo (MCMC) methods [3], variational methods [4][5], etc., may found the related publications literally too deep to follow. Here I documented my footprints on figuring out some fundamental theories of DNN. This note spins around the paper written by Hinton et. al. in 2006 [6], which I think is the pivot reference for understanding DNN. I strongly recommend readers to at least scan through most of the references co-authored by Hinton before reading this note. Below I will begin with the training algorithm of restricted Boltzmann machine (RBM), i.e. contrastive divergence. Then I will investigate w...