Posts

Bayesian and Frequentist -- two worldviews of machine learning

Image
Before graduated as Ph. D., some of my friends and I have formed a Probabilistic Graphical Models (PGM) study group. But we stopped at 1/3 of the course content since everybody had his own business to mind. Until recently I finally finished this course online given by professor Dephne Koller, the founder of Coursera, and finally I start to see how the worldview of so called "Bayesian" and "Frequentist" differs. Most of the Machine Learning (ML) courses, for example the "Machine Learning Foundations" by professor Hsuan-Tien Lin at National Taiwan University , start from the concept of "classifiers" , or supervised discriminative models to be more specific. And the core concept of classifiers can be summarized as "separating hyper-plane" : Imagine we're looking for a way to separate a bunch of samples in the space so we can get the "best" outcome. Logistic regression or perceptron is simply cutting this space in h

A Quick Guide of Multithreading with Queue in Python

Image
As described in the beginning of [1]: "The Queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be exchanged safely between multiple threads. The Queue class in this module implements all the required locking semantics." Therefore the implementation of multi-threading with Queue is quite handy and approachable to beginners in my opinion. I strongly recommend the references [2][3] for some head-first understandings on this topic. Note that after python 2.4, it is in general recommended the  threading  module for multi-thread programming instead of the  thread  module in previous version. Don't get confused when you're crawling the web for information. Also note there's a well known Global Interpreter Lock (GIL) issue in CPython [4]. We'll discuss this later. Multi-threading with Queue in Python 1. Producer-Consumer Model Say you want to parallelize a piece of code with m

From Restricted Boltzmann Machine to Deep Neural Network -- Missing Links Explained

Image
LaTeX version: https://www.sharelatex.com/project/52295b43e77a8bec1401f6bc 1. Introduction Deep neural network (DNN) has been a hot topic in the community of speech processing in recent years [1]. People are eager to learn about the various theories of DNN. However, those who are not familiar with some background knowledge of Probabilistic Graphical Models [2], Markov chain Monte Carlo (MCMC) methods [3], variational methods [4][5], etc., may found the related publications literally too deep to follow. Here I documented my footprints on figuring out some fundamental theories of DNN. This note spins around the paper written by Hinton et. al. in 2006 [6], which I think is the pivot reference for understanding DNN. I strongly recommend readers to at least scan through most of the references co-authored by Hinton before reading this note. Below I will begin with the training algorithm of restricted Boltzmann machine (RBM), i.e. contrastive divergence. Then I will investigate w