Bayesian and Frequentist -- two worldviews of machine learning
Before graduated as Ph. D., some of my friends and I have formed a Probabilistic Graphical Models (PGM) study group. But we stopped at 1/3 of the course content since everybody had his own business to mind. Until recently I finally finished this course online given by professor Dephne Koller, the founder of Coursera, and finally I start to see how the worldview of so called "Bayesian" and "Frequentist" differs. Most of the Machine Learning (ML) courses, for example the "Machine Learning Foundations" by professor Hsuan-Tien Lin at National Taiwan University , start from the concept of "classifiers" , or supervised discriminative models to be more specific. And the core concept of classifiers can be summarized as "separating hyper-plane" : Imagine we're looking for a way to separate a bunch of samples in the space so we can get the "best" outcome. Logistic regression or perceptron is simply cutting this space in h...