Machine learning Supervised learning Supervised learning Bayesian inference

machine learning supervised learning supervised

51

Machine learning – Supervised learning Supervised learning – Bayesian inference Bayes rule In the case of a categorical variable d and a real vector x P ( d | x ) = p ( x , d ) p ( x ) = p ( x | d ) P ( d ) p ( x ) = p ( x | d ) P ( d ) d p ( x | d ) P ( d ) P ( d | x ) : probability that x is of class d , p ( x | d ) : distribution of x within class d , P ( d ) : frequency of class d . Example of final classifier: f ( x ; θ ) = argmax d P ( d | x ) Generative models carry more information: Learning p ( x | d ) and P ( d ) allows to deduce P ( d | x ) . But they often require much more parameters and more training data. Discriminative models are usually easier to learn and thus more accurate. 51
Machine learning – Workflow Machine learning workflow (Source: Michael Walker) 52

Machine learning – Problem types Problem types (Source: Lucas Masuch) 53
Machine learning – Clustering Clustering Clustering: group observations into “meaningful” groups. (Source: Kasun Ranga Wijeweera) Task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other. Popular ones are K-means clustering and Hierarchical clustering. 54

Machine learning – Clustering – K-means Clustering – K-means Feature #2 (Source: Naftali Harris) Feature #1 1 Consider data in R 2 spread on three different clusters , 55
Machine learning – Clustering – K-means Clustering – K-means Feature #2 (Source: Naftali Harris) Feature #1 1 Consider data in R 2 spread on three different clusters, 2 Pick randomly K = 3 data points as cluster centroids , 55

Machine learning – Clustering – K-means Clustering – K-means Feature #2 (Source: Naftali Harris) Feature #1 1 Consider data in R 2 spread on three different clusters, 2 Pick randomly K = 3 data points as cluster centroids, 3 Assign each data point to the class with closest centroid , 55
Machine learning – Clustering – K-means Clustering – K-means Feature #2 (Source: Naftali Harris) Feature #1 1 Consider data in R 2 spread on three different clusters, 2 Pick randomly K = 3 data points as cluster centroids, 3 Assign each data point to the class with closest centroid, 4 Update the centroids by taking the means within the clusters , 55

Machine learning – Clustering – K-means Clustering – K-means Feature #2 (Source: Naftali Harris) Feature #1 1 Consider data in R 2 spread on three different clusters, 2 Pick randomly K = 3 data points as cluster centroids, 3 Assign each data point to the class with closest centroid, 4 Update the centroids by taking the means within the clusters, 5 Go back to 3 until no more changes. 55
Machine learning – Clustering – K-means Clustering – K-means Feature #2 (Source: Naftali Harris) Feature #1 1 Consider data in R 2 spread on three different clusters, 2 Pick randomly K = 3 data points as cluster centroids, 3 Assign each data point to the class with closest centroid , 4 Update the centroids by taking the means within the clusters, 5 Go back to 3 until no more changes.

