COS 424SML 302 Features and Kernels 24 49 Encoding a network

Cos 424sml 302 features and kernels 24 49 encoding a

This preview shows page 24 - 32 out of 49 pages.

COS 424/SML 302 Features and Kernels February 18, 2019 24 / 49
Image of page 24

Subscribe to view the full document.

Encoding a network as an adjacency matrix Adjacency matrix A : may be directed (non-symmetric) or undirected (symmetric); weighted or unweighted Adjacency can be thought of as a bag-of-neighbors: for every node, the number of features is the number of nodes, and counts are edges Eigenvalues of this matrix (e.g., PageRank [Page, Brin]) represent linear combinations of connected nodes, scaled by inverse path length How can we compare networks, instead of nodes in a network? COS 424/SML 302 Features and Kernels February 18, 2019 25 / 49
Image of page 25
Features: summary Feature extraction or feature engineering is an art, not a science Carefully-constructed features can mean the difference between a useful analysis and a failed analysis Some features will be predictive; other features will not Features may be correlated The best features will be informed by the data domain the analysis task Prune extraneous features: Feature selection . Why? COS 424/SML 302 Features and Kernels February 18, 2019 26 / 49
Image of page 26

Subscribe to view the full document.

Kernel Functions Kernels are a way to flexibly compare samples in a complex space. Kernels have shown great utility in comparing: images of different sizes protein sequences of different lengths object 3D structures networks with different numbers of edges or nodes text documents of different lengths and formats. COS 424/SML 302 Features and Kernels February 18, 2019 27 / 49
Image of page 27
Why Kernels? Often objects are difficult to compare on the basis of their features alone. What features would you use to classify: 3D structures (e.g., molecules, proteins) time series data (e.g., stock prices) strings of unequal length (e.g., DNA sequences) network structures across different sets of random variables (e.g., evolutionary trees) many more... Kernels are similarity functions that bypass feature representations. COS 424/SML 302 Features and Kernels February 18, 2019 28 / 49
Image of page 28

Subscribe to view the full document.

Kernels for comparing samples A well-defined kernel gives us a single metric to quantify the similarity between two samples 1 0.32 0.23 0.62 0.44 0.32 1 0.57 0.19 0.08 0.23 0.57 1 0.81 0.29 0.62 0.19 0.81 1 0.73 0.44 0.08 0.29 0.73 1 COS 424/SML 302 Features and Kernels February 18, 2019 29 / 49
Image of page 29
Kernels for classification Classifying or clustering samples Simple classifiers may not perform well for a set of features. Kernels project features to a (higher dimensional) feature space Classifiers may work better in the kernelized feature space. When describing the naive Bayes classifier, we used a fixed set of features. Next lecture, we will describe classification methods that use kernels. COS 424/SML 302 Features and Kernels February 18, 2019 30 / 49
Image of page 30

Subscribe to view the full document.

Kernel function Given some abstract space X (e.g., documents, images, proteins, etc.), function κ : X × X 7→ < is called a kernel function . Kernel functions quantify similarity between two samples x and x 0 in X . For a given feature vector φ ( x ), we can construct a naive kernel: κ ( x , x 0 ) = φ ( x ) T φ ( x 0 ) .
Image of page 31
Image of page 32

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes