Regression-BasisFns

Regression-BasisFns - Machine Learning!Srihari Linear...

This preview shows pages 1–7. Sign up to view the full content.

Machine Learning Srihari 1 Linear Models for Regression Sargur Srihari

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari 2 Overview • Plan: Discuss supervised learning starting with regression • Goal: predict value of one or more target variables t • Given d -dimensional vector x of input variables • Terminology – Regression • When t is continuous-valued – Classification • if t has a value consisting of labels (non-ordered categories) – Ordinal Regression • Discrete values, ordered categories • Learning to Rank problem t is discrete (eg, 1,2. .6 ) in training set but a continuous value in [1,6] is learnt and used to rank objects
Machine Learning Srihari 3 Types of Linear Regression Models • Simplest form of linear regression models: – linear function of single input variable y(x ,w ) = w 0 +w 1 x • More useful class of functions: – Polynomial curve fitting y(x ,w ) = w 0 +w 1 x+w 2 x 2 +…= Σ w i x i – linear combination of non-linear functions of input variables φ i ( x ) instead of x i called basis functions • Linear functions of parameters (which gives them simple analytical properties), yet are nonlinear with respect to input variables Task is to learn weights w 0 , w 1 from data D={(y i ,x i )}i=1,,N

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari Regression: Learning To Rank Log frequency of query in anchor text Query word in color on page # of images on page # of (out) links on page PageRank of page URL length URL contains “~” Page length Input ( x i ): ( d Features of Query-URL pair) Output ( y ): Relevance Value In LETOR 4.0 dataset 46 query-document features Maximum of 124 URLs/query ( d >200 ) Yahoo! data set has d=700 Target Variable - Point-wise (0,1,2,3) - Regression returns continuous value -Allows fine-grained ranking of URLs Traditional IR uses TF/IDF
Machine Learning Srihari NRC Ranking of PhD programs (2006) • S (survey) ranking – Ask faculty to rate how important d = 20 characteristics are to program quality – Randomly draw half of faculty program ratings 500 times • to produce 500 sets of direct weights • R (regression) ranking – Ask faculty to rate quality ( t =1. .6 ) of N specific programs in their field • Values of t i used for regression – Randomly draw half of program ratings 500 times • Obtain 500 sets of regression weights for the d characteristics 5 • Gather raw data ( x i ) from institutions and other sources measures of faculty productivity, student support and outcomes, diversity

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari NRC Weights ( w i ) for CS R-based 0.05 R- 0.95 R-Stdev S-based 0.05 S- 0.95 S-Stdev Publications per Allocated Faculty 0.110 0.135 0.009 0.132 0.138 0.002 Cites per Publication 0.067 0.095 0.011 0.148 0.155 0.002 Grants per Allocated Faculty -0.001 0.047 0.014 0.129 0.135 0.002 Percent Faculty Interdisciplinary 0.044 0.068 0.008 0.044 0.049 0.001 Percent Non-Asian Minority Faculty 0.038 0.070 0.010 0.005 0.007 0.001 Percent Female Faculty 0.038 0.086 0.016 0.008 0.010 0.001 Awards per allocated faculty 0.083 0.125 0.015 0.099 0.106 0.002 Average GRE-Q 0.066 0.119 0.015 0.059 0.063 0.001 Percent 1st yr. students w/ full support -0.011 0.050 0.020 0.066 0.070 0.001 Percent 1st yr students with portable fellowships -0.054 -0.010 0.014 0.042 0.045 0.001 Percent Non-Asian Minority Students -0.047 0.004 0.015 0.011 0.014 0.001
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 33

Regression-BasisFns - Machine Learning!Srihari Linear...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online