Rely on model selec on methods robust to irrelevant

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ror metric: Root mean squared devia-on (RMSD) ALS Predic-on Prize: Evalua-on   Oct. 1, 2012: Test set released to contestants   The Final Contest Data •  918 training pa-ents + 279 test pa-ents   12 months of data (demographic, ALSFRS, vital sta-s-cs, lab tests) •  625 valida-on pa-ents determined prize winners   Data never seen by contestants, no prior feedback given   Tests ability to generalize to new pa-ents Our Approach Featuriza;on •  Sta-c Data •  Time Series Data Modeling and Inference •  Bayesian Addi-ve Regression Trees Post ­hoc Evalua;on •  BART Performance •  Feature Selec-on •  Model Comparison Featuriza-on   Goal: Compact numeric representa-on of each pa-ent •  Features will serve as covariates in a regression model •  Most extracted features will be irrelevant •  Rely on model selec-on / methods robust to irrelevant features Issue: Features manually specified by non ­expert (me) Open Ques;on: Automa;c featuriza;on of longitudinal data? Featuriza-on   Goal: Compact numeric representa-on of each pa-ent •  Features will serve as covariates in a regression model •  Most extracted features will be irrelevant •  Rely on model selec-on / methods robust to irrelevant features   Sta;c Data Demographics Age, Race, Sex ALS History Time from onset, Site of onset Family History Mother, Father, Grandmother, Uncle… 49 …………………… …………………… Categorical variables encoded as binary indicators Featuriza-on   Goal: Compact numeric repres...
View Full Document

This note was uploaded on 02/03/2014 for the course STATS 202 taught by Professor Taylor during the Fall '09 term at Stanford.

Ask a homework question - tutors are online