# Register now to access 7 million high quality study materials (What's Course Hero?) Course Hero is the premier provider of high quality online educational resources. With millions of study documents, online tutors, digital flashcards and free courseware, Course Hero is helping students learn more efficiently and effectively. Whether you're interested in exploring new subjects or mastering key topics for your next exam, Course Hero has the tools you need to achieve your goals.

17 Pages

### lecture9

Course: STA 216, Fall 2008
School: Duke
Rating:

Word Count: 851

#### Document Preview

Time Discrete Survival Models j = P (Ti = j | Ti j, xi) = h(j + xi), where j is the discrete hazard, = (1, . . . , k ) are parameters characterizing the baseline hazard xi are time-independent covariates are regression coecients 1 Proportional Hazards in Discrete Time Assuming (t) = 0(t) exp(xi), and let Si denote the continuous event time Suppose that Ti = j if Si (aj1, aj ], for j = 1, . . . , k. Then,...

Register Now

#### Unformatted Document Excerpt

Coursehero >> North Carolina >> Duke >> STA 216

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Time Discrete Survival Models j = P (Ti = j | Ti j, xi) = h(j + xi), where j is the discrete hazard, = (1, . . . , k ) are parameters characterizing the baseline hazard xi are time-independent covariates are regression coecients 1 Proportional Hazards in Discrete Time Assuming (t) = 0(t) exp(xi), and let Si denote the continuous event time Suppose that Ti = j if Si (aj1, aj ], for j = 1, . . . , k. Then, the discrete hazard is as follows: Pr(Ti = j | Ti j, xi) = 1 S(aj ) exp(i(aj )) =1 S(aj1) exp(i(aj1)) aj aj1 = 1 exp 0(t) exp(xi) dt = 1 exp { exp(xi)(0(aj ) 0(aj1)} = 1 exp { exp(j + xi)}, where j = log(0(aj ) 0(aj1)) and 0(t) = t 0 0 (s) ds. 2 Thus, a Cox proportional hazards model can be t using a discretetime approximation by using a binary response GLM with a complementary log-log link In doing this, the discrete event time Ti must be coded as a Ti 1 vector of binary responses, yi = (0, ,0, i) The corresponding design matrix is then, Xi = (xi1, . . . , xi,Ti ) , where xij is a (k + p) 1 vector consisting of 0s in each of the rst k positions except for the jth which has a 1. The last p elements are xed at xi. 3 Time-Varying Covariates Often, in applications one or more of the predictors may vary over time. For example, suppose that we are interested in assessing the eect of air pollution levels on mortality. The air pollution levels vary from day to day. 4 Reasonable model for discrete hazard of death in age group j? Pr(Ti = j | Ti = j, xi, zij ) = h(j + xi + zij ), where zij is the level of population for individual i at age j This model accommodates the time-varying covariate Are we making a restrictive assumption? 5 In the previous model, we assumed that the eect of air pollution was constant at dierent ages. In fact, infants and the elderly are more susceptible to pollutioninduced mortality. How can we generalize the model, to account for this age-dependent susceptibility? 6 Time-Varying Coecients Pr(Ti = j | Ti = j, xi, zij ) = h(j + xi + zij j ), where we have now added a j subscript to the parameter characterizing the air pollution eect. Potentially, the eect of the time-independent predictors can also vary with time by allowing dierent s for the dierent age intervals Dimensionality rapidly become problematic - Order Restrictions? 7 What about computation & inference from these models? Well, if were frequentist, we can just t the binary response GLM and proceed as before (maximum likelihood estimation, analysis of deviance, etc) If were Bayesian, we can potentially also proceed as in binary response GLMs - either using adaptive rejection sampling or (if probit) the Albert and Chib approach 8 ContinuationRatioProbitModels Pr(Ti j = | Ti = j, xij ) = (xij ), where we can potentially parameterize xij to allow a nonparametric baseline and time-varying coecients. Note that Ti {1, . . . , k}, with k potentially large Thus, we have a potentially large number of parameters, including the time-varying coecients 9 By choosing a probit model, we can update the high dimensional vector jointly after augmenting the data with latent normal variables. Now, we have yi = (0, . . . , 0, i) as a Ti 1 outcome vector for subject i We introduce a zi = (zi1, . . . , zi,Ti ) vector of independent normal variables underlying yi yij = 1(zij > 0) and zij N (xij , 1), for j = 1, . . . , Ti. Gibbs sampler proceeds as before. 10 What about the prior specication? We have a potentially high-dimensional vector of time-varying baseline parameters and coecients. Potentially, no individuals with the event in certain intervals. What type of information prior is reasonable? 11 Focusing initially on the model with no time-varying coecients, we may want to do some smoothing Values of j and j are likely to be similar is j is close to j Autoregressive - Gaussian random walk prior: j N(j1, 1), where is a precision parameter controlling the degree of smoothing. 12 Penalized Likelihood The autoregressive prior essentially penalizes values of j that are far from the neighboring values From a frequentist perpective, we can use a similar idea by including a penalty term in the likelihood and then maximizing the resulting penalized likelihood. The penalty term can follow many forms, including an autoregressive normal density for the s 13 Oft...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Duke - STA - 240
Fall 2003'\$1Fall 2003'Exploratory Data AnalysisNematodes 0\$2One-way ANOVA: Example X10.650 10.425 5.600 5.450s2.053 1.486 1.244 1.771How do nematodes (microscopic worms) affect plant growth? A botanist prepares 16 identical p
Duke - STA - 240
www.stat.duke.edu/courses/Fall02/sta240/quiz4/quiz4data.htmlQuiz 4: Lab Exercise, 11/11/02 I will follow the NSEES Honor Code.Name:_ Signature:_1.[3 points] Circle the terms that describe the meadowfoam study: (a) (b) (c) (d) (e) completely r
Duke - STA - 278
STA 278/BGT 208 GENE EXPRESSION ANALYSISStatistical Models, Methods &amp; Computation Mike West Institute of Statistics &amp; Decision Sciences www.isds.duke.edu Computational &amp; Applied Genomics Program www.cagp.duke.eduSTA 278/BGT 208January 12, 2004
Duke - STA - 113
STA 113 Spring 2004 I. H. DinwoodieAssignment 1Due January 29 1. Consider the data in arsenic.txt explained in Arsenic.txt. a. Do a scatter plot of the amount of arsenic in the drinking water in ppm versus theamount in a toenail.b. Find
Duke - STA - 113
0 49 376 726 736 990 2008 2574 2718 2857 2920 3423 3678 3739 4465 4879 5056 5217 6027
Duke - STA - 205
Midterm Examination #1STA 205: Probability and Measure Theory Thursday, 2004 Feb 16, 2:20-3:35 pmThis is a closed-book examination. You may use a single one-sided sheet of prepared notes, if you wish, but you may not share materials. You may use a
Duke - STA - 205
Final ExaminationSTA 205: Probability and Measure Theory Due Monday, 2002 Apr 29, 5:00 pmThis is an open-book take-home examination. You must do your own work- collaboration is not permitted. If a questions seems ambiguous or confusing please ask
Duke - STA - 113
This is from the same paper as the etchratedata.txt file.The 490 measurements in etchratedata.txt were used tocompute a measure of nonuniformity for each of the ten wafers. The nonuniformity is actually the standard deviation of the 49etch ra
Duke - CH - 113
&quot;x1&quot;&quot;x2&quot;&quot;x3&quot;&quot;x4&quot;&quot;y&quot;8410011.42418072.27418014.610712054.97418054.67718014.771314014.65416074.54714034.85110071.481014034.72410031.641018034.56712074.7101318034.84101605
Duke - CH - 113
&quot;C1&quot;&quot;C2&quot;&quot;C3&quot;1.2&quot;pH 3&quot;&quot;Diseased&quot;1.4&quot;pH 3&quot;&quot;Diseased&quot;1&quot;pH 3&quot;&quot;Diseased&quot;1.2&quot;pH 3&quot;&quot;Diseased&quot;1.4&quot;pH 3&quot;&quot;Diseased&quot;.8&quot;pH 5.5&quot;&quot;Diseased&quot;.6&quot;pH 5.5&quot;&quot;Diseased&quot;.8&quot;pH 5.5&quot;&quot;Diseased&quot;1&quot;pH 5.5&quot;&quot;Diseased&quot;.8&quot;pH 5.5&quot;&quot;Diseased&quot;1&quot;pH 7&quot;&quot;Dis
Duke - CH - 113
&quot;response&quot;&quot;type&quot;&quot;subject&quot;12111022733744815926837748919152114321443111411251136124711181329123113421313102483511461217828103910411212923934745101611
Duke - CH - 113
&quot;Obs:&quot;&quot;x:&quot;&quot;y:&quot;1.41.022.421.213.48.884.51.985.571.526.61.837.71.58.751.89.751.7410.781.6311.84212.952.813.992.48141.032.47151.123.05161.153.18171.23.76181.253.68191.253.82201.283.21211.3
Duke - CH - 113
&quot;Linoleic&quot;&quot;Kerosene&quot;&quot;Antiox&quot;&quot;Betacaro&quot;303010.7303010.63303018.41.01340405.049303010.713.183010.120405.04204015.006540205.202303010.6330301.59.04402015.132404015.15303010.73046.8210.34630
Duke - CH - 113
Duke - CH - 113
&quot;stiffness&quot;&quot;plate lengths&quot;309.24409.543114326.54316.84349.84309.74402.16347.263616404.563316348.96381.76392.48366.283518357.18409.98367.383828346.710452.910461.410433.110410.610384.210362.6104
Duke - CH - 113
&quot;temp&quot;&quot;removal%&quot;7.6898.096.5198.256.4397.825.4897.826.5797.8210.2297.9315.6998.3816.7798.8917.1398.9617.6398.916.7298.6815.4598.6912.0698.5111.4498.0910.1798.259.6498.368.5598.277.57986.9498.098.3298.2510.59
Duke - CH - 113
&quot;C1&quot;212401320533132470230421311341232284513150232106421603336123
Duke - CH - 113
&quot;c1&quot;20.919.620.420.320.820.620.520.419.919.819.520.216.518.318.719.6202019.519.619.118.818.317.617.217.818.7191918.618.81918.518.317.516.91717.818.118.818.918.919.118.818.417.81716.817.918.41919.41
Duke - STA - 113
w 88y #zu XQy u w w f8|y zu i a gb i a i j ap r i g r c i r e a n r s tQeA5rqptefet\$eAbIA`Xqp\$ilm8 y e A8q|w q v u y w y 2 w G8G u v|w o8w u s p w e q|X2w2f|w y y y y w y G8G u VG8G u GQtG u G\$G u Ev|w
Duke - STA - 113
p &quot; 8 &quot; &quot; 8 &quot; 1 &quot; U54BA05ih 547#25&amp; &quot; T &amp;U T(0S R)0@&amp;PIHGF )(E&amp;&quot; \$#! D D Q 8 ' % &quot; g F d c F F W a72 2a f@8 &amp;6eR!b2 2a !0`Y&amp;00`Y XV U) 547#25&amp; &quot; 1 &quot; &quot; T &amp;U T(0S R)0@&amp;PIHGF )(E&amp;&quot; \$#! D D Q
Duke - STA - 113
jhi Eu0ihf Q VIr60 ERqCvv U S p I hh p p t U p he i hU Q p i h w Qe h Q i h q h p w s U U p h q p i VIT@uQ (Ro R(h w@9uT(x)40vu44t V(h VI | Q W Q h Q p q p i Q h t W p j pU Q U p I tI @p xt mt' mVxh I4mu Vi k 'x6Sw4e VIrm H V hq Q h q
Duke - STA - 244
STA2441/08/2003Homework 1Due 1/15/2003.Please provide concise, neatly written or typed solutions. All work should be your own and not copied from other texts or sources. Do feel free to discuss questions with me, the TA, others in class, or po
Duke - STA - 113
Students0510152025Range: 69.38% - 96.08%, 84 Students Median = 82.07, Quantiles = [76.36, 86.09] Mean = 81.4, Std Dev = 6.255060708090100Course Averages for STA113
Duke - STA - 205
z V g # &quot; 4 w a F &quot; \$ &amp; B \$ g 0 \$ Q # rSw P6B SiGo1%S1s6E&quot; D E%GEU%651%S1\$ i # Q D \$ # &quot;4 &amp; 2 # 0F &amp; B B # &quot; &amp; B # W B c i # Q w i B W Q # &quot; i B # W B i \$4 B \$ &quot; 0 ( &quot; Q B i4 # &quot;4 \$ # B &amp; @ B B y &amp; \$ B &quot; i CS'S%bU3i G%kfC86EqS
Duke - STA - 103
The data come from http:/www.econstats.com/eq_d1.htm. After the date and day of week they are open high low close return(%)
Duke - STA - 103
The wins (1) and losses (0) of the Philadelphia Phillies in the 2001 season.
Duke - STA - 103
Review of key points about estimators Populations can be at least partially described by population parameters Population parameters include: mean, proportion, variance, etc. Because populations are often very large (maybe innite, like the output
Duke - STA - 216
Frequentist Logistic Regression &amp; ExtensionsReturning to the DDE &amp; Pre-Term Birth Example, recall: yi = 1 for pre-term birth &amp; yi = 0 otherwise di = dose of DDE for woman i zi = vector of covariatesLogistic Regression: logitPr(yi = 1 | xi) =
Duke - STA - 101
21.0 Paired Dierences Answer Questions Paired Dierences Signicance Tests121.1 Paired DierencesExample 1: You want to show that men spend less on Valentines Day than women. You could draw some random men and some random women, ask them what th
Duke - STA - 290
Introduction to Statistical Data AnalysisGiven a new set of data to analyze, how should we proceed? Faced with uncertainty, statistics provides answers to questions and addresses uncertainties p. 1/15Model BuildingWhere should we start? 1. What
Duke - STA - 104
Midterm Examination # 2Mth 135 = Sta 104 Thursday, 2000 November 16, 2:15 3:30 pmIf you dont understand something in one of the questions, please 1 ask me. You may use your own one-sided, 8 2 11 sheet of notes and calculator, but do not share m
Duke - STA - 113
3.14 (d) check whether3.37 P (X = k) = p(k) = 1=6, where k = 1; 2; :; 6. Calculate E(1=X). If it bigger than (1=3:5), gamble; otherwise, accept the guaranteed amount. s 3.48 Let X = number of drivers who will come to a complete stop among 20 random
Duke - STA - 216
Extending GLMs for Correlated DataGLMs assume that the observations y1, . . . , yn are independent draws from an exponential family distribution However, in many applications, there may be dependency in the outcome data For example, in longitudinal
Duke - STA - 216
Standard Errors &amp; Confidence Intervals - N (0, I()-1), where 2l(, ; y) I() = ij=asyWe can obtain asymptotic 100(1 - )% confidence intervals for j using: j Z1-/2se(j ) j 1.96se(j ) for = 0.05, where Zp denotes the pth percentile of the N
Duke - STA - 104
Chisquare(2) densitydensity0.00.20.401020 x3040Chisquare(18) density (sum of 9 chisquare(2) random variables)0.00 0.02 0.04 0.06density01020 x3040Normal(18,36) density0.06 density 0.00 0.02 0.0401020 Central
Duke - STA - 104
105.860106.200105.010105.750104.590104.100101.890103.960103.000106.990106.860104.950104.130100.36099.950101.490100.35098.00096.59096.47093.34096.40096.00093.40090.50094.80094.45091.30090.00091.72092.71093.77096.95097.
Duke - STA - 122
Simple Linear RegressionMarch 16, 2009Reading Lee Ch 6Simple Linear Regression p.1/12BodyFat DataPercent Bodyfat01020304080100120140Circumference of Abdomin (cm)Simple Linear Regression p.2/12Body Fat ExampleEstimat
Duke - STA - 122
STA122 Lab Session # 5Course Instructor: Prof. Merlise Clyde Teaching Assistant: Debdeep Pati (dp55@stat.duke.edu) February 16, 20091Automatic HPD interval calculation using the beta-binomial exampleLet Y bin(n, p). We assume Beta(a, b) prior
Duke - STA - 122
STA 122 ASSIGNMENT 2Due February 23, 2009 1. Chapter 3 of Lee, exercises 3, 4, 5, 7, 8, 9, 12. For problem 7, use the reference prior. For problems that require nding an HPD region use the R code for the beta distribution in HPD.R and using coda pac
Duke - STA - 122
Duke - STA - 122
OR White Mucinous Invasives - all sites utilizedSNP 8073498 has P(OR &gt; 1 | data) = .96 but is based on 2 sites -suggestive of an effect. 95% intervals do include 1. &gt; OR.wmi[1][1]\$snp[1] &quot;rs9894946n&quot;[1]\$OR 50% 2.5% 97.
Duke - STA - 205
Sta 205 : Homework 1Due : January 21, 2009I. Fields and - fields. (A) For a three-point outcome set = {a, b, c} and C := {a} , enumerate the class of all -fields F on that contain C, i.e., satisfy C F . Also find (C). (B) For each integer n
Duke - STA - 104
MTH135/STA104: ProbabilityHomework # 7 Due: Tuesday, Nov 1, 2005 Prof. Robert Wolpert1. For some number c &gt; 0 the random variable X has a continuous probability distribution with density function f (x) = c x, 0&lt;x&lt;4(so f (x) = 0 for x (0, 4); th
Duke - STA - 104
MTH135/STA104: ProbabilityHomework # 5 Due: Tuesday, Oct 4, 2005 Prof. Robert Wolpert1. setLet X1 and X2 be the numbers on two independent rolls of a fair die; Y1 min(X1 , X2 ) Y2 max(X1 , X2 )a) Give the joint distribution of X1 and X2 1 Th
Duke - STA - 290
Bayesian Inference in a Normal PopulationSeptember 22, 2005Casella &amp; Berger Chapter 7, Gelman, Carlin, Stern, Rubin Sec 2.6, 2.8, Chapter 3.Bayesian Inference in a Normal Population p. 1/15Normal ModelIID observations Y = (Y1 , Y2 , . . . Yn
Duke - STA - 113
The simple linear regression model says that the n data points satisfy the following equation yi = 0 + 1 xi + i , i = 1, 2, . . . , n (1)where 0 is the intercept of the regression line, 1 is the slope and i is the error for the i-th data point. Usi
Duke - STA - 104
Emacs Speaks Statistics (ESS): A multi-platform, multi-package intelligent environment for statistical analysisA.J. Rossini Richard M. Heiberger Martin M chler a Rodney A. Sparapani Kurt Hornik Date: 2002/03/01Revision: 1.255Abstract Computer pr
Duke - STA - 205
Sta 205 : Home Work #5Due : February 22, 2006 I. Expectation. (A) Consider the triangle with vertices (-1, 0), (1, 0), (0, 1) and suppose (X1 , X2 ) is a random vector uniformly distributed with in this triangle. Compute E(X1 + X2 ). (B) Let (0, 1],
Duke - STA - 215
Statistical InferenceRobert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA1.Asymptotic Inference in Exponential FamiliesLet Xj be a sequence of independent, identically distributed random variables fr
Duke - STA - 215
Statistical InferenceRobert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USAWeek 12. Testing and Kullback-Liebler Divergence1.Likelihood RatiosLet X1 , X2 , X2 , . be independent, identically distribu
Duke - STA - 205
Sta 205 : Homework #10Due : April 11, 2007 I. Convergence In Distribution (A) For events {An } and A in some probability space (, F , P), define Bernoulli random variables by Xn 1An and X 1A . As n , i. Under what conditions on {An } and A will X
Duke - STA - 205
Sta 205 : Home Work #4Due : February 14, 2007 1. Expectation. (a) Consider the triangle with vertices (1, 0), (1, 0), (0, 1) and suppose (X1 , X2 ) is a random vector uniformly distributed with in this triangle. Compute E(X1 + X2 ). (b) Let (0, 1],
Duke - STA - 395
Title: Higher Order Semiparametric Frequentist Inference Based on theProfile SamplerAbstract: In this talk, we have systematically constructed a higher order frequentist validation of semiparametric estimation procedures through easy-to-implemen
Duke - CPS - 170
CPS 170: Artificial Intelligencehttp:/www.cs.duke.edu/courses/spring09/cps170/First-Order LogicInstructor: Vincent ConitzerLimitations of propositional logic So far we studied propositional logic Some English statements are hard to model in
Duke - CPS - 140
CPS 140 - Mathematical Foundations of CS Dr. Susan Rodger Section: Introduction (Ch. 1) (handout)What will we do in CPS 140? Questions Can you write a program to determine if a string is an integer? 9998.89 8abab 789342 Can you do this if your m
Duke - CPS - 140
Section: Turing Machines - Building Blocks 1. Given Turing Machines M1 and M2 Notation for Run M1 Run M2M1 M2SHSHM1M2SHz;z,Rz;z,LSHz represents any symbol in12. Given Turing Machines M1 and M2M1 M2SHSHM1x
Duke - CPS - 196
CPS 196.03: Information Management and Mining First programming projectFirstProgrammingProject Individualproject,15Pointsinfinalgrade Sales(customer_id,item_id,item_group,item_price,purchase_date)Willbeprovidedasafileduringdemoandforgeneratin
Duke - CPS - 104
1. a2. d3. c4. a5. b6. c7. c8. d9. d
Duke - CPS - 111
1790 39290001800 53080001810 72400001820 96380001830 128660001840 170690001850 231920001860 314430001870 385580001880 501560001890 629480001900 759950001910 919720001920 1057110001930 1227550001940 1316690001950 1506970001960 1793230
Duke - CPS - 111
857Linear ProgrammingSo far we have looked at modeling problems that involve quantities that change with time. Time, however, is not always part of the picture. In a modeling scenario that arises very often in economics, as well as in other sci
Duke - STAT - 101
Stat 101: Lecture 19Summer 2006OutlineRegression: A ReviewRecall that in simple linear regression one tried to predict Y from X by assuming a model: Yi = a + bXi +iHere a and b are unknown constants estimated from the observed data (i.e.,