HW4 - compound, represented by the logarithm of the...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
STA 4702/5701 Homework 4 Due by 5pm on April 6 Data Analysis The data for this analysis, from Umetrics (1995), come from the field of drug discovery. New drugs are developed from chemicals that are biologically active. Testing a compound for biological activity is an expensive procedure, so it is useful to be able to predict biological activity from cheaper chemical measurements. In fact, computational chemistry makes it possible to calculate certain chemical measurements without even making the compound. These measurements include size, lipophilicity, and polarity at various sites on the molecule. On the course web site there is a .sas file that creates a data set named penta for which you will run the analyses. You would like to study the relationship between these measurements and the activity of the
Background image of page 1
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: compound, represented by the logarithm of the relative Bradykinin activating activity, logRAI . Notice that these data consist of many predictors relative to the number of observations. Use Proc PLS to perform both a principal components regression and a partial least squares regression to nd a few underlying predictive factors that account for most of the variation in the response. Conduct the appropriate model selection and report your conclusion in journal article format. Your Introduction should compare the advantages in general of PCR to PLSR, and your Results section should include a comparison of the resulting models for the two techniques. Each analysis is worth 50 points. 1...
View Full Document

This note was uploaded on 07/14/2011 for the course STA 4702 taught by Professor Staff during the Spring '08 term at University of Florida.

Ask a homework question - tutors are online