This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Class notes for “Introduction to R for Bioinformatics (BM1)” Stephen Ellner * , modified by Ben Bolker † , further modified by Ramon Diaz-Uriarte ‡ October 19, 2009 1 Scenarios • You are designing an experiment: 20 plates are to be assigned (randomly) to 4 conditions. You are too young (or too old) to cut paper into pieces, place it in a urn, etc. You want a better, faster way. Specially because your next experiment will involve 300 units, not 20. • The authors of a paper claim there is a weak relationship between levels of protein A and growth. However, you know that some of the samples are from males and some are from females, and you suspect the correlation is present only in males. The authors provide the complete data. • You’ve been working on a microarray study. For 100 subjects (50 of them with leukemia, 50 of them healthy) you have the Cy 3 /Cy 5 intensity ratios for 300,000 spots. You just got the email with the compressed data file. You are leaving for home. In less than five minutes you’d like to get a quick idea of what the data look like: maximum and minimum values for all spots, average for 5 specific control spots (corresponding to probes 10, 23, 56, 10,004, 20,000), and a quick-and-dirty statistical test of differences for two specific probes (that correspond to two well know genes, that correspond to probes 7,000 and 99,000). • Tomorrow you’ll look at the data in more detail. For a set of 20 selected probes you will want to: a) take a look at the mean of the intensity, variance of intensity, and the mean of the intensity in each of the two groups; b) plot the intensity vs. the age of the subject; c) plot the log of the intensity vs. the age of the subject. • A paper describes a specific growth curve model (some non-linear function). You would like to see what the actual curve looks like, and how much variation you get if you modify the parameters slightly. For each of those problems, would you ... * Ecology and Evolutionary Biology, Cornell University † Department of Zoology, University of Florida ‡ Spanish National Cancer Centre (CNIO), and Universidad Autonoma de Madrid, Spain 1 • Know how to do it? • Do it quickly? • Save all the steps of what you did so that 6 months from today you know exactly what you did, can repeat it, and apply it to new data? This course is a quick introduction to an “environment for statistical computing and graphics” that will allow you to carry out each of the above. 2 This course, this document 2.1 What you should expect from this section of the course. What I (RDU) expect from you This is a crash introduction to R , an “environment for statistical computing and graphics”....
View Full Document
This note was uploaded on 04/06/2010 for the course COMPUTER S COSC1520 taught by Professor Paul during the Spring '09 term at York University.
- Spring '09