2 of
13
1.1 Introducing R
Figure 1: The Univac II, circa 1958.
Figure 2: The program
Z
(1) =
Y
+
W
(1) on a punch card.
Statistics and Computers
The U.S. Bureau of the Census used the UNIVAC computer to process the
1950 Census data. In 1951, a report stated“it would take at least 650 keypunch
operators, working on 17 document punch machines during the week of peak
Census processing, to produce a million punched cards completely edited and
ready for tabulation.” To realize productivity gains, and to pursue their interest
in technical innovation, the Bureau acquired an electronic computing machine.
1.1
Introducing R
It’s statistics program / language capable of most basic and advanced statistical
procedures.
Some benefits:
•
It’s free (GNU General public license).
•
Works on all major platforms.
•
Compatible with S.
Anthony Tanbakuchi
MAT167
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Introduction to R
3 of
13
•
It’s a powerful calculator.
•
It’s simple (therefore easy).
•
It encourages reproducible research.
S Version 3 (1983
(the `blue book’)
•
Merged some new ideas with
•
“Everything is an object” (inc
functions).
•
Functional evaluation model.
•
.C(), .Fortran(),
no
Interface
•
No direct back compatibility
Statistical Models in S (S3)
(the `white book’)
•
An objectbased approach.
•
Model formulas (& terms objects).
•
Data Frames (& model frames, …).
•
S3 methods
–
Give the user a simple call for plot, summary,
predict, etc.
–
Minimal additions to S engine & API
Figure 3: Timeline of R. (Credit: From John M. Chambers 2006 talk.)
Why not use another statistics package?
•
We could, but most complete packages cost from hundreds to thousands
of dollars!
Since R can do basic through highly advanced statistics —
and it’s free — it is a good choice!
•
Why not use Excel? Excel, while an spreadsheet excellent package, is not
a statistical software package.
Even with its statistical (Analysis Tool
Pack) addin it will not be able to adequately perform all the necessary
functions.
•
Why not use a TI statistics calculator? We could use it for trivial prob
lems. But you’d not likely use it after the class for real data. It’s just not
reasonable to enter large data sets (with potentially thousands of num
bers) into a calculator and you can’t readily put your work or graphs into
a report.
•
By teaching you R you will learn a real world statistics program that you
can actually use in your work if needed. Since it’s free and accepted by
the scientific community, you won’t have to ask your company to buy an
expensive piece of software. Perhaps you can convince your boss to give
you a raise since you’ve saved them thousands of dollars by using R!
Four key things you must learn from this lecture
By the end of this lecture, you must be able to:
1. Store a set of data (vector) in a variable.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '11
 Tanbakuchi
 Statistics, Mean, Anthony Tanbakuchi

Click to edit the document details