Code
Show All Code
Hide All Code
Download Rmd
Solution, Assignment #4
Prof Stine
Due February 15
An important aspect of working with real data is recognizing and dealing with missing data, sometimes in
surprising places. Question 1, for instance, is no
#=
# CONCEPT: VECTOR COMPUTATIONS
# - The simplest forms of data consisting of a single variable
#
can be stored in a vector. (Later we will introduce matrices
#
and dataframes for data with more variables.)
#
Two typical tasks on such simple data are:
#
#=
# INDEXING / SUBSETTING VECTORS WITH INTEGERS
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
In what follows we will put all three data types to work for the
task of pulling data out of a larger data vector. Subsetting
datasets is one of the most important tasks
#=
# RANDOM NUMBERS:
# - Random numbers are the food of stochastic simulations.
#
They are essentially 'iid' draws from probability distributions.
#
Even though they 'look' truly random and distributed according
#
to the desired distribution, they are rea
#=
# INDEXING / SUBSETTING VECTORS WITH INTEGERS
#
#
#
#
#
#
#
#
#
#
#
In what follows we will learn two of four ways of solving the
task of pulling data out of a larger data vector. Subsetting
datasets is one of the most important tasks in any analysis o
Second R Practice
January 18, 2012
For this assignment we will be using the famous Boston housing data set.
You can download it here:
http:/www-stat.wharton.upenn.edu/~magarick/471/boston.dat
Descriptions of the variables are here
http:/www-stat.wharton.u
First practice in R
February 7, 2012
Lets look at a simple data set from your rst regression class here at Penn.
First grab the le:
http:/www-stat.wharton.upenn.edu/~waterman/fsw/datasets/txt/Cleaning.txt
Once you have this le, you can analyse it either u
Obama vs the SP500
First the data itself:
>
>
>
>
>
obama <- read.csv("obama.csv")
sp500 <- read.csv("sp500.csv")
obama$rowIndex <- 1:(dim(obama)[1])
both <- merge(sp500, obama, "Date")
both <- both[sort.list(both$rowIndex), ]
Basic plots:
plot(both$Obama
#=
# CONCEPT: VARIABLE NAMES AND ASSIGNMENT OF VALUES/DATA
# - As in math, we can use 'variable names' to point
#
to values and data structures.
#
#
Examples:
x <- 1.2
# preferred
x = 1.2
# same, less preferred
1.2 -> x
# ok but rarely used
x
# Printing t
#=
#
#
# - SOME BASICS: HISTORY, NUMBERS, OPERATIONS, FUNCTIONS -#
#
#
WHY R?
#
#
. We need standards - R is one of them.
#
. Huge developer community
#
. New stats algorithms appear first as R packages.
#
. Growing user community, also in industry
#
. Po
Code
Show All Code
Hide All Code
Download Rmd
Solutions, Assignment #2
Prof Stine
Due February 1
This R-Notebook has both the questions as well as places for you to insert your answers. Turn in the .Rmd
file with your name inserted above in the header.
Code
Show All Code
Hide All Code
Download Rmd
Solutions #3
Prof Stine
Due February 8
Question 1 Dates in R
R has numerous functions for manipulating dates. Most important, R can read dates from files in a variety
of formats and convert these into its o
Code
Show All Code
Hide All Code
Download Rmd
Lecture 2: Matrices and Lists
Robert Stine, Stat 405/471
Spring 2017
Lecture 1 introduces vectors, one-dimensional sequences consisting of all numbers, character strings, or
logical elements. Matrices organ
Code
Show All Code
Hide All Code
Download Rmd
Assignment #1
Insert your name here
Due January 25
This R-Notebook has both the answers, along with some explanation and grading notes.
Question 1
Use R to compute the
1. cube root of 30.
30^(1/3)
[1] 3.107
Code
Show All Code
Hide All Code
Download Rmd
Assignment #4
Prof Stine
Due February 22
This R-Notebook has both the questions as well as places for you to insert your answers. Turn in the .Rmd
file with your name inserted in the header above. Type the
-title: "Lecture 10: R Graphics"
author: Robert Stine, Stat 405/705
date: Spring 2017
output: html_notebook
-We have used quite a few of the features of R graphics. This lecture explores
more of the structure of R graphics, such as the ability to draw you
Statistics 405/705
Spring, 2017
Introduction to Programming in R
Instructor and TA
Robert Stine
444 Huntsman Hall
[email protected]
Ruoqi Yu
[email protected]
There are two sections of this course; these meet at 12-1:30 pm and 1:30-3:00 pm on Mondays
and Wednes
#=
# CHARACTER DATA / STRINGS / TEXT DATA
#
3 data type: number ; character ; logic
# - We use synonymously:
#
#
'text data' = 'string data' = 'character data'
#
#
For now we only consider vectors of text/string/character data.
#
#
Character data has many
#=
# CONCEPT: NUMERIC VECTORS AND FUNCTIONS TO CREATE THEM
# - This is our first 'composite data structure'.
#
We have seen simple examples: ladders with spacings of +/-1
#
(It will be followed by matrices, arrays, lists and data frames.)
#
# - A numeric
#=
=
# CHARACTER DATA / STRINGS / TEXT DATA
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
- We use synonymously:
'text data' = 'string data' = 'character data'
Character data has many uses:
. It can label groups of data.
Examples: gender groups (female, male)
Class: Naive bayes for Linguistics Simple regression (R-squared
of 0.54) then 5 variable multiple regression (R-squared of 0.73). Using
80 variables we have a simple regression (R-squared = 1.0) and a naive
bayes. Last is a naive-bayes with 100s of variab
Admistrivia
Large clouds of data
Projecting all words down to a small diminsional space
Allows looking at them in a nice graph
Makes pretty pictures
How to project?
PCA
Look at words documents/articles
Each word is a variable
Compute the PCA
Gener
Class: Rare counts
April 2, 2012
(pdf version)
1
Admistrivia
Lit review due wednesday.
Lyle Ungar (Computer Science), Dean Foster (Wharton Statistics), and Mark Liberman (Linguistics) are looking for a student
to work this summer on an exploratory resea
Admistrivia
HW 2 due next Tuesday
I put up notes (in Rnw) which have examples of all the bootstraps
I talked about in class last time.
Nice article by Jim Manzi on experiments.
Science is observation
Piaget did wonders for child psychology from just o
Class: Doglegs /piecewise linear / Bent
stick
February 7, 2012
(online version)
Story time: Publishing books
Information wants to be free
I could tell you who said thatbut wiki is down today
Accedemics write papers for free
Most musicians (as in numbe
Class: CCA
April 4, 2012
Admistrivia
Last time: Prize for compression
Lit review due today
One shot learning
Discriptors
POS: Noun / verb
Tone: formal / informal
gender: male / female
sentiment: better / worse
etc
1
Fill in the blank
New word come