STA - 2023/0002
Statistical Methods I
Spring -2013
* http:/pegasus.cc.ucf.edu/nuddin/Stat Methods I/Websta2023 Spring2013.htm *
Class Time, Days and Place: 12:00PM - 1:15PM, Tuesday and Thursday, PSY 108
Instructor: Dr. Nizam Uddin
Office: Technology Comm
Correlation and Regression Example and Homework
Example: Test scores and hours watching TV the number of hours 12 students watched TV during the
weekend and the scores of each student who took a test the following Monday.
Hours spent
watching TV
x
0
1
Tes
Maxwell Redan
STA 2023 CHAPTER 2
Comparing the Mean and Median
Symmetric Distributions (mean = median)
Bell- Shaped Curve
U Shaped Curve:
Uniform
Skewed Distributions When one of the tails is longer
Right skewed: Median < Mean
Left skewed: Median > Mean
2
1
PCA
PCA can be motivated and understood in several different (yet related) ways: 1) The first is
to (linearly) transform correlated variables into a set of uncorrelated ones. Having uncorrelated
predictors is useful in dealing with multicollinearity in
STA 6704: DATA MINNING II
Assignment 2
Submitted by: SALAH UDDIN MOMTAZ
Date: March 18, 2017
_
Problem 1: Missing rate (in percentage) for variables
Using the attached R code, I looked at the missing rates for the variables. Those are as
follows:
Variable
Solutions to STA 6704 HW1, Spring 2017
We will use the apriori algorithm to create association rules for the Adult data set available
on the UCI Machine Learning Repository (http:/archive.ics.uci.edu/ml/datasets/Adult). The
data is adult.data and the attr
Principal Component Analysis (PCA) is a method of dimension reduction. This is
not directly related to prediction problem, but several regression methods are directly
dependant on it. Now a motivation for dimension reduction is being set up.
Notation
The
STAT 5474
Intro to Data Mining and Statistical Learning
Topic 2 (a)
Introduction to R
Xiaogang Su, Ph.D.
Department of Mathematical Sciences
University of Texas at El Paso (UTEP)
[email protected]
Downloading and Installing R from CRAN
R, a free version of S o
2016 Analytics Shootout
Sponsored by SAS Institute Inc.
OFFICIAL RULES
NO PURCHASE NECESSARY
HOW TO ENTER: To submit your entry, go to the contest web site located at:
http:/www.sas.com/en_us/events/analytics-conference/analytics-experience-2016/analytics
STAT 5370
Data Mining and Statistical Learning
Topic 2 (b)
Handling Large Data in R
Xiaogang Su, Ph.D.
Department of Mathematical Sciences
University of Texas at El Paso (UTEP)
[email protected]
INTRODUCTION
Data mining and statistical learning essentially pro
# #
# R-code for principal component analysis (PCA)
# Some Codes were Modified from Dr. Marloes Maathuis's Class Notes:
# http:/stat.ethz.ch/~maathuis/
# and the text An Introduction to Statistical Learning
# http:/www-bcf.usc.edu/~gareth/ISL/Chapter%2010
The SAS Analytics Shootout
Annual Student Competition
Sponsored by SAS and
The Institute for Health & Business Insight at
Central Michigan University
chp.cmich.edu/ihbi
Joseph Pomerville, Research Analyst / Advanced Analytics
Overview
2
What is th
STA6704 HW1 Spring 2017 (Due on 2/6/2017)
Your assignment should be typed (for example, in Microsoft Word, Lyx, Latex, etc). Submit a single PDF
document together with the code (if applicable) as one zipped file. You should name your zipped file in the
fo
STA 6704
Data Mining Methodology II
Daoji Li
[email protected]
Spring 2017
Announcements
1) SAS Global Forum
http:/www.sas.com/en_us/events/sas-global-forum/sas-global-forum-2017/program/scholarships-academicprograms.html#scholarships
Deadline: 01/13/2017
STA6704 HW1 Spring 2017 (Due on 2/6/2017)
Your assignment should be typed (for example, in Microsoft Word, Lyx, Latex, etc). Submit a single PDF
document together with the code (if applicable) as one zipped file. You should name your zipped file in the
fo
STA 6704 - 0001 Data Mining Methodology II
Spring 2017
(updated on 01/09/2017)
This syllabus is tentative and subject to change based on needs of students, University,
and instructor. Any change will be announced in class and posted on the course web site
If you are not familiar with R, please watch these videos by yourself.
1) Introduction to R
https:/www.youtube.com/watch?v=BlI3OVztQfM
2) Writing R Functions
https:/www.youtube.com/watch?v=w6nYISxAJmA
3) Linear Regression in R
https:/www.youtube.com/watch
Clustering of variables around latent components
Ricco RAKOTOMALALA
Ricco Rakotomalala
Tutoriels Tanagra - http:/tutoriels-data-mining.blogspot.fr/
1
Overview
1.
Clustering variables
2.
Correlations, distances and latent variables
3.
HAC based on latent v