Data Visualization
Section 1 Introduction
Section 2 Numerical Measurements for One Variable
Numerical Measures for Location Parameter
Numerical Measures for Scale Parameter
Section 3 Graphical Methods for One Variable
Histogram
Box Plot
Density Plot
Using SAS Enterprise Miner 6.2 - SEMMA 1. Opening Enterprise Miner 1.1 Initiating Project 1.2 Starting Project and Diagram 1.3 Initiating Data Exploration 1.4 Setting Prior Probability and Profit Matrix 2. SEMMA Data Mining Process Used in Enterprise Mine
Introduction to SAS/Graph Philip Mason, Wood Street Consulting, Wallingford, Oxfordshire, England
ABSTRACT INTRODUCTION
SAS/GRAPH software offers device-intelligent color graphics for producing charts, maps and plots in a variety of patterns. Users can cu
Lab for Statistical Decision Theory
Data Explanation: Same as the data used in Practicum 3
Problem 1
For the target TAR1,
(a) Repeat Practicum 2 with exactly the same options used in the practicum.
(b) Write down the profit equation for each decision wher
Lab for Statistical Decision Theory
Data Explanation: Same as the data used in Practicum 3
Problem 1
For the target TAR1,
(a) Repeat Practicum 2 with exactly the same options used in the practicum.
(b) Write down the profit equation for each decision wher
STA 6714 Final Project
Report Due Date: April 28, 2006
Final Presentation: April 26, 2006
Most data-mining projects always start from data preparation.
However, the
effectiveness of data preparation process can not be confirmed without adequate model
buil
MathWillRockYourWorld
A generation ago, quants turned finance upside down. Now they're
mapping out ad campaigns and building new businesses from mountains
of personal data
Neal Goldman is a math entrepreneur. He works on Wall Street, where
numbers rule. B
Paper 313-2008
Introduction to the Graph Template Language Sanjay Matange, SAS Institute, Cary, NC
ABSTRACT
In SAS 9.2, the SAS/GRAPH Graph Template Language (GTL) goes production. This system is used by many SAS analytical procedures to create the automa
SUGI 31
Tutorials
Paper 262-31
A Programmer's Introduction to the Graphics Template Language
Jeff Cartier, SAS Institute Inc., Cary, NC
ABSTRACT
In SAS 9.2, the ODS Graphics Template Language (GTL) becomes production software. This powerful language is us
PharmaSUG2010 - Paper TU-SAS01 The Graph Template Language and the Statistical Graphics Procedures: An Example-Driven Introduction
Warren F. Kuhfeld, SAS Institute Inc., Cary, NC March 8, 2010
ABSTRACT
This paper provides a gentle, parallel, and example-d
Case Study II Comparing Recoding Efficient
Purpose: To prove that the model performance can be improved by using transformation
on nominal scale variable with many levels.
Data: SHOES_MVP
Complete Diagram:
Step 1: Add SAS Code in Appendix 1 in the Project
Visualizing Categorical Data with SAS and R
Michael Friendly
Part 4: Model-based methods for categorical data
logit(Admit) = Dept DeptA*Gender
2
Arthritis treatment data Linear and Logit Regressions on Age 1.0 Probability (Improved) 0.9 0.8 0.7 0.6 0.5 0.
InClass Exercise: Getting Familiar with SAS Enterprise Miner
(adapted from Applied Analytics using SAS Enterprise Miner, SAS Institute, Cary, NC. 2010)
Creating a SAS Enterprise Miner Project
A SAS Enterprise Miner project contains materials related to
How to Use Census Data to Enhance Marketing Analytics
What Is Census
A given population of members systematically acquiring and recording the information. Used mostly in connection with national population and housing censuses
Census Data and Marketing An
Case Study I Categorical Recoding in Enterprise Miner Purpose: There is not any single node that can perform nominal variable recording inside Enterprise Miner 7.1. However, SAS programmer has a utility program that can be used to perform this task. This
Using SAS Enterprise Miner 6.1 - SEMMA
1. Opening Enterprise Miner
1.1 Initiating Project
1.2 Starting Project and Diagram
1.3 Initiating Data Exploration
1.4 Setting Prior Probability and Profit Matrix
2. SEMMA Data Mining Process Used in Enterprise Mine
Assignment #3 Model Selection and Assessment
Due Date: February 28, 2011
Problem 1 (Target Marketing) Suppose that the revenue generated if the target responds is r
and the cost of sending out a letter is c.
(a) Set up the profit matrix. (1 Point)
(b) Wha
STA 6714 Assignment # 4 Categorical Variable Recoding
Due Date: March 14, 2011
Data: The data (pmad_pva) is from 1998 KDD Cup competition. It is the same data as we
known as donors data. This data set includes 26 predictor variables and two target variabl
Auxiliary Uses of Decision Trees
Morgan C. Wang
Department of Statistics
Orlando, Florida 32816-2370
1
Morgan C. Wang
3/29/2010
Outlines
2
Introduction
Data Exploration
Data Preparation
Conclusions
Morgan C. Wang
3/29/2010
Introduction
3
Morgan C. Wang
3/
Model Selection, Assessment and Decision Making
Section 1 Introduction
Section 2 Loss Functions
Section 2.1 Loss Functions for Quantitative Responses
Section 2.2 Loss Functions for Qualitative Responses
Section 3 Selection and Assessment
Section 3.1 Data
Categorical Variable Recoding
Morgan C. Wang
Department of Statistics
University of Central Florida
1
Morgan C. Wang
2/28/2011
Outline
Introduction
Representations with Dummy Variables
Unary Variable with Missing Values
Binary Variable with Missing Valu
Missing and Empty Values
Morgan C. Wang
Department of Statistics
University of Central Florida
1
Morgan C. Wang
3/8/2010
Outline
Introduction
Criterion for Replacing Missing Values
Unconditional Imputation Methods
Conditional Imputation Methods
Conclusion
An Example of Building a Prediction Model Using the Logistic Regression Node
SECTION 1 INTRODUCTION. 3
SECTION 2 ILLUSTRATIVE EXAMPLE . 3
DATA . 3
PURPOSE OF THE STUDY. 4
COMPLETE DIAGRAM . 4
STEP 1: DATA SOURCE NODE . 4
STEP 2: FIXING DATA PROBLEMS WITH
Outliers
Morgan C. Wang
Department of Statistics
University of Central Florida
1
Morgan C. Wang
2/9/2011
Outline
2
Introduction
Data Anomaly
Univariate Outliers Detection
Multivariate Outliers Detection
Case Study
Conclusions
Morgan C. Wang
2/9/2011
Intro
Programming and Computer Software, Vol. 29, No. 4, 2003, pp. 228237. Translated from Programmirovanie, Vol. 29, No. 4, 2003.
Original Russian Text Copyright 2003 by Petrovskiy.
Outlier Detection Algorithms in Data Mining Systems
M. I. Petrovskiy
Departmen
STA 6714 Data Preparation (Spring 2011)
Class days & times:
Office hours:
Office:
Phone:
e-mail:
Withdraw deadline:
Holidays:
Special note:
Required Text:
Reference Books:
M & W 4:30 to 5:45 PM (CL1 220)
M & W 10:00 to 12:00 PM and Tuesday 1:00 to 2:00
Co
Statistics 6714 Spring, 2011
Data Preparation
Instructor:
Dr. Morgan C. Wang
Office
Room 203, CC II
Phone:
(407) 823-2818
Email:
[email protected]
Office Hour M & W 10:00 to 12:00 PM and Tuesday 1:00 to 2:00
Website
http:/dms.stat.ucf.edu/STA6714/STA6714
STA 6714 Assignment # 1 Using Enterprise Miner Version 6.1
Due Date: January 26, 2011
Data: The data (HMEQ) is from a financial services company that extends a line of credit to
homeowners. This data set includes 12 predictor variables and one target vari