Documents about Root Mean Square Error

lect11

UPenn, STAT 112
Excerpt: ... Class 11: Thurs., Oct. 14 Finish transformations Example Regression Analysis Next Tuesday: Review for Midterm (I will take questions and go over practice midterm if there are no questions) Next Thursday: Midterm HW5 due Tuesday. I will e-mail review notes and a practice midterm to you tomorrow. Transformations in JMP 1. Use Tukey's Bulging rule (see handout) to determine transformations which might help. 2. After Fit Y by X, click red triangle next to Bivariate Fit and click Fit Special. Experiment with transformations suggested by Tukey's Bulging rule. 3. Make residual plots of the residuals for transformed model vs. the original X by clicking red triangle next to Transformed Fit to . and clicking plot residuals. Choose transformations which make the residual plot have no pattern in the mean of the residuals vs. X. 4. Compare different transformations by looking for transformation with smallest root mean square error on original y-scale. If using a transformation that involves transforming y, ...

notes12

UPenn, STAT 112
Excerpt: ... 2 year increase in mean life expectancy. Log Transformation of Both X and Y variables It is sometimes useful to transform both the X and Y variables. A particularly common transformation is to transform X to log(X) and Y to log(Y) E (log Y | X ) = 0 + 1 log X E (Y | X ) = exp( 0 + 1 log X ) Heart Disease-Wine Consumption Data (heartwine.JMP) Bivariate Fit of Heart Disease Mortality By Wine Consumption 12 Heart Disease Mortality 10 8 6 4 2 0 10 20 30 40 50 60 70 80 Wine Consumption Residual Plot for Simple Linear Regression Model 3 2 1 0 -1 -2 -3 0 10 20 30 40 50 60 70 80 Wine Consumption Residual Plot for Log-Log Transformed Model 3 Residual 1 -1 -3 0 10 20 30 40 50 60 70 80 Wine Consumption Linear Fit Transformed Fit Log to Log Residual Evaluating Transformed Y Variable Models The residuals for a log-log transformation model on the original Y-scale are ^ ^ ei = Yi - E (Y | X i ) = Yi - exp(b0 + b1 log X i ) The root mean square error and R2 on the original Y-scale are shown in JMP under Fit ...

ME345_Lecture_03

Penn State, ME 345
Excerpt: ... M E 345 Spring 2008 Professor John M. Cimbala Lecture 03 01/18/2008 Today, we will: Review the pdf module: Errors and Calibration Do some example problems errors and calibration Discuss an additional item not in the pdf notes: Mean Bias Error Review the pdf module: Basic Statistics Do some example problems basic statistics Discuss an additional item not in the pdf notes: Root Mean Square Error Do Candy Questions for Candy Friday Example Given: The actual (true) temperature is 22.100oC. Six thermometer readings are taken: 22.15, 22.22, 22.09, 22.21, 22.18, and 22.24oC. To do: Calculate the systematic error for this set of measurements, and calculate the accuracy error (inaccuracy) and the precision error for each measurement. Solution: Ti (oC) 22.15 22.22 22.09 22.21 22.18 22.24 Inaccuracy = Ti Ttrue (oC) Precision error = Ti T (oC) [See also Excel spreadsheet on the website for this same problem] Example Given: Ten houses are sold in the State College ar ...

lectslides14

UPenn, STAT 112
Excerpt: ... Inference for Simple Linear Regression Lecture Notes XIV Statistic 112, Fall 2002 Outline Transforming Relationships (Chapter 2.6). Inference for Simple Linear Regression (Chapter 10.1). Reading for next class: Finish Chapter 10.1. Transformi ...

homework6

Delaware, CIEG 675
Excerpt: ... CIEG 675 Homework #6 Due Wednesday April 15, 2009 In a m-file do the following and verify it works by copy and pasting into the command window or running your m-file. 1) Load the data file called regr.mat from my web page. It contains two variables u and c (synthetic) that represent velocity and sediment concentration data respectively. There is some work that suggests the concentration should be related to the velocity squared. Do an analysis and tell me what you think. Also, determine the root-mean-square error between the data and the theorized linear fit. 2) The data contained in the file fit.mat (variables x and y) on my web page is thought to be well-suited to a cubic polynomial (x,y). Determine how well the data follows expectations using root mean square error and a plot. 3) Load the data file called surface.mat from my web page. It contains variables x (the cross-shore coordinate, y, the alongshore coordinate and z, the elevation) from a beach survey. Perform an interpolation to a uniform grid that ...

lecture-13

Furman, MATH 47
Excerpt: ... Standard Error Mathematics 47: Lecture 13 Dan Sloughter Furman University March 23, 2006 Dan Sloughter (Furman University) Standard Error March 23, 2006 1/5 Root mean squared error Denition Suppose T is an estimator for a parameter . We call r ...

e6finalexamstudyguideW09

UC Davis, ENG ENG6
Excerpt: ... xam) Section 9. Vectors, Matrices and Linear Algebra Vectors Addition, dot product, Euclidean norm, unit vector, angles and orthogonality, projection Matrices Transpose, identity matrix Same-size matrix addition, scalar multiplication, matrix multiplication Matrix inverse, determinant Singular matrix, non-singular matrix Solutions to systems of linear equations Solution by matrix inverse Solution by matrix division Ill-conditioned set of equations Section 10. Curve Fitting and Interpolation Polynomial regression The polyfit command Mean square error, root mean square error Interpolation Linear, cubic spline, nearest neighbor Section 11. Integration and Differentiation Numerical integration Trapezoidal method, the trapz command Cumulative integration and the cumtrapz command The quad, quad8, and quadl functions (Simpson's Rule, Newton-Cotes 8 panel integration, and the Lobatto algorithm, respectively) Two-dimensional integration Numerical differentiation Backward, forward, and central differences and their ...

StudyGuide2

University of Illinois, Urbana Champaign, STAT 100
Excerpt: ... n estimates given the 5 summary statistics. (Convert the independent variable to standard units, multiply by r and then convert back to units of the dependent variable.) Compute regression estimates for percentiles Understand the regression effect and the regression fallacy. Chapter 11: Residuals and RMSE Residuals are prediction errors. (Residual = Actual Predicted) Average of the residuals is always 0. The SD of the residuals is called the root mean square error (RMSE). RMSE = 1 r 2 * SDy (when you are predicting y from x). This tells you how accurate your predictions are. Note that when r = 1 (when there is a perfect correlation) then RMSE = 0, and when r = 0 then the RMSE = SDy. Rule of thumb: If the scatter plot is football-shaped, then about 2/3 of the predictions will be right to within one RMSE and about 95% will be right to within 2 RMSE. In a football-shaped scatter plot, the y-values associated with a fixed x-value are approximately normally distributed, with an average ...

Lecture_24

UPenn, ESE 24
Excerpt: ... Do a multiple regression to estimate (1). If P-value in Durbin-Watson test is small (say < .05), continue. STEP 2. Set equal to the Autocorrelation value in JMPIN STEP 3. Estimate the transformed variables: (3) (4) zt = yt yt1 , wit = xit xi ,t1 , t = 2,., T t = 2,., T , i = 1,., k STEP 4. Do multiple regression to estimate new linear model: (5) zt = 0 + i=1 i wit + ut , k t = 1,., T STEP 5. Use (1 ,., k ) from Step 4 to estimate (1 ,., k ) in (1), and use 0 from Step 4 plus in Step 2 to estimate 0 in (1) by (6) 0 = 0 /(1 ) COOKS D-MEASURE FOR OUTLIERS Given a regression of Y on ( x1 ,., xk ) using data set ( y j , x1 j ,., xkj ), j = 1,., n , if s is the root mean square error , and if Y j = regression prediction of E (Y j | x1 j ,., xkj ) Y j (i ) = regression prediction of E (Y j | x1 j ,., xkj ) with the i th data point ( yi , x1i ,., xki ) removed. then Cooks Distance Mea ...

lecture-19

UPenn, STAT 431
Excerpt: ... ary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.427626 0.420812 0.579659 1.191754 171 Response Mercury Concentration Whole Model Actual by Predicted Plot Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model 2 42.173416 21.0867 62.7571 Error 168 56.448858 0.3360 Prob > F C. Total 170 98.622274 <.0001 Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Intercept -1.496193 0.365801 -4.09 <.0001 Length 0.071353 0.011979 5.96 <.0001 Weight -0.000143 0.000117 -1.23 0.2217 4 Residual by Predicted Plot Mercury Concentration Resi 2.0 1.5 1.0 0.5 0.0 -0.5 -1.0 .0 .5 1.0 2.0 3.0 Mercury Concentration Predicted 5 Interpreting the coefcients (x1, x2) = E[Y | Length = x1, Weight = x2] = 0 + 1 x1 + 2 x2 . Interpretation of 0: mean mercury concentration for a zero-dimensional sh (length = 0 cm, weight = 0 g) Interpretation of 1: if observed length increases by 1 cm, and other variables remain xed, then av ...

homework6

Delaware, CIEG 675
Excerpt: ... CIEG 675 Homework #6 Due Tuesday October 30, 2007 In a m-file do the following and verify it works by copy and pasting into the command window or running your m-file. 1) Load the data file called regr.mat from my web page. It contains two variables u and c (synthetic) that represent velocity and sediment concentration data respectively. There is some work that suggests the concentration should be related to the velocity squared. Do an analysis and tell me what you think. Also, determine the root-mean-square error between the data and the theorized linear fit. 2) The data contained in the file fit.mat (variables x and y) on my web page is thought to be well-suited to a cubic polynomial (x,y). Determine how well the data follows expectations using root mean square error and a plot. 3) Load the data file called surface.mat from my web page. It contains variables x (the cross-shore coordinate, y, the alongshore coordinate and z, the elevation) from a beach survey. Perform an interpolation to a uniform grid that ...

homework6

Delaware, CIEG 675
Excerpt: ... CIEG 675 Homework #6 Due Tuesday October 30, 2007 In a m-file do the following and verify it works by copy and pasting into the command window or running your m-file. 1) Load the data file called regr.mat from my web page. It contains two variables u and c (synthetic) that represent velocity and sediment concentration data respectively. There is some work that suggests the concentration should be related to the velocity squared. Do an analysis and tell me what you think. Also, determine the root-mean-square error between the data and the theorized linear fit. 2) The data contained in the file fit.mat (variables x and y) on my web page is thought to be well-suited to a cubic polynomial (x,y). Determine how well the data follows expectations using root mean square error and a plot. 3) Load the data file called surface.mat from my web page. It contains variables x (the cross-shore coordinate, y, the alongshore coordinate and z, the elevation) from a beach survey. Perform an interpolation to a uniform grid that e ...

Fall2006Exam2A

Iowa State, STAT 104
Excerpt: ... = 505 grams (National Vital Statistics Report, Vol. 48, No. 3). a. What is the probability that a new full-term baby weighs more than 3000 grams? (6 points) b. A mother is told that the weight of her newborn full-term baby is at the 70th percentile. How much does her newborn baby weigh? (6 points) c. In a random sample of 20 full-term babies, what is the probability that the mean birth weight is between 3400 and 3600 grams? (6 points) Fit Y by X Group 4 Bivariate Fit of Beef Consumption (CB) By Disposable Income 60 Bivariate Fit of Beef Consumption (CB) By Price of Beef 60 Beef Consumption (CB) 55 Beef Consumption (CB) 25 30 35 40 45 50 55 60 55 50 50 45 45 55 60 65 70 Price of Beef Linear Fit Linear Fit 75 80 85 Disposable Income (DINC) Linear Fit Consumption (CB) = 43.799135 + 0.1633466 Income Linear Fit Consumption (CB) = 80.238992 - 0.4655561 Price Summary of Fit Rsquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.102552 0.042722 4.187743 53.08824 17 S ...

CourseTerms

SUNY Albany, PAD 705
Excerpt: ... = error or residual variance x = standard deviation of a variable (here x) x = standard error of the coefficient on x s = sample standard deviation of a variable Note that s and s2 always refer to sample standard deviation and variance. For reference, the Root Mean Square Error can be calculated once a regression is run by using the estimated coefficients here called b. The formula is: {[(yi - xib)2] / N -k}1/2 where b = estimated coefficients yi = dependent variable xi = independent variables N = number of observations k = number of variables in the regression (including the constant) Probability Distributions The distribution of a variable is often denoted by a set of symbols that include the parameters that define the distribution. For instance, if is distributed normally with a mean of and 2 2 standard deviation of then we would write: ~ N(, ), or simply ~ N(, ). If the mean and standard deviation are known, then and would be replaced with the ac ...

eml tekscan

UF, EML 5598
Excerpt: ... Digital Pressure Sensing Devices Tekscan Systems Presented by Stanley Kim Natural Joint Natural Joint 6MPa threshold for chondrocyte death under repeated mechanical loading Prosthetic joints UHMWPE wear particles common cause of aseptic loo ...

Lecture7

UPenn, STAT 102
Excerpt: ... tone patterns are often best treated by polynomial regression; to be discussed later. Our data is of the form suitable for trying a Group A. The transformation t ( x ) = x works reasonably well here. Heres the plot showing the best curve of form y = 0 + 1 x . 10 Best of this form is in the sense that 0 , 1 minimize the SSE for all curves of this form 600 Price ($1,000) 500 400 300 200 100 0 10 20 30 40 50 60 Age of House Transformed Fit to Sqrt Price ($1,000) = 385 - 37.69 Sqrt(Age of House) You can use the corresponding ANOVA tables to compare the value of this fit to that of a straight line: 11 ANOVA Tables for fit to an ordinary line and to SqRt Linear Fit: Price ($1,000) = 323 - 4.857 Age of House Summary of Fit RSquare 0.395 Root Mean Square Error 66.0 Observations 438 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model 1 1240000 1241676 285.1378 Error 436 4355 Prob > F 1900000 C. Total 437 3140000 <.0001 Transformed Fit to Sqrt: Price ($1,000) ...

ms4results

Penn State, MBC 5025
Excerpt: ... Dataset Longitude kiveo_dataset Latitude cane_coders_finaldataset Intensity SST 34-kt Wind Radius Max Wind Radius Classifier LinearRegression -S 0 -R 1.0E-8 Regression By Discretation (j48) Regression By Discretation (DecisionStump) Regressi ...

notes16

UPenn, STAT 112
Excerpt: ... 0 1 2 3 4 5 6 Influential Points, High Leverage Points, Outliers in Multiple Regression high leverage As in simple linear regression, we identify and high influence points by checking the leverages and Cook's distances (Use save columns to save Cook's D Influence and Hats). High influence points: Cook's distance > 1 High leverage points: Hat greater than (3*(# of explanatory variables + 1)/n is a point with high leverage. These are points for which the explanatory variables are an outlier in a multidimensional sense. Use same guidelines for dealing with influential observations as in simple linear regression. Point that has unusual Y given its explanatory variables: point with a residual that is more than 3 RMSEs away from zero. Response MORT Summary of Fit RSquare RSquare Adj Root Mean Square Error 0.688278 0.659415 36.30065 Parameter Estimates Term Intercept PRECIP EDUC NONWHITE log NOX log S02 Estimate 940.6541 1.9467286 -14.66406 3.028953 6.7159712 11.35814 Std Error 94.05424 0.700696 ...

Lecture 17

UPenn, STAT 102
Excerpt: ... territory for 1999. Advert: The amount spent on advertising in each territory (in hundreds) in 1999. Bonus: The total amount paid in bonus in each territory (in hundreds) in 1999. Region: South, West and Midwest. These regions are divided into territories of approximately equal sales potential, and the data comes from the territories within each region 2 One-Way ANOVA 1. As a background analysis, lets look at whether Sales are related to Region. For now, ignore the Advert and Bonus variables. Here are the standard tables for the analysis of Sales on Region: Summary of Fit Rsquare Root Mean Square Error Observations Analysis of Variance 0.669 ? 26 Mean Square 522830 22493 F Ratio ? Prob > F <.0001 Source Region name Error C. Total Level MIDWEST SOUTH WEST DF ? ? ? Number 11 7 8 Sum of Squares 1045660 517333 1562993 Mean 1498.6 1037.5 1162.5 Means for Oneway Anova Std Error 45.2 56.7 53.0 Lower 95% 1405.1 ? 1052.8 Upper 95% 1592.2 ? 1272.2 Std Error uses a pooled esti ...

hw6_solns

Delaware, CIEG 675
Excerpt: ... ine yl=polyval(p,xl); plot(xl,yl,'k'); xlabel('velocity squared'); ylabel('sediment concentration'); 0.7 0.6 0.5 sediment concentration 0.4 0.3 0.2 0.1 0 0 0.5 1 1.5 velocity squared 2 2.5 3 The rmse is 0.0328. small given range of data. R2 is 0.95 which is quite high for this data meaning they are well correlated. 2) The data contained in the file fit.mat (variables x and y) on my web page is thought to be well-suited to a cubic polynomial (x,y). Determine how well the data follows expectations using root mean square error and a plot. % prob 2% % make fake data x=-10:0.1:10; x=x+rand(size(x); y=x.^3+rand(size(x)*103 + x.^2 +x +1; save fit x y % this part done for you clear all load fit p=polyfit(x,y,3); % cubic fit; yfit=polyval(p,x); % the polyfitted values of c assuming a linear fit rmse= sqrt(mean( (yfit-y).^2); % root mean square error clf plot(x,y,'.'); hold on xl=-11:.2:11; % for polyfitted curve yl=polyval(p,xl); % to make the polyfitted curve plot(xl,yl,'k'); xlabel('x'); ...

categorical

UPenn, INSR 260
Excerpt: ... 0 Households (000) Linear Fit 120 100 120 140 160 180 200 220 240 Households (000) Linear Fit Linear Fit Households (000) Linear Fit Sales ($000) = 18.155451 + 0.887074*Households (000) Linear Fit Sales ($000) = 50.630163 + 0.8289871*Households (000) Linear Fit Sales ($000) = 7.9004191 + 0.9207038*Households (000) SRM Term Intercept Households (000) Estimate 14.867648 0.9371196 Std Error 13.12805 0.073045 t Ratio 1.13 12.83 Prob>|t| 0.2779 <.0001* 5 Qualitative Variables Represent categories using dummy variables A 0/1 indicator for each of the categories Redundant: only need 2 dummies for the 3 categories Data table JMP software makes the manual creation of dummy variables unnecessary. 6 Regression with Categorical Add the dummy variables to the regression Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.986846 0.983258 6.349409 176.9893 15 Parameter Estimates Term Intercept Households (000) DD DM Estimate 14.977693 0 ...

categorical_nb

UPenn, INSR 260
Excerpt: ... 0 Households (000) Linear Fit 120 100 120 140 160 180 200 220 240 Households (000) Linear Fit Linear Fit Households (000) Linear Fit Sales ($000) = 18.155451 + 0.887074*Households (000) Linear Fit Sales ($000) = 50.630163 + 0.8289871*Households (000) Linear Fit Sales ($000) = 7.9004191 + 0.9207038*Households (000) SRM Term Intercept Households (000) Estimate 14.867648 0.9371196 Std Error 13.12805 0.073045 t Ratio 1.13 12.83 Prob>|t| 0.2779 <.0001* 5 Qualitative Variables Represent categories using dummy variables A 0/1 indicator for each of the categories Redundant: only need 2 dummies for the 3 categories Data table JMP software makes the manual creation of dummy variables unnecessary. 6 Regression with Categorical Add the dummy variables to the regression Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.986846 0.983258 6.349409 176.9893 15 Parameter Estimates Term Intercept Households (000) DD DM Estimate 14.977693 0 ...

Schuckman_ProjectFinal_Proposal_Revised

Penn State, GEOG 505
Excerpt: ... Karen Schuckman 19 November 2008 Geog 586: QuarterLong Project Revised Project Proposal One of the primary professional responsibilities of a Certified Photogrammetrist is to provide a spatial accuracy statement for base map products, particular ...

homework6

UPenn, STAT 112
Excerpt: ... Homework 6, Statistics 112, Fall 2005 This homework is due Thursday, November 3rd at the beginning of class. 1. This problem is based on Dielman, Problem 5.8. The data set MPGWT5.JMP contains data on the number of miles per gallon obtained by a car in city driving (CITYMPG) and the weight of a car in pounds (WEIGHT) for 147 cars listed in the Road and Track October 2002 issue. We would like to model E(CITYMPG|WEIGHT). (a) Fit a simple linear regression model of Y=CITYMPG on X=WEIGHT. Construct a residual plot. What is the most obvious problem you see with the residual plot compared to what you would expect to see if the ideal simple linear regression model holds? (b) Using Tukey's Bulging rule, try three appropriate transformations to try to achieve a better fit. Which transformation is best in terms of minimizing the R 2 (equivalently the root mean square error )? Does the transformation improve on the simple linear regression model? 2. Problem 1 continued. (a) Use polynomial regressions to model E(CITYMPG|WE ...

studyguide2

Arizona, RNR 417
Excerpt: ... hic data products, CAD conversions; issues and processes tablet calibration - definition, general procedures, issues related to transformation global positioning systems (GPS) - definition, use, basic functional understanding of time and distance relationship project development approach - understand the steps in the process as presented in class Vocabulary - vertical integration, integrated parametric mapping , manuscripting, vector feature resolution, image registration, geoid, ellipsoid, trilateration, datum, reliability, validity, accuracy, precision Questions What is IPU Mapping and what is it used for? What is vertical integration? What is Root Mean Square error as it applies to tablet calibration? What are some the advantages of scanning as a form of cartographic data acquisition? Created on 11/2/2004 11:48 AM Why are there so many datums? What happens during the process of digitizer calibration? DATA MODELS and TOPOLOGY georelational model - definition, characteristics, funct ...