Transformations - Tarter, M. E. (2008), Data...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Tarter, M. E. (2008), Data transformations. In S. Boslaugh (Ed.), Encyclopedia of Epidemiology (pp. 249–254). Thousand Oaks, CA: Sage Publications. Transformations Introduction Data transformations modify measured values systematically. For example, suppose the heart-rate (HR) ratio variate, HRR = (HR Work – HR Rest)/(HR Predicted Maximum – HR Rest) is transformed to the new variate arcsin( HRR ). In terms of the match up between, on the one hand, the statistical methodology applied to study arcsin( HRR ) and, on the other, the assumptions that underlie this methodology, a variate like arcsin( HRR ) is often a preferred transform of a variate like HRR . In modern statistical usage transformations often help preprocess raw data prior to the implementation of a general-purpose software package. Were the steps from data input to some display or printing device’s output compared to a journey by car through a city, a transformation like arcsin( HRR ) would play the role of an access road to the software package’s freeway onramp. Software validity or, loosely speaking, journey safety, depends on underlying assumptions. Hence data transformations can be classified on the basis of types of assumptions. These include a measured variate’s standard Normality, its general Normality, model linearity and/or variate homoscedasticity, i.e equal standard deviations. In addition, some useful transformations are not designed to preprocess measurements individually. Instead, once an estimator or test statistic has been computed using raw measurements, these transformations can help enhance the Normality of the estimator or test statistic. Transformations and Simulated Data Besides the transformation of measured values, among the steps implemented for the purpose of simulating artificial data values a transformation procedure is usually applied. For example, by using a pair of uniformly distributed random numbers as input a Box- Muller transformation (BMT) generates a pair of independent, standard Normal, in other words, Normal with zero expectation and unit variance, variates. To answer the two questions, (1) Why does the BMT have so many applications? And, (2) How are transformation components assembled? it is helpful to call upon the following notational conventions. The two Greek letters, φ and Φ , represent the standard Normal density function, i.e. curve, and cumulative distribution function (cdf), respectively. In the same way that sin -1 often designates the arcsin function, Φ -1 designates the inverse of . The three symbols that form -1 (which in older statistical and epidemiological texts is often called the probit function ) provide a useful notational device because of the tendency for transformation and other data analysis steps to be taken in the reverse of the order in which data simulation process components are implemented. For instance no data analysis text discusses a scale parameter σ before discussing a location parameter µ.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/04/2011 for the course PB HLTH 140 taught by Professor Tarter during the Fall '10 term at University of California, Berkeley.

Page1 / 5

Transformations - Tarter, M. E. (2008), Data...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online