i ii t rtL 111 loglog)(log
. 9.27
The maximum log-likelihood is obtained by setting to zero the derivative
of 9.27, yielding the following estimate of the parameter : = n i i t r
1 1 . 9.28 The standard error of this estimate is . / r The following
stati
Cumul Propn Surv at End Propn Terminating column is the
complement of the Propn Surviving column. Finally,
358 9 Survival Analysis
the Fatigue dataset. Determine the survivor, hazard and density
functions using the life-table estimate procedure. What is t
(AAPN). The scatter plot of Figure 8.6a shows that the pc scores
obtained with the covariance matrix are unable to discriminate the
several groups of rocks; u1 only
groups. On the other hand, the scatter plot in Figure 8.6b shows that
the pc scores obtain
diagonal elements of S, we have: S = Sk + (D H). 8.22 In order to cope
with different units of the original variables, it is customary to carry out
the factor analysis on correlation matrices: R = Rk + (I H). 8.23
There are several algorithms for finding
similar to the one mentioned in Example 9.8 (Weight 3). Note that the t
values are divided, as in
9.4 Models for Survival Data 371
Example 9.3, by 104. The observed probability of the chi-square
goodness of fit test is very high: p = 0.96. The model param
to create a Kaplan-Meier estimate of the data using survfit(x) (or, if
preferred, survfit(Surv(t[1:8],ev[1:8]=0). The survdiff function
provides tests for comparing groups of survival data. The argument rho
can be 0 or 1 depending on whether one wants the
8.12 Redo the principal factor analysis of Example 8.10 using three
factors and varimax rotation. With the help of a 3D plot interpret the
results obtained checking that the three factors are related to the
following original variables: SiO2-Al2O3-CaO (si
(scree plot), discarding those starting where the plot levels off. 4. A
more elaborate criterion is based on the so-called broken stick model.
This criterion discards the eigenvalues whose proportion of explained
variance is smaller than what should be th
< 1, the hazard function decreases monotonically. Taking into account
9.23, one obtains:
tetS =)( . 9.31
The probability density function of the survival time is given by the
derivative of F(t) = 1 S(t). Thus: tettf = 1)( . 9.32
370 9 Survival Analysis
T
8.5 Consider the CTG dataset with 2126 cases of foetal heart rate (FHR)
features computed in normal, suspect and pathological FHR tracings
(variable NSP). Perform a principal component analysis using the
feature set cfw_LB, ASTV, MSTV, ALTV, MLTV, WIDTH,
function returns the eigenvectors, u, and eigenvalues, l, of a covariance
matrix C. The pcacov function determines the principal components of a
covariance matrix C, which are returned in pc. The return vectors lat
and expl store the variances and contrib
8.2 Dimensional Reduction 337
8.2 Dimensional Reduction
When using principal component analysis for dimensional reduction,
one must decide how many components (and corresponding variances)
to retain. There are several criteria published in the literature
Figure 8.9. Principal component transformation of a bivariate dataset:
a) original data with a group delimited by an ellipsis; b) Standardised
data with the same group (delimited by a circle); c) Standardised data
projection onto the F1-F2 plane; d) Plot
unobservable variables, called latent variables, which model the data in
such a way that the remaining errors are uncorrelated. Equation 8.19
then expresses the observations x in terms of the latent variables zk and
uncorrelated errors e. The true values
352 8 Data Structure Analysis
8.9 Consider the Stock Exchange dataset. Using principal factor analysis,
determine which economic variable best explains the variance of the
whole data.
for this fact taking into account the values of the variables highly
co
Assuming one has created the Surv object x as explained in Commands
9.1, one
confidence interval (see section 9.2.3), can be obtained with
plot(survfit(x). Applying summary to survfit(x) the confidence intervals
for S(t) are displayed as follows:
time n.r
ones, f1 and f2, are computed with STATISTICA using the correlation
matrix. Figure 8.6 shows the corresponding pc scores (called factor
scores in STATISTICA), that is the data projections onto the principal
components.
340 8 Data Structure Analysis
We see
contemplating the interaction effects
X12 = X1X2, X13 = X1X3, X23 =
X2X3, and show that these interactions have no valid contribution.
log10FW = 1.2508 + 0.166BPD + 0.046AP 0.002646(BPD)(AP). Try to
obtain this formula using the Foetal Weight dataset and
,
from where we derive the unit length eigenvector: u1 = [0.7071 0.7071]
[ 2 /1 2/1 ] . For 2, in the same way we derive the unit length
eigenvector orthogonal to u1: u2 = [0.7071 0.7071] [ 2 /1 2 /1 ] .
Thus, the principal components of the co-ordinates
t (days)
Complete Censored
Figure 9.4. Kaplan-Meier estimate of the survivor function for the eventfree survival of patients with heart valve implant, obtained with
STATISTICA.
9.2.3 Statistics for Non-Parametric Analysis The following statistics are
ofte
b Figure 9.7. Survivor function (a) and hazard function (b) for the Heart
Valve dataset with the fitted exponential estimates shown with dotted
lines. Plots obtained with STATISTICA
9.4.2 The Weibull Model The Weibull distribution offers a more general
mo
x = x + Uk zk. 8.18
Using 8.17 and 8.18, we can express the original data in terms of the
estimation error e = x x , as: x = x + Uk zk + (x x ) = x + Uk zk + e.
8.19 When all principal components are used, the covariance matrix
satisfies
S = U U (see form
)(2/(1)( 02 IC = 2, for the circle (p = 2); )sinh(4/()(3 =C , for
the sphere (p = 3).
2 Ip denotes the modified Bessel
function of the first kind and order p (see B.2.10).
384 10 Directional Data
For p = 2, one obtains the circular distribution first stud
factor analysis solution afford an interpretation of the Apgar index. For
this purpose, use the varimax rotation and plot the categorised data
using three classes for the Apgar at 1 minute after birth (Apgar1: 5; >5
and 8; >8) and two classes for the Apga
always decreases with increasing time. The parameters of the
distribution can be estimated from the data using a log- likelihood
approach, as described in the previous section, resulting in a system of
two equations, which can only be solved by an iterati
in degrees for several joint surfaces of a granite structure. What are the
Cartesian co-ordinates of the unit length vector representing the first
measurement? A: Since the pitch is a descent angle, we use the
following MATLAB instructions (see Commands 1
correlated to the principal components found in a). c) Using a scatter
plot of the pc-scores check that the cfw_ADI, CON class set is separated
from all other classes by the first principal component only, whereas the
discrimination of the carcinoma class
The analysis and interpretation of directional data requires specific data
representations, descriptions and distributions. Directional data occurs
in many areas, namely the Earth Sciences, Meteorology and Medicine.
Note that directional
usual statistics,
Y in describing the data is tenuous. In the limit, with 2 Y 0, Y would
be discarded as an interesting variable and the equal density ellipsis
would converge to a line segment. In Figure 8.1c, X and Y are correlated
( = 0.99) and have the same variance, 2
obtained using either the SPSS or STATISTICA commands. Incidentally,
note how the logit and probit models afford a regression solution to
classification problems and constitute an alternative to the statistical
classification methods described in Chapter
The following properties are verified: 1. U S U = and S = U U . 8.6 2.
The determinant of the covariance matrix, |S|, is: | S | = | | = 1
2 d . 8.7 | S | is called the generalised variance and its square root
is proportional to the area or volume of the
likelihood), which is a function of the coefficients, L(), can be
expressed as: [ ] += ) exp(1ln)()( 1 010 i ii x xyL . 7.69
The maximization of the L() function can now be carried out using one
of many numerical optimisation methods, such as the quasi-Ne
performing ridge regression we go as far as r = 1. If we go beyond this
value the square norm of b is driven to small values and we may get
strange solutions such as the one shown in Figure 7.20d for r = 50
corresponding to b = [0.020 0.057 0.078] , i.e.,