140 Pages

fletcher_thesis

Course: STAT 322, Fall 2009
School: UNC
Rating:
 
 
 
 
 

Word Count: 42080

Document Preview

Variability Statistical in Nonlinear Spaces: Application to Shape Analysis and DT-MRI by P. Thomas Fletcher A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulllment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science. Chapel Hill 2004 Approved by: Stephen M. Pizer, Advisor Sarang Joshi, Advisor Guido Gerig,...

Register Now

Unformatted Document Excerpt

Coursehero >> North Carolina >> UNC >> STAT 322

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Variability Statistical in Nonlinear Spaces: Application to Shape Analysis and DT-MRI by P. Thomas Fletcher A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulllment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science. Chapel Hill 2004 Approved by: Stephen M. Pizer, Advisor Sarang Joshi, Advisor Guido Gerig, Reader J. S. Marron, Reader Michael Kerckhove, Reader ii iii ABSTRACT P. THOMAS FLETCHER: Statistical Variability in Nonlinear Spaces: Application to Shape Analysis and DT-MRI. (Under the direction of Stephen M. Pizer and Sarang Joshi.) Statistical descriptions of anatomical geometry play an important role in many medical image analysis applications. For instance, geometry statistics are useful in understanding the structural changes in anatomy that are caused by growth and disease. Classical statistical techniques can be applied to study geometric data that are elements of a linear space. However, the geometric entities relevant to medical image analysis are often elements of a nonlinear manifold, in which case linear multivariate statistics are not applicable. This dissertation presents a new technique called principal geodesic analysis for describing the variability of data in nonlinear spaces. Principal geodesic analysis is a generalization of a classical technique in linear statistics called principal component analysis, which is a method for computing an ecient parameterization of the variability of linear data. A key feature of principal geodesic analysis is that it is based solely on intrinsic properties, such as the notion of distance, of the underlying data space. The principal geodesic analysis framework is applied to two driving problems in this dissertation: (1) statistical shape analysis using medial representations of geometry, which is applied within an image segmentation framework via posterior optimization of deformable medial models, and (2) statistical analysis of diusion tensor data intended as a tool for studying white matter ber connection structures within the brain imaged by magnetic resonance diusion tensor imaging. It is shown that both medial representations and diusion tensor data are best parameterized as Riemannian symmetric spaces, which are a class of nonlinear manifolds that are particularly well-suited for principal geodesic analysis. While the applications presented in this dissertation are in the eld of medical image analysis, the methods and theory should be widely applicable to many scientic elds, including robotics, computer vision, and molecular biology. iv v ACKNOWLEDGMENTS First, I would like to thank my advisor Dr. Stephen M. Pizer, whose guidance and support were essential to this work. I thank him for always being there when I had a question or when I needed to be pointed in the right direction. I also want to thank my co-advisor Dr. Sarang Joshi, who has played an integral role in formulating much of this research and has been a joy to work with. My committee members Dr. Guido Gerig, Dr. J.S. Marron, and Dr. Michael Kerckhove have been enormous help with their expertise in image analysis, statistics and mathematics. This research was made possible with help from many people at the University of North Carolina. I want to thank all my fellow students (past and present) in the Medical Image Display and Analysis Group, including Dr. Paul Yushkevich, Dr. Andrew Thall, Yonatan Fridman, Sean Ho, Qiong Han, Eli Broadhurst, and many others. I also want to thank MIDAG members Dr. Ed Chaney, Dr. James Damon, Dr. Daniel Fristch, Dr. A. Graham Gash, Dr. Keith Muller, and Gregg Tracton. My work has beneted immensely from many discussions with MIDAG members in our regular meetings, including Image Lunch, Stats of Shape, and the m-rep group meetings. The funding for this work was provided by National Institutes of Health grants P01 CA47982 and EB02779. I am deeply grateful to the faculty and sta of the UNC Department of Computer Science. I thank each and every professor who has taken the time to teach me something, whether in or out of class. I thank the sta in the department who keep everything running smoothly. I am especially grateful to Delphine Bull whose organization and assistance have saved me countless hours. Finally, I am thankful for my friends and family, whose love and support made this all worthwhile. My friends Tom, Dave, Josh, Shelby, Zac, Kimberly and Eli kept me laughing when I needed it the most. My parents, Pete and Elizabeth, have supported me in everything that I do. They always stressed to me the importance of education, and I am glad they did. My brother David, sister Rebecca, and grandmother Dorothy have been an essential part of my support system. Most of all, I feel very lucky that I have been able to share this experience with such a wonderful person as my wife Erin. vi vii TABLE OF CONTENTS LIST OF FIGURES 1 Introduction 1.1 Motivation . . . . . . . . . . . . 1.1.1 Shape Analysis . . . . . 1.1.2 Diusion Tensor Imaging 1.2 Thesis and Claims . . . . . . . 1.3 Overview of Chapters . . . . . . xi xii 1 3 4 4 6 7 8 8 9 10 10 11 11 12 14 15 15 16 20 22 24 25 26 27 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Mathematical Background 2.1 Topology . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Basics . . . . . . . . . . . . . . . . . . . . 2.1.2 Metric spaces . . . . . . . . . . . . . . . . 2.1.3 Continuity . . . . . . . . . . . . . . . . . . 2.1.4 Various Topological Properties . . . . . . . 2.2 Dierentiable Manifolds . . . . . . . . . . . . . . 2.2.1 Topological Manifolds . . . . . . . . . . . 2.2.2 Dierentiable Structures on Manifolds . . 2.2.3 Tangent Spaces . . . . . . . . . . . . . . . 2.3 Riemannian Geometry . . . . . . . . . . . . . . . 2.3.1 Riemannian Metrics . . . . . . . . . . . . 2.3.2 Geodesics . . . . . . . . . . . . . . . . . . 2.4 Lie Groups . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Lie Group Exponential and Log Maps . . 2.4.2 Bi-invariant Metrics . . . . . . . . . . . . 2.5 Symmetric Spaces . . . . . . . . . . . . . . . . . . 2.5.1 Lie Group Actions . . . . . . . . . . . . . 2.5.2 Symmetric Spaces as Lie Group Quotients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 3 Image Analysis Background 3.1 Statistical Shape Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 3.1.2 3.1.3 3.1.4 3.2 3.2.1 3.2.2 3.3 3.3.1 3.3.2 3.4 Point Set Shape Spaces . . . . . . . . . . . . . . . . . . . . . . . . Procrustes Distance and Alignment . . . . . . . . . . . . . . . . . Shape Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . Nonlinear Statistical Analysis . . . . . . . . . . . . . . . . . . . . Active Contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . Probabilistic Deformable Models . . . . . . . . . . . . . . . . . . The Medial Locus . . . . . . . . . . . . . . . . . . . . . . . . . . . M-reps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 29 30 32 34 38 39 39 41 43 44 48 57 61 Deformable Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Medial Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . Diusion Tensor Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Manifold Statistics 4.1 Means on Manifolds 4.1.1 4.1.2 4.2 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5 4.2.6 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Intrinsic vs. Extrinsic Means . . . . . . . . . . . . . . . . . . . . . Computing the Intrinsic Mean . . . . . . . . . . . . . . . . . . . . Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geodesic Submanifolds . . . . . . . . . . . . . . . . . . . . . . . . Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dening Principal Geodesic Analysis . . . . . . . . . . . . . . . . An Alternative Denition of PGA . . . . . . . . . . . . . . . . . . Approximating Principal Geodesic Analysis . . . . . . . . . . . . 61 62 63 64 65 66 67 67 68 69 71 73 74 75 76 77 79 81 83 84 Principal Geodesic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Statistics of M-reps 5.1 M-reps as Elements of a Symmetric Space . . . . . . . . . . . . . . . . . 5.1.1 5.1.2 5.2 5.3 5.4 5.5 The Exponential and Log Maps for M-reps . . . . . . . . . . . . . The Hippocampus Data Set . . . . . . . . . . . . . . . . . . . . . M-rep Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M-rep Averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M-rep PGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PGA in Deformable M-reps Segmentation . . . . . . . . . . . . . . . . . 5.5.1 Principal Geodesic Deformations . . . . . . . . . . . . . . . . . . ix 5.6 5.5.2 PGA-Based Geometric Prior . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 86 88 90 91 92 93 94 96 96 97 98 100 101 101 105 107 109 109 113 113 115 116 117 119 6 Statistics of Diusion Tensors 6.1 The Space of Diusion Tensors . . . . . . . . . . . . . . . . . . . 6.2 The Geometry of P D(n) . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 The Lie Group Action on P D(n) . . . . . . . . . . . . . . 6.2.2 The Invariant Metric on P D(n) . . . . . . . . . . . . . . . 6.2.3 Computing Geodesics . . . . . . . . . . . . . . . . . . . . . 6.3 Statistics of Diusion Tensors . . . . . . . . . . . . . . . . . . . . 6.3.1 Averages of Diusion Tensors . . . . . . . . . . . . . . . . 6.3.2 Principal Geodesic Analysis of Diusion Tensors . . . . . . 6.4 Properties of PGA on P D(n) . . . . . . . . . . . . . . . . . . . . 6.5 New Methods: Comparison Metric, Interpolation, and Anisotropy 6.5.1 Comparison Metric . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Diusion Tensor Interpolation . . . . . . . . . . . . . . . . 6.5.3 Geodesic Anisotropy Measure . . . . . . . . . . . . . . . . 6.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Discussion and Future Work 7.1 Summary of Contributions . . . . . . 7.2 Future Work . . . . . . . . . . . . . . 7.2.1 Theoretical Questions . . . . 7.2.2 M-rep Extensions . . . . . . . 7.2.3 Future Diusion Tensor Work 7.2.4 Other Application Areas . . . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x xi LIST OF FIGURES 2.1 2.2 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.1 5.1 5.2 5.3 5.4 5.5 5.6 A local coordinate system (x, U ) on a manifold M . . . . . . . . . . . . . The Riemannian exponential map. . . . . . . . . . . . . . . . . . . . . . Three objects that have the same shape, yet have dierent positions, orientations, and scales. . . . . . . . . . . . . . . . . . . . . . . Projection of an object onto the tangent space of the shape space k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . n Principal component analysis and Mahalanobis distance. . . . . . . . . . Medial axes of a 2D and 3D object. . . . . . . . . . . . . . . . . . . . . . Two representations of a 3D order 1 medial atom. . . . . . . . . . . . . . Two single gure m-rep models: a kidney (left) and a hippocampus (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A 3D medial end atom, showing the portion of the boundary crest implied by the atom. . . . . . . . . . . . . . . . . . . . . . . . . The gural coordinate directions (u, v, t, ) demonstrated on an m-rep model of the hippocampus. . . . . . . . . . . . . . . . . . . A multi-gure m-rep object showing the subgure hinge attachment. . . 12 19 31 34 37 45 47 50 50 51 52 71 76 79 80 81 82 83 The spherical triangle used in the calculation of the projection operator for S 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The surfaces of 16 of the 86 original hippocampus m-rep models. . . . . . The 86 aligned hippocampus m-reps, shown as overlayed medial atom centers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The surface of the mean hippocampus m-rep. . . . . . . . . . . . . . . . Comparison of linear averaging and symmetric space averaging of medial atoms. . . . . . . . . . . . . . . . . . . . . . . . . . . . The rst three PGA modes of variation for the hippocampus m-reps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A plot of the eigenvalues from the modes of variation and their cumulative sums. . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii 6.1 6.2 6.3 6.4 The space P D(2), showing the geodesic and the straight line l between the two points p0 and p1 . . . . . . . . . . . . . . . . . . . . The rst two modes of variation of the simulated diusion tensor data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An example of geodesic interpolation of two diusion tensors. 91 99 . . . . . . 101 The geodesic anisotropy values for a sequence of diusion tensors. . . . . 106 Chapter 1 Introduction 1.1 Motivation Advances in medical imaging technology have provided the ability to acquire highresolution 3D images of the human body. Imaging technologies such as CT and MR are non-invasive means for obtaining potentially life-saving information. The goal of medical image analysis is to maximize the potential benet of this data, expanding its use beyond simple visualization of the raw data. Image analysis techniques provide more advanced visualizations and aid in disease diagnosis, radiotherapy treatment, surgery planning, and tracking of anatomic growth. For example, automatic extraction of anatomical geometry from a medical image is useful in planning radiation beam therapy to apply maximum radiation dose to a tumor while minimizing the exposure to surrounding organs. Image analysis techniques have shown promise in diagnosing brain disorders such as schizophrenia by distinguishing healthy brain structures from those with disease. Analysis of diusion tensor magnetic resonance images of neonatal brains can give information about the early stages of development in brain connectivity. These examples benet from a particular tool from medical image analysis known as statistical shape analysis, which describes the geometric variability of anatomy. A probability distribution of the possible geometric congurations of an organ can be used as prior information to help guide the process of automatic extraction of anatomy geometry from medical images. Probability distributions of normal organ shape and of diseased organ shape can be used to assign a probability that a patient has a particular disease based on the shape of their anatomy. Statistics of diusion tensor image data can be used to explore what common changes occur in the connectivity of the brain during development. 2 Previous approaches to these problems have used linear models of anatomic shape, and thus, linear statistical techniques to analyze the shape variability. Most models of shape currently in use are based on linear representations of the boundary that can only undergo linear variations in shape. However, richer models of shape and richer variations of shape can be achieved with nonlinear models. For example, medial representations of geometry, or m-reps, have shown promise in representing the interior of anatomic structures and describing shape changes in intuitive terms such as local thickening, bending, and twisting. The parameters of m-rep models are inherently nonlinear. Also, previous statistical models of diusion tensor data have been linear models. As shown in this dissertation, diusion tensors are more naturally modeled as elements of a nonlinear space. The drawback of nonlinear models is that classical linear statistical methods can give very misleading results. This dissertation presents a new technique called principal geodesic analysis for describing the variability of data in nonlinear spaces. Principal geodesic analysis is a generalization of a classical technique in linear statistics called principal component analysis, which is a method for computing an ecient parameterization of the variability of linear data. Principal component analysis also allows the dimensionality of the data to be reduced to only the true variables of change. This dissertation extends these concepts to nonlinear spaces known as manifolds. The driving problems in this dissertation are two: (1) statistical shape analysis using medial representations of geometry, which is applied within an image segmentation framework via posterior optimization of deformable medial models, and (2) statistical analysis of diusion tensor data intended as a tool for studying white matter ber connection structures within the brain imaged by magnetic resonance diusion tensor imaging. While the applications presented in this dissertation are in the eld of medical image analysis, the methods and theory should be widely applicable to many scientic elds, including mechanical engineering, robotics, computer vision, and molecular biology. Many common geometric entities are elements of nonlinear spaces. These include transformations such as rotations, scalings, and ane transformations, and primitives such as lines, planes, and unit vectors. The statistical methods developed in this work can be applied to all of these spaces. These other possible applications are mentioned in the future work section in Chapter 7. The remainder of this section continues the motivation for the two driving applications of this work: shape analysis and diusion tensor imaging. 3 1.1.1 Shape Analysis Shape analysis concerns the statistical study of the geometry of objects that is invariant to position, size, and orientation. An important aspect of shape theory is in studying the geometric variability of objects. Anatomical shape analysis plays an important role in several medical image analysis applications. One motivation for statistical shape analysis is its use in segmentation of anatomical structures in medical images. Segmentation is the process of distinguishing important structures in an image from background. This is a fundamental task in medical image analysis that is often a prerequisite for further analysis, visualization, disease diagnosis, or planning of medical treatment. Knowledge of the geometric variability of the anatomy can be used as prior information to help guide the segmentation process. This geometric prior helps overcome diculties inherent to segmentation, such as image noise, sampling artifacts, and low contrast. Statistical shape analysis may also be useful in educational atlases of anatomy. Current anatomical atlases only present a single instance of the normal anatomy. A statistical shape atlas can present a full range of geometric variabilities that occur in normal anatomy. Another application of statistical shape analysis is its potential to serve as a tool in understanding and diagnosing disease. For instance, brain disorders such as Alzheimers and schizophrenia are often accompanied by structural changes in the brain. Understanding the changes in organ shape that occur could be fundamental in furthering our understanding of such diseases. Also, shape analysis can help in diagnosing disease by detecting dierences in the shape of an organ aected by disease. The goal of this work is to provide new methods for analyzing nonlinear shape variability. The medial representation of object geometry provides a powerful framework for describing shape variability in intuitive terms such as local thickness, bending, and widening. However, the medial parameters are not elements of a Euclidean space. Therefore, the standard linear techniques of shape analysis, namely linear averaging and principal component analysis, do not apply. This work describes how the medial parameters are in fact elements of a certain type of manifold known as a Riemannian symmetric space. The theory of principal geodesic analysis developed in this dissertation is applied to study the variability of medial representations of object shape. This dissertation develops a segmentation strategy for 3D medical images using m-rep models with a geometric prior based on principal geodesic analysis. 4 1.1.2 Diusion Tensor Imaging Diusion tensor magnetic resonance imaging (DT-MRI) is an imaging technique that is used to obtain information about ber structures such as the white matter ber in the brain or the ber structure of muscles. It produces at every voxel in an imaging volume a diusion tensor, which is a model of the diusivity properties of water in that voxel. In brous structures, such as the white matter ber in the brain, water tends to diuse more in the direction parallel to the bers. This gives the ability to determine the local direction of white matter bers from diusion tensor images. Furthermore, algorithms for tracking these local ber directions leads to global information about the connectivity of various regions in the brain. In addition to connectivity information, the diusivity information has shown promise in understanding certain brain disorders. Studies have shown that pathologies such as multiple sclerosis and stroke can aect the diusivity properties of the brain matter. Statistical analysis of diusion tensor images has the potential to further our understanding of the connectivity properties of the brain and the eects of disease on the brain ber structure. Probability distributions on diusion tensors could be used to generate statistical atlases of diusion tensor data, describing the normal variation in brain connectivity across individuals. Statistical methods on diusion tensor data might be used to quantify the variability of the brain matter that is caused by disease and to study the normal variability in the geometry of white matter ber bundles. In addition, more fundamental processing of diusion tensor data, such as smoothing, could benet from statistics. Previous approaches to statistical analysis of diusion tensor data have used linear statistical models. However, such methods can assign nonzero probability to illegal instances of diusion tensors, i.e., tensors that do not model any possible diusion of water. In this dissertation it is shown that diusion tensors are more naturally modeled by a nonlinear space. Principal geodesic analysis is applied to the space of diusion tensors to parameterize the statistical variability of diusion tensor data. It is shown that, unlike linear techniques, principal geodesic analysis preserves the important properties of diusion tensors, including producing only legal models of water diusion. 1.2 Thesis and Claims Thesis: Principal geodesic analysis is a natural generalization of principal component analysis for describing the statistical variability of geometric data that are parameter- 5 ized as curved manifolds. Such manifolds include medial representations of shape and diusion tensors. Principal geodesic analysis can be used to parameterize the shape variability of a population of m-rep models. The resulting probabilities can be used eectively as a statistical geometric prior in a deformable m-rep model segmentation of 3D medical images. The contributions of this dissertation are 1. A novel theory called principal geodesic analysis has been developed as a natural generalization of principal component analysis for describing the statistical variability of geometric data that are parameterized as curved manifolds. This generalization is natural in the sense that it uses only intrinsic distances and geodesics in the data space. 2. It has been shown that medial representations of shape, or m-reps, can be formulated as elements of a Riemannian symmetric space and that the variability of a population of m-rep objects can be eciently computed using principal geodesic analysis. 3. A new method for aligning m-reps to a common position, orientation and scale has been developed and demonstrated. It generalizes the Procrustes alignment method for aligning linear representations of shape. It proceeds by minimizing the sum-of-square geodesic distances between corresponding atoms in medial models. 4. A method for maximum posterior segmentation of 3D medical images via deformable m-reps models using principal geodesic analysis has been developed. The optimization of the objective function in the segmentation uses the principal geodesic modes of variation as a parameter space. A geometric prior based on principal geodesic analysis has been developed and incorporated into a Bayesian objective function. 5. It has been shown that diusion tensors can be treated as data in a Riemannian symmetric space and that the variability of diusion tensor data can be described using principal geodesic analysis. 6. New methods for interpolating diusion tensors, comparing the similarity of diffusion tensor images, and measuring the anisotropy of diusion tensors have been developed using the symmetric space formulation of the space of diusion tensors. 6 1.3 Overview of Chapters The remainder of this dissertation is organized in the following chapters: Chapter 2 provides an overview of the required mathematics used in this dissertation. This includes dierential geometry concepts such as Riemannian manifolds, Lie groups and symmetric spaces. Chapter 3 presents the background topics in medical image analysis that are related to the applications treated in this dissertation. The topics include deformable models, shape analysis, m-rep models and segmentation, and diusion tensor imaging. Chapter 4 presents the main theoretical contribution of this work, principal geodesic analysis, which is a method for describing the statistical variability of data on a curved manifold. A discussion of existing methods for computing averages on manifolds is also included. Chapter 5 applies the statistical methods from Chapter 4 to the space of 3D m-rep models. This provides a method for describing the statistical variability of m-rep models of anatomic shape. A generalization of the Procrustes alignment method is given for m-rep models. Principal geodesic analysis is applied to a collection of m-rep models of hippocampi, demonstrating the average and modes of variation. A geometric prior using principal geodesic analysis is developed for deformable m-rep model segmentation. Chapter 6 applies the statistical methods from Chapter 4 to the space of diusion tensors. Principal geodesic analysis is applied to synthetic diusion tensor data to show that the average and modes of variation produce legal instances of diusion tensors, while naive linear statistical analysis does not. A natural method for interpolating diusion tensors and describing their anisotropy is also discussed. Chapter 7 concludes with a discussion of the contributions of this dissertation and possible future work. Chapter 2 Mathematical Background The geometric entities studied in this thesis, namely m-rep shape models and diusion tensors, are elements of high-dimensional, curved manifolds. More precisely, they are Riemannian symmetric spaces. It is useful to think of a point in a symmetric space as a transformation from a xed base point. For example, when constructing the space of diusion tensors, the base point is chosen to be the identity matrix, and any diusion tensor is treated as a transformation from the identity. The transformation spaces that are being used are known as Lie groups, which are smooth manifolds themselves. It is useful to study these Lie group transformations of symmetric spaces because they tend to be algebraic in nature, and, therefore, certain computations on symmetric spaces, such as distances and shortest paths between two points, often have closed-form solutions. The same computations can require solving dierential equations if the manifold in question is not a symmetric space. Since distances and shortest paths will be essential in the denitions of statistics for manifolds, symmetric spaces are particularly nice spaces for doing statistics. Many geometric entities are representable as Lie groups or symmetric spaces. Transformations of Euclidean spaces such as translations, rotations, scalings, and ane transformations all arise as elements of Lie groups. Geometric primitives such as unit vectors, oriented planes, and symmetric, positive-denite matrices can be seen as points in symmetric spaces. This chapter is a review of the basic mathematical theory of Lie groups and symmetric spaces. The study of such spaces rst requires some background in basic topology and manifold theory, which is provided in the rst three sections. The reader already familiar with these topics may skip the appropriate sections. The various spaces that are described throughout this chapter are all generalizations, in one way or the other, of Euclidean space, Rn . Euclidean space is a topological space, a Riemannian manifold, a Lie group, and a symmetric space. Therefore, each section 8 will use Rn as a motivating example. Also, since the study of geometric transformations is stressed, the reader is encouraged to keep in mind that Rn can also be thought of as a transformation space, that is, as the set of translations on Rn itself. 2.1 Topology The study of a topological spaces arose from the desire to generalize the notion of continuity on Euclidean spaces to more general spaces. Topology is a fundamental building block for the theory of manifolds and function spaces. This section is a review of the basic concepts needed for the study of dierentiable manifolds. For a more thorough introduction see [93]. For several examples of topological spaces, along with a concise reference for denitions, see [120]. 2.1.1 Basics Remember that continuity of a function on the real line is phrased in terms of open intervals, i.e., the usual - denition. A topology denes which subsets of a set X are open, much in the same way an interval is open. As will be seen at the end of this subsection, open sets in Rn are made up of unions of open balls of the form B(x, r) = {y Rn : x y < r}. For a general set X this concept of open sets can be formalized by the following set of axioms. Denition 2.1. A topology on a set X is a collection T of subsets of X such that (1) and X are in T . (2) The union of an arbitrary collection of elements of T is in T . (3) The intersection of a nite collection of elements of T is in T . The pair (X, T ) is called a topological space. However, it is a standard abuse of notation to leave out the topology T and simply refer to the topological space X. Elements of T are called open sets. A set C X is a closed set if its complement, X C, is open. Unlike doors, a set can be both open and closed, and there can be sets that are neither open nor closed. Notice that the sets and X are both open and closed. Example 2.1. Any set X can be given a topology consisting of only and X being open sets. This topology is called the trivial topology on X. Another simple topology is the discrete topology on X, where any subset of X is an open set. 9 Denition 2.2. A basis for a topology on a set X is a collection B of subsets of X such that (1) For each x X there exists a B B containing x. (2) If B1 , B2 B and x B1 B2 , then there exists a B3 B1 B2 such that x B3 . The basis B generates a topology T by dening a set U X to be open if for each x U there exists a basis element B B with x B O. The reader can check that this does indeed dene a topology. Also, the reader should check that the generated topology T consists of all unions of elements of B. Example 2.2. The motivating example of a topological space is Euclidean space Rn . It is typically given the standard topological structure generated by the basis of open balls B(x, r) = {y Rn : x y < r} for all x Rn , r R. Therefore, a set in Rn is open if and only if it is the union of a collection of open balls. Examples of closed sets in Rn include sets of discrete points, vector subspaces, and closed balls, i.e., sets of the form B(x, r) = {y Rn : x y r}. 2.1.2 Metric spaces Notice that the topology on Rn is dened entirely by the Euclidean distance between points. This method for dening a topology can be generalized to any space where a distance is dened. Denition 2.3. A metric space is a set X with a function d : X X R that satises (1) d(x, y) 0, and d(x, y) = 0 if and only if x = y. (2) d(x, y) = d(y, x). (3) d(x, y) + d(y, z) d(x, z). The function d above is called a metric or distance function. Using the distance function of a metric space, a basis for a topology on X can be dened as the collection of open balls B(x, r) = {y X : d(x, y) < r} for all x X, r R. From now on when a metric space is discussed, it is assumed that it is given this topology. One special property of metric spaces will be important in the review of manifold theory. Denition 2.4. A metric d on a set X is called complete if every Cauchy sequence converges in X. A Cauchy sequence is a sequence x1 , x2 , . . . X such that for any > 0 there exists an integer N such that d(xi , xj ) < for all i, j > N . 10 2.1.3 Continuity As was mentioned at the beginning of this section, topology developed from the desire to generalize the notion of continuity of mappings of Euclidean spaces. That generalization is phrased as follows: Denition 2.5. Let X and Y be topological spaces. A mapping f : X Y is continuous if for each open set U Y , the set f 1 (U ) is open in X. It is easy to check that for a mapping f : Rd Rn the above denition is equivalent to the standard - denition. Denition 2.6. Again let X and Y be topological spaces. A mapping f : X Y is a homeomorphism if it is bijective and both f and f 1 are continuous. In this case X and Y are said to be homeomorphic. When X and Y are homeomorphic, there is a bijective correspondence between both the points and the open sets of X and Y . Therefore, as topological spaces, X and Y are indistinguishable. This means that any property or theorem that holds for the space X that is based only on the topology of X also holds for Y . 2.1.4 Various Topological Properties This section is a discussion of some special properties that a topological space may possess. The particular properties that are of interest are the ones that are important for the study of manifolds. Denition 2.7. A topological space X is said to be Hausdor if for any two distinct points x, y X there exist disjoint open sets U and V with x U and y V . Notice that any metric space is a Hausdor space. Given any two distinct points x, y in a metric space X, we have d(x, y) > 0. Then the two open balls B(x, r) and B(y, r), 1 where r = 2 d(x, y), are disjoint open sets containing x and y, respectively. However, not all topological spaces are Hausdor. For example, take any set X with more than one point and give it the trivial topology, i.e., and X as the only open sets. Denition 2.8. Let X be a topological space. A collection O of open subsets of X is said to be an open cover if X = U O U . A topological space X is said to be compact if for any open cover O of X there exists a nite subcollection of sets from O that covers X. 11 The Heine-Borel theorem (see [108], Theorem 2.41) gives intuitive criteria for a subset of Rn to be compact. It states that any closed and bounded subset of Rn is compact. Thus, for example, a closed ball B(x, r) is compact as is the unit sphere S n1 = {x Rn : x = 1}. The sphere, like Euclidean space, will be an important example throughout this chapter. This is partly because it is a simple example of a symmetric space, but also because it is an integral part of the medial representation used later. Denition 2.9. A separation of a topological space X is a pair of disjoint open sets U, V such that X = U V . If no separation of X exists, it is said to be connected. 2.2 Dierentiable Manifolds Dierentiable manifolds are spaces that locally behave like Euclidean space. Much in the same way that topological spaces are natural for talking about continuity, dierentiable manifolds are a natural setting for calculus. Notions such as dierentiation, integration, vector elds, and dierential equations make sense on dierentiable manifolds. This section gives a review of the basic formulations that will be needed later. A good introduction to the subject may be found in [15]. For a comprehensive overview of dierential geometry see [113,114,115,116,117]. Other good references include [2,89,58]. 2.2.1 Topological Manifolds A manifold is a topological space that is locally equivalent to Euclidean space. More precisely, Denition 2.10. A manifold is a Hausdor space M with a countable basis such that for each point p M there is a neighborhood U of p that is homeomorphic to Rn for some integer n. At each point p M the dimension n of the Rn in Denition 2.10 is unique. If the integer n is the same for every point in M , then M is called a n-dimensional manifold. The simplest example of a manifold is Rn , since it is trivially homeomorphic to itself. Likewise, any open set of Rn is also a manifold. 12 x U M x(U ) Figure 2.1: A local coordinate system (x, U ) on a manifold M . 2.2.2 Dierentiable Structures on Manifolds The next step in the development of the theory of manifolds is to dene a notion of dierentiation of manifold mappings. Dierentiation of mappings in Euclidean space is dened as a local property. Although a manifold is locally homeomorphic to Euclidean space, more structure is required to make dierentiation possible. First, recall that a function on Euclidean space f : Rn R is smooth or C if all of its partial derivatives exist. A mapping of Euclidean spaces f : Rm Rn can be thought of as a n-tuple of real-valued functions on Rm , f = (f 1 , . . . , f n ), and f is smooth if each f i is smooth. Given two neighborhoods U, V in a manifold M , two homeomorphisms x : U Rn and y : V Rn are said to be C -related if the mapping x y 1 : y(U V ) x(U V ) is C . The pair (x, U ) is called a chart or coordinate system, and can be thought of as assigning a set of coordinates to points in the neighborhood U (see Figure 2.1). That is, any point p U is assigned the coordinates x1 (p), . . . , xn (p). As will become apparent later, coordinate charts are important for writing local expressions for derivatives, tangent vectors, and Riemannian metrics on a manifold. A collection of charts whose domains cover M is called an atlas. Denition 2.11. An atlas A on a manifold M is said to be maximal if for any any other atlas A on M any coordinate chart (x, U ) A is also a member of A. Denition 2.12. A smooth structure on a manifold M is a maximal atlas A on M . 13 The manifold M along with such an atlas is termed a smooth manifold. The next theorem demonstrates that it is not necessary to dene every coordinate chart in a maximal atlas, but rather, one can dene enough compatible coordinate charts to cover the manifold. Theorem 2.1. Given a manifold M with an atlas A, there is a unique maximal atlas A such that A A . Example 2.3. Consider the sphere S 2 as a subset of R3 . The upper hemisphere U = {(x, y, z) S 2 : z > 0} is an open neighborhood in S 2 . Now consider the homeomorphism : S 2 R2 given by : (x, y, z) (x, y). This gives a coordinate chart (, U ). Similar charts can be produced for the lower hemisphere, and for hemispheres in the x and y dimensions. The reader may check that these charts are C -related and cover S 2 . Therefore, these charts make up an atlas on S 2 and by Theorem 2.1 there is a unique maximal atlas containing these charts that makes S 2 a smooth manifold. A similar argument can be used to show that the n-dimensional sphere, S n , for any n 1 is also a smooth manifold. Now consider a function f : M R on the smooth manifold M . This function is said to be a smooth function if for every coordinate chart (x, U ) on M the function f x1 : U R is smooth. More generally, a mapping f : M N of smooth manifolds is said to be a smooth mapping if for each coordinate chart (x, U ) on M and each coordinate chart (y, V ) on N the mapping y f x1 : x(U ) y(V ) is a smooth mapping. Notice that the mapping of manifolds was converted locally to a mapping of Euclidean spaces, where dierentiability is easily dened. As in the case of topological spaces, there is a desire to know when two smooth manifolds are equivalent. This should mean that they are homeomorphic as topological spaces and also that they have equivalent smooth structures. This notion of equivalence is given by Denition 2.13. Given two smooth manifolds M, N , a bijective mapping f : M N is called a dieomorphism if both f and f 1 are smooth mappings. 14 2.2.3 Tangent Spaces Given a manifold M Rd , it is possible to associate a linear subspace of Rd to each point p M called the tangent space at p. This space is denoted Tp M and is intuitively thought of as the linear subspace that best approximates M in a neighborhood of the point p. Vectors in this space are called tangent vectors at p. Tangent vectors can be thought of as directional derivatives. Consider a smooth curve : ( , ) M with (0) = p. Then given any smooth function1 f : M R, the composition f is a smooth function, and the following derivative exists: d (f )(0). dt This leads to an equivalence relation between smooth curves passing through p. Namely, if 1 and 2 are two smooth curves passing through the point p at t = 0, then 1 2 if d d (f 1 )(0) = (f 2 )(0), dt dt for any smooth function f : M R. A tangent vector is now dened as one of these equivalence classes of curves. It can be shown (see [2]) that these equivalence classes form a vector space, i.e., the tangent space Tp M , which has the same dimension as M . Given a local coordinate system (x, U ) containing p, a basis for the tangent space Tp M is given by the partial derivative operators /xi , which are the tangent vectors associated with the coordinate curves of x. Example 2.4. Again, consider the sphere S 2 as a subset of R3 . The tangent space at a point p S 2 is the set of all vectors in R3 perpendicular to p, i.e., Tp S 2 = {v R3 : v, p = 0}. This is of course a two-dimensional vector space, and it is the space of all tangent vectors at the point p for smooth curves lying on the sphere and passing through the point p. A vector eld on a manifold M is a function that smoothly assigns to each point p M a tangent vector Xp Tp M . This mapping is smooth in the sense that the components of the vectors may be written as smooth functions in any local coordinate system. A vector eld may be seen as an operator X : C (M ) C (M ) that maps a smooth function f C (M ) to the smooth function Xf : p Xp f . In other words, the directional derivative is applied at each point on M . Strictly speaking, the tangent vectors at p are dened as directional derivatives of smooth germs of functions at p, which are equivalence classes of functions that agree in some neighborhood of p. 1 15 For two manifolds M and N a smooth mapping : M N induces a linear mapping of the tangent spaces : Tp M T(p) N called the dierential of . It is given by (Xp )f = Xp (f ) for any vector Xp Tp M and any smooth function f C (M ). A smooth mapping of manifolds does not always induce a mapping of vector elds (for instance, when the mapping is not onto). However, a related concept is given in the following denition. Denition 2.14. Given a mapping of smooth manifolds : M N , a vector eld X on M and a vector eld Y on N are said to be -related if (X(p)) = Y (q) holds for each q N and each p 1 (q). 2.3 Riemannian Geometry As mentioned at the beginning of this chapter, the idea of distances on a manifold will be important in the denition of manifold statistics. The notion of distances on a manifold falls into the realm of Riemannian geometry. This section briey reviews the concepts needed. A good crash course in Riemannian geometry can be found in [88]. Also, see the books [15, 113, 114, 77]. Recall the denition of length for a smooth curve in Euclidean space. Let : [a, b] Rd be a smooth curve segment. Then at any point t0 [a, b] the derivative of the curve (t0 ) gives the velocity of the curve at time t0 . The length of the curve segment is given by integrating the speed of the curve, i.e., b L() = a (t) dt. The denition of the length functional thus requires the ability to take the norm of tangent vectors. On manifolds this is handled by the denition of a Riemannian metric. 2.3.1 Riemannian Metrics Denition 2.15. A Riemannian metric on a manifold M is a function that smoothly assigns to each point p M an inner product , on the tangent space Tp M . A Riemannian manifold is a smooth manifold equipped with such a Riemannian metric. Now the norm of a tangent vector v Tp M is dened as v = v, v 2 . Given local coordinates x1 , . . . , xn in a neighborhood of p, the coordinate vectors v i = /xi at p 1 16 form a basis for the tangent space Tp M . The Riemannian metric may be expressed in this basis as an n n matrix g, called the metric tensor, with entries given by gij = v i , v j . The gij are smooth functions of the coordinates x1 , . . . , xn . Given a smooth curve segment : [a, b] M , the length of can be dened just as in the Euclidean case as b L() = a (t) dt, (2.1) where now the tangent vector (t) is a vector in T(t) M , and the norm is given by the Riemannian metric at (t). Given a manifolds M and a manifold N with Riemannian metric , , a mapping : M N induces a metric , on M dened as Xp , Yp = (Xp ), (Yp ) . This metric is called the pull-back metric induced by , as it maps the metric in the opposite direction of the mapping . 2.3.2 Geodesics In Euclidean space the shortest path between two points is a straight line, and the distance between the points is measured as the length of that straight line segment. This notion of shortest paths can be extended to Riemannian manifolds by considering the problem of nding the shortest smooth curve segment between two points on the manifold. If : [a, b] M is a smooth curve on a Riemannian manifold M with endpoints (a) = x and (b) = y, a variation of keeping endpoints xed is a family of smooth curves: : ( , ) [a, b] M, such that 1. (0, t) = (t), 2. (s0 ) : t (s0 , t) is a smooth curve segment for xed s0 ( , ), 3. (s, a) = x, and (s, b) = y for all s ( , ). Now the shortest smooth path between the points x, y M can be seen as nding a critical point for the length functional (2.1), where the length of is considered as a 17 function of s. The path = (0) is a critical path for L if dL( (s)) ds = 0. s=0 It turns out to be easier to work with the critical paths of the energy functional, which is given by b E() = a (t) 2 dt. It can be shown (see [113]) that a critical path for E is also a critical path for L. Conversely, a critical path for L, once reparameterized proportional to arclength, is a critical path for E. Thus, assuming curves are parameterized proportional to arclength, there is no distinction between curves with minimal length and those with minimal energy. A critical path of the functional E is called a geodesic. Given a chart (x, U ) a geodesic curve U can be written in local coordinates as (t) = ( 1 (t), . . . , n (t)). Using any such coordinate system, satises the following dierential equation (see [113] for details): d2 k d i d j = k ((t)) . ij dt2 dt dt i,j=1 The symbols k are called the Christoel symbols and are dened as ij k ij 1 = 2 n n (2.2) g kl l=1 gjl gil gij + j xi x xl , where g ij denotes the entries of the inverse matrix g 1 of the Riemannian metric. Example 2.5. In Euclidean space Rn the Riemannian metric is given by the identity matrix at each point p Rn . Since the metric is constant, the Christoel symbols are zero. Therefore, the geodesic equation (2.2) reduces to d2 k = 0. dt2 The only solutions to this equation are straight lines, so geodesics in Rn must be straight lines. Given two points on a Riemannian manifold, there is no guarantee that a geodesic exists between them. There may also be multiple geodesics connecting the two points, 18 i.e., geodesics are not guaranteed to be unique. Moreover, a geodesic does not have to be a global minimum of the length functional, i.e., there may exist geodesics of dierent lengths between the same two points. The next two examples demonstrate these issues. Example 2.6. Consider the plane with the origin removed, R2 {0}, with the same metric as R2 . Geodesics are still given by straight lines. There does not exist a geodesic between the two points (1, 0) and (1, 0). Example 2.7. Geodesics on the sphere S 2 are given by great circles, i.e., circles on the sphere with maximal diameter. This fact will be shown later in the section on symmetric spaces. There are an innite number of equal-length geodesics between the north and south poles, i.e., the meridians. Also, given any two points on S 2 that are not antipodal, there is a unique great circle between them. This great circle is separated into two geodesic segments between the two points. One geodesic segment is longer than the other. The idea of a global minimum of length leads to a denition of a distance metric d : M M R (not to be confused with the Riemannian metric). It is dened as d(p, q) = inf{L() : a smooth curve between p and q}. If there is a geodesic between the points p and q that realizes this distance, i.e., if L() = d(p, q), then is called a minimal geodesic. Minimal geodesics are guaranteed to exist under certain conditions, as described by the following denition and the HopfRinow Theorem below. Denition 2.16. A Riemannian manifold M is said to be complete if every geodesic segment : [a, b] M can be extended to a geodesic from all of R to M . The reason such manifolds are called complete is revealed in the next theorem. Theorem 2.2 (Hopf-Rinow). If M is a complete, connected Riemannian manifold, then the distance metric d(, ) induced on M is complete. Furthermore, between any two points on M there exists a minimal geodesic. Example 2.8. Both Euclidean space Rn and the sphere S 2 are complete. A straight line in Rn can extend in both directions indenitely. Also, a great circle in S 2 extends indenitely in both directions (even though it wraps around itself). As guaranteed by the Hopf-Rinow Theorem, there is a minimal geodesic between any two points in Rn , i.e., 19 X Expp (X) p Tp M M Figure 2.2: The Riemannian exponential map. the unique straight line segment between the points. Also, between any two points on the sphere there is a minimal geodesic, i.e., the shorter of the two great circle segments between the two points. Of course, for antipodal points on S 2 the minimal geodesic is not unique. Given initial conditions (0) = p and (0) = v, the theory of second-order partial dierential equations guarantees the existence of a unique solution to the dening equation for (2.2) at least locally. Thus, there is a unique geodesic with (0) = p and (0) = v dened in some interval ( , ). When the geodesic exists in the interval [0, 1], the Riemannian exponential map at the point p (see Figure 2.2), denoted Expp : Tp M M , is dened as Expp (v) = (1). If M is a complete manifold, the exponential map is dened for all vectors v Tp M . Theorem 2.3. Given a Riemannian manifold M and a point p M , the mapping Expp is a dieomorphism in some neighborhood U Tp M containing 0. This theorem implies that the Expp has an inverse dened at least in the neighborhood Expp (U ) of p, where U is the same as in Theorem 2.3. Not surprisingly, this inverse is called the Riemannian log map and denoted by Logp : Expp (U ) Tp M . 20 Denition 2.17. An isometry is a dieomorphism : M N of Riemannian manifolds that preserves the Riemannian metric. That is, if , M and , N are the metrics for M and N , respectively, then , N = , M . It follows from the denitions that an isometry preserves the length of curves. That is, if c is a smooth curve on M , then the curve c is a curve of the same length on N . Also, the image of a geodesic under an isometry is again a geodesic. 2.4 Lie Groups The set of all possible translations of Euclidean space Rn is again the space Rn . A point p Rn is transformed by the vector v Rn by vector addition, p+v. This transformation has a unique inverse transformation, namely, translation by the negated vector, v. The operation of translation is a smooth mapping of the space Rn . Composing two translations (i.e., addition in Rn ) and inverting a translation (i.e., negation in Rn ) are also smooth mappings. A set of transformations with these properties, i.e., a smooth manifold with smooth group operations, is known as a Lie group. Many other interesting transformations of Euclidean space are Lie groups, including rotations, reections, and magnications. However, Lie groups also arise more generally as smooth transformations of manifolds. This section is a brief introduction to Lie groups. More detailed treatments may be found in [15, 36, 54, 58, 69, 113]. It is assumed that the reader knows the basics of group theory (see [59] for an introduction), but the denition of a group is listed here for reference. Denition 2.18. A group is a set G with a binary operation, denoted here by concatenation, such that 1. (xy)z = x(yz), for all x, y, z G, 2. there is an identity, e G, satisfying xe = ex = x, for all x G, 3. each x G has an inverse, x1 G, satisfying xx1 = x1 x = e. As stated at the beginning of this section, a Lie group adds a smooth manifold structure to a group. Denition 2.19. A Lie group G is a smooth manifold that also forms a group, where the two group operations, (x, y) xy x x1 : : GGG GG Multiplication Inverse 21 are smooth mappings of manifolds. Example 2.9. The space of all n n non-singular matrices forms a Lie group called the general linear group, denoted GL(n). The group operation is matrix multiplication, 2 and GL(n) can be given a smooth manifold structure as an open subset of Rn . The equations for matrix multiplication and inverse are smooth operations in the entries of the matrices. Thus, GL(n) satises the requirements of a Lie group in Denition 2.19. A matrix group is any closed subgroup of GL(n). Matrix groups inherit the smooth 2 structure of GL(n) as a subset of Rn and are thus also Lie groups. The books [30, 54] focus on the theory of matrix groups. Example 2.10. The n n rotation matrices are a closed matrix subgroup of GL(n) and thus form a Lie group. This group is called the special orthogonal group and is dened as SO(n) = {R GL(n) : RT R = I and det(R) = 1}. This space is a closed 2 and bounded subset of Rn , so it is compact by the Heine-Borel theorem. Given a point y in a Lie group G, it is possible to dene the following two dieomorphisms: Ly : x yx Ry : x xy (Left multiplication) (Right multiplication) A vector eld X on a Lie group G is called left-invariant if it is invariant under left multiplication, i.e., Ly X = X for every y G. Right-invariant vector elds are dened similarly. A left-invariant (or right-invariant) vector eld is uniquely dened by its value on the tangent space at the identity, Te G. Recall that vector elds on G can be seen as operators on the space of smooth functions, C (G). Thus two vector elds X and Y can be composed to form another operator XY on C (G). However, the operator XY is not necessarily vector eld. Surprisingly, however, the operator XY Y X is a vector eld on G. This leads to a denition of the Lie bracket of vector elds X, Y on G, dened as [X, Y ] = XY Y X. (2.3) Denition 2.20. A Lie algebra is a vector space V equipped with a bilinear product [, ] : V V V , called a Lie bracket, that satises (1) [X, Y ] = [Y, X], 22 (2) [[X, Y ], Z] + [[Y, Z], X] + [[Z, X], Y ] = 0, for all X, Y, Z V. The tangent space of a Lie group G, typically denoted g (a German Fraktur font), forms a Lie algebra. The Lie bracket on g is induced by the Lie bracket on the cor responding left-invariant vector elds. If X, Y are two vectors in g, then let X, Y be the corresponding unique left-invariant vector elds on G. Then the Lie bracket on g is given by [X, Y ] = [X, Y ](e). The Lie bracket provides a test for whether the Lie group G is commutative. A Lie group G is commutative if and only if the Lie bracket on the corresponding Lie algebra g is zero, i.e., [X, Y ] = 0 for all X, Y g. Example 2.11. The Lie algebra for Euclidean space Rn is again Rn . The Lie bracket is zero, i.e., [X, Y ] = 0 for all X, Y Rn . In fact, the Lie bracket for the Lie algebra of any commutative Lie group is always zero. Example 2.12. The Lie algebra for GL(n) is gl(n), the space of all real n n matrices. The Lie bracket operation for X, Y gl(n) is given by [X, Y ] = XY Y X. Here the product XY denotes actual matrix multiplication, which turns out to be the same as composition of the vector eld operators (compare to (2.3)). All Lie algebras corresponding to matrix groups are subalgebras of gl(n). Example 2.13. The Lie algebra for the rotation group SO(n) is so(n), the space of skew-symmetric matrices. A matrix A is skew-symmetric if A = AT . The following theorem will be important later. Theorem 2.4. A direct product G1 Gn of Lie groups is also a Lie group. 2.4.1 Lie Group Exponential and Log Maps Denition 2.21. A mapping of Lie groups : G1 G2 is called a Lie group homomorphism if it is a smooth mapping and a homomorphism of groups, i.e., (e1 ) = e2 , where e1 , e2 are the respective identity elements of G1 , G2 , and (gh) = (g)(h) for all g, h G1 . 23 The image of a Lie group homomorphism h : R G is called a one-parameter subgroup. A one-parameter subgroup is both a smooth curve and a subgroup of G. This does not mean, however, that any one-parameter subgroup is a Lie subgroup of G (it can fail to be an imbedded submanifold of G, which is required to be a Lie subgroup of G). As the next theorem shows, there is a bijective correspondence between the Lie algebra and the one-parameter subgroups. Theorem 2.5. Let g be the Lie algebra of a Lie group G. Given any vector X g there is a unique Lie group homomorphism hX : R G such that hX (0) = X. The Lie group exponential map, exp : g G, not to be confused with the Riemannian exponential map, is dened by exp(X) = hX (1). Example 2.14. For the Lie group Rn the unique Lie group homomorphism hX : R Rn in Theorem 2.5 is given by hX (t) = tX. Therefore, one-parameter subgroups are given by straight lines at the origin. The Lie group exponential map is the identity. In this case the Lie group exponential map is the same as the Riemannian exponential map at the origin. This is not always the case, however, as will be shown later. For matrix groups the Lie group exponential map of a matrix X gl(n) is computed by the formula 1 k X . (2.4) exp(X) = k! k=0 This series converges absolutely for all X gl(n). Example 2.15. For the Lie group of 3D rotations, SO(3), the matrix exponential map takes a simpler form. For a matrix X so(3) the following identity holds: X 3 = X, where = 1 tr(X T X). 2 Substituting this identity into the innite series (2.4), the exponential map for so(3) can now be reduced to I, = 0, exp(X) = I + sin X + 1 cos X 2 , (0, ). 2 24 The Lie group log map for a rotation matrix R SO(3) is given by I, = 0, (R RT ), || (0, ), 2 sin log(R) = where tr(R) = 2 cos + 1. The exponential map for 3D rotations has an intuitive meaning. Any vector X so(3), i.e., a skew-symmetric matrix, may be written in the form 0 z y X= z 0 x . y x 0 If v = (x, y, z) R3 , then the rotation matrix given by the exponential map exp(X) is a 3D rotation by angle = v about the unit axis v/ v . 2.4.2 Bi-invariant Metrics Denition 2.22. A Riemannian metric , on a Lie group G is said to be a biinvariant metric if it is invariant under both right and left multiplication, that is, Rg , = L , = , for all g G. g Theorem 2.6. For a Lie group G with bi-invariant metric the Lie group exponential map agrees with the Riemannian exponential map at the identity, that is, for any tangent vector X g exp(X) = Expe (X). Using the left-invariance of the Riemannian metric, any geodesic at a point g G may be written as the left multiplication of a geodesic at the identity. That is, the geodesic with initial conditions (0) = g and (0) = Lg (X) is given by (t) = g exp(tX). Theorem 2.7. A compact Lie group G has a unique bi-invariant metric (up to scale). 25 2.5 Symmetric Spaces Briey, a Riemannian symmetric space is a connected manifold M such that at each point the mapping that reverses geodesics through that point is an isometry. For a detailed treatment of symmetric spaces see the standard texts [15,58]. Common examples of symmetric spaces are Euclidean spaces, Rn , spheres, S n , and hyperbolic spaces, H n . Symmetric spaces, and the methods for computing geodesics and distances on them, arise naturally from certain Lie group actions on manifolds. A few preliminary denitions about mappings of sets are needed before symmetric spaces can be dened. Let X be a set and be any mapping of X into itself. A point x X is called a xed point of if (x) = x. The mapping is called involutive if is not the identity mapping, but its square is, i.e., = id. Denition 2.23. A symmetric space is a connected Riemannian manifold M such that at each point p M there is an involutive isometry p : M M that has p as an isolated xed point. The term isolated means that there is a neighborhood U of p such that p is the only point in U that is a xed point of p . This denition is somewhat illusive in that it is hard to get an intuitive feel for what kinds of manifolds are symmetric spaces. Fortunately, this denition is sucient to imply very nice properties of symmetric spaces. These properties are explained below, and the interested reader is referred to the appropriate references for derivations. The next theorem (see [15], Lemma 8.2 and Theorem 8.4) shows that the involutive isometry p in Denition 2.23 is more easily seen as the map that reverses geodesics through the point p. Theorem 2.8. A Riemannian symmetric space is complete, and if p is an involutive isometry of M , then p is a reection of the tangent space Tp M , i.e., p (X) = X, and p reverses geodesics through p, i.e., p (Expp (X)) = Expp (X) for all X Tp M such that those geodesics exist. As will be shown later, symmetric spaces arise naturally from certain Lie group transformations of a manifold M . This formulation requires a background to Lie group actions. 26 2.5.1 Lie Group Actions Denition 2.24. Given a smooth manifold M and a Lie group G, a smooth group action of G on M is a smooth mapping G M M , written (g, p) g p, such that for all g, h G and all p M 1. e p = p, 2. (gh) p = (g (h p)). The group action should be thought of as a transformation of the manifold M , just as matrices are transformations of Euclidean space. The orbit of a point p M is dened as G(p) = {g p : g G}. In the case that M consists of a single orbit, we call M a homogeneous space and say that the group action is transitive. The isotropy subgroup of p is dened as Gp = {g G : gp = p}, i.e., Gp is the subgroup of G that leaves the point p xed. Let H be a closed Lie subgroup of the Lie group G. Then the left coset of an element g G is dened as gH = {gh : h H}. The space of all such cosets is denoted G/H and is a smooth manifold. There is a natural bijection G(p) G/Gp given by the = mapping g p gGp . Now let M be a symmetric space and choose an arbitrary base point p M . We can always write M as a homogeneous space M = G/Gp , where G is a connected group of isometries of M , and the isotropy subgroup Gp is compact. The fact that G is a group of isometries means that d(p, q) = d(g p, g q), for all p, q M , g G. An element g G induces a smooth mapping g : M M via the group action, dened as g (p) = gp. Also, this mapping has a smooth inverse, namely g1 . Therefore, g is a dieomorphism. Denition 2.25. Given a Lie group action of G on a manifold M , a G-invariant Riemannian metric , on M is a metric such that the mapping g is an isometry for all g G, i.e., , . g Example 2.16. The standard Euclidean metric on Rn is invariant under the SO(n) group action. In other words, a rotation of Euclidean space is an isometry. The action of Rn on itself by translations is another example of a group of isometries. These two groups can be combined to form the special Euclidean group, SE(n) = SO(n) Rn . The semi-direct product means that SE(n) as a set is the direct product of SO(n) n and R , but multiplication is given by the formula (R1 , v1 ) (R2 , v2 ) = (R1 R2 , R1 v2 + v1 ). 27 2.5.2 Symmetric Spaces as Lie Group Quotients The following theorem (see [15], Theorem 9.1) provides criteria for a manifold to possess a G-invariant metric. Theorem 2.9. Consider a Lie group G acting transitively on a manifold M . If for some point p M the isotropy subgroup Gp is a connected, compact Lie subgroup of G, then M has a G-invariant metric. Symmetric spaces arise naturally from homogeneous spaces with G-invariant metrics, as the next theorem shows (see [15], Theorem 9.2 and Corollary 9.3). Theorem 2.10. Suppose that G, M , and p satisfy the conditions of Theorem 2.9. If : G G is an involutive automorphism2 with xed set Gp , then M is a symmetric space. The converse to Theorem 2.10 is also true, as shown in the next theorem (see [58], Theorem 3.3). Theorem 2.11. If M is a symmetric space and p any point in M , then M is dieomorphic to the Lie group quotient G/Gp , where G = I0 (M ) is the connected component of the Lie group of isometries of M and Gp is the compact Lie subgroup of G that leaves the point p xed. Furthermore, there is an involutive automorphism : G G that leaves Gp xed. Theorem 2.12. A connected Lie group G with bi-invariant metric is a symmetric space. Example 2.17. Euclidean space Rn is a symmetric space, as can be seen by Theorem 2.12. The involutive isometry p is given by reection about p, i.e., p reverses lines through p by the equation p (q) = 2p q. Geodesics on a symmetric space M = G/Gp are computed through the group action. Since G is a group of isometries acting transitively on M , it suces to consider only geodesics starting at the base point p. For an arbitrary point q M , geodesics starting at q are of the form g , where q = g p and is a geodesic with (0) = p. Geodesics are the image of the action of a one-parameter subgroup of G acting on the base point p, as the next theorem shows. 2 Recall that an automorphism of a group G is an isomorphism of G onto itself. 28 Theorem 2.13. If M is a symmetric space with G-invariant metric, as in Theorem 2.10, then a geodesic starting at the point p M is of the form (t) = exp(tX) p, where X is a vector in the Lie algebra g. Example 2.18. The sphere S 2 is a symmetric space. The rotation group SO(3) acts transitively on S 2 , that is, for any two unit vectors x, y there is a rotation R such that Rx = y. The north pole p = (0, 0, 1) is left xed by any rotation of the x-y plane. Therefore, the isotropy subgroup for p is equivalent to SO(2). The sphere can thus be written as the homogeneous space S 2 = SO(3)/SO(2). The involutive isometry p is given by reection about p, i.e., a rotation of the sphere about the axis p by an angle of . The geodesics at the base point p = (0, 0, 1) are the great circles through p, i.e., the meridians. Geodesics at an arbitrary point in S 2 are also great circles, i.e., rotated versions of the meridians. As Theorem 2.13 shows, these geodesics are realized by the group action of a one-parameter subgroup of SO(3). Such a subgroup consists of all rotations about a xed axis in R3 perpendicular to p. We consider a tangent vector in Tp S 2 as a vector v = (v1 , v2 , 0) in the x-y plane. Then the exponential map is given by Expp (v) = v1 sin v sin v , v2 , cos v v v , (2.5) 2 2 where v = v1 + v2 . This equation can be derived as a sequence of two rotations that rotate the base point p = (0, 0, 1) to the point Expp (v). The rst is a rotation about the y-axis by an angle of y = v . The second, aligning the geodesic with the tangent vector v, is a rotation about the z-axis by an angle of z , where cos(z ) = v1 / v and sin(z ) = v2 / v . The corresponding log map for a point x = (x1 , x2 , x3 ) S 2 is given by Logp (x) = x1 , x2 sin sin , (2.6) where = arccos(x3 ) is the spherical distance from the base point p to the point x. Notice that the antipodal point p is not in the domain of the log map. Chapter 3 Image Analysis Background This chapter provides the necessary background to the aspects of image analysis that are relevant to this dissertation. It begins in Section 3.1 with an overview of the statistical theory of shape. This theory is an important tool in deformable models, which are discussed in Section 3.2. Medial representations, the particular type of deformable model used in this work, are presented in Section 3.3. Finally, in Section 3.4 the necessary background for diusion tensor imaging is covered. 3.1 Statistical Shape Theory Statistical shape analysis is emerging as an important tool for understanding anatomical structures from medical images. Given a set of training images, the goal is to model the geometric variability of the anatomical structures within a class of images. Statistical models give an ecient parameterization of the geometric variability of anatomy. These models can provide shape constraints during image segmentation [27]. Also, statistical descriptions of shape are useful in understanding the processes behind growth and disease [29]. The study of anatomical shape and its relation to biological growth and function dates back to the landmark work of DArcy W. Thompson in 1917 [127]. This section is a review of several key concepts in statistical shape theory. Subsection 3.1.1 is a review of the shape theory of point sets introduced by David G. Kendall in a brief note in 1977 [71] and detailed in his 1984 paper [72]. Similar ideas in the theory of shape were independently developed by Fred L. Bookstein [12, 13]. In the Kendall and Bookstein theories of shape an object is represented by a nite set of points in Euclidean space Rn . In medical image analysis these points may represent a sampling of the boundary of an organ or important landmarks in a medical image. Subsection 3.1.2 is an overview 30 of methods for aligning geometric objects to a common position, orientation, and scale. Alignment is an important preprocessing step that is necessary for analyzing the shape dierences between objects. Subsection 3.1.3 discusses the common linear methods used to analyze the statistical variability of shape. Subsection 3.1.4 reviews several methods that have been proposed for the statistical analysis of nonlinear geometric data. For a more in-depth overview of shape theory, including applications beyond the realm of medical image analysis, see the books [14, 35, 111] and the review article [73]. Of these references the book by Dryden and Mardia [35] is the easiest to digest. Readers interested in more of the mathematical details of shape theory are encouraged to read Smalls book [111]. 3.1.1 Point Set Shape Spaces Shape is often dened as the geometry of objects that is invariant under translation, rotation, and scaling. This denition of shape provides an equivalence relation between objects, that is, two objects have the same shape if one can be transformed into the other by only a translation, rotation, and scaling (see Fig. 3.1). A shape space is a space in which each point represents an entire equivalence class of objects under this relation. This subsection is a review of Kendalls shape spaces of point sets in Rn [72]. The rst step in the construction of these shape spaces is to dene the transformation group of combined translations, rotations, and scalings. This group leads to an action on point sets in Rn . Shape spaces will be dened as the orbit spaces under this action; that is, point sets that can be transformed into each other under this action will be associated with the same point in shape space. A similarity transform of Rn is a combined translation, rotation, and scale of Rn . The space of all such transformations can be written as the Lie group Sim(n) = (SO(n) R+ ) Rn (recall the denition of the special Euclidean group in Example 2.16). An element S Sim(n) can be written as an (n + 1) (n + 1) matrix in the block form sR v S= T , 0 1 where v Rn , s R+ , and R SO(3). Composition of two similarity transformations is now achieved by multiplying their representative matrices. Writing a vector x Rn in homogeneous coordinates, i.e., as the column vector (x, 1)T , a similarity transform 31 Figure 3.1: Three objects that have the same shape, yet have dierent positions, orientations, and scales. matrix acts on x by Sx= sR v 0T 1 x . 1 Now consider a collection of k points x1 , . . . , xk in n-dimensional Euclidean space. This collection should be thought of as the vector x = (x1 , . . . , xn ) in Rnk . The conguration where all points are equal is not allowed. That is, the n-dimensional linear subspace V = {(x1 , . . . , xk ) Rnk : x1 = x2 = . . . = xk } is subtracted from the set Rnk to form the space of legal point congurations, Rnk \V . A similarity transformation S Sim(n) acts on Rnk \V by applying S to each of the points in the collection: S x = (S x1 , . . . , S xk ). The shape space k of k points in Rn is dened as the set of orbits under this acn tion. Recall that two points x and y are in the same orbit if there exists a similarity transformation S such that S x = y. The shape space k can be constructed by removing the translation, scale, and n rotation eects from the space Rnk \V . Consider the point set x = (x1 , . . . , xk ). The center of mass of these points is given by x = (1/k) k xk . Then x is determined up i=1 to translation by the point set x = (x1 x, . . . , xk x). The space of all such point sets 32 with zero centroid can be identied with the space Rn(k1) \{0}. The scale eects can be removed by dividing the point x by its Euclidean norm in Rnk . Therefore, the space of all scale-normalized point sets with zero centroid is a sphere of dimension n(k 1) 1. k This is called the preshape space Sn . Finally, the shape space k is the set of orbits n k k under the action of the rotation group SO(n), that is, n = Sn /SO(n). Recall that this k quotient of spaces means that points in Sn that can be transformed into each other by a k rotation in SO(n) are associated with the same point in the quotient space Sn /SO(n). The topology of the shape space k is somewhat harder to understand. For data n on the real line, i.e., n = 1, the rotation group SO(1) consists of only the identity transformation. Thus, k is identical to the preshape space, which is the sphere S k2 . 1 For planar data, n = 2, it helps to consider the plane R2 as the set of complex numbers C. In this case it can be shown (see [111]) that the shape space k is equivalent to the 2 complex projective space CP(k 2). Complex projective space CP(n) is the manifold of all one-dimensional complex subspaces of Cn+1 . The picture gets more dicult for higher dimensions, n 3. Here the shape space k , with k > n, is a singular manifold. n The singularities arise from the congurations of points that are invariant to certain rotations. For example, given a preshape of points in R3 that are collinear, the rotations about that common line leave all the points xed. This is a failure of SO(3) to act freely k on preshape space S3 . In the smooth parts of k , that is, where SO(n) acts freely, the n 1 dimension will be reduced by the dimension of SO(n), which is 2 n(n 1). Therefore, 1 the dimension of the smooth parts of k is n(k 1) 2 n(n 1) 1. The dimension n of the singularities of k is the same as the dimension of the lower-dimensional shape n n2 space k . For example, in the shape space of 3D objects, k , the singularities have 3 dimension k 2. For more information on the structure of the shape space manifolds, including the Riemannian metric and curvature properties, see the paper [76]. 3.1.2 Procrustes Distance and Alignment Consider the problem of dening a distance metric on the shape space k . Given two n sets of landmarks, x = (x1 , . . . , xk ) and y = (y1 , . . . , yk ), whose points are in one-toone correspondence with each other, the problem is to dene a distance d(x, y) that is invariant to translation, rotation, and scaling of either x or y. The Procrustes distance [47] is one such metric. It is based on a sum-of-squares Euclidean distance 33 between the corresponding points, k 1 2 d(x, y) = i=1 xi yi 2 . (3.1) This is of course equivalent to the Euclidean norm xy if the point sets are considered as elements of Rnk . Now the Procrustes distance is the distance induced on k by this n Euclidean distance. This is typically approximated by using the distance in (3.1) after rst aligning the two point sets to a common position, orientation, and scale in the following manner: 1. Translate each point set so that its centroid is located at zero. 2. Scale both point sets to norm one (considering them as points in Rnk ). 3. Rotate one point set to minimize the sum-of-square distances given in (3.1). This alignment process is known as ordinary Procrustes analysis (OPA). The rotation necessary in the last step may be computed using a singular value decomposition (SVD) of the n n matrix xT y, where x and y are considered as k n matrices, i.e., matrices with the landmarks as rows. Let U V be the corresponding SVD. Then the rotation matrix needed in step 3 to rotate y in alignment with x is U V T . This is the rotation matrix that maximizes the correlation between the two point sets. Alignment of more than two objects is achieved by a process called generalized Procrustes analysis (GPA) [48]. The GPA algorithm for a collection of objects x1 , . . . , xN is given by 1. Translate each object to a centroid at zero. 2. Compute the linear average of the objects, i.e., = to norm one. N i=1 xi . Normalize the mean 3. Align each object xi to the mean with respect to orientation and scale using OPA. 4. Repeat steps 2 and 3 until the mean does not change. In addition to aligning all objects into a common coordinate system, generalized Procrustes analysis also results in the production of a mean shape . In essence GPA 34 ^ xi xi m TmS k n 0 Sk n Figure 3.2: Projection of an object onto the tangent space of the shape space k . n maps each object onto the shape space k , which is a curved manifold. However, one n would like to use linear statistics to analyze the variability of these shapes. Therefore, it is necessary to linearize the data in some fashion. This is achieved by projecting the shapes onto the tangent space at the mean, i.e., T k . Since the shape space k n n has spherical curvature, the tangent space at is the set of vectors perpendicular to . The projection is accomplished by scaling the vector xi , producing a x such that the dierence x is perpendicular to (see Figure 3.2). Given one of the objects xi aligned using GPA, its projection onto the tangent space T k is given by n xi = 1 xi , xi , where is again the normalized mean, i.e., = 1, resulting from GPA. 3.1.3 Shape Variability The standard technique for describing the variability of linear shape data is principal component analysis (PCA), a method whose origins go back to Pearson [101] and Hotelling [60]. Its use in shape analysis and deformable models was introduced by Cootes and Taylor [26]. See the book [63] for a comprehensive review of PCA. The objectives of principal component analysis are (1) to eciently parameterize the variability of data 35 and (2) to decrease the dimensionality of the data parameters. This section describes PCA of multivariate data x1 , . . . , xN Rn with mean . The reader may think of this data as a set of shapes represented as projections onto the linear tangent space T k . n There are several dierent ways to describe PCA. The denitions given here may not necessarily be standard, but they are helpful as the basis for the generalization to nonlinear spaces. The goal of PCA is to nd a sequence of nested linear subspaces, V1 , . . . , Vn , through the mean that best approximate the data. This may be formulated in two ways, both resulting in the same answer. The rst is a least-squares approach, where the objective is to nd the linear subspaces such that the sum-of-squares of the residuals to the data are minimized. More precisely, the linear subspace Vk is dened by a basis of orthonormal vectors, i.e., Vk = span({v1 , . . . , vk }), which are given by N vk = arg min v =1 i=1 xk xk , v v 2 , i i (3.2) where the xk are dened recursively by i x1 = xi , i k1 xk = xk1 xi , vk1 vk1 i i Simply put, the point xk is obtained by removing from (xi ) the contributions of i the previous directions, v1 , . . . , vk1 . In other words, the point xk is the projection of i (xi ) onto the subspace perpendicular to Vk1 . The other way of dening principal component analysis is as the subspaces through the mean that maximize the total variance of the projected data. The total variance for a set of points y1 , . . . , yN is dened as 2 = 1 N N yi 2 . i=1 Then the linear subspaces Vk = span({v1 , . . . , vk }) are given by the vectors N vk = arg max v =1 i=1 xk , v 2 , i (3.3) where the xk are dened as above. It can be shown (see [63]) that both denitions of i PCA, i.e., (3.2) and (3.3), give the same results thanks to the Pythagorean theorem. 36 The computation of the spanning vectors vk proceeds as follows. First, the linear average of the data is computed as = 1 N N xi . i=1 Next, the sample covariance matrix of the data is computed as 1 S= N 1 N (xi )(xi )T . i=1 This is the unbiased estimate of the covariance matrix, that is, N 1 is used in the denominator instead of N . The covariance matrix is a symmetric, positive-semidenite quadratic form, that is, S = S T , and for any x Rn the inequality xT Sx 0 holds. Therefore, the eigenvalues of S are all real and nonnegative. Let 1 , . . . , n be the eigenvalues of S ordered so that 1 2 n , and let v1 , . . . , vn be the correspondingly ordered eigenvectors 1 . These directions are the solutions to the dening PCA equations, (3.2) and (3.3), and are called the principal directions or modes of variation. Any data point xi can be decomposed as n xi = + k=1 ik vk , for real coecients ik = xi , vk . The ik for xed i are called the principal components of xi . The total variation of the data is given by the sum of the eigenvalues, 2 = n k . The dimensionality of the data can be reduced by discarding the k=1 principal directions that contribute little to the variation, that is, choosing an m < n and projecting the data onto Vm , giving the approximation m xi = + k=1 ik vk . One method for choosing the cut-o value m is based on the percentage of total variation that should be preserved. The sample mean and covariance matrix S can be considered as the maximum like When repeated eigenvalues occur, there is an ambiguity in the corresponding eigenvectors, i.e., there is a hyperplane from which to choose the corresponding eigenvectors. This does not present a problem as any orthonormal set of eigenvectors may be chosen. 1 37 Figure 3.3: A set of points in Rn showing the resulting principal directions weighted by the corresponding variances and the level sets of Mahalanobis distance d. lihood estimates of the parameters of a Gaussian probability distribution. The resulting Gaussian distribution is given by the density p(x) = 1 exp (x )T S 1 (x ) . 2 (2) |S| n 2 1 2 1 This denes a probability distribution on the space of shapes that can be used as a geometric prior in a deformable models framework (described in the next section). However, PCA is a valid operation even if the data cannot be assumed to come from a Gaussian process. It can give reasonable resulting modes of variation for data that is Gaussianlike, i.e., densities that are unimodal and fall o rapidly away from the mean. As an alternative to using the full Gaussian probability density, a useful measure of the geometric typicality of an object is given by the squared Mahalanobis distance from the mean, d(x, )2 = (x )T S 1 (x ). The Mahalanobis distance function uses the covariance matrix as a quadratic form to create an inner product on Rn . This gives hyperelliptical level sets of distance emanating 38 from the mean, where the axes of the hyperellipsoids are the principal directions from PCA (see Fig. 3.3). The Mahalanobis distance skews Euclidean distance so that distance is stretched in directions with lower variance. Points nearer to the mean in Mahalanobis distance represent more probable shapes. 3.1.4 Nonlinear Statistical Analysis While most work on the statistical analysis of shape has focused on linear methods, there has been some work on statistical methods for nonlinear geometric data. Hunt [61] describes probability measures on Lie groups that satisfy the semigroup property under convolution. This leads to a natural denition of a Gaussian distribution on a Lie group as a fundamental solution to the heat equation f = f = div(gradf ) t 2f f = g ij ( i j k k ), ij x x x where g ij are the components of the inverse of the Riemannian metric, and k are the ij Christoel symbols. Wehn [131, 132] shows that such distributions satisfy a law of large numbers as in the Euclidean Gaussian case. Grenanders book [49] on probabilities on algebraic structures includes a review of these works on Gaussian distributions on Lie groups. Pennec [102] denes Gaussian distributions on a manifold as probability densities that minimize information. Bhattacharya [5] develops nonparametric statistics of the mean and dispersion values for data on a manifold. Mardia [81] describes several methods for the statistical analysis of directional data, i.e., data on spheres and projective spaces. Kendall [72] and also Mardia and Dryden [82] have studied the probability distributions induced on shape space k by independent identically distributed Gausn sian distributions on the landmarks. Olsen [97, 98] and Swann [123] describe Lie group actions on shape space k that result in nonlinear variations of shape. Klassen et n al. [75] develop an innite-dimensional shape space representing smooth curves in the plane. The space of dieomorphisms is an innite dimensional and curved Lie group, and statistical analysis of dieomorphisms has found interest recently. Miller, Trouv e and Younes [86,87,129,134] have studied the space of dieomorphisms as a Riemannian space, thus equipping it with a distance metric based on geodesic paths. Davis et al. [32] describe a method for estimating a minimum mean squared error dieomorphism from 39 a set of images. Nielsen et al. [96] and Markussen [83] use Brownian motion warps as a least-committed prior on the space of dieomorphisms. 3.2 Deformable Models Segmentation is the process of distinguishing important structures in an image from background. This is a fundamental task in medical image analysis that is often a prerequisite for further analysis, visualization, disease diagnosis, or planning of medical treatment. Medical images can be very large, especially 3D images, time sequence images, or images with multi-dimensional values, such as diusion tensor images. Finding complex geometric objects in such a vast amount of data can be a challenge. Segmentation is further complicated by the wide range of variability in the geometry and image intensities of the anatomy. Image noise, sampling artifacts, and the confusion of other nearby structures add to the diculty of the task. Deformable models is a powerful image analysis method that overcomes many of the diculties of segmentation by incorporating prior information of the objects to be segmented. A survey of deformable models methods can be found in [85]. The deformable models approach to segmentation involves the deformation of a geometric model into an image by optimizing an objective function. Deformable model approaches dier in the way they represent object geometry, how they deform objects, and in the objective functions they use to t into an image. 3.2.1 Active Contours The rst deformable models to gain popularity in image analysis were the active contours or snakes [68]. Snakes represent an object in a 2D image as a parametric contour c(s) = (x(s), y(s)), for s [0, 1]. In the early papers on deformable models such a contour is t to an object in an image I(x, y) by minimizing the following energy functional: 1 1 Esnake (c) = 0 c (s) 2 + c (s) 2 ds + 0 P (c(s))ds. (3.4) The rst integrand in the above equation is called the internal energy. The weights and specify the elasticity and stiness of the contour. The second integrand in the above equation is known as the external energy. It measures how well the contour ts the image data. The function P (x, y) is a potential function in the image plane that 40 typically measures the desired image features, such as specic intensities or edges. A common potential function is the edge-based potential P (x, y) = g( I(x, y) )2 , which attracts the contour to edges in the image, i.e., places with high gradient values. Here g : [0, ) R+ is a monotonic decreasing function, and the weight is chosen to balance the strength of the image attraction with the internal energy constraints. It is also common in this approach to rst convolve the image with a Gaussian kernel to remove noise and extend the capture range of the local minima. One drawback of the active contour method is that the snakes energy functional (3.4) is not intrinsic in the sense that it contains the term 1 E(c) = 0 c (s) 2 ds, which depends on the parameterization of the curve. This can be remedied by choosing an arclength parameterization for c. Caselles et al. [21] go a step further and phrase the active contour minimization as nding a geodesic curve under a particular Riemannian metric, resulting in a method called geodesic active contours. The metric is chosen in the image plane so that the resulting geodesics minimize the snakes energy functional (3.4) with the term = 0. Instead of using a parameterized curve model for the snake, geodesic active contours use a level-set approach based on the work of Osher and Sethian [99, 109]. This can be phrased as nding a smooth function f : R2 R in the image plane whose zero set is the desired contour. Following a steepest-descent approach, the geodesic active contours solve the evolution equation dc = g(I)N dt g, N N, where N is the curve normal, is the curvature, and g is a monotonic decreasing function as above. A distinguishing characteristic of active contours is that they are inherently a local approach. The components of the energy functional (3.4) are all local properties: rst and second derivatives of the contour and local image properties. In addition the search methods for active contours proceed by local searches along the normal direction at a point on the curve. This locality is an advantage in the sense that it is typically very fast because computations need only be made in local neighborhoods. However, it is also a 41 disadvantage in the sense that active contours are unable to describe global aspects of shape change and they do not take into account any correlations of image information at dierent points on an object. As a result, active contours have a tendency to be attracted to local spurious features in an image and can leak, i.e., continue searching for an edge when the desired boundary in an image has low contrast. 3.2.2 Probabilistic Deformable Models An alternative to the energy minimization formulation of deformable models is based on a probabilistic point of view.2 The probabilistic framework is based on Grenanders pattern theory [50, 51, 52, 92]. Pattern theory encompasses a wide variety of methods for analyzing signals generated by the world. These signals may include images, audio, DNA sequences, or weather measurements. Pattern theory holds that the real world cannot be modeled deterministically because it is too complex and the sensors used to observe it are too limited. Therefore, observations must be modeled partly stochastically and partly deterministically in order to make analysis of the observations computationally practical. In a stochastic model of observations, the world may be in one of many dierent states, and each state w in the set of possible states occurs with probability p(w). The probability p(w) is called a prior and must be learned from past experience. For example, a radiologist uses prior training in anatomy when segmenting a new CT image. An observation f of the world has conditional probability p(f |w), which is the likelihood of the observation f given that the world is in state w. The likelihood is often computed by generating a synthetic signal fw from a given model w and comparing it to the signal f . The goal is to infer the true state w given an observation f . This may be done by maximizing the a posteriori probability p(w|f ) with respect to w, that is, nd the state w that has maximum probability given the observation f . The posterior probability is computed using Bayes formula p(w|f ) = p(f |w)p(w) . p(f ) (3.5) Now the most probable estimate for the state w, called the maximum a posteriori The energy minimization snakes can actually be phrased in the probabilistic setting as well. This approach involves using Gibbs probability distributions for a smoothness prior term and an image likelihood term. Such a formulation can be shown to be equivalent to the energy minimization (3.4) from the previous section. See [85] for more details. 2 42 (MAP) estimate, is given by w = arg max p(w|f ) = arg max p(f |w)p(w). w w (3.6) The term p(f ) in the denominator of Bayes formula (3.5) is dropped from the MAP equation because it does not alter which state maximizes the posterior probability. In the deformable models setting, the observations f are images and the states w are geometric models of the objects in the images. Thus, the MAP estimate can be phrased as the most probable conguration of a geometric model with respect to a given particular image. In practice the posterior maximization cannot be solved analytically due to the large number of variables and the complexity of the deformable model problem. Therefore, the following procedure is used: 1. Begin with an initial estimate for the model w. 2. Generate a synthetic image fw from w. 3. Evaluate the posterior probability, comparing fw to f . 4. Update (deform) the model w according to the method used to search for the optimum w. 5. Repeat steps 2 through 5 until maximum is achieved. The search method used in step 4 may be one of several optimization strategies (see [106], Chp. 10). Local methods such as the simplex method, gradient descent, or conjugate gradient can nd optimum solutions quickly but can get stuck in local optima. Thus, local methods work well for problems where the initial estimate (step 1) can be placed near the correct answer. Global optimization strategies such as simulated annealing and genetic algorithms do a better job of avoiding local optima but are also much slower, even when near the correct solution. Several dierent geometric representations have been used to model anatomy in deformable models approaches. The active shape model (ASM) of Cootes and Taylor [26, 27] represents an objects geometry as a dense collection of boundary points. Cootes et. al. [25] have augmented their models to include the variability of the image information as well as shape. Delingette [33, 34] uses a simplex mesh to represent the boundary of objects. Staib and Duncan [119] use Fourier decompositions of contours. Szkely, e Kelemen, et. al. [70, 124] also use Fourier representations in 2D and use a spherical 43 harmonic (SPHARM) decomposition of the object geometry in 3D. Joshi [64, 65] and Christensen [24] use volumetric representations of anatomy along with dieomorphic transformations of that anatomy. In all of these approaches the underlying geometry is parameterized as a Euclidean vector space. The prior probability density must be inferred from a training sample of known instances of the object. The training data is given as a set of vectors x1 , . . . , xN in a vector space V . For active shape models each vector is constructed by concatenation of the boundary points in an object. This is followed by a general Procrustes analysis and a tangent space projection as described in the previous section. For Fourier and spherical harmonics each vector is constructed as the concatenation of the coecients of a harmonic representation of the object boundary. Although dieomorphisms themselves are not a vector space, prior probability models can be based on the velocity vector elds of the deformations, which do form a vector space. Therefore, in each of these approaches the parameters of the prior density can be inferred using the linear statistical techniques mentioned in the previous section, namely linear averaging and PCA. In contrast to active contours, probabilistic deformable models are a more global approach. Methods for describing the statistical variability of shape, such as PCA, take into account the global variations in shape. That is, changes in the components of variation cause changes across the entire object. This is a result of the fact that PCA models the correlations of geometric changes in dierent parts of the object. Also, statistical models of the image variability use correlations of image values at dierent points on the object. This global approach has the advantage that the models of the geometry and the image values stay consistent across the entire object. Search methods can take steps in the model parameters, which result in global changes to the model geometry and image intensities. Thus, probabilistic models are less likely to be attracted to spurious image features and are more robust under low contrast or missing data. This added power comes at the cost of more complex models and longer run times. 3.3 Medial Representations Medial representations of objects, or m-reps, are the foundation of the deformable models approach taken in this work. This section is a review of the necessary background in medial geometry representations and segmentation via deformable m-rep models. The rst subsection (Section 3.3.1) is an overview of the medial locus and some of its mathematical properties. The next subsection (Section 3.3.2) describes m-reps and the 44 deformable models approach based on them. The article by Pizer et al. [105] provides an overview of the properties of the medial locus and methods for extracting the medial locus from an object. The deformable m-reps approach to image segmentation is described by Pizer et al. [104]. A ne overview of medial techniques that goes beyond the material covered in this section can be found in the Ph.D. dissertation of Yushkevich [135]. 3.3.1 The Medial Locus The medial locus is a means of representing the middle or skeleton of a geometric object. Such representations have found wide use in computer vision, image analysis, graphics, and computer aided design [8, 9, 62, 110, 121]. Psychophysical and neurophysiological studies have shown evidence that medial relationships play an important role in the human visual system [6, 17, 78, 79, 84]. The medial locus was rst proposed by Blum in 1967 [10], and its properties were later studied in 2D by Blum and Nagel [11] and in 3D by Nackman [94]. Arising from the medial locus denition is a surprisingly rich mathematical theory that incorporates many aspects from dierential geometry and singularity theory (see, for instance, [31, 46, 45]). The denition of the medial locus of a set A Rn is based on the concept of a maximal inscribed ball. Denition 3.1. A maximal inscribed ball of a set A Rn is an open ball Br (x) = {y Rn : x y < r} such that Br (x) A, and there does not exist another ball B = Br (x) such that Br (x) B A. Denition 3.2. The medial locus of a set A Rn is the closure of the set of all pairs (x, r) Rn R+ such that Br (x) is a maximal inscribed ball in A. The medial axis refers to the set of positions x Rn that are centers of maximal inscribed balls in A, i.e., the medial axis is the image of the medial locus under the projection : Rn R+ Rn .3 Several authors have used the terms medial locus, medial axis, symmetry axis, and skeleton to mean either the medial positions or the medial position and radius tuples. The word axis is somewhat misleading (but it has stuck) since it connotes a straight line. However, as discussed below, the medial axis can have higher dimensions than a line and can be curved. The above denition of the medial locus is valid for any set A Rn . However, for the real-world objects that are found in images it is convenient to narrow the possible sets that can be considered. This leads to the following denition: The term medial locus is also used for other related skeletal structures. Here the terms medial locus and medial axis will always refer to this denition given by Blum. 3 45 A2 1 A3 A3 A3 1 A3 A3 1 A1A3 A3 A2 1 Figure 3.4: A 2D object and its medial axis (left). The medial axis of a 3D object (right). The dierent generic medial points are labelled (all types except the A4 are 1 present). Denition 3.3. An object in Rn is a connected, compact, imbedded, n-dimensional manifold with boundary. The compactness assures that an object is a bounded region, and the fact that an object is imbedded means that it cannot intersect itself in any way. Of course objects from images will be either 2D or 3D. An object does not have to have a smooth boundary, so, for example, a 2D region bounded by a polygon or a 3D region bounded by a closed polygonal surface is an object. Also, an object does not have to be simply connected, that is, it can have holes like an annulus in 2D or a solid doughnut in 3D. The medial axis forms a structure known as a stratied space. It consists of a collection of smooth manifolds of dierent dimensions known as strata. Medial points that are tangent to the boundary in exactly two points make up a codimension 1 stratum. This is referred to as the smooth part of the medial axis. For example, in 2D the smooth parts of the medial axis are smooth curves, and in 3D the smooth parts of the medial axis are smooth surfaces. Each connected piece of the smooth part of the medial axis is called a branch. The remaining parts of the medial axis are the singular parts, which form lower-dimensional strata. They consist of boundaries of the branches and places where the branches connect. The types of generic points4 on the medial axis The term generic refers to points or features of geometry that are stable under perturbations of that geometry. In this case a medial point is generic if it does not disappear under a small perturbation of the objects boundary. 4 46 have been classied in 3D by Giblin and Kimia [45]. A point on the medial axis is classied as an Am point when the resulting inscribed sphere is tangent at m distinct k points and the sphere has order k contact with the object boundary. No superscript indicates the sphere has contact at a single point. The types of medial points that can occur generically in 2D and 3D are (see Fig. 3.4) 1. A2 points are the smooth parts of the medial axis, where the sphere is tangent at 1 two distinct points. Points of this type form curves in 2D and surfaces in 3D. 2. A3 points are where two branches of the medial axis meet, and the sphere is tangent 1 at three distinct points on the boundary. Points of this type form points in 2D and curves in 3D. 3. A3 points are the edges of the medial branches, where the sphere is tangent at a single point and has the same radius of curvature as the boundary at that point. The boundary also has a maximum of curvature at that point. Points of this type form points in 2D and curves in 3D. 4. A1 A3 points are where a branch curve (A3 ) meets an edge curve (A3 ) in 3D. These 1 points do not occur in 2D, and they form points in 3D. 5. A4 points are where four branch curves (A3 ) meet in 3D. These points do not occur 1 1 in 2D, and they form points in 3D. In contrast to boundary representations, which sample the boundary of an object, medial representations sample the medial locus of an object. These medial samples, called medial atoms, come in two dierent varieties. Denition 3.4. An n-dimensional order 0 medial atom is a pair (x, r) Rn R+ . An order 0 medial atoms represents the position and radius of a maximal inscribed ball at a location on the medial axis of an object. The phrase order 0 is to distinguish these atoms from atoms with higher order information, i.e., derivatives of position and radius. The disadvantage of the order 0 medial atom is that it does not give enough information to reconstruct the corresponding boundary points, i.e., the points tangent to the sphere dened by the medial atom. This is remedied by adding rst order information to the medial atom. Denition 3.5. An n-dimensional order 1 medial atom is a tuple (x, r, n0 , n1 ) Rn R+ S n1 S n1 . 47 n0 x n1 x Figure 3.5: Two representations of a 3D order 1 medial atom and the portions of the implied boundaries associated with them. A medial atom as in Denition 3.5 as a position, radius, and two spoke directions (left). The same medial atom as a position, radius, frame, and object angle (right). The order 1 medial atom (see Fig. 3.5) adds two unit length vectors, n0 , n1 , thought of as two points on the unit sphere S n1 . These two points represent the tangency points of the boundary with the inscribed sphere. The vectors pointing from the medial locus position to the object boundary, called spokes, are given by rn0 and rn1 . Therefore, order 1 medial atoms give enough information to reconstruct the corresponding boundary points on the object, y0 , y1 , given by the formulas y0 = x + rn0 , y1 = x + rn1 . (3.7) The order 1 medial atom assumes that the inscribed sphere is bitangent to the object boundary. Thus, they are valid as samples of the smooth parts of the medial locus. Order 1 medial atoms encode the derivative information of the medial locus in a non-obvious way. Given an object in Rn , let M be a branch of the objects medial axis, i.e., a smooth manifold in Rn of codimension 1, and let r : M R+ be the radius function on that branch. An order 1 medial atom at a point x M is given by the tuple (x, r, n0 , n1 ) Rn R+ S n1 S n1 . Then the positional derivative information at x M is given by the tangent space Tx M . This tangent space also has codimension 1, and as such it is uniquely determined by a single normal vector. This (unit) normal 48 vector is given by n= n0 n1 . n0 n1 The derivative information of the radius function comes in the form of the gradient vector r Tx M . This gradient is given by the projection of the n0 vector (or, equivalently, the n1 vector) into the tangent space: r = n0 n0 , n n = n1 n1 , n n. Medial loci have the property that the gradient of the radius function in the smooth strata satises the inequality r 1. With this restriction it can be seen that an order 1 medial atom, dened as the tuple (x, r, n0 , n1 ), uniquely encodes the derivative information of the medial locus at the point x. One contribution of this dissertation is the specication of an order 1 medial atom as dened above (Denition 3.5). In previous papers [66, 103, 104] order 1 medial atoms were given as a tuple (x, r, F, ) R3 R+ SO(n) [0, /2) (see Fig. 3.5). In other words, the two unit vectors, n0 , n1 , were replaced with a frame F SO(n) and an angle [0, /2), known as the object angle. In 3D the frame is given by three orthonormal vectors {b, n, b }, where b is the unit bisector of the spokes, n is the unit normal to the medial axis, and b = b n. The unit spoke directions can be derived from this representation as n0 = cos()b + sin()n n1 = cos()b sin()n. The advantages of the order 1 medial atom as dened in this dissertation over the previous frame and object angle representation will be discussed in Chapter 5. 3.3.2 M-reps Two major contributions of this dissertation are 1) a new method for studying statistical shape variability using medial representations and 2) application of this method to a deformable models approach to 3D medical image segmentation. Both the medial representation of object geometry and the resulting deformable models framework that are used in this dissertation are due to Pizer et al. [103, 104]. These medial representations, or m-reps, are described in this section. After an introduction to the benets of m-reps as models of object shape, a description of the medial representation and data structure is given. Methods for interpolating smooth boundaries and gural-based coordinate sys- 49 tems from m-rep models are outlined. Then a dierent twist on medial representations called spline-based m-reps is reviewed. The section wraps up with a presentation of the deformable m-reps approach to 3D image segmentation. The main selling points of the m-rep method as an approach to object geometry and deformable models are 1. M-reps are multiscale. They decompose the geometry of object collections in a coarse-to-ne manner. Multiscale approaches to deformable models have proven to be more robust to image problems such as noise, aliasing, and missing data. Also, multiscale methods are capable of extending the capture range of optimization procedures and increasing the rate of convergence. 2. M-reps have a fuzzy boundary. That is, they have a built-in boundary tolerance that allows them to extract ne-scale boundary perturbations without creating extra branches in the medial locus. 3. M-reps provide a gural coordinate system. This coordinate system measures distances along the medial directions of a gure and through the gure. 4. M-reps are a solid representation. Instead of just modeling the boundary, or shell, of an object, m-reps also model the interior (and just outside the object). This gives a means for indexing image values inside an object and also for modeling the physical properties of the object interior. 5. M-reps directly model the medial locus of an object. Methods that extract the medial locus from a boundary can be time-consuming and sensitive to perturbations in the boundary. Also, comparing the medial structure of two similar objects is easier when the medial locus is modeled directly because the medial branching can be kept consistent between objects. The medial representation decomposes complex objects into a set of gures, which are slabs with unbranching medial locus. Each gure consists of a sheet of order 1 medial atoms. Single gures in 2D consist of 1D curve segment of atoms, and single gures in 3D consist of a 2D surface-with-boundary of atoms. The objects considered in this dissertation are all 3D single gure models (see Fig. 3.6), and they are the focus of this review. However, more complex models, i.e., models consisting of multiple gures and models consisting of collections of objects, are briey reviewed. Recall that an atom on the edge of the continuous medial locus has a single spoke with third order contact with the object boundary, i.e., an A3 point. However, a single 50 Figure 3.6: Two single gure m-rep models: a kidney (left) and a hippocampus (right). Figure 3.7: A 3D medial end atom, showing the portion of the boundary crest implied by the atom. 51 t u v t t = +1 t t = -1 Figure 3.8: The gural coordinate directions (u, v, t, ) demonstrated on an m-rep model of the hippocampus. Sample order 1 medial atoms on the sheet are also shown. point of contact can be an unstable feature for image analysis tasks. The representation can be stabilized by restricting each edge point of the medial locus to lie along the b vector of a medial atom shifted back from the edge curve. An end atom (see Fig. 3.7) is a special type of order 1 medial atom that models this atom shifted back from the edge of the medial locus. It has an extra spoke in the bisector direction, b, along which the true edge of the medial locus lies. This extra spoke points to the crest of the implied boundary and has length r, where is a parameter in the interval [1, 1/ cos()]. A value of = 1 gives a circular end cap, while at the other extreme a value of = 1/ cos() gives a sharp corner. Figural Coordinates An m-rep sheet should be thought of as representing a continuous branch of medial atoms with associated continuous implied boundary. This continuous sheet of medial atoms can be parameterized by two real parameters (u, v). The choice of this parameterization may depend on the need to make comparisons at corresponding points between similar objects. In this case parameterizations that are in one-to-one correspondence are chosen. This correspondence can be based on geometric properties of the objects, or it can be chosen with the desire to build optimal statistical models of a population. A full discussion of these correspondence issues is beyond the scope of this review, and it is assumed from here on that some (u, v) parameterization for the medial locus is given. Since each internal medial atom in a single gure implies two boundary points, an extra parameter t {1, 1} can be added to extend the medial coordinates to a parameterization (u, v, t) of the implied boundary. 52 Figure 3.9: Left: the hinge arrangement of a subgure with the subgure on top and parent gure on bottom. Right top: a protrusion subgure. Right bottom: an indentation subgure. (This gure is courtesy of Qiong Han and appears in [55].) The gural coordinates further extend the implied boundary coordinates to a parameterization of the space inside and just outside the m-rep gure. A gural coordinate (see Fig. 3.8) is a tuple (u, v, t, ), where the parameter gives the r-proportional signed distance of the point in question from the surface point at (u, v, t). That is, is given as the signed distance along the normal to the surface at (u, v, t) divided by the r value at (u, v, t). This coordinate system is valid inside the entire solid represented by the m-rep gure (i.e., each point has a unique coordinate). It is also valid outside the gures boundary up to the exterior shock set of the distance function to the boundary. Therefore, it is valid for all of R3 when the gure is convex. An important feature of the gural coordinate system is that the coordinates are invariant under translations, rotations, and scalings of the object. This makes the gural coordinate system an ideal parameterization when dealing with shape properties. Also, if one object is represented as a deformation of the medial representation of another object, the gural coordinates of the two objects are in one-to-one correspondence. This is useful, for example, during segmentation in comparing image intensity values of the target image with the image intensity values of a training image with respect to the current m-rep object. 53 Multi-gure Objects and Multi-object Complexes For objects consisting of multiple gures (see Fig. 3.9), the objects gures are arranged in a hierarchical fashion, i.e., a tree or directed acyclic graph (DAG). A parent gure in the tree represents a more substantial part of the object, and a child gure represents a protrusion or indentation of its parent gure. Protrusions are gures that add material to an object, while indentations are gures that subtract material from an object. To make the representation focus on the portions of the object with the major internal substance, a child gure, also called a subgure, is attached to its parent by a curve segment of its edge curve, called a hinge. The resulting implied boundary subdivision surfaces are blended using a method described in [55]. The gural coordinates of the subgure arrangement include the gural coordinates of the parent, the gural coordinates of the child, and an extra blend coordinate w that parameterizes the blend area between the parent and child. More details about the geometry and segmentation process for multi-gure models can be found in [55]. Sometimes one wishes to represent and then segment multiple disconnected objects at the same time. An example is the cerebral ventricles, hippocampus, and caudate in which the structures are related but one is not a protrusion or an indentation on another. Another example are the pair of kidneys and the liver. In the m-reps system these can be connected by one or more links between the representations of the respective objects, allowing the position of one gure to predict boundary positions of the other. This matter is explained in detail in a paper covering the segmentation of multi-object complexes [43]. Mesh-Based M-reps In 3D a single gure object can be represented by a quadrilateral mesh mij of order 1 medial atoms (see Fig. 3.6). Atoms on the edge of the mesh are represented by end atoms with three spokes as described above. The atoms in an m-rep mesh can be thought of as control points implying a full continuous sheet of order 1 medial atoms. The continuous medial locus extends beyond the end atoms to the curve of A3 atoms osculating the crest of the implied objects boundary. In multi-gure models the hinge curve of a subgure is represented as one edge of the subgures quadrilateral mesh of medial atoms. The implied boundary of an m-rep gure is interpolated from the boundary points (y0 , y1 ) and corresponding normals (n0 , n1 ) implied by the medial atoms. This also 54 includes the crest points implied by the third spokes of the end atoms. The surface interpolation used is due to Thall [125, 126] and is an extension of the Catmull-Clark subdivision method [22]. As a result of this interpolation, each boundary point can be associated a boundary gural coordinate (u, v, t), where the parameter t [1, 1] is equal to either 1 or 1 for internal atoms to distinguish between the two spoke ends. At end atoms the t parameter transitions continuously from 1 to 1 around the crest. Spline-Based M-reps Yushkevich et al. [135,136] describe a medial representation built on a continuous spline model of the medial locus of an object. This in turn implies a continuous boundary of the object as well as a parameterization of the interior of the object. Although spline-based m-reps have been dened in both 2D and 3D, this review concentrates on the 3D case. The medial locus is parameterized as a pair of continuous functions (x(u, v), r(u, v)), where x is the medial position and r is the associated radius eld. These functions are represented as b-spline surfaces. They are determined by control points (xij , rij ) : 0 i d1 , 0 j d2 and given by the b-spline formulas d1 d2 x(u, v) = i=0 j=0 d1 d2 Ni3 (u)Nj3 (v)xi j, Ni3 (u)Nj3 (v)ri j, i=0 j=0 r(u, v) = where Ni3 are third-order b-spline basis functions (see [38]). The control points (xij , rij ) can be thought of as order 0 medial atoms. However, order 1 medial atoms can be produced at each point in the continuous medial locus by using the rst partial derivatives xu , xv , ru , rv of the b-spline functions. The spoke directions of the order 1 medial atoms are given by the functions n0 = r + 1 r 2 n, n1 = r 1 r 2 n, where n = xu xv / xu xv is the unit surface normal to x. The gradient of the radius is given by the formula ru r = [xu xv ]I 1 , rv 55 where I is the metric tensor on the surface x, i.e., I= x u , xu x u , xv x u , xv x v , xv . The b-spline medial locus must satisfy several constraints to ensure that it implies a valid, non-folding boundary surfaces. First, it must satisfy the constraint that r <1 in the interior of the medial sheet. Second, the implied boundary must be constrained to not crease or fold by ensuring that the Jacobian of the medial-to-boundary mapping remains positive. Finally, the medial locus must be a manifold with boundary, where the edge curve of A3 medial atoms satises the constraint r = 1. This last condition is achieved by setting the edge of the control point grid to large negative radii, while the interior control points all have positive radii. This causes the level curve of r = 1 to lie within the b-spline surface. The surfaces is then trimmed along this curve, resulting in the desired edge for the medial sheet. Segmentation via Deformable M-reps Following the deformable models paradigm, a 3D m-rep model M is deformed into an image I(x, y, z) by optimizing an objective function, which is dened as F (M, I) = L(M, I) + G(M). The function L, the image match, measures how well the model matches the image information, while G, the geometric typicality, gives a prior on the possible variation of the geometry of the model. The relative importance of the two terms is weighted by the non-negative real parameter . The segmentation strategy described in this review is from Pizer et al. [104] and also developed in previous papers [103, 66]. One of the main contribution of this thesis, described later in Chapter 5, builds on this segmentation strategy by incorporating geometric statistics. This includes using a statistical geometric prior for the geometric typicality and using gural deformations based on the statistical modes of variation. This objective function is optimized in a multiscale fashion. That is, it is optimized over a sequence of transformations that are successively ner in scale. In this review only segmentation of single gures is considered, which includes three levels of scale: the gural level, the medial atom level, and the dense boundary sampling level. At each scale level the deformations are dened as transformations of the current primitives, 56 either gures, medial atoms, or boundary points. The model is rst initialized by the user placing a template model into the image I using a global translation, rotation, and scaling. At the gural level the transformation used is a similarity transformation plus an elongation of the entire gure. More generally, the gural stage transformation can be any operation that acts globally on the gure, such as an atom transformation applied to each atom in the gure. At the atom level each medial atom is independently transformed by a translation of the medial position, a 3D rotation of the frame, a scaling of the radius, and a rotation of the object angle. In the boundary stage each boundary point is displaced along its corresponding normal direction. The computation of the image match term in the objective function is based on a a template model M. Image values in a template image I at a particular gural coordinate of the template model are compared to image values in the target image I at the corresponding gural coordinate of the candidate model. The image match term of the objective function is computed as a correlation over a collar ( in the normal direction) about the object boundary: L(M, I) = B s G(t)I ( + (t/) ) I (s + (t/r)n) dt dw. r n In this equation a hat () always denotes an entity associated with the template model M, and the same entities without a hat are associated with the candidate model M. The parameter w = (u, v, t) is a gural boundary coordinate, B is the parameter domain of the boundary coordinates. The following are functions of the boundary gural coordinate w: s, are parameterizations of the boundaries, r, r are the radius functions, and s n, n are the boundary normals. The function G is the Gaussian kernel with standard deviation . The Gaussian kernel is used to weight the importance of the image match so that features closer to the boundary are given higher weight. The values for the collar width and Gaussian standard deviation have been set by experience to = 0.3 and = 0.15. The geometric typicality term G consists of two terms. Each term is computed using r-proportional squared distances. The rst term, denoted by P , measures the total amount of change in the object boundary during the current stage. The second term, denoted by N , measures the dierence between the boundary of the candidate and the boundary of the candidate replacing the current primitive with the prediction of its neighbors. This neighbor term enforces a local consistency between model primitives. 57 The geometric typicality term is dened as G(M) = (1 ) P (M) + N (M), where [0, 1] is a weighting term. The function P measures the change in the boundary from the previous level of scale in r-proportional terms: P (M) = B(M) ||s s0 ||2 ds, r2 where s0 is the initial position of the boundary at this scale level. The function N seeks to keep primitives in the same relationship with their neighboring primitives. It is dened as ||s s ||2 ds, N (M) = r2 B(M) where now s is the boundary surface of the model in which the current primitive is in the position predicted by its neighbors. For single-gure models there is no neighbor primitive at the gural stage. Therefore, the neighbor term is zero, i.e., = 0, during the gural level. For the atom scale level each medial atoms neighbors are the adjacent atoms in the grid (4 neighbors for internal atoms, 3 for edge atoms, and 2 for the corner atoms). The neighbor term at the boundary scale level comes from comparing a boundary point to the prediction by its neighbors in the boundary mesh. This prediction is given by an average of the neighboring points. 3.4 Diusion Tensor Imaging The statistical methods introduced in this dissertation are shown in Chapter 6 to have application in the statistical analysis of diusion tensor images. This section is a review of diusion tensor imaging. It begins with a description of diusion tensor imaging and Brownian motion. Several quantitative measures derived from diusion tensors are then discussed. Finally, the clinical applicability of diusion tensor imaging is reviewed, as well as research issues such as regularization, ber tracking, and statistical studies. Diusion tensor magnetic resonance imaging (DT-MRI), developed at NIH by Peter Basser et al. [3], measures the random 3D motion of water molecules, i.e., the diusion of water. It produces a 3D diusion tensor, that is, a 3 3, symmetric, positive-denite matrix, at each voxel of an 3D imaging volume. Recall that a matrix A is symmetric if A = AT , and it is positive-denite if xT Ax > 0 for all nonzero vectors x. This tensor 58 is the covariance in a Brownian motion model of the diusion of water at that voxel. In homogeneous materials water tends to diuse isotropically, that is, equally in all directions. For example, if a drop of dye is placed in water, it will slowly spread out equally in all directions. However, in brous materials, such as skeletal muscle, cardiac muscle, and brain white matter, water molecules tend to diuse faster in the directions parallel to the bers and slower in the directions perpendicular to the bers. Therefore, DT-MRI can give important information about the microstructure of brous tissues in the body. In brain imaging DT-MRI is used to track white matter bers and establish connectivity properties of the brain. Brownian motion was rst discovered by the biologist Robert Brown in 1827 [16]. He noticed that small pollen particles demonstrated a seemingly random, jittery, motion when suspended in water. However, a mathematical model of Brownian motion was not developed until 1905 by Albert Einstein [37]. Einstein showed that Wiener processes provide a reasonable model of Brownian motion (although more accurate and more complicated physical models appeared later). A Wiener process is a random process w(t) of time t that satises the following two axioms: 1. w(t) w(s) N (0, (t s)). 2. w(t) w(s) and w(v) w(u) are independent random variables for 0 s t u v. For DT-MRI the Wiener process w(t) is a function giving the 3D position of a molecule under diusion. The axioms state that the incremental motion of a particle is governed by a normal distribution at each point in space, and that the motion is independent of any previous movement of the particle. The covariance is a 3 3 symmetric, positivedenite matrix, which can take dierent values at dierent locations in space. This covariance matrix is what is produced by DT-MRI at each voxel of an image. The quantitative measurements derived from diusion tensors that people have used can be roughly broken down into two categories: size and shape An measurements. important aspect of these measurements is that they should be independent of the laboratory coordinate system. That is, a derived measurement should be invariant to translation and rotation of the diusion tensor. First of all, diusion tensors themselves are invariant to translation (a translated diusion tensor is the same tensor, just at a dierent point). Therefore, it suces to consider only rotational invariance. If the coordinate system is rotated by a matrix R SO(3), a diusion tensor D will be 59 transformed to the tensor D by the equation D = RDRT . The eigenvalues 1 , 2 , 3 of D are left invariant under this operation. Therefore, any combination of the eigenvalues is an invariant measurement of the tensor D. This leads to two measurements of the size of a diusion tensor. The rst size measurement, the mean diusivity is given by the average of the eigenvalues, D = (1/3)(1 + 2 + 3 ). The second measurement of size is the determinant of the diusion tensor, given by the product of its eigenvalues, det(D) = 1 2 3 . Measurements of the shape of a diusion tensor depend on the relative magnitudes of the eigenvalues. There are two common anisotropy measures, which measure how far the diusion tensor is from being isotropic. They are both based on the standard deviation of the eigenvalues of D, given by (D) = (1/ 3) (1 D )2 + (2 D )2 + (2 D )2 . The rst anisotropy measure, known as the relative anisotropy (RA), is given by the ratio of the standard deviation of the eigenvalues of D with the average of the eigenvalues, that is, (D) RA(D) = . D The second anisotropy measure, known as the fractional anisotropy (FA), is similar to the RA, except the denominator is the magnitude of the tensor under the Frobenius matrix norm, D = 2 + 2 + 2 . The FA is given by 1 2 3 F A(D) = 6 (D) . 2 D Diusion tensor imaging has shown promise in several clinical studies of the brain. See the paper by Le Bihan et al. [7] for a review. Ischemic areas of the brain, that is, areas with decreased blood supply, in stroke patients have demonstrated lower diusivity [91, 112, 130]. Thus, DT-MRI could help doctors understand which areas of the brain 60 have been damaged and might be salvageable in the rst hours after a stroke. DTMRI has shown promise in diagnosing diseases such as multiple sclerosis [128, 133] and Alzheimers [57, 56]. The health of the brain white matter can be assessed using the derived measures of the diusion tensor. The diusivity measures tell the overall content of water in the tissue, and the anisotropy measures indicate the health of the myelin bers. Also, studies have shown that DT-MRI could be used to assess the growth and maturity of white matter in the newborn brain [95, 138]. From the perspective of the image analyst, diusion tensor images present many new and interesting issues including visualization, regularization, ber tracking, and statistical analysis of diusion tensor data. A major challenge in these applications is that diusion tensor images contain 6-dimensional tensors at each voxel rather than a single value per voxel found in other modalities. Several authors have addressed the problem of estimation and smoothing within a DT image [23, 28, 137]. Further insights might be had from the use of diusion tensor imaging in intersubject studies. Statistical brain atlases have been used in the case of scalar images to quantify anatomical variability across patients. However, relatively little work has been done towards constructing statistical brain atlases from diusion tensor images. Alexander et al. [1] describe a method for the registration of multiple DT images into a common coordinate frame, however, they do not include a statistical analysis of the diusion tensor data. Previous attempts [4, 100] at statistical analysis of diusion tensors within a DT image are based on a Gaussian model of the linear tensor coecients. Chapter 4 Manifold Statistics This chapter1 presents a novel framework for computing the statistical variability of data on general manifolds. Principal component analysis is a standard technique for describing the statistical variability of data in Euclidean space Rn . The method presented in this chapter, called principal geodesic analysis (PGA), is a natural extension of principal component analysis to manifold-valued data. In Section 4.1 we review existing denitions for the mean of manifold-valued data. The denition of the mean used in this work is intrinsic to the geometry of the manifold. In Section 4.2 we present principal geodesic analysis for describing the variability of data on manifolds. This is based on generalizing the denition of principal component analysis, using either the variance-maximizing or least-squares denition. We give an algorithm for computing principal geodesic analysis as well as an algorithm for eciently approximating it. Finally, we demonstrate implementations of both the PGA and approximation to PGA algorithms on the sphere S 2 . 4.1 Means on Manifolds The rst step in extending statistical methods to manifolds is to dene the notion of a mean value. In this section we describe two dierent notions of means on manifolds called intrinsic and extrinsic means, and we argue that the intrinsic mean is a preferable denition. We then present a method for computing the intrinsic mean of a collection of data on a manifold. Throughout this section we consider only manifolds that are connected and have a complete Riemannian metric. The work presented in this chapter was done in collaboration with Dr. Sarang Joshi, Dr. Conglin Lu, and Dr. Stephen Pizer at the University of North Carolina. This chapter contains parts of the paper [42] and is also based on the previous papers [40, 41]. 1 62 4.1.1 Intrinsic vs. Extrinsic Means 1 Given a set of points x1 , . . . , xN Rn , the arithmetic mean x = N N xi is the point i=1 that minimizes the sum-of-squared Euclidean distances to the given points, i.e., N x = arg min xRn i=1 ||x xi ||2 . Since a general manifold M may not form a vector space, the notion of an additive mean is not necessarily well dened. However, like the Euclidean case, the mean of a set of points on M can be formulated as the point which minimizes the sum-of-squared distances to the given points. This formulation depends on the denition of distance. One way to dene distance on M is to embed it in a Euclidean space and use the Euclidean distance between points. This notion of distance is extrinsic to M , that is, it depends on the ambient space and the choice of embedding. Given an embedding : M Rn , dene the extrinsic mean [53] of a collection of points x1 , . . . , xN M as N = arg min xM i=1 ||(x) (xi )||2 . Given the above embedding of M , we can also compute the arithmetic (Euclidean) mean of the embedded points and then project this mean onto the manifold M . This projected mean is equivalent to the above denition of the extrinsic mean (see [118]). Dene a projection mapping : Rn G as (x) = arg min ||(y) x||2 . yM Then the extrinsic mean is given by 1 = N N (xi ) . i=1 A more natural choice of distance is the Riemannian distance on M . This denition of distance depends only on the intrinsic geometry of M . We now dene the intrinsic mean of a collection of points x1 , . . . , xN M as the minimizer in M of the sum-of- 63 squared Riemannian distances to each point. Thus the intrinsic mean is N = arg min xM i=1 d(x, xi )2 , (4.1) where d(, ) denotes Riemannian distance on M . This is the denition of a mean value that we use in this paper. The idea of an intrinsic mean goes back to Frchet [44], who denes it for a general e metric space. The properties of the intrinsic mean on a Riemannian manifold have been studied by Karcher [67]. Moakher [90] compares the properties of the intrinsic and extrinsic mean for the group of 3D rotations. Since the intrinsic mean is dened in (4.1) as a minimization problem, its existence and uniqueness are not ensured. However, Kendall [74] shows that the intrinsic mean exists and is unique if the data is welllocalized. We argue that the intrinsic mean denition is preferable over the extrinsic mean. The intrinsic mean is dened using only the intrinsic geometry of the manifold in question, that is, distances that are dependent only on the Riemannian metric of the manifold. The extrinsic mean depends on the geometry of the ambient space and the embedding . Also, the projection of the Euclidean average back onto the manifold may not be unique if the manifold has negative sectional curvatures. 4.1.2 Computing the Intrinsic Mean Computation of the intrinsic mean involves solving the minimization problem in (4.1). We will assume that our data x1 , . . . , xn M lie in a suciently small neighborhood so that a unique solution is guaranteed. We must minimize the sum-of-squared distance function N 1 d(x, xi )2 . f (x) = 2N i=1 We now describe a gradient descent algorithm, rst proposed by Pennec [102], for minimizing f . Using the assumption that the xi lie in a strongly convex neighborhood, i.e., a neighborhood U such that any two points in U are connected by a unique geodesic contained completely within U , Karcher [67] shows that the gradient of f is 1 f (x) = N N Logx (xi ). i=1 64 The gradient descent algorithm takes successive steps in the negative gradient direction. Given a current estimate j for the intrinsic mean, the equation for updating the mean by taking a step in the negative gradient direction is j+1 = Expj where is the step size. Because the gradient descent algorithm only converges locally, care must be taken in the choices of the initial estimate of the mean 0 and the step size . Since the data is assumed to be well-localized, a reasonable choice for the initial estimate 0 is one of the data points, say x1 . The choice of is somewhat harder and depends on the manifold M . Buss and Fillmore [18] prove for data on spheres, a value of = 1 is sucient. Notice that if M is a vector space, the gradient descent algorithm with = 1 is equivalent to linear averaging and thus converges in a single step. If M = R+ , the Lie group of positive reals under multiplication, the algorithm with = 1 is equivalent to the geometric average and again converges in a single step. In summary we have the following algorithm for computing the intrinsic mean of manifold data: Algorithm 4.1: Intrinsic Mean Input: x1 , . . . , xN M Output: M , the intrinsic mean 0 = x 1 Do = N N Logj xi i=1 j+1 = Expj () While |||| > . N N Logj (xi ) , i=1 4.2 Principal Geodesic Analysis Although averaging methods on manifolds have previously been studied, principal component analysis has not been developed for manifolds. We present a new method called principal geodesic analysis (PGA), a generalization of principal component analysis to manifolds. We start with a review of PCA in Euclidean space. Consider a set of points x1 , . . . , xN Rn with zero mean. Principal component analysis seeks a sequence of linear subspaces that best represent the variability of the data. To be more precise, 65 the intent is to nd an orthonormal basis {v1 , . . . , vn } of Rn , which satises the recursive relationship N v1 = arg max ||v||=1 N k1 i=1 (v xi )2 , (vj xi )2 + (v xi )2 . (4.2) vk = arg max ||v||=1 i=1 j=1 (4.3) In other words, the subspace Vk = span({v1 , . . . , vk }) is the k-dimensional subspace that maximizes the variance of the data projected to that subspace. The basis {vk } is computed as the set of ordered eigenvectors of the sample covariance matrix of the data. Now turning to manifolds, consider a set of points x1 , . . . , xN on a manifold M . Our goal is to describe the variability of the xi in a way that is analogous to PCA. Thus we will project the data onto lower-dimensional subspaces that best represent the variability of the data. This requires rst extending three important concepts of PCA into the manifold setting: Variance. Following the work of Frchet, we dene the sample variance of the e data as the expected value of the squared Riemannian distance from the mean. Geodesic subspaces. The lower-dimensional subspaces in PCA are linear subspaces. For general manifolds we extend the concept of a linear subspace to that of a geodesic submanifold. Projection. In PCA the data is projected onto linear subspaces. We dene a projection operator for geodesic submanifolds, and show how it may be eciently approximated. We now develop each of these concepts in detail. 4.2.1 Variance The variance 2 of a real-valued random variable x with mean is given by the formula 2 = E[(x )2 ], where E denotes expectation. It measures the expected localization of the variable x about the mean. When dealing with a vector-valued random variable x in Rn with mean 66 , the variance is replaced by a covariance matrix = E[(x )(x )T ]. However, this denition may not be well dened for general manifolds again since vector space operations may not exist for such spaces. The denition of variance we use comes from Frchet [44], who denes the variance e of a random variable in a metric space as the expected value of the squared distance from the mean. That is, for a random variable x in a metric space with intrinsic mean , the variance is given by 2 = E[d(, x)2 ]. Thus given data points x1 , . . . , xN on a complete, connected manifold M , we dene the sample variance of the data as 2 = 1 N N d(, xi )2 = i=1 1 N N || Log (xi )||2 , i=1 (4.4) where is the intrinsic mean of the xi . If M is a vector space, the variance denition in (4.4) is given by the trace of the sample covariance matrix, i.e., the sum of its eigenvalues, or the total sum of squares about the mean. It is in this sense that this denition captures the total variation of the data. 4.2.2 Geodesic Submanifolds The next step in generalizing PCA to manifolds is to generalize the notion of a linear subspace. A geodesic is a curve that is locally the shortest path between points. In this way a geodesic is the generalization of a straight line. Thus it is natural to use a geodesic curve as the one-dimensional subspace that provides the analog of the rst principal direction in PCA. In general if N is a submanifold of a manifold M , geodesics of N are not necessarily geodesics of M . For instance the sphere S 2 is a submanifold of R3 , but its geodesics are great circles, while geodesics of R3 are straight lines. A submanifold H of M is said to be geodesic at x H if all geodesics of H passing through x are also geodesics of M . For example a linear subspace of Rn is a submanifold geodesic at 0. Submanifolds geodesic at x preserve distances to x. This is an essential property for PGA because variance 67 is dened as the average squared distance to the mean. Thus submanifolds geodesic at the mean will be the generalization of the linear subspaces of PCA. 4.2.3 Projection The projection of a point x M onto a geodesic submanifold H of M is dened as the point on H that is nearest to x in Riemannian distance. Thus we dene the projection operator H : M H as H (x) = arg min d(x, y)2 . (4.5) yH Since projection is dened by a minimization, there is no guarantee that the projection of a point exists or that it is unique. However, by restricting to a small enough neighborhood about the mean, we can be assured that projection is unique for any submanifold geodesic at the mean. 4.2.4 Dening Principal Geodesic Analysis We are now ready to dene principal geodesic analysis for data x1 , . . . , xN on a connected, complete manifold M . Our goal, analogous to PCA, is to nd a sequence of nested geodesic submanifolds that maximize the projected variance of the data. These submanifolds are called the principal geodesic submanifolds. Let T M denote the tangent space of M at the intrinsic mean of the xi . Let U T M be a neighborhood of 0 such that projection is well-dened for all geodesic submanifolds of Exp (U ). We assume that the data is localized enough to lie within such a neighborhood. The principal geodesic submanifolds are dened by rst constructing an orthonormal basis of tangent vectors v1 , . . . , vn T M that span the tangent space T M . These vectors are then used to form a sequence of nested subspaces Vk = span({v1 , . . . , vk }) U . The principal geodesic submanifolds are the images of the Vk under the exponential map: Hk = Exp (Vk ). The rst principal direction is chosen to maximize the projected variance along the corresponding geodesic: N v1 = arg max ||v||=1 i=1 || Log (H (xi ))||2 , (4.6) where H = Exp (span({v}) U ). 68 The remaining principal directions are dened recursively as N vk = arg max ||v||=1 i=1 || Log (H (xi ))||2 , (4.7) where H = Exp (span({v1 , . . . , vk1 , v}) U ). 4.2.5 An Alternative Denition of PGA Recall from Section 3.1.3 that principal component analysis may be dened in two dierent ways, both giving the same end result. Thus far, we have based the denition of principal geodesic analysis on generalizing the variance maximization approach to PCA. In this section we describe an alternative denition of PGA based on generalizing the other approach to PCA, namely, the least-squares approach. The least-squares approach to PCA of a collection of data x1 , . . . , xN Rn seeks a sequence of linear subspaces that are closest to the data in a least-squares sense. These subspaces are generated from an orthonormal basis {v1 , . . . , vn } of Rn , which satises the recursive relationship N v1 = arg min v =1 N xi v, xi v 2 , i=1 k1 2 vk = arg min v =1 i=1 xi j=1 vj , xi vj + v, xi v . In other words, the subspace Vk = span({v1 , . . . , vk }) is the k-dimensional subspace that minimizes the sum-of-squared distances to the data. We now want to dene principal geodesic analysis of data x1 , . . . , xN on a manifold M by generalizing this least-squares approach. The least-squares distance is dened using geodesic distances on the manifold. Using the same notation as in the previous subsection, we dene principal geodesic submanifolds via subspaces Vk = span({v1 , . . . , vk }) of the tangent space T M . The principal geodesic submanifolds are again given by Hk = Exp (Vk ). The rst principal direction is now chosen to minimize the sum-ofsquared distance of the data to the corresponding geodesic: N v1 = arg max ||v||=1 i=1 || Logxi (H (xi ))||2 , where H = Exp (span({v}) U ). 69 The remaining principal directions are dened recursively as N vk = arg max ||v||=1 i=1 || Logxi (H (xi ))||2 , where H = Exp (span({v1 , . . . , vk1 , v}) U ). The only dierence in these equations from the variance approach given in (4.6) and (4.7) is that the base point for the Log is xi rather than . The question immediately arises: is the least-squares approach to PGA equivalent to the maximum variance denition? For data in Rn the two denitions are equivalent since PGA reduces to PCA in the linear case. The question remains unsolved for more general manifolds. This issue is discussed further in the future work section in Chapter 7. 4.2.6 Approximating Principal Geodesic Analysis Exact computation of PGA, that is, solution of the minimizations (4.6) and (4.7), requires computation of the projection operator H . However, the projection operator does not have a closed-form solution for general manifolds. Projection onto a geodesic submanifold can be approximated linearly in the tangent space of M . Let H M be a geodesic submanifold at a point p M and x M a point to be projected onto H. Then the projection operator is approximated by H (x) = arg min || Logx (y)||2 yH arg min || Logp (x) Logp (y)||2 . yH Notice that Logp (y) is simply a vector in Tp H. Thus we may rewrite the approximation in terms of tangent vectors as Logp (H (x)) arg min || Logp (x) v||2 . vTp H But this is simply the minimization formula for linear projection of Logp (x) onto the linear subspace Tp H. So, if v1 , . . . , vk is an orthonormal basis for Tp H, the projection 70 operator can be approximated by the formula k Logp (H (x)) i=1 vi , Logp (x) . (4.8) Analyzing the quality of the approximation to the projection formula (4.8) is dicult for general manifolds. It obviously gives the exact projection in the case of Rn . For other manifolds of constant curvature, such as spheres, S n , and hyperbolic spaces, H n , the projection formula can be computed exactly in closed form. This makes it possible to get an idea of how well the linear approximation does in these cases. The error computations for the sphere S 2 are carried out at the end of this subsection as an example. If we use (4.8) to approximate the projection operator H in (4.6) and (4.7), we get N v1 arg max v =1 N k1 i=1 v, Log (xi ) 2 , 2 vk arg max v =1 i=1 j=1 vj , Log (xi ) + v, Log (xi ) 2 . The above minimization problem is simply the standard principal component analysis in T M of the vectors Log (xi ), which can be seen by comparing the approximations above to the PCA equations, (4.2) and (4.3). Thus an algorithm for approximating the PGA of data on a manifold is given by Algorithm 4.2: Principal Geodesic Analysis Input: x1 , . . . , xN M Output: Principal directions, vk T M Variances, k R = intrinsic mean of {xi } (Algorithm 4.1) ui = Log (xi ) 1 S = N N u i uT i i=1 {vk , k } = eigenvectors/eigenvalues of S. Now we demonstrate the error computations for the projection operator in the special case of the sphere S 2 . Let H be a geodesic (i.e., a great circle) through a point p S 2 . Given a point x S 2 , we wish to compute its true projection onto H and compare that with the approximation in the tangent space Tp S 2 . Thus we have the spherical right triangle as shown in Fig. 4.1. We know the hypotenuse length c = d(p, x) and the angle 71 x c b m q a Figure 4.1: The spherical triangle used in the calculation of the projection operator for S 2. , and we want to derive the true projection, which is given by the side length a. We use the following two relations from the laws of spherical trigonometry: cos c = (cos a)(cos b), sin b = sin c. sin Solving for a in terms of the hypotenuse c and the angle , we have a = arccos cos c 1 (sin sin b)2 . The tangent-space approximation in (4.8) is equivalent to solving for the corresponding right triangle in R2 . Using standard Euclidean trigonometry, the tangent-space approximation (4.8) gives a c cos . For nearby data, i.e., small values for c, this gives a good approximation. For example, for c < the maximum absolute error is 0.07rad. However, the error can be signicant 4 for far away points, i.e., as c approaches . 2 4.3 Conclusions In this chapter we have presented principal geodesic analysis, a new methodology for analyzing the statistical variability of data on a manifold. We reviewed two denitions 72 of a mean value on a manifold, the intrinsic and extrinsic mean, and we argued that the intrinsic mean was preferable. Principal geodesic analysis is dened as a direct generalization of principal component analysis using either a variance maximizing or least-squares approach. We gave an algorithm for computing an approximation to PGA in the tangent space to the mean. The methods is this chapter will be applied to medial representations in Chapter 5, with the application of using the statistics as a geometric prior in a Bayesian deformable models segmentation method. In Chapter 6 we will apply PGA to the study of diusion tensor data, with the driving application to study the statistical variability of diusion tensor images across patient populations. There are several theoretical questions about principal geodesic analysis that remain to be solved. They are 1. Are the two denitions for PGA, variance maximization and least-squares, equivalent? 2. Is the greedy, i.e., recursive, approach to nding the vk in (4.6) and (4.7) equivalent to nding each subspace Vk independently? 3. If the Vk are found independently, are they even subsets of one another, i.e., do they satisfy Vk Vk+1 ? Another open problem is an algorithm for computing principal geodesic analysis exactly when the projection operator is known in closed form. These issues will be discussed further in the future work section of Chapter 7. Chapter 5 Statistics of M-reps In this chapter1 we apply the statistical framework presented in the previous chapter for general manifolds to the statistical analysis of m-rep models of anatomical objects. Throughout this chapter we use the mesh-based medial representation with order 1 medial atoms as described in Section 3.3. Thus the term m-rep model will always refer to this type of medial representation. We rst show in Section 5.1 that the space of m-rep models containing n medial atoms is a symmetric space that we denote by M(n). As is the case with other shape analysis methods, since we are interested in studying the variability of shape alone, we must rst align the models to a common position, orientation, and scale. In Section 5.2 we present an m-rep alignment algorithm that minimizes the sum-of-squared geodesic distances between models, i.e., has the desirable property that it minimizes the same metric as is used in the denition of the mean and principal geodesics, but over the global similarity transformations of alignment. Next the mean and PGA algorithms are adapted to the specic case of m-rep models in Sections 5.3 and 5.4. The initial data is a set of m-rep models that have been t to a particular class of objects in a training set of images. Finally, in Section 5.5 we describe how the statistical methods developed in this chapter give both an optimization parameter space and a geometric prior in the Bayesian deformable m-reps segmentation method. The work presented in this chapter was done in collaboration with Dr. Sarang Joshi, Dr. Conglin Lu, and Dr. Stephen Pizer at the University of North Carolina. This chapter contains parts of the paper [42] and is also based on the previous papers [40, 41]. We acknowledge Jerey Lieberman and Guido Gerig, UNC Psychiatry, for providing hippocampal data from a clinical Schizophrenia study which was supported by Stanley Foundation and by the UNC-MHNCRC (MH33127). 1 74 5.1 M-reps as Elements of a Symmetric Space In this section it is shown that m-rep models can be parameterized as a symmetric space. This formulation will open up m-reps to the statistical methods introduced in the previous chapter, namely, manifold means and PGA. We then give formulas for the Riemannian log and exponential maps that are required to be able to compute the statistics described in the previous chapter. Finally, we describe the data set of 86 hippocampus m-rep models that are used to demonstrate the methods that are presented in later sections. Let Md (n) denote the space of all d-dimensional m-rep models containing n (order 1) medial atoms. The focus in this chapter will be on the case d = 3, so M(n) without a subscript will be used to denote the three-dimensional case. Recall from Denition 3.5 that a three-dimensional order 1 medial atom is dened as a tuple (x, r, n0 , n1 ) R3 R+ S 2 S 2 . Recall that such an atom represents a position x and two equal length vectors emanating from this position with length r and directions n0 , n1 . Therefore, M(1) = R3 R+ S 2 S 2 is the space of all possible order 1 medial atoms. In general an m-rep model with n medial atoms is a point in the space M(n) = M(1)n = (R3 R+ S 2 S 2 )n , i.e., the direct product of n copies of M(1). As was shown in the background section on symmetric spaces (Section 2.5), each of the space R3 , R+ , and S 2 are symmetric spaces. Therefore, to show that M(n) is a symmetric space it suces to show that the direct product of symmetric spaces is also a symmetric space. (This is a well-known fact of symmetric spaces, but a quick derivation is given here all the same.) Recall from Theorem 2.4 that the direct product of Lie groups is again a Lie group. The direct product operation is also dened for mappings, as the next denition shows. Denition 5.1. The direct product of a collection of maps fi : Xi Yi , (1 i n), where Xi , Yi are sets, is dened as the map (f1 fn ) : X1 Xn Y1 Yn given by (f1 fn )(x1 , . . . , xn ) = (f1 (x1 ), . . . , fn (xn )). Now, if Gi , (1 i n) are Lie groups with automorphisms i : Gi Gi , it is easy to see that the product map 1 n is an automorphism of G1 Gn . The next theorem now follows easily from Theorems 2.4 and 2.10. Theorem 5.1. If Mi : 1 i n are symmetric spaces, then the direct product manifold M = M1 Mn is also a symmetric space. Proof. Since Mi is a symmetric space, by Theorem 2.11it can be written as the quotient space Mi = Gi /Hi , where Gi is a connected Lie group, and Hi is a connected, compact 75 subgroup of Gi . Also, from Theorem 2.11 there is an involutive automorphism i : Gi Gi that has xed set Hi . The direct product G = (G1 Gn ) is a connected Lie group by Theorem 2.4, and the direct product H = (H1 Hn ) is a connected, compact Lie subgroup of G. Also, the product map = (1 n ) is an involutive automorphism of G with xed set H. Now, M is dieomorphic to the Lie group quotient space G/H because of the equivalence (G1 /H1 ) (Gn /Hn ) = (G1 Gn )/(H1 Hn ). Therefore, M satises the conditions to be a symmetric space given in Theorem 2.10. 5.1.1 The Exponential and Log Maps for M-reps Before we can apply the statistical techniques for manifolds developed in the previous chapter, we must dene the exponential and log maps for the symmetric space M(n), the space of m-rep models with n atoms. We begin with a discussion of the medial atom space M(1) = R3 R+ S 2 S 2 . Let p = (0, 1, p0 , p1 ) M(1) be the base point, where p0 = p1 = (0, 0, 1) are the base points for the spherical components. The tangent space for M(1) at the base point p can be identied with R8 . We write a tangent vector u Tp M(1) as u = (x, , v0 , v1 ), where x R3 is the positional tangent component, R is the radius tangent component, and v0 , v1 R2 are the spherical tangent components. The exponential map for M(1) is now the direct product of the exponential map for each component. The exponential map for R3 is simply the identity map, for R it is the standard real exponential function, and for S 2 it is the spherical exponential map given in (2.5). Thus for M(1) we have Expp (u) = (x, e , Expp0 (v0 ), Expp1 (v1 )), where the two Exp maps on the right-hand side are the spherical exponential maps. Likewise, the log map of a point m = (x, r, n0 , n1 ) is the direct product map Logp (m) = (x, log r, Logp0 (n0 ), Logp1 (n1 )), where the two Log maps on the right-hand side are the spherical log maps given by (2.6). Finally, the exponential and log maps for the m-rep model space M(n) are just the direct products of n copies of the corresponding maps for the medial atom space M(1). For end atoms there is an extra parameter representing the elongation of the bisector spoke that points to the crest (see Section 3.3.2). This is treated as another positive real number under multiplication. Therefore, end atoms are represented as the 76 Figure 5.1: The surfaces of 16 of the 86 original hippocampus m-rep models. symmetric space R3 R+ S 2 S 2 R+ . The exponential and log maps for these atoms are just augmented with another copy of the corresponding map for R+ . Notice that the position, radius, and orientations are not in the same units. For the PGA calculations in Section 4.2 we scale the radius and sphere components (and for end atoms) in the Riemannian metric to be commensurate with the positional components. The scaling factor for both components is the average radius over all corresponding medial atoms in the population. Thus the norm of the vector u = Tp M(1) becomes ||u|| = ||x||2 + r2 (2 + ||v1 ||2 + ||v2 ||2 ) 1 2 , where r is the average radius over all corresponding medial atoms. Using this norm and the formula for Riemannian distance, the distance between two atoms m1 , m2 M(1) is given by d(m1 , m2 ) = || Logm1 (m2 )||. (5.1) 5.1.2 The Hippocampus Data Set The results of these techniques are demonstrated on a set of 86 m-rep models of hippocampi from a schizophrenia study. A subset of 16 of these models are displayed as 77 surfaces in Fig. 5.1. The m-rep models were automatically generated by the method described in [122], which chooses the medial topology and sampling that is sucient to represent the population of objects. The models were t to expert segmentations of the hippocampi from MRI data. The average distance error from the m-rep boundary to the original segmentation boundary ranged from 0.14mm and 0.27mm with a mean error of 0.17mm. This is well within the original MRI voxel size (0.9375mm x 0.9375mm 1.5mm). The sampling on each m-rep was 3 8, making each model a point on the symmetric space M(24). Since the dimensionality of M(1) is 8, the total number of dimensions required to represent the hippocampus models is 192. 5.2 M-rep Alignment To globally align objects described by boundary points to a common position, orientation, and scale, the standard method is the Procrustes method [47]. Procrustes alignment minimizes the sum-of-squared distances between corresponding boundary points, the same metric used in dening the mean and principal components. We now develop an analogous alignment procedure based on minimizing sum-of-squared geodesic distances on M(n), the symmetric space of m-rep objects with n atoms. Let S = (s, R, w) denote a similarity transformation in R3 consisting of a scaling by s R+ , a rotation by R SO(3), and a translation by w R3 . We dene the action of S on a medial atom m = (x, r, n0 , n1 ) by S m = S (x, r, n0 , n1 ) = (sR x + w, sr, R n0 , R n1 ). (5.2) This action is the standard similarity transform of the position x, and the scaling and rotation of the spokes are transformations about the medial position x. Now the action of S on an m-rep object M = {mi : i = 1, . . . , n} is simply the application of S to each of Ms medial atoms: S M = {S mi : i = 1, . . . , n}. (5.3) It is easy to check from the equation for the implied boundary points (3.7) that this action of S on M also transforms the implied boundary points of M by the similarity transformation S. Consider a collection M1 , . . . , MN M(n) of m-rep objects to be aligned, each consisting of n medial atoms. We write mi to denote the ith medial atom in the th m-rep object. Notice that the m-rep parameters, which are positions, orientations, and 78 scalings, are in dierent units. Before we apply PGA to the m-reps, it is necessary to make the various parameters commensurate. This is done by scaling the log rotations and log radii by the average radius value of the corresponding medial atoms. The squared-distance metric between two m-rep models Mi and Mj becomes n d(Mi , Mj ) = =1 2 d(mi , mj )2 , (5.4) where the d(, ) for medial atoms on the right-hand side is given by (5.1). The m-rep alignment algorithm nds the set of similarity transforms S1 , . . . , SN that minimize the total sum-of-squared distances between the m-rep gures: N i d(S1 , . . . , SN ; M1 , . . . , MN ) = i=1 j=1 d(Si Mi , Sj Mj )2 . (5.5) Following the algorithm for generalized Procrustes analysis for objects in R3 , minimization of (5.5) proceeds in stages: Algorithm 5.1: M-rep Alignment 1. Translations. First, the translational part of each Si in (5.5) is minimized once and for all by centering each m-rep model. That is, each model is translated so that the average of its medial atoms positions is the origin. 2. Rotations and Scalings. The ith model, Mi , is aligned to the mean of the remaining models, denoted i . The alignment is accomplished by a gradient descent algorithm on SO(3) R+ to minimize d(i , Si Mi )2 . The gradient is approximated numerically by a central dierences scheme. This is done for each of the N models. 3. Iterate. Step 2 is repeated until the metric (5.5) cannot be further minimized. The result of applying the m-rep alignment algorithm to the 86 hippocampus m-rep models is shown in Fig. 5.2. The resulting aligned gures are displayed as overlaid medial atom centers. Since the rotation and scaling step of the alignment algorithm is a gradient descent algorithm, it is important to nd a good starting position. Thus the alignment was initialized by rst aligning the m-rep models with the Procrustes method applied to the implied boundary points of the m-rep models. 79 Figure 5.2: The 86 aligned hippocampus m-reps, shown as overlayed medial atom centers. 5.3 M-rep Averages Algorithm 4.1 can be adapted for computing means of m-rep models by taking the manifold to be the symmetric space M(n). Recall that the gradient descent algorithm for the mean, Algorithm 4.1, has a parameter , which is the step size taken in the downhill gradient direction. For m-reps a step size of = 1 is used. Since M(n) is a direct product space, the algorithm will converge if each of the components converge. Notice that each of the R3 and R+ components in M(n) converge in a single iteration since they are commutative Lie groups. The step size of = 1 is sucient to ensure that the S 2 components converge as well. Also, care must be taken to ensure that the data is contained in a small enough neighborhood that the minimum in (4.1) is unique. For the R3 and R+ components there is no restriction on the spread of the data. However, for the S 2 components the data must lie within a neighborhood of radius (see [18]), 2 i.e., within an open hemisphere. This is a reasonable assumption for the aligned m-rep models, whose spoke directions for corresponding atoms are fairly localized, and we have not experienced in practice any models that do not fall within such constraints. We now have the following algorithm for computing the intrinsic mean of a collection of m-rep models: 80 Figure 5.3: The surface of the mean hippocampus m-rep. Algorithm 5.2: M-rep Mean Input: M1 , . . . , MN M(n), m-rep models Output: M(n), the intrinsic mean 0 = M 1 Do 1 = N N Logj Mi i=1 j+1 = Expj () While |||| > . Fig. 5.3 shows the surface of the resulting intrinsic mean of the 86 aligned hippocampus m-rep models computed by Algorithm 5.2. The maximum dierence in the rotation angle from the mean in either of the S 2 components was 0.1276 for the entire data set. Thus the data falls well within a neighborhood of radius as required. 2 One might be tempted to simplify the statistical computations by treating a medial atom as three points in R3 : the center point x, and the two implied boundary points y0 , y1 . With this linear representation, the symmetric space mean algorithm involving geodesic computations is replaced by a simpler linear average. However, linear averaging produces invalid medial atoms. To demonstrate this, we computed a linear average of the atoms at a corresponding location in the hippocampus mesh across the population. This average was compared to the symmetric space average described in this paper. The resulting two medial atoms are shown in Fig. 5.4. The symmetric space mean is a valid medial atom, while the linear average is not because the two spoke vectors do not have equal length. The ratio of the two spoke lengths in the linear average is 1.2 to 1. 81 (a) (b) Figure 5.4: The resulting average of corresponding medial atoms in the hippocampus models using (a) symmetric space averaging and (b) linear averaging. Notice that the linear average is not a valid medial atom as the two spokes do not have equal length. 5.4 M-rep PGA The PGA algorithm for m-rep models is a direct adaptation of Algorithm 4.2. The only concern is to check that the data is localized enough for the projection operator to be unique. That is, we must determine the neighborhood U used in (4.6) and (4.7). Again there is no restriction on the R3 and R+ components. For S 2 components it is also sucient to consider a neighborhood with radius . Therefore, there are no further 2 constraints on the data than those discussed for the mean. Also, we can expect the projection operator to be well-approximated in the tangent space, given the discussion of the error in Section 4.2.3 and the fact that the data lie within 0.1276 rad. from the mean. Finally, the computation of the PGA of a collection of m-rep models is given by Algorithm 5.3: M-rep PGA Input: M-rep models, M1 , . . . , MN M(n) Output: Principal directions, vk T M(n) Variances, k R = intrinsic mean of {Mi } (Algorithm 5.2) ui = Log (Mi ) 1 S = N N ui uT i i=1 {vk , k } = eigenvectors/eigenvalues of S. 82 Figure 5.5: The rst three PGA modes of variation for the hippocampus m-reps. From left to right are the PGA deformations for 3, 1.5, 1.5, and 3 times i . Analogous to linear PCA models, we may choose a subset of the principal directions vk that is sucient to describe the variability of the m-rep shape space. New m-rep models may be generated within this subspace of typical objects. Given a set of real coecients = (1 , . . . , d ), we generate a new m-rep model by d M() = Exp k=1 k vk , (5.6) where k is chosen to be within [3 k , 3 k ]. The m-rep PGA algorithm was applied to the aligned hippocampus data set. Fig. 5.5 displays the rst three modes of variation as the implied boundaries of the m-reps generated from PGA coecients k = 3 k , 1.5 k , 0, 1.5 k , 3 k . A plot of the eigenvalues and their cumulative sums is given in Fig. 5.6. The rst 30 modes capture 95 percent of the total variability, which is a signicant reduction from the original 192 dimensions of the hippocampus m-rep model. In this statistical analysis of the hippocampus, the resulting mean model (Fig. 5.3) and the models generated from the PGA (Fig. 5.5) qualitatively look like hippocampi. Also, the generated models are legal m-reps, that is, they produce valid meshes of medial 83 100 % of Total Variance 80 60 40 20 Cumulative 95% Line Eigenvalues 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 Number of Modes Figure 5.6: A plot of the eigenvalues from the modes of variation and their cumulative sums. atoms and smooth, non-folding implied boundaries. The mean and PGA algorithms have also been applied to populations of m-rep models of the kidney, prostate, heart, and liver. In our experience so far, we have found that the mean and PGA methods described in this chapter generate legal m-rep models when the input models are legal. While we do not have quantitative results to say that these methods produce legal models, our experiments indicate that they produce valid results for real-world data. 5.5 PGA in Deformable M-reps Segmentation In this section we describe how the method of principal geodesic analysis on m-reps that has been developed in this chapter can be used in a Bayesian deformable models segmentation method based on m-reps. Recall from Section 3.3.2 that m-reps segmentation proceeds in several stages corresponding to dierent levels of scale. In this section we focus on the gure stage of the optimization of a single gure model. Principal geodesic analysis will be used in two aspects of the segmentation process: 1. The principal geodesic components are used as a parameter space generating global deformations of the m-rep gure. 2. The geodesic Mahalanobis distance is used as the geometric prior term in the Bayesian objective function. In the segmentation problem we are given an image I, and we want to t an mrep model to a particular object in the image. A statistical m-rep model is trained on a population of known objects of the same class. The training proceeds by tting 84 a set of m-rep models to binary segmentations of objects from similar images. Next a mean m-rep model and a principal geodesic analysis are computed as described above. The principal geodesic analysis results in a set of principal directions vk T M(n) and variances k . The rst d principal directions are chosen depending on the desired amount of variation that is desired. 5.5.1 Principal Geodesic Deformations The mean model is used as the initial model in the optimization. It is placed within the image by the user applying a translation, rotation, and scale. As described in the background section on m-reps (Section 3.3.2), the gure stage proceeds by deforming the model by global transformations to optimize the objective function. The dierence is that we now use the principal geodesics as the global deformations of the model. This is achieved by optimizing over parameters c = (c1 , . . . , cd ) that generate deformed versions of the mean model given by d M(c) = S Exp i=1 ck vk . Here S represents the user-dened similarity transform used to place the mean model into the image. Care must be taken in the order that the similarity transform is applied with respect to the PGA transformations. The two operations do not commute, and since the principal directions are dened as tangent vectors to the mean model, it does not make sense to apply them to a transformed version of the mean. Therefore, the similarity transform must be applied after the principal geodesic deformation. An alternative would be to apply the similarity transform to the mean and also apply the derivative mapping of the similarity transform to the principal directions (since they are after all tangent vectors). Then the vk can be replaced by the transformed vectors, and the similarity transform need not be applied during the optimization. 5.5.2 PGA-Based Geometric Prior The next part of using principal geodesic analysis in the deformable m-reps segmentation is to use the geodesic Mahalanobis distance as a geometric prior in the objective function. Recall from Section 3.3.2 that the posterior objective function used for m-reps 85 segmentation is given by F (M(c), I) = L(M(c), I) + G(M(c)), where L is the image match term and G is the geometric typicality. In the Bayesian setting this objective function F with = 1 can be seen as a log posterior probability density, where the image match L is a log likelihood probability and the geometric typicality G is a log prior probability. We focus on the geometric typicality term G. We dene this term to be the squared geodesic Mahalanobis distance, which is proportional to the log prior probability d G(M(c)) = i=1 c2 k log(p(M(c)). k The probability distribution p can be constructed as a truncated Gaussian distribution in the tangent space to the intrinsic mean, M(n). If U T M(n) is the neighborhood in which PGA is well-dened (recall Section 4.2.4), then p is given by p(M) = 1 exp Log (M)T 1 Log (M) , 2 V (U )(2) || 1 n 2 1 2 where V (U ) denotes the normalization factor based on the neighborhood U to make p integrate to 1 and is the covariance matrix approximated in Algorithm 5.3. It should be stressed that this distribution is not a Gaussian distribution on the manifold M(n) as dened in [61] (recall Section 3.1.4). That is, the density p is not a fundamental solution to the heat equation on M(n). Also, the intrinsic mean and the covariance matrix that are derived from the training data are not maximum-likelihood estimates of the density parameters. It is not clear that a Gaussian distribution is the correct model, and further research is required to investigate possible probability models and the estimation of their parameters. The statistical segmentation method presented in this section has been implemented as a part of Pablo [104], the deformable m-reps segmentation tool developed at UNC. A study carried out by Rao et al. [107] compared deformable m-rep and human segmentations of kidneys from CT. The m-rep segmentation process used was the one presented in this section. The training set for the geometry statistics included 53 models of the right kidney and 51 models of the left kidney (left and right kidneys were trained as two separate groups). The target images to be segmented were 12 CT images of the 86 kidneys (left and right). Human segmentations were carried out by manual slice-byslice contour outlining by two dierent raters. The statistical m-rep segmentation gave reasonable results that compared favorably with the human segmentations. The mean surface separations between the human and m-rep segmentations were sub-voxel. The dierences between the human and m-rep segmentations were slightly larger than the dierences between the two human segmentations. However, the experiment was biased towards this result since the humans used a slice-based segmentation while the m-reps segmentation was a smooth 3D model. 5.6 Conclusions In this chapter we demonstrated how statistical m-rep shape models can be built using the mean and PGA methods presented in the previous chapter. We rst showed that m-rep models are elements of a symmetric space and gave formulas for the Riemannian log and exponential map. We then developed an alignment method for m-rep models analogous to the Procrustes alignment method for point set shape models, except that the m-rep alignment method is based on a least-squares approach using geodesic distances on M(n). Finally, we adapted the mean and PGA algorithms to the m-rep case. These methods were demonstrated on a set of 86 hippocampus m-rep models t from expert binary segmentations. The work of this chapter brings up several questions that remain to be answered: 1. Following an analogous construction (see Section 3.1) to the one used to build Kendalls shape spaces, k = (Rnk {0})/Sim(n), a medial shape space can n be constructed as the quotient M(n)/Sim(3). In other words, the medial shape space is the space created by identifying m-rep models that are dierent by only a similarity transform. An open problem is to classify the topology and Riemannian structure of these medial shape spaces. 2. The m-rep statistics described in this chapter used the tangent space approximation to principal geodesic analysis. It would be preferable to solve for the projection operator explicitly and nd the true principal geodesic analysis. In addition, averages and principal geodesic analysis could be computed on the medial shape space M(n)/Sim(3) rather than on aligned models in the space M(n). 3. A thorough validation of the segmentation method using PGA is ongoing. It is designed to test whether the statistical segmentation method has advantages over 87 a method without geometry statistics. This is measured in both advantages in segmentation accuracy and in the speed of the segmentation. 4. The methods presented in this chapter were for single-gure objects. This could be extended to multi-gure models and multi-object complexes. The challenge for multi-gure models is to allow only variations that preserve the hinge relationship of a child gure with its parent. Care must be taken in the multi-object situation to prevent adjacent objects from intersecting in the PGA deformations. These issues will be discussed further in the future work section in Chapter 7. 88 Chapter 6 Statistics of Diusion Tensors As discussed in the background chapter (Section 3.4) diusion tensor magnetic resonance imaging (DT-MRI) is emerging as an important tool in medical image analysis of the brain. However, relatively little work has been done on producing statistics of diusion tensors. A main diculty is that the space of diusion tensors, i.e., the space of symmetric, positive-denite matrices, does not form a vector space. Therefore, standard linear statistical techniques do not apply. This chapter1 presents new methods for the statistical analysis of diusion tensors. We demonstrate that the space of diusion tensors is more naturally described as a Riemannian symmetric space, rather than a linear space. Applying the ideas presented in Chapter 4 to this space, we develop new methods for averaging and describing the variability of diusion tensor data. It is shown that these statistics preserve natural properties of the diusion tensors, most importantly the positive-deniteness, that are not preserved by linear statistics. The framework presented in this chapter should be useful in the registration of diusion tensor images, the smoothing of diusion tensor images, the production of statistical atlases from diusion tensor data, and the quantication of the anatomical variability caused by disease. In Section 6.1 we show why the space of diusion tensors is not a linear space and how linear statistical methods such as PCA break down in this space. In Section 6.2 we formulate the space of diusion tensors as a Riemannian symmetric space. Section 6.3 presents the methods for averaging and principal geodesic analysis of diusion tensors. Finally, Section 6.5 develops several new methods based on the symmetric space formulation of diusion tensors that are essential for building statistical atlases of diusion The work presented in this chapter was done in collaboration with Dr. Sarang Joshi at the University of North Carolina. This chapter is an expanded version of the paper [39]. 1 90 tensor images. These methods include (1) a new similarity measure for comparing diusion tensors, (2) a method for interpolating diusion tensors, and (3) a new anisotropy measure. 6.1 The Space of Diusion Tensors Recall that a real nn matrix A is symmetric if A = AT and positive-denite if xT Ax > 0 for all nonzero x Rn . We denote the space of all n n symmetric, positive-denite matrices as P D(n). The tensors in DT-MRI are thus elements of P D(3). The space 2 P D(n) forms a convex subset of Rn . One can dene a linear average of N positive1 denite, symmetric matrices A1 , . . . , AN as = N N Ai . This denition minimizes i=1 n2 the Euclidean metric on R . Since P D(n) is convex, lies within P D(n). However, linear averages do not interpolate natural properties. The linear average of matrices of the same determinant can result in a matrix with a larger determinant. Second order statistics are even more problematic. The standard principal component analysis is invalid because the straight lines dened by the modes of variation do not stay within the space P D(n). In other words, linear PCA does not preserve the positive-deniteness of diusion tensors. The reason for such diculties is that space P D(n), although a subset of a vector space, is not a vector space; for example, the negation of a positivedenite matrix is not positive-denite. In this chapter we derive a more natural metric on the space of diusion tensors, 2 P D(n), by viewing it not simply as a subset of Rn , but rather as a Riemannian symmetric space. Following Frchet [44], we dene the average as the minimum mean squared e error estimator under this metric. We apply the method of principal geodesic analysis developed in Chapter 4 to describe the variability of diusion tensor data. In this framework the modes of variation are represented as ows along geodesic curves, i.e., shortest paths under the Riemannian metric. These geodesic curves, unlike the straight lines of 2 Rn , are completely contained within P D(n), so they preserve the positive-deniteness. To illustrate these issues, consider the space P D(2), the 2 2 symmetric, positivedenite matrices. A matrix A P D(2) is of the form A= a b , b c ac b2 > 0, a > 0. If we consider the matrix A as a point (a, b, c) R3 , then the above conditions describe the interior of a cone as shown in Fig. 6.1. The two labeled points are p0 = (1, 0, 7), p1 = 91 p0 c a p1 g l b Figure 6.1: The space P D(2), showing the geodesic and the straight line l between the two points p0 and p1 . (7, 0, 1). The straight line l between the two points, i.e., the geodesic in Rn , does not remain contained within the space P D(2). The curve is the geodesic between the two points when P D(2) is considered as a Riemannian symmetric space. This geodesic lies completely within P D(2). We chose P D(2) as an example since it can be easily visualized, but the same phenomenon occurs for general P D(n), i.e., n > 2. 2 6.2 The Geometry of P D(n) In this section we show that the space of diusion tensors, P D(n), can be formulated as a Riemannian symmetric space. This leads to equations for computing geodesics that will be essential in dening the statistical methods for diusion tensors. The dierential geometry of diusion tensors has also been used in [23], where the diusion tensor smoothing was constrained along geodesic curves. The fact that P D(n) is a symmetric space has been known for some time. In fact, Cartan accomplished a complete classication of the possible symmetric spaces in two papers in 1926 and 1927 [19, 20]. A review of symmetric spaces can be found in [15, 58]. Recall from Theorem 2.8 that a symmetric space is a connected Riemannian manifold M such that for each x M there is an isometry x that reverses all geodesics through the point x. We will show that the space P D(n) is a Riemannian symmetric space by constructing a transitive Lie group action on it. This leads to a natural Riemannian 92 metric on P D(n) that is invariant under the group action. The equations for computing geodesics are then derived from the action of one-parameter subgroups. 6.2.1 The Lie Group Action on P D(n) Consider the Lie group of all n n real matrices with positive determinant, denoted GL+ (n). This group acts on P D(n) via : GL+ (n) P D(n) P D(n) (g, p) = gpg T . We will sometimes write the group action as g p = (g, p). We will show that this action satises the conditions in Theorem 2.10 for P D(n) to be a symmetric space, namely, 1. The action is transitive. 2. The Lie group GL+ (n) is connected. 3. The resulting isotropy subgroup is compact. 4. There is an involutive automorphism of GL+ (n) leaving the isotropy subgroup xed. We prove each of these conditions in turn. (1) Recall that the group action is transitive if for any two points p, q P D(n), there exists an element g GL+ (n) such that q = (g, p). Let In denote the n n identity matrix. Given a matrix p P D(n) let p = U U T be the SVD of p. Then p can be written as the product p = gg T , where g GL+ (n) is given by g = U (1/2) . Therefore, In = (g 1 , p). Now write q = hhT for h GL+ (n). Then q = h(g 1 , p)hT = (h, (g 1 , p)) = (hg 1 , p), which shows that is transitive. In other words, the space P D(n) is a homogeneous space. (2) Let g1 , g2 be two matrices in GL+ (n). To show that GL+ (n) is connected, we show that there is a continuous path in GL+ (n) connecting g1 and g2 . Let g1 = U1 1 V1 and g2 = U2 2 V2 be singular value decompositions with Ui , Vi SO(n). We can safely assume that the matrices i have positive diagonal entries. If i has negative entries, (6.1) 93 we can write gi = Ui i Vi , where i = i Ri , Vi = Ri Vi , and Ri is the diagonal rotation matrix with 1 in the entries corresponding to negative values in the i and +1 in the other entries. Consider the paths ci : [0, 1] GL+ (n) for i = 1, 2 given by ci (s) = exp(s log Ui ) s exp(s log Vi ). i These are continuous paths from ci (0) = In to ci (1) = gi . Now consider the path c : [0, 1] GL+ (n) given by c(s) = c1 (1 s) c2 (s). This path is continuous with c(0) = g1 and c(1) = g2 . Furthermore, for any s [0, 1] we have (1s) det(c(s)) = det(1 s ) = det(g1 )(1s) det(g2 )s > 0. 2 This shows that the path c lies completely within GL+ (n), so GL+ (n) is connected. (3) The isotropy subgroup of In under is the set of all matrices g GL+ (n) that satisfy (g, In ) = In . Thus, the isotropy subgroup is given by SO(n) = {g GL+ (n) : gg T = In }, the space of n n rotation matrices. This is a compact Lie subgroup of GL+ (n) as was mentioned in the background section on Lie groups (Section 2.4). (4) The mapping : GL+ (n) GL+ (n) given by (g) = (g 1 )T is an involutive automorphism that leaves SO(n) xed. Therefore, the space of diusion tensors, P D(n), is a symmetric space and equivalent to the quotient space GL+ (n)/SO(n). An intuitive way to view this quotient is to think of the polar decomposition, which decomposes a matrix g GL+ (n) as g = pu, where p P D(n) and u SO(n). Thus, the diusion tensor space P D(n) GL+ (n)/SO(n) = comes from dividing out the rotational component in the polar decomposition of GL+ (n). 6.2.2 The Invariant Metric on P D(n) The space of diusion tensors, P D(n), has a Riemannian metric that is invariant under the GL+ (n) action, which follows from the fact that the isotropy subgroup SO(n) is connected and compact (recall Theorem 2.9). 94 The tangent space of P D(n) at the identity matrix can be identied with the space of n n symmetric matrices, Sym(n). Since the group action g : s gsg T is linear, its derivative map, denoted dg , is given by dg (X) = gXg T . If X Sym(n), it is easy to see that dg (X) is again a symmetric matrix. Thus the tangent space at any point p P D(n) is also identiable with Sym(n). If X, Y Sym(n) represent two tangent vectors at p P D(n), where p = gg T , g GL+ (n), then the Riemannian metric at p is given by the inner product X, Y p = tr(g 1 Xp1 Y (g 1 )T ). Finally, the mapping In (p) = p1 is an isometry that reverses geodesics of P D(n) at the identity, and this turns P D(n) into a symmetric space. 6.2.3 Computing Geodesics Geodesics on a symmetric space are easily derived via the group action (see [58] for details). Let p be a point on P D(n) and X a tangent vector at p. There is a unique geodesic, , with initial point (0) = p and tangent vector (0) = X. To derive an equation for such a geodesic, we begin with the special case where the initial point p is the n n identity matrix, In , and the tangent vector X is diagonal. Then the geodesic is given by (t) = exp(tX), where exp is the matrix exponential map given by the innite series exp(X) = k=0 1 k X . k! For the diagonal matrix X with entries xi , the matrix exponential is simply the diagonal matrix with entries exi . Now for the general case consider the geodesic starting at an arbitrary point p P D(n) with arbitrary tangent vector X Sym(n). We will use the group action to map this conguration into the special case described above, i.e., with initial point at the identity and a diagonal tangent vector. Since the group action is an isometry, geodesics and distances are preserved. Let p = gg T , where g GL+ (n). Then the action g1 maps p to In . The tangent vector is mapped via the corresponding tangent map to Y = dg1 (X) = g 1 X(g 1 )T . Now we may write Y = vv T , where v is a rotation 95 matrix and is diagonal. The group action v1 diagonalizes the tangent vector while leaving In xed. We can now use the procedure above to compute the geodesic with initial point (0) = In and tangent vector (0) = . Finally, the result is mapped back to the original conguration by the inverse group action, gv . That is, (t) = gv ( (t)) = (gv) exp(t)(gv)T . If we ow to t = 1 along the geodesic we get the Riemannian exponential map at p. That is, Expp (X) = (1). In summary we have Algorithm 6.1: Riemannian Exponential Map Input: Initial point p P D(n). Tangent vector X Sym(n). Output: Expp (X) Let p = uuT (u SO(n), diagonal) g=u Y = g 1 X(g 1 )T Let Y = vv T (v SO(n), diagonal) Expp (X) = (gv) exp()(gv)T An important property of the geodesics in P D(n) under this metric is that they are innitely extendible, i.e., the geodesic (t) is dened for < t < . This follows from the fact that all symmetric spaces are complete (Theorem 2.8). Again, Fig. 6.1 demonstrates that the symmetric space geodesic remains within P D(2) for all t. In contrast the straight line l quickly leaves the space P D(2). The map Expp has an inverse, called the Riemannian log map and denoted Logp . It maps a point x P D(n) to the unique tangent vector at p that is the initial velocity of the unique geodesic with (0) = p and (1) = x. Using a similar diagonalization procedure, the log map is computed by 96 Algorithm 6.2: Riemannian Log Map Input: Initial point p P D(n). End point x P D(n). Output: Logp (x) Let p = uuT (u SO(n), diagonal) g=u y = g 1 x(g 1 )T Let y = vv T (v SO(n), diagonal) Logp (x) = (gv) log()(gv)T Using the notation of Algorithm 6.2, geodesic distance between the diusion tensors p, x P D(n) is computed by d(p, x) = || Logp (x)||p = tr(log()2 ). (6.2) 6.3 Statistics of Diusion Tensors Having formulated the geometry of diusion tensors as a symmetric space, we now develop methods for computing statistics in this nonlinear space. The algorithms for computing the mean and PGA will be direct adaptations of the algorithms described in Chapter 4 to the space P D(n). The computations for the log and exponent maps described in the previous section will be instrumental in these statistical methods. 6.3.1 Averages of Diusion Tensors Again we dene the intrinsic mean of a set of diusion tensors p1 , . . . , pN P D(n) as the diusion tensor that minimizes the sum-of-squared distance to the pi . That is, the intrinsic mean is given by N = arg min pP D(N ) i=1 d(p, pi )2 . (6.3) Again let A denote the sum-of-squared distance function for the set of points A = {p1 , . . . , pN }, that is, N 1 d(p, pi )2 . A (p) = 2N i=1 97 Recall that the gradient of can be computed as 1 A (p) = N N Logp (pi ) i=1 when the data lie in a strongly convex neighborhood. In fact, the entire space P D(n) is strongly convex; that is, any two points can be connected by a unique geodesic curve. This fact follows from the curvature properties of P D(n) (it has everywhere non-positive sectional curvature). Therefore, the minimization problem for the intrinsic mean (6.3) has a unique solution. The gradient descent algorithm for computing the mean of diusion tensors, which is a direct adaptation of Algorithm 4.1, is given by Algorithm 6.3: Intrinsic Mean of Diusion Tensors Input: p1 , . . . , pN P D(n) Output: P D(n), the intrinsic mean 0 = I Do 1 Xi = N N Logi (pk ) k=1 i+1 = Expi (Xi ) While ||Xi || > . 6.3.2 Principal Geodesic Analysis of Diusion Tensors We are now in a position to dene principal geodesic analysis for diusion tensor data p1 , . . . , pN P D(n). Our goal, analogous to PCA, is to nd a sequence of nested geodesic submanifolds that maximize the projected variance of the data, according to the dening equations (4.6) and (4.7) presented in Chapter 4. We must again determine under what conditions the projection operator (4.5) for the space P D(n) is unique. That is, we must determine the neighborhood U used in the PGA equations (4.6) and (4.7). Actually, since P D(n) has non-positive sectional curvature, the projection operator is well-dened for the entire space U = P D(n). We will use the tangent space approximation to the projection operator. That is, we will adapt Algorithm 4.2 for the tangent space approximation to PGA, giving the following algorithm on P D(n): 98 Algorithm 6.4: PGA of Diusion Tensors Input: p1 , . . . , pN P D(n) Output: Principal directions, vk Sym(n) Variances, k R = intrinsic mean of {pi } (Algorithm 6.3) xi = Log (pi ) 1 S = N N xi xT (treating the xi as column vectors) i i=1 {vk , k } = eigenvectors/eigenvalues of S. A new diusion tensor p can now be generated from the PGA by the formula p = d Exp k=1 k vk , where the k R are the coecients of the modes of variation. 6.4 Properties of PGA on P D(n) We now demonstrate that principal geodesic analysis on the symmetric space P D(n) preserves certain important properties of the diusion tensor data, namely the properties of positive-deniteness, determinant, and orientation. This makes the symmetric space formulation an attractive approach for the statistical analysis of diusion tensor images. We have already mentioned that, in contrast to linear PCA, symmetric space PGA preserves positive-deniteness. That is, the principal geodesics are completely contained within P D(n), and any matrix generated by the principal geodesics will be positivedenite. The next two properties we consider are the determinant and orientation. Consider a collection of diusion tensors that all have the same determinant D. We wish to show that the resulting average and any tensor generated by the principal geodesic analysis will also have determinant D. To show this we rst look at the subset of P D(n) of matrices with determinant D, that is, the subset PD = {p P D(n) : det(p) = D}. This subset is a totally geodesic submanifold, meaning that any geodesic within PD is a geodesic of the full space P D(n). Recall in Chapter 4 we discussed submanifolds geodesic at a point p, i.e., submanifolds whose geodesics passing through p were also geodesics of the ambient manifold. This is dierent from totally geodesic submanifolds, which are submanifolds geodesic at every point. Now, the fact that PD is totally geodesic implies that the averaging process in Algorithm 6.3 will remain in PD if all the data lies in PD . Also, the principal directions vk in the PGA will lie in the tangent subspace T PD . Thus any diusion tensor generated by the principal geodesics will remain in the space PD . 99 Figure 6.2: The rst two modes of variation of the simulated data: (left) using symmetric space PGA, and (right) using linear PCA. Units are in standard deviations. The boxes labelled Not Valid indicate that the tensor was not positive-denite, i.e., it had negative eigenvalues. The same argument may be applied to show that symmetric space averaging and PGA preserve the orientation of diusion tensors. In fact, the subset of all diusion tensors having the same orientation is also a totally geodesic submanifold, and the same reasoning applies. Unlike the positive-deniteness and determinant, orientations are also preserved by linear averaging and PCA. To demonstrate these properties, we simulated random 3D diusion tensors and computed both their linear and symmetric space statistics. We rst tested the determinant preservation by generating 100 random 3D diusion tensors with determinant 1. To do this we rst generated 100 random 3 3 symmetric matrices, with entries distributed according to a normal distribution, N (0, 1 ). Then, we took the matrix exponential of 2 these random symmetric matrices, thus making them positive-denite diusion tensors. Finally, we normalized the random diusion tensors to have determinant 1 by dividing each tensor by the cube root of its determinant. We then computed the linear average and PCA and symmetric space average and PGA of the simulated tensors. The results are shown in Fig. 6.2 as the diusion tensors generated by the rst two modes of variation. The linear PCA generated invalid diusion tensors, i.e., tensors with negative eigenvalues, at +2 standard deviations in both the rst and second modes. All of the diusion tensors generated by the symmetric space PGA have determinant 1. The linear mean demonstrates the swelling eect of linear averaging. It has determinant 2.70, 100 and the linear PCA tensors within 2 standard deviations have determinants ranging from 2.80 to 2.82. The negative determinants came from the tensors that were not positive-denite. Therefore, we see that the symmetric space PGA has preserved the positive-deniteness and the determinant, while the linear PCA has preserved neither. Next we tested the orientation preservation by generating 100 random, axis-aligned, 3D diusion tensors. This was done by generating 3 random eigenvalues for each matrix, corresponding to the x, y, and z axes. The eigenvalues were chosen from a log-normal distribution with log mean 0 and log standard deviation 0.5. Next we generated a random orientation u SO(3) and applied it to all of the axis-aligned matrices by the map p upuT . Thus each of the diusion tensors in our test set had eigenvectors equal to the columns of the rotation matrix u. We computed both the symmetric space and linear statistics of the data. As was expected, both methods preserved the orientations. However, the linear PCA again generated tensors that were not positive-denite. 6.5 New Methods: Comparison Metric, Interpolation, and Anisotropy In this section we present several novel methods for the analysis of diusion tensors based on the symmetric space formulation of P D(n) presented earlier. As mentioned at the beginning of this chapter, a primary application of the statistical methods presented above is for inter-subject studies of diusion tensor data. To make such studies possible, images of dierent patients need to be registered into a common coordinate system for direct comparison. The rst two methods in this section are intended to be used as part of a registration method for diusion tensor images. The third method is a new anisotropy measure based on the dierential geometry of the symmetric space P D(n). An algorithm for diusion tensor image registration requires three important tools: (1) a method for warping a diusion tensor image, (2) a method for resampling the warped diusion tensor image, i.e., an interpolation method for diusion tensors, and (3) a comparison metric for computing how close two diusion tensor images are to one another. Warping diusion tensor images is nontrivial because it is not obvious how a warp of space should aect the shape and orientation of a tensor. Alexander et al. [1] describe several strategies for warping diusion tensor images. We focus on items (2) and (3) and present new methods for interpolating and comparing diusion tensors based on the symmetric space formulation of P D(n). We begin with the comparison 101 g(0) = p0 g(0.25) g(0.5) g(0.75) g(1) = p1 Figure 6.3: An example of geodesic interpolation of two diusion tensors. First, two diusion tensors p0 and p1 were chosen randomly. The diusion tensors at times 0.25, 0.5, and 0.75 were generated along the unique geodesic segment between p0 and p1 . metric. 6.5.1 Comparison Metric We propose a new comparison metric for diusion tensors dened as the geodesic distance between two tensors. That is, given two diusion tensors p1 , p2 P D(3) the error metric between them is given by E(p1 , p2 ) = d(p1 , p2 ), where the distance d is the geodesic distance given by (6.2). For two diusion tensor images I1 , I2 : P D(3), where R3 is the image domain, the error metric is given by E(I1 , I2 ) = d(I1 (x), I2 (x)) dx 2 1 2 . Alexander et al. [1] have proposed comparing diusion tensors based on the angular dierence between their principal directions. This dierence is then weighted by the relative anisotropy to give higher weight to the more anisotropic tensors. We argue that the geodesic error metric presented here takes into account both the orientation and the anisotropy. In addition, the geodesic error metric is consistent with the proposed statistical methods; that is, it is based on geodesic distance on P D(n) as are the mean and PGA methods presented above. 6.5.2 Diusion Tensor Interpolation The most basic method for resampling a warped image is a nearest neighbor approach. Another possibility is to use trilinear interpolation of the linear tensor coecients. The 102 tensor interpolation method that we propose is based on the symmetric space averaging method developed in Section 6.3. First, consider the case of two diusion tensors p1 , p2 P D(n). We would like an interpolation method given by a continuous curve c : [0, 1] P D(n) satisfying c(0) = p1 and c(1) = p2 . Given the symmetric space formulation for P D(n) presented above, an obvious choice for c is the unique geodesic curve segment between p1 and p2 . This geodesic interpolation is demonstrated between two randomly chosen diusion tensors in Fig. 6.3. Geodesic interpolation can be seen as a direct generalization of linear interpolation for scalar or vector data. Now for 3D images of diusion tensors an interpolation method can be thought of as a smooth function in a cube, where the tensor values to be interpolated are given at the corners of the cube. In other words, we want a smooth function f : [0, 1]3 P D(n), where the values f (i, j, k) : i, j, k {0, 1} are specied. It is tempting to rst create f using tri-geodesic interpolation, that is, by repeated geodesic interpolation in the three coordinate directions. However, unlike linear interpolation, geodesic interpolation of diusion tensors does not commute. Therefore, a tri-geodesic interpolation would be dependent on the order in which the coordinate interpolations were made. A better method for interpolating diusion tensors in three dimensions is using a weighted geodesic average. Weighted averaging of data on an sphere S n has been studied by Buss and Fillmore [18]. We follow their approach, extending the denition of weighted averages to diusion tensors. Given a set of diusion tensors p1 , . . . , pN P D(n) and a set of weights w1 , . . . , wN R, consider the weighted sum-of-squared distances function (p; p1 , . . . , pN ; w1 , . . . , wN ) = 1 N N wi d(p, pi )2 . i=1 Given non-negative real weights w1 , . . . , wN with sum equal to 1, the weighted average of the pi with respect to the weights wi is dened as a minimum of the weighted sum-of-squared distances function, i.e., Avg(p1 , . . . , pN ; w1 , . . . , wN ) = arg min (p; p1 , . . . , pN ; w1 , . . . , wN ). pP D(n) (6.4) The intrinsic mean denition given in Chapter 4 is equivalent to weighted average definition with all weights set to wi = (1/N ). For vector-valued data v1 , . . . , vN Rn the 103 weighted average is given by the weighted sum N Avg(v1 , . . . , vN ; w1 , . . . , wN ) = i=1 wi vi . For diusion tensor data the weighted average can be computed using a generalization of the intrinsic mean algorithm (Algorithm 6.3). The gradient of the sum-of-squared distances function is given by N (p; p1 , . . . , pN ; w1 , . . . , wN ) = i=1 wi Logp (pi ). Therefore, the gradient descent algorithm for nding the weighted average of a set of diusion tensors is given by Algorithm 6.5: Weighted Average of Diusion Tensors I...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Stanford - MERQE - 1035
Case 5:05-cv-03395-JFDocument 153Filed 11/17/2006Page 1 of 31 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19KEKER &amp; VAN NEST, LLP JAN NIELSON LITTLE - #100029 ASIM M. BHANSALI - #194925 710 Sansome Street San Francisco, CA 94111-1704 Telephone: (415)
Stanford - ESST - 1025
Case 5:02-cv-04497-RMWDocument 203Filed 02/07/2006Page 1 of 51 2 3 4 5 6 7 8 9 10 11 12 In re ESS TECHNOLOGY, INC. 13 SECURITIES LITIGATION 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 This Document Relates To: ALL ACTIONS. ) ) ) ) ) ) ) ) Master File
Stanford - SIPX - 1033
Case 3:05-cv-00392-WHADocument 147Filed 04/04/2006Page 1 of 31 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28GLANCY BINKOW &amp; GOLDBERG LLP Lionel Z. Glancy (#134180) Michael Goldberg (#188669) 1801 Avenue of the Stars, Suite
Stanford - TVIA - 1036
Case 5:06-cv-06304-RMWDocument 22-1Filed 01/05/2007Page 1 of 51 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28Laurence M. Rosen (State Bar No. 219683) Phillip Kim THE ROSEN LAW FIRM, P.A. 350 Fifth Avenue, Suite 5508 New Yor
Stanford - ALVR - 1037
1 2 3 4 5 6 7 8LIONEL Z. GLANCY, ESQ. #134180 ANDY SOHRN, ESQ. #241388 GLANCY BINKOW &amp; GOLDBERG LLP 1801 Avenue of the Stars, Suite 311 Los Angeles, CA 90067 Telephone: (310) 201-9150 Fascimile: (310) 201-9160 LAW OFFICES OF JACOB SABO JACOB SABO The Tow
Washington University in St. Louis - CSE - 567
A Survey Paper on Processor WorkloadsChristoph Jechlitschek, q.wert@gmx.de Abstract Processors are ubiquitous in today's world. Almost any electronic device contains a processor, and their computing power ranges from a few thousand to millions of instruc
Delaware - CIS - 889
CISC 889 Bioinformatics Spring 2004 Homework 3Handed out: April 6, 2004 Due date: April 15, 20041. For five species a, b, c, d, and e, with distance given in the following tab le, reconstruct the tree using the neighbor-joining algorithm. Show the parti
Fayetteville State University - COP - 402003
Copyright (C) R.A. van Engelen, FSU Department of Computer Science, 2000-2003Parameter PassingFormal parameters: the parameters that appear in the declaration of a subroutine For example, int n in fac's declaration written in C:int fac(int n) cfw_ retu
UCCS - ECE - 5675
ECE 5675/4675 Final Project Spring 2008: DSP Based Carrier and Symbol Synchronization1 IntroductionA strict honor code is assumed for this project. Each person is to do his/her own work, with no consultation with others. Bring any questions you have abo
Duquesne - CPMA - 521
An Introduction to SplusPhil Spector Statistical Computing Facility University of California, Berkeley November 10, 19991BackgroundThe S programming language was developed at AT&amp;T Bell Labs, for internal use by a group of statisticians who wanted an i
Los Angeles Southwest College - CSCE - 718
Power Aware Real-Time SchedulingDr. Gang Quan Department of Computer Science &amp; Engineering University of South Carolina Spring, 2007The Power/Energy Consumption Dynamic Power Due to transistor switching activities Traditionally the dominant oneP dyn
Illinois State - ITK - 325
Illinois State University School of Information Technology ITK 325 Computer Organization Spring, 2005March 21, 20051Announcement and Agenda How Was Your Spring Break? Questions From Last Lecture? Discuss Homework #4 10 Minutes Quiz 5 Minutes Quiz Di
Illinois State - ITK - 169
Computer Application DevelopmentACS 169 Test 1Instructor:Summer 2002Dr. Joaquin VilaOne Solution Name _ Last 4 digits of SSN:_Grade:/1001.What is the output of the following C+ program? (10 points) #include &lt;iostream&gt; #include &lt;cstdlib&gt; using nam
Illinois State - ITK - 325
Illinois State University School of Information Technology ITK 325 Computer Organization Spring, 2005February 28, 20051Announcement and Agenda Questions From Last Lecture? Reading Assignment Chapter 3.4 and Chapter 4.1 (possible quiz) Homework #3 Stat
Illinois State - ITK - 325
Illinois State University School of Information Technology ITK 325 Computer Organization Spring, 2005March 23, 20051Announcement and Agenda Questions From Last Lecture? Homework #5 Due 04/11/2005 Chapter 4: 1, 3, 5, 7, 9, 10, 11 Chapter 5: 1, 3, 4, 5
Rutgers - ECE - 428
1N/FDLL 914/A/B / 916/A/B / 4148 / 44481N/FDLL 914/A/B / 916/A/B / 4148 / 4448Small Signal DiodeAbsolute Maximum Ratings*SymbolVRRM IF(AV) IFSMTA = 25C unless otherwise notedParameterMaximum Repetitive Reverse Voltage Average Rectified Forward Cur
Colorado - P - 4165
TheVersion 2.3GuideW. J. Owen Department of Mathematics and Computer Science University of RichmondBlack Cherry Treeslarge residual 70 Volume 10 20 30 40 50 60657075 Height8085Consider a log transform on Volume W. J. Owen 2007. A license is gr
Washington - ART - 383
979745005692234x 2x 2x1x1x1x221:13214563320 CM/8 INCH1x2333435
Binghamton - CS - 527
PreliminariesrRest of the semester, focus on mobile computingr rrUnfortunately, running out of time, we may need a make up class Feedback less networking/more computing next run of this class?Some topics I would like us to discuss in the remaining l
Binghamton - CS - 527
Misc. Next Monday: rst quiz 1/2 class; covers what we discussed in class New Reading Set Project proposals (before or after spring break?) Today: nish NWB, then begin Mobility Support/Routing Dene the general problem and the shape of possible solutions M
Binghamton - CS - 528
Last Time/Today OSPF, Inter-domain routing Today: Discussion of Peering/transit paper Today: discussion of Paxons E2E routing behavior paper Today (maybe): Finish some IP related stuff (Tunneling, Mobile IP, VPNs, Ad hoc routing .)SUNY-B INGHAMTON CS528
Binghamton - CS - 350
CS 350: First Midterm (Spring 03) 3/5/2003Problem 1: (12 points; 10 minutes) Briey explain any 3 of the following concepts 1. Programmed I/O 2. Conditional Critical Regions 3. Memory hierarchy 4. Atomic instruction Problem 2: (18 pts; 12 minutes) (a) (4
Clarkson - WEB - 209
Get Involved with CRC!HOW YOU CAN HELP BE A RESPONSIBLE CONSUMER BY DEMANDING CORPORATIONS BE RESPONSIBLE AS WELL:Boycotting the products found on our website: www.conrespcorp.com Calling our promotional office: 1 800555CRC1 Taking a stand when you KNOW
Lake County - CI - 332
I actually got to teach the Math meeting and lesson to my students this week. I was extremely nervous at first, but it eased my mind to know that I had a script to follow. It was a little difficult keeping my place because everything is so scripted and I
Clarkson - OM - 480
OM480 - PROJECT MANAGEMENT FALL 2003 TAKE HOME QUIZ NAME: _ (Print Legibly) Directions: This quiz will measure your understanding of the Project Life Cycle in the context of value of effort project management. Using the handouts from the Tuesday, November
Lake County - CI - 332
Lindsey BarnardInquiry Unit PlanI chose the topic of Medgar Evers and his impact on the Civil Rights Movement for my inquiry unit plan. My basis for this particular topic mostly stemmed from my background and previous knowledge on the subject. I was bor
Lake County - CI - 332
Assessment Throughout this math unit, there is both informal and formal assessment. Most of the assessments used are informal because they are more convenient throughout the unit. We feel that it is formal assessment isn't as useful because it doesn't rea
Lake County - CI - 332
Kathy Metalesson I thought it was a great idea for us to read Kathy and about the characteristics of Kathy's classroom as opposed to the issues that Ms. Carter had with the way the class was being taught. I like the idea of putting more discussions in the
Lake County - CI - 332
Grocery Store Field TripGrade: 1-2 Time: 30-55 minutesObjectives: Students will demonstrate comparing prices by looking for the least expensive item. Students will demonstrate estimation and rounding skills.Standards: Illinois State Standard: NCTM Stan
Lake County - CI - 332
Grocery Bagging 2nd Grade Objective: The students will practice addition, subtraction, and problem solving with single digit numbers. Materials: Grocery store flash cards Grocery/lunch bags (one per student) Math journals Pencil Procedures: 1. Explain to
Lake County - CI - 332
SELF-MANAGEMENT FORMI kept my hands to myself._____I listened when the teacher was talking._____I followed directions._____I raised my hand to answer questions._____I did/said something nice to someone._____I lined up the ri
UCSC - CMPE - 280
Image Processing in the Human Visual System, a Quick OverviewBy Orazio Gallo, April 24th, 2008The Visual SystemOur most advanced perception system:The optic nerve has 106 fibers, more than all the fibers inserting in the spine. The acoustic nerve has
University of Iowa - WM - 360
Check if previous acquisition is finished. Halt &amp; Save if Note from previous user.NS optional, only if you want more than 16 scans(multiple of 8)Sign into logbookZG( CNTRL H to halt long acquistion )LOCK buttons to OFFWR filename.001To Acquire a 1
Syracuse - CPS - 196
/*paintwall.c*/ /* computing the surface to paint on a wall, using two functions*/#include&lt;stdio.h&gt; /* function prototypes */ double ComputeRectArea(double height, double length); double ComputeCircleArea(double radius);int main(void) cfw_ double
Valparaiso - ECE - 490
VALPARAISO UNIVERSITY ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT ECE 490J DESIGN PROJECT #1 - Formal Specications Using UML SPRING 2003Objective: In this project, you will develop a set of formal specifications for a MP3 plug-in to a Handspring Visor
Allan Hancock College - PHAR - 2811
PHAR2811 ELMA Assessment Presented below is the sample data from OGTT for glucose and insulin levels for the following subjects: Normal, IDDM, NIDDM. Your are to determine which subject was assigned to you in the prac class. In order to do this you will:
Los Angeles Southwest College - CSCE - 580
Marco Valtorta wrote these notes (mainly around 1982). Blaine Nelson typed them on October 2-4, 2002. The main reference is: Bertel, Umberto and Francesco Brioschi. Non-Serial Dynamic Programming. Academic Press, 1972. Another reference used in notes writ
University of Louisiana at Lafayette - SXG - 5317
Magic Tcl Tutorial #4: Simulation with IRSIMR. Timothy EdwardsSpace Department Johns Hopkins University Applied Physics Laboratory Laurel, MD 20723 This tutorial corresponds to Tcl-based Magic version 7.2Tutorials to read rst:Magic Tutorial #1: Gettin
Mary Washington - EDUC - 303
Grade FiveThe fifth-grade standards emphasize the importance of selecting appropriate instruments for measuring and recording observations. The organization, analysis, and application of data continue to be an important focus of classroom inquiry. Scienc
Idaho - AGECON - 489
Week 2 L2Spec. Trading, Price Quotes, &amp; Margin AccountsUnderstanding and Using Futures and Options MarketsSpeculative Trading, Price Quotes, and Margin AccountsWeek 2 L2Spec. Trading, Price Quotes, &amp; Margin Accounts1. What is speculative trading?a.
Cal Poly - CSC - 580
Agent MobilityOverview mobile agents and mobile computing technical issues agent languages, distributed execution, environment, security multi-agent systems cooperation between agents to solve a taskFranz J. KurfessCSC/CPE 580 Intelligent AgentsSpring
University of Illinois, Urbana Champaign - MATH - 285
NAME:Math 285 Spring 2003 - Test 2Total points: 100. Do all questions. Explain all answers. No notes, books, calculators or computers. 1. [6 points] For the following differential equation, write down the form of the complementary solution yc , and of t
NYU - CPS - 272
North-South Lending and Endogenous Capital Market InefcienciesMark Gertler and Kenneth Rogoff, JME 1990 presented in T.Sargents RGIntroductions sFact: Capital ows depend on cap mkt efciency 2-period model with asymmetric information: 1. Cost of invest
UCSB - MATH - 115
FINAL EXAM Math 115B, UCSB, Winter 2009 - SOLUTIONS Due in SH6518 or as an email attachment at 12:00pm, March 16, 2009. You are to work on your own, and may only consult your notes, text and the class web page. It is very important that you write up your
Minnesota - LEC - 5980
July 1998WRL Research Report 98/10An Analysis of Database Workload Performance on Simultaneous Multithreaded ProcessorsJack L. Lo Luiz Andr Barroso Susan Eggers Kourosh Gharachorloo Henry Levy Sujay ParekhWestern Research Laboratory 250 University Ave
UCSB - ESM - 298
GREEN CHEMISTRYWorkshop on Biodegradable Plastics and Polymers, Osaka, Japan, 9 to 11 November 1993, Y. Doi, K. Fukuda, Eds. (Elsevier Science, Amsterdam, 1994). C. Bastoli, Polym. Degrad. Stab. 59, 263 (1998). , L. Marini, Novamont SpA, Global Status of
New Mexico - PHYS - 522
Physics 522. Quantum Mechanics II Problem Set #4 Due Friday, Feb. 21, 2003 Problem 1: Stark Shift in Hydrogen (10 points) Excluding nuclear spin, the n=2 manifold in Hydrogen has the configuration:2p3/2 2s1/2 DELamb DEFS 2p1/2where DEFS/h=10 GHz (the fi
Arizona - LEC - 510
FO C U S O N C Y TO S K E L E TA L DYN A M I C STHE CO-WORKERS OF ACTIN FILAMENTS: FROM CELL STRUCTURES TO SIGNALSCline Revenu, Raf ika Athman, Sylvie Robine and Daniel LouvardCells have various surface architectures, which allow them to carry out diff
University of Illinois, Urbana Champaign - PHYS - 498
1SpinorsLet us first discuss objects with upper and lower indices in a general finite dimensional space, without assuming a metric. We can get the required equations from many places, for example Weyl, Space-Time-Matter pp. 33-35. Suppose we have contra
UCSC - CE - 224
e 47)E4#4S1&amp;&quot;1A4#)! &quot;$ w)$ a847aS &quot;4!3 4@81)&quot;$ w)v' 8&quot;$ G4' Vab u d Y 4)$ TD 43S 5 3 ' g S ! 0 2 ! F 0 3 # 7 # 5 ! 0 6 3 y 9 3 r Q x 2! # 7 7 ' g h # g 3 e ` 30 # H 2$ #D ! F Q 3'$ 3 0 I 2 #D ! F ( 0 3 ( F $ ! 7 6 # 5 D # ! 0 6! 7 r ! 3 g g H 2 6 #D$ 7 g
Stanford - SCUR - 1037
3:07-cv-00392-SCDocument 30Filed 10/25/2007Page 1 of 21 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28[PROP] ORDER GRANTING PLF THOMAS' UNOPPOSED MOT TO WITHDRAW AS CO-LEAD PLF Case No. 07-CV-0392-SCUNITED STATES DISTRICT C
Stanford - TVIA - 1036
Case 5:06-cv-06304-RMWDocument 47Filed 04/16/2007Page 1 of 181 2 3 4 5 6 7 8 9 10 United States District Court For the Northern District of California 11 12 13 Plaintiff, 14 v. 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Before the court are motions for
Stanford - GILD - 1029
COUGHLIN STOIA GELLER RUDMAN M ROBBINS LLPSAN DIEGO SAN FRANCISCO NEW YORK BOCA RATON WASHINGTON, DC - HOUSTON LOS ANGELES PHILADELPHIASusan K. Alexander SAlexander@csgrr.com 415-676-4411October 25, 2007 VIA MESSENGER Ms. Cathy Catterson, Clerk United
Iowa State - MT - 404
Assessment2/25/08 9:26 PMQuiz - 4: Oceans, paleoclimate and biogeochemical cyclesWilliam Gutowski Started: February 25, 2008 9:25 PM Questions: 10 Finish Save All(Points: 10)Help1. Ocean temperature measurementsFrom the middle of the twentieth cent
Minnesota - D - 3113
Linguistic Intelligence Jamie DobieThe first book I read was an alphabet bookEarly Childhood My first word was &quot;ball&quot;I wrote my first story about my family in 1st gradeElementary SchoolWas class leader in &quot;Book It&quot;Studied poetry and wrote my own poe
Rochester - PHY - 217
Chapter 6. Magnetostatic Fields in Matter6.1. MagnetizationAny macroscopic object consists of many atoms or molecules, each having electric charges in motion. With each electron in an atom or molecule we can associate a tiny magnetic dipole moment (due
UNC - ECON - 275
Microsoft, 1986-2000 120 100 80
UNC - MED - 146
Protein FoldingPart II.Protein Folding: Literature1. Ptitsyn &amp; Finkelstein, &quot;Protein physics&quot; 2. Fersht, &quot;Structure and mechanism in protein science&quot; 3. Creighton, &quot;Protein folding&quot; 4. Cantor &amp; Schimmel, &quot;Biophysical Chemistry&quot; (Math in the Appendix)P
Delta MI - CST - 151
Loan CalculatorInterest rate Term (Years) Loan amount Monthly PaymentAmortization SchedulePayment Number Beginning Balance Repayment of Principal Interest Ending Balance
Toledo - PHYS - 2130
Interactive-engagement vs traditional methods: A six-thousandstudent survey of mechanics test data for introductory physics courses*Richard R. Hakea) Department of Physics, Indiana University, Bloomington, Indiana 47405A survey of pre/post test data usi
CSU Mont. Bay - CBE - 3551
YearQuartersalestimeQ2Q3Q41987258.660811001987362.579720101987470.415430011988181.720640001988266.764251001988366.182760101988473.283970011989179.687680001989265.533591001989369.2599