21 history of frequent changes the historical count

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: er of lines deleted by c DATE…c† ˆ the date on which c is completed; which we term the date of c INT…c† ˆ the interval of c; the …calendar† time required to implement c DEV…c† ˆ number of developers implementing c: Derived variables • • • • 4.2.1 History of Frequent Changes The historical count of changes is expressed by the CDI X 1fDATE…c† 2 Ig; …1† CHNG…m; I† ˆ c >m e FREQ(m, I) = defining a CDI, one confronts three critical issues. rst issue is to select appropriate levels of aggregation th changes and software units. Of the levels of es described in Section 2, MRs seem in most instances he most informative: The associated data sets are rich h to be interesting, but not so large as to create ability. most of the system we study, software can be ated to any of three levels.9 Files are the atomic unit tw a r e . M o d u l e s ar e c o l l e ct i o n s o f r e la te d f il e s , ponding physically to a single directory in the re hierarchy. A subsystem is a collection of modules enting a major function of the software system. In dies, modules typically yield the most insight. second issue is scaling: In some cases it is helpful to CDI to convert it into a rate per unit time or per unit tware size (usually, NCSL, the number of nonent source lines). In addition to being scaled for time, s may also be functions of time, in order to illuminate olution of code decay. third issue is transformation: An index can sometimes roved by transforming a variable mathematically, for le, by taking logarithms, powers, or roots. In some the rationale may be physical, while in others it will istical, in order to improve the ªfitº of models. /I the number of changes to a module m in the time interval I , appears in Section 5.2. In other settings, the frequency of changes may be more relevant, as quantified by FILES(c) = # of files touched for change c FREQ…m; I† ˆ 1 CHNG…m; I†; jIj …2† where j is the = of of non-commentary source lines per module NCSL jI(m) length# the time interval I. 4.2.2 Span of Changes The span of a change is the number of files it touches (here, files yield a more sensitive index than modules), leading to the CDI X 1fc e>f g: …3† FILES…c† ˆ AGE(m) = average age of its consequent lines xample CDIs we present example CDIs that appear in the analyses tion 5. They represent symptoms, risk factors, and nes, even though tracked in the version management data base, lack sufficient structure to be appropriate. Tuesday, September 10, 13 DELTAS…c† ˆ number of deltas associated with c f In Section 5.4, we will provide evidence that FILES predicts the effort necessary to make changes. There are three primary reasons why changes touching more files are more difficult to accomplish and, hence, that span is a symptom of decay. First is the necessity to get expertise a b o u t u n f a m i l i a r f i l e s f r o m o t h e r de v e l o pe r s ; t h i s i s e s p e c i a l l y v e x i n g i n l a r g e - s c a l e so f t w a r e , w h e r e e a c h developer has a localized knowledge of the code. Second is the breakdown of encapsulation and modularity. Wellengineered code is modular and changes are localized. Changes spanning multiple files are more likely to modify an interface. Third is the size: Touching multiple files significantly increases the size of the change. Study Approach: (3) Finding Correlation • • Tuesday, September 10, 13 Linking risk factors to symptoms Statistical regression • This requires designing some template models lor represents the age of a source code line. Rainbow colored boxes represent frequently changed files, e. Color changed little age of a source code line. Rainbow colored boxes represent frequently changed files, s that represents the since their creation. nt files that changed little since their creation. Study Approach: (3) Finding Correlation ion 5), psviƒ increases 4.2.5 Fault Potential Section 5), psviƒ increases 4.2.5 Fault Potential duoducesleclr ar cientnfici,c, Predictive CDIs are functions of CDIs that quantify ces c a e s scie i ti f pr Predictive CDIs are functions of CDIs that quantify ecay, ay,s adisciscussedin symptoms or risk factors and are intended to predict a s d ussed in e dec symptoms or risk factors and are intended to predict the key responses of effort, interval, andand quality. We the...
View Full Document

Ask a homework question - tutors are online