Unformatted text preview: ctions of factors
too of effort as for
e modules that affect effort. are sample form for suchmost purposes. under study. There, we display the chance tha
However, effort A
data (person hours) are available only at time an MR touches more than one file by sm
issue could relationship is:
endent, but this level. (Further analysis of factors affecting effort, in which each point corresponds to an MR
based onclarger a FILES
c a X 1fc > imputation of coordinate is time represented by the openin
a0 data sets but requiring the f gjf j
ing the CDI effort for individual changes givfen aggregated effort y-coordinate is one when more than one file i
6 zero otherwise. Three local linear smooths
ar model, is values, is in .) a ADD
c a DEL
Extreme variability of the feature-level data necessitated and  for introduction and discussion.) are
c a6 DEV
10 taking logarithms of INTvariables. (The :actual transforma- top plot. These smooths are essentially w
averages, where the weights have a Gaussian
tion, denotes avoids negative numbers.)
Here, jf jlog1 Á, the size in NCSL of the file f .The resultant
s that have model is
One motivation for the form used in (6) is to distinguish widths of the windows (i.e., standard dev
ely free of the dependency overhead associated with a change, captured weight function) are h 0:3 (purple curve), h
c :32 :13
year older in the terms involving a0 , a1 , and a2 , from the nominal effort, colored curve), and h 7:5 (blue curve).
The central curve, h 1:5, shows an initi
two-thirds represent ed by the À e09
logi1 voDELgc32 and a4 . The
t : rms n lvin
remaining terms incorporate interval and log1 DEL
c . trend, which is natural because many files a
:12 log1 ADD
c developer overhead
co m mo n c h a n g e s i n t h e i n i t i a l de v e l o p
comparison A statistical analysis of this index appears in Section 5.4.
:11 log1 INT
followed by a steady upward trend starting
r of future
À :47 log1 DELTAS
last trend reflects breakdown in the modu
ases, (10) is 5 THE EVIDENCE FOR DECAY
11 code, as we discuss further in Section 5.2.
F some of our major results to correlation.
m o d e l ( a s In this section, we discuss ile span has positive date. substantial increase comes from the fact th
coefficients shown are statistically significantly
ss, this still AllAll these analyses are based on a single subsystemdifferent the y-axis represent probabilities (local in
Large deletionsisare Despiteand of the change will touch more than one file, whi
from zero; the of approximately 10038. implemented rather easily.
multiple R2 value : modules the 2,500
eltas cause code, consisting
s a model files. The Hardest changesroughly 6,000 IMRs,additions andfrom a low of less than 2 percent in
change data consist of require both 27,000 doubles deletions.
12. One might expect that modules modified by many developers would
aults over MRs, and 130,000 as a result Somedifferent styles and,login names than 5 percent in 1996.
have confused logic deltas. of the 500 different hence, be difficult
Large number of editing changes are rather In the absence of more detailed analysis,
easy to implement.
made changes to the code in this subsystem.
The results yield very strong evidence that code does the top plot in Fig. 3 depend on the window
larger window width, h 7:5, shows only
Tuesday, September 10, 13 First, in Section 5.1, statistical smoothing demon- Results: (4) Prediction of efforts increases
over time Expected results?
•Changes will take longer to implements as modules age
•Modularity breaks down over
•The number of ﬁles that changed
increases over time.
•Large deletions are easier than
large deletions and additions
together. Tuesday, September 10, 13 Any unexpected
•Older code is less likely to have
•More modules are affected by
View Full Document
- Spring '08
- developer, Cdis