{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

git-cercs-05-10 - Improving the Classication of Software...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Improving the Classification of Software Behaviors using Ensembles of Control-Flow and Data-Flow Classifiers James F. Bowring, Mary Jean Harrold, and James M. Rehg College of Computing Georgia Institute of Technology Atlanta, Georgia 30332-0280 { bowring | harrold | rehg } @cc.gatech.edu ABSTRACT One approach to the automatic classification of program behaviors is to view these behaviors as the collection of all the program’s executions. Many features of these ex- ecutions, such as branch profiles, can be measured, and if these features accurately predict behavior, we can build automatic behavior classifiers from them using statistical machine-learning techniques. Two key problems in the de- velopment of useful classifiers are (1) the costs of collecting and modeling data and (2) the adaptation of classifiers to new or unknown behaviors. We address the first problem by concentrating on the properties and costs of individual fea- tures and the second problem by using the active-learning paradigm. In this paper, we present our technique for mod- eling a data-flow feature as a stochastic process exhibiting the Markov property. We introduce the novel concept of databins to summarize, as Markov models, the transitions of values for selected variables. We show by empirical studies that databin-based classifiers are effective. We also describe ensembles of classifiers and how they can leverage their com- ponents to improve classification rates. We show by em- pirical studies that ensembles of control-flow and data-flow based classifiers can be more effective than either component classifier. Categories and Subject Descriptors: D.2.4 [Software Engineering]: Software/Program Verification; G.3 [Mathe- matics of Computing]: Probability and Statistics; I.2.6 [Ar- tificial Intelligence]: Learning General Terms: Measurement, Reliability, Experimenta- tion, Verification Keywords: Software testing, software behavior, machine learning, Markov models 1. INTRODUCTION The automatic detection and classification of software be- haviors is an important component of many software de- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 200X ACM X-XXXXX-XX-X/XX/XX ... $ 5.00. velopment and maintenance activities. For example, to im- prove the quality of software after its release, developers can monitor the deployed software and use the results of the monitoring to determine automatically the frequency and properties of the software behaviors, including failures [7, 20]. For another example, to facilitate autonomic comput- ing, developers can model software systems as self-regulating biological systems, which requires automated behavior de- tection to support system self-awareness [1, 10, 15].
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}