Quantitative analysis: Statistically reliable and generalisable results.
In
quantitative
research we classify features, count them, and even construct
more complex statistical models in an attempt to explain what is observed.
Findings can be generalised to a larger population, and direct comparisons
can be made between two corpora, so long as valid sampling and significance
techniques have been used. Thus, quantitative analysis allows us to discover
which phenomena are likely to be genuine reflections of the behaviour of a
language or variety, and which are merely chance occurences. The more
basic task of just looking at a single language variety allows one to get a
precise picture of the frequency and rarity of particular phenomena, and thus
their relative normality or abnomrality.
However, the picture of the data which emerges from quantitative analysis is
less rich than that obtained from qualitative analysis. For statistical purposes,
classifications have to be of the hard-and-fast (so-called "Aristotelian" type).
An item either belongs to class
x
or it doesn't. So in the above example about
the phrase "the red flag" we would have to decide whether to classify "red" as
"politics" or "colour". As can be seen, many linguistic terms and phenomena
do not therefore belong to simple, single categories: rather they are more
consistent with the recent notion of "fuzzy sets" as in the
red
example.
Quantatitive analysis is therefore an
idealisation
of the data in some cases.
Also, quantatitve analysis tends to sideline rare occurences. To ensure that
certain statistical tests (such as chi-squared) provide reliable results, it is
essential that minimum frequencies are obtained - meaning that categories
may have to be collapsed into one another resulting in a loss of data richness.
A recent trend
From this brief discussion it can be appreciated that both qualitative and
quantitative analyses have something to contribute to corpus study. There has
been a recent move in social science towards
multi-method
approaches

which tend to reject the narrow analytical paradigms in favour of the breadth
of information which the use of more than one method may provide. In any
case, as Schmied (1993) notes, a stage of qualitative research is often a
precursor for quantitative analysis, since before linguistic phenomena can be
classified and counted, the categories for classification must first be identified.
Schmied demonstrates that corpus linguistics could benefit as much as any
field from multi-method research.
