Unformatted text preview: 6 This is an OCR-Version from the book, I did this in order to keep the pagination of the original New Aspects of Human Ethology
Edited by Schmitt, A., Atzwanger, K., Grammer, K and Schaefer, K.
1997 PlenumPress: London, NewYork. THE COMMUNICATION PARADOX AND
Towards a Radical Empiricism
Karl Grammer,’ Valentina Filova,2 and Martin Fieder’
’Ludwig-Boltzmann-lnstitute for Urban Ethology c/o Institute for Human
Biology/University of Vienna Althanstrasse 14 A-1090 Vienna/Austria
Institute for Automation-Department for Pattern Recognition and Image Processing
Technical University Vienna Treitlstrasse 3 A-1040 Vienna/Austria In the history of both animal and human ethology the direct observation of unstaged
interactions in a natural habitat plays a critical role for methodological and theoretical
considerations. Even when ethologists think that they already know much about adaptations
and the ways in which they interact with the environment, the principles which have been
involved in the evolution of increasingly complex human behaviour are still not very well
A major reason for this lies in methodological problems connected with the observation and
description and the nature of human behaviour itself. In order to asses causation and function of
behaviour we rely on an "observational device." The process of information reduction which is
applied to the study of behaviour results in highly variable observations. The assessment of
meaning and function rarely produces reproducible results, and different signals especially in
human communication seem to take many meanings which are context-specific. Partially this
might be due to the observational approaches used for coding behaviour. 1. WHAT IS COMMUNICATION?
A straightforward definition of communication is not difficult. As a starting point we can
define it as the transfer of information between two communicative units. Ethology has created
many models for the process of information transfer. Basic to these approaches is the term
"signal," an information carrier which is produced through encoding information in an
New Aspects of Human Ethology. edited by Schmitt et al. Plenum Press, New York, 1997 91 92 K. Grammer et al communication channel by a sender. This signal is decoded by a receiver who adds information
to the signal and then decodes its meaning. In a classical ethological approach many of the
signals are a result of evolutive constraints and work in a quasi automatic way. Although many
human signals have been isolated as cultural universals (Eibl-Eibesfeldt, 1972) a closer look
reveals high variability. For instance Ekman and Friesen (1971) propose cross-cultural universal
facial signals for emotions. An observational approach to the study of emotions in every day life
reveals a high variability in the production of patterns and pure emotion patterns which occur
rarely (Grammer et al., 1988).
The Lorenz-Tinbergen approach sees signals as discrete and deterministic: A sends a signal X
and B decodes and returns Y. In this definition, visual, tactile, acoustical and verbal information
are divided into units of meaning. Signals have a lexical structure: one entry in the lexicon has
one specific meaning and one definite function. The basic assumption of this approach is that
signals exist as independent units and sender and receiver share a common code. The signals
themselves are considered as discrete units of movements each with a beginning and an end in
time, which in turn can frame a static component like a posture. This is clearly demonstrated by
the acoustic properties of laughter which separate it from speech (Provine and Young, 1991), or
by bodily movements in interactions, like illustrating hand-movements (Ekman and Friesen,
1972). Thus, signals have a content which is different for each signal, and which makes them
identifiable reliably. If so, signals have to show a certain form constancy which is necessary for
identification and two different signs do not overlap in their meaning. Most of the times form
constancy defines the relation of movements to each other, like the typical head movements
connected with laughter (Grammer and Eibl-Eibesfeldt, 1989). If laughter occurs in interactions,
the head is moved in a circular fashion away from the partner.
Another approach to signaling uses the same basic structure, but it takes the probabilistic nature
of communication into account. It still assumes that signals are discrete: A sends X and B
decodes and returns Y, or not (Argyle, 1988). These probabilistic models imply some law of
summation over space and time. This means that A sends X and Z and T to B at the same time,
or A sends X and Z and T sequentially. This model of communication is called "modulated
communication" (Markl, 198S). Grammer (1995) proposed that "natural" signs sent in parallel,
like age and sex, signals of dominance/ submission or emotions contain the decoding instruction
for a signal. For instance age and sex of the sender can modulate the meaning of a smile from a
"come-on" to a simple friendly gesture. A slightly different approach was suggested by Schleidt
(1973) as tonic communication. He assumed that meaning could be encoded in the form of pulse
rate modulation. The sender sends a signal of uniform height and duration repeatedly in
distinctive intervals. The receiver then applies some kind of low pass filter in order to integrate
the signals over time. The effect on the receiver then is a slowly accumulating, tonic one.
For the observation of communication behavioural categories are constructed as classes or
prototypes of signals. These classes have to be stereotypical, homogenous and discrete in order
to be reliable. In addition we want to avoid functional descriptions because the goal of
ethological approaches is the description of function itself. Thus, in order to produce reliable
results the reduction of information has to be enormous. An example for such a classical
approach is provided by Grammer (1991) who paired strangers randomly and tried to find out if
body movements and postures could predict self-reported interest in another person. The
resulting clusters of movements proved to be inconsistent and unreliable in predicting interest in
the other person.
Furthermore, single postures did not covary with reported interest in this study. But as soon as
postures and vocal stimuli were combined, the situation changes dramatically. The Communication Paradox and Possible Solutions 93 Figure 1. Body postures and laughter. The figure shows body poses 2 seconds before two strangers laughed in a
waiting room experiment (Grammer, 1991). The postures where constructed by regression analysis from 4S47 postures.
In (a) the most frequent postures during laughter are shown. In the right picture the view from the female is shown, the
middle picture shows the observers view and the left picture shows the view from the male. In (b) the postures are
shown males and female take during laughter which signals aversion and in (c) postures during laughter which signals
approval and interest in the other person are shown. Interestingly there is an additive effect for the single posture
elements in the different body parts: the more of the single elements are present, the higher is the correlation with the
self-reported intentions (Wire frame models by A. Jtte). It was possible to show that postures which are taken during laughter might well transport the
meaning of laughter. The acoustic event laughter does not alone delineate interest from no
interest. Highly interested males or females do not laugh more often than persons with no
interest. Moreover, people who are together with strangers of the same sex laugh more often. In
sum, there is no contextual evidence that laughter alone is a sexual signal. When combined with
postures however, laughter may take different meanings on a continuum from rejection to
appraisal of the partner.
Evidence from the consequences of particular signals reveals many contradictions. The same
behaviour can have different meanings. open legs among females and Hair Flip (Fig. 2) where
the hair is moved out of the face with the hand and the head tossed backward, indicates low
female interest. If both behaviours are combined with laughter they covary with high interest.
other behaviours may take different meaning when they are static or dynamic. The Head
Akimbo, a behaviour where the breast is pushed out and the hands are folded behind the neck, is
associated with high interest when it occurs as a 94 K. Grammer al Figure 2. Hairflip a female mannerism. The Hairflip consists of a typical movement sequence which starts with a slight
head tilt, followed by a head up movement. The hand reaches out into the hair and the head turns back into the starting
position with gaze aversion. This movement is performed more often by females (Grammer, 1991) than by males. The Communication Paradox and Possible Solutions 95 movement during laughter, but with low interest when it occurs as posture before laughter
The communicative situation presents itself as unclear and ambiguous although the receiver
seems to be able to decode the senders intentions. Receivers are generally aware of what the
sender wants to tell them. Thus the decoding of meaning in interactions can not be described
with a simple signal oriented approach. As an alternative, we could speculate that meaning and
intentions are communicated solely through the verbal channel by speech content itself. Krauss
et al. (1981) showed that speech accompanying gestures were not related to the decoding of
meaning. They assume that body and arm movements are results of speech production itself and
propitiate speech production. In earlier rating experiment of politicians Krauss et al. (1981) had
already shown that verbal information dominate visual and auditive information. If we agree
with this approach it seems useless to search for signaling intentions in human non-verbal
behaviour. In contrast to the above mentioned results, Mehrabian (1972) showed in series of
experiments the relative role facial expression, vocal behaviour and speech content play in the
perception of persons. Mehrabian comes to the general conclusion that non-verbal behaviour
plays the main role for the decoding of meaning. He finds that the meaning of messages is
determined to 55% by visual information, 38% by vocal information and only 7% by speech
content. These relative relations have been replicated by Siddiqi et al. (1973) and Wallbott
If we look at the content of the information which is transferred between interactants, we find
that facial information is used for decoding tendencies of dominance and positive affect
(Rosenthal and Depaulo, 1979). Ekman et al. (198O) gave information from seven different
communicative channels: only facial information, only bodily information, speech, filtered voice,
transcribed speech content, and combinations of voice and speech, voice, body and speech. In
this research none of the presented channels was dominating the others in the transfer of
meaning. 3 What Is Communication for?
Communicative models are a description of a communicative process of information transfer.
Most models will fail when it comes to explain the function of communication because of the
nature of communication itself, and the tools which are applied to reduce the information, as we
will show later. Social groups are complex structures and their main feature is that the goals of
the members rarely are in accordance. Human groups can be seen as an agglomeration of
conflicting interests. This fact ultimately may be the driving force behind the evolution of social
intelligence. Proximately it may be the basic constraint for communication and thus the
generation of signals in any channel of communication.
The probabilistic multi-meaning nature of human communication is present in verbal and in nonverbal communicative acts. Linguistic research shows that indirectness of verbalisation and
verbal acts like "hedging" depend on the risk of the intended communicative act (Brown and
Levinson, 1978). If the benefits for the sender are high and the costs for the receiver are also
high, it is obvious that the risk of not reaching the pursued goal for the sender is also high. As a
result, the sender has to use signals and actions which allow to manipulate the receiver in the
sense of the sender. Evolutionary theorists have forwarded comparable ideas. openly presenting
intentions in communication might not pay (Dawkins and Krebs, 1981) because the signal
receiver might act directly against the sender’s intentions. The sender thus would not be able to
reach his/her goals. In addition, as soon as the receiver recognises the intentions of the sender the
probability of deception might rise. This situation 96 K. Grammer et al drives any type of communication into manipulative efforts. The manipulative component of a
signal has to force the receiver into a certain state where he is willingly accepting the goals of the
sender, preferably without recognising that he was manipulated. This situation is the
communicative paradoxon: showing intentions and not getting caught by a suspicious receiver.
In this view the function of communication is manipulation and is used for risk dependent
transfer of information. Thus, a prerequisite for any communicative model is the assessment of
risk, which will be highly context dependent. Risk itself is created by the goal under quest, i.e.
the imposition for the receiver, the relationship between the interactants and motivational factors.
In our introductory example of the waiting room situation risk should be high for a person who
develops interest in the other person. Risk is determined by the possible costs and benefits for
both sender and receiver. Risk dependent communication allows the explanation of simple
straightforward transfer of meanings (under low risk conditions) and highly ambiguous transfer
of information in situations with high risk. In this view both verbal and non-verbal channels can
be affected. So, any model of communication should take risk assessment into account. Nonverbal behaviour may be an important tool in high risk situations because of its non-binding
standard, when compared to verbal behaviour (Grammer et al., 1996). The contradiction in the
results which show either dominance of verbal information over visual information or vice-versa
lies in the fact that an independent rater of situations had no possible costs in such an experiment,
nor had the sender of the signal. This trap forces research on communication to work under
naturalistic conditions, that is to observe unstaged social interactions.
1.1.1. Direct Communication. How can the sender achieve the delicate task of risk dependent
communication? The sender has to assess risk and to act accordingly. The production of signals
then could be optimised. Signals used in a situation of high benefit and/or high costs for both the
receiver and the sender should be easily decodable. In this case the only preconditions of
effective signaling are low environmental noise, encoding error by the sender and decoding
errors by the receiver. An important means to actively reducing these errors is to increase
contrast in a signal. Another mechanism is to produce a signal repeatedly and constantly over
time. Both processes lead to ritualisation of signals. A ritualised signal consists only of a few
elements that are produced repeatedly and in a fixed sequence. The aim of ritualisation is to
make a signal definite and unmistakable. Grammer and Eibl-Eibesfeldt ( 1989) have shown that
laughter follows ritualisation principles. Under high risk conditions, female laughter becomes
more stereotypic, the threshold for performance is lowered and it is accompanied by typical
movement sequences. As soon as risk becomes more asymmetric and the possibility of
deception rises, the communicative situation changes drastically.
1.1.2. Lying, Deception and Mind Reading. The first possibility is the use of deception in the sense of
sending false information. There are some constraints connected to lying. An example are
children who cry in conflicts (Grammer, 1992~. A crying child who is engaged in a conflict with
another child receives support by a third child in most cases. The risk of the supporter is high in
such a situation, because he might get attacked. The probability of getting support depends on
the frequency the child cried in the past. If it cries too often and uses crying in a deceptive way
(i.e. to receive support) he/she won’t receive any longer support. The receiver only engages in
support if the honesty of the signal is guaranteed. Thus the use of deception will depend from the
frequency in which it is used and the costs and benefits connected to the interaction. Harper
(1992) pointed out that deception can only occur when the frequency of deception is low, the
signal has little The Communication Paradox and Possible Solutions 97 costs for the sender and high benefits for the receiver. This situation forces the receiver to apply
"mind-reading" and try to find additional cues for possible detection of deception. If the sender
tries to deceive the receiver, the sender will try to control his behaviour in order to avoid
detection. Yet control is rarely complete. If the sender tries to control his emotions he for
instance creates leaks in the rest of his non-verbal behaviour. The receiver then will be able to
detect the deception (Ekman and Friesen, 1969). Therefore lying is not always a solution for the
The second form of deception is withholding information. According to Harper (1992) this is the
main form of deception. Even if its use is widespread, the signal sender has to clarify his
intentions sooner or later, or he will not be able to reach his goals.
1.1.3. Direct Cognitive and Physiological Manipulation: ~ Smile Is Not Just a Smile. Direct
manipulation of the cognitive apparatus or the physiology of the receiver can play a role. Lorenz
(1973) and later Cosmides et al. (1992) proposed that our information processing apparatus was
formed and optimized in the course of evolution. If our brains are optimized for adaptive
information processing then these adaptive structures can be exploited. An example of this
possibitity is the perception of emotions. The signal receiver experiences the same physiological
changes as the sender of an emotion (Ekman et al., 1983). Thus by sending signaling "emotion"
the sender is able to influence the physiology of the receiver. In the case of a smile, this makes
sense because emotions change the cognitive processing of social stimuli. Happy people process
information less critically than sad people (Forgas, 1992). Thus, a smile does not only mean "I
am Happy," it simply influences the information processing in the receiver in favour of the
sender (Grammer, 1995). This also explains why smiling is not necessarily bound to emotions
(Kraut and Johnston, 1979). Smiles are more reliably associated with social motivations than
with emotional experience. Comparable physiological changes occur with the perception of
olfactory stimuli. A female pheromone, i.e. copulin, which is produced in the vaginal secretion
influences male processing of female attractiveness. Under the influence of female copuline,
males judge female attractiveness more positively (Grammer et al., 1996).
Although direct manipulation of the receiver through signals might play a critical role for
communication, it does not yet explain how senders can hide their intentions. This goal can not
be achieved simply by doing nothing, because receivers will become suspicious. Again there are
many possible solutions to such a goal.
1.1.4. Being Honestly Dishonest. The first solution is the sending of meaningful signals either
out of context or with different motivational background. This motivational background might
contradict the actual goal of the sender. Grammer and Kruck (1996) showed that this solution is
not uncommon. Females who are not interested in males do not send negative signals, they send
a mixture of 6O% positive and 40% negative signals. In this case the negative message is hidden
in positive signals. In this case the deceptive attempt is the avoidance of face-loss by the male
and thus potential aggression towards the female, which could occur when the male is bluntly
1.1.5. Multimodal Combinations: Signals from Different Channels and Metacommunication. A
second solution to the problem of intention hiding lies in combining signals from different
sensory channels. For instance in the case of laughter, body movements or postures, each with a
determined meaning itself, can be added, but also laughter quality (i.e. amount of vocalisation,
number of bouts etc.), odour and touching can co-occur. Such combinations have the potential to
create an almost infinite number of meanings of laugh 98 K. Grammer et al ter, from sexual enticement to mobbing (Grammer, 1991). Furthermore metacommunication,
becomes possible through multimodal combinations. Laughter for instance might put everything
else what happened in a "play-mode" which simply says "Look it is not serious what I am doing"
(vanHooff, 1972). By doing so, multi-meaning combinations are possible, which allow almost
endless combinations for communicative purposes.
In the case of multimodal combinations, communicative channels are interacting. If this occurs,
there has to be cross-modal integration of information. By comparing the impact of the different
parts of the signal the receiver has to arrive at a decision what actually is meant. This problem is
well known in research on the interaction of non-verbal behaviour and verbal utterances.
Eyebrow raises and slight vertical or lateral rotations of the head appear to serve both punctuating
speech rhythm and the emphasizing utterances. Although eyebrow raises are a clearly identifiable
and a cross-culturally constant and discrete signal (Eibl-Eibesfeldt, 1989, Grammer et al., 1988),
it can take many different meanings when associated with speech. Thus we would reach an
almost infinite number of possible combinations. This is also the case for combinations with of
eye-brow raising with other facial muscle movements. Grammer et al. (1988) showed that the
resulting patterns where highly variable. This makes it unlikely to reencounter the same
combination of stimuli again and these could make decoding more difficult. The same could also
account for the high variance in human non-verbal behaviour.
1.1.6. Higher order Combinations.. Signals from the Same Channels. Higher order combinations
or the summation of signal combinations over time could also veil the senders intentions.
Combinations can occur when simple or multimodally combined signals are combined again
either sequentially or at the same time. It becomes even more complex when a member of the
combination can be substituted by another signal with the same meaning. As a result the same
meaning can arise through different combinations, which are not identical. Moreover if in such
combinations not only the presence of a signal plays a role, even the absence can be a signal
itself. Moore’s (1985) description of female solicitation may serve as illustration. She found 51
behavioural units in the validation of her repertoire and any combination of at least 1O units of
female behaviour could predict male approaches. How could a male receiver have managed to
"evaluate" or "recognise" female interest? Indeed there are 1.3*1O’Opossible combinations of IO
behaviour elements out of a repertoire of 51.
1.1. 7. Manipulation of the Time Structure.. Good Vihrations and the Generation of Noise. There
is mixed evidence on the assessment of function when using discrete behavioural categories, thus
syntactical rules could provide some cue for the interpretation of signals independent of the signal
content. With a mathematically sophisticated method, Grammer et al. (in prep.) analysed
behaviour sequences in male-female interactions. They found a highly complex interweaving of
behavioural elements. Pairs created dance like movement patterns. The drawback is that these
patterns are highly idiosyncraticnot one pattern occurred twice in almost 8OOO patterns
identified on the pair level. The temporal organisation of these patterns varies with interest: they
become more stereotypical when interest in the partner is high. Even more, the female initiates
the patterns and the more patterns are present the better the male feels himself in the interaction.
This suggests that temporal organisation itself could predict male-female-interest. Manipulation
of time structure and its perception can also by reached by trying to achieve synchronisation.
Although most attempts to empirically describe synchronisation empirically have not been
successful. In contrast to the impossibility of description subjects can rate the degree of The Communication Paradox and Possible 99 synchronisation in interactions, and this ratings correspond to the subjective experience of
interpersonal rapport among the interactants (Bernieri and Rosenthal, 1995).
Another solution to the hiding of intentions problem is the creation of "noise." That means
sending many signals without attaching meaning to it, and hiding the meaningful signals in this
"noise." The noise will make it almost impossible for the receiver to decode the real intentions
of the sender. In this context, the concept of "protean behaviour" (Chance and Russell, 1959)
could play a role. The concept proposes that behaving unpredictably and erratically will mask
the signal sender’s intentions like a prey trying to evade its predator. In the study cited above,
the repeated and time constant patterns of behaviour are hidden in a continuous flow behaviour.
A closer analysis of the data reveals that only the precise timing and not the content of the
performed behaviour plays a role for the generation mutual understanding.
Another alternative is that rare events which are hidden in "noise" advertise intentions at "hot
times" at "hot-spots." This means that one single signal with a distinctive and explicit meaning
sent at the right time could signal the intentions of the sender. Thus one or two events which
might differ from interaction to interaction can be enough for communicating intentions. This
alternative is highly likely because individuals communicate stimulus information in a way
which is adaptive for perceivers to detect, and perceivers detect this information, when they are
attuned to it (MacArthur and Baron, 1983).
Mutual understanding can also be reached by sending information in a way the receiver can not
consciously assess. This means sending unknown signals to the receiver or sending variants of
signals the receiver is not likely to interpret. Thus, the receiver is forced to learn the shared code
slowly. This hazard would lead to a highly variant shared code. The emphasis lies on the term
"learning." The sender has to prepare the receiver slowly for receiving his actual intentions. 4 The Quality of Movements: A Neglected Dimension
Slow escalation and hiding intentions when at risk of failure on the sender’s side is paired with a
cognitive apparatus which tries to unravel the intentions on the receivers side. This situation will
force communication to a level where the receiver might not be able to asses consciously the
manipulation which is underway. The receiver then has to look for honest and develop strategies
This situation is contrasted with a scientific research apparatus which seems not at all adequate
for the analysis of such a paradoxical situation. It is obvious for any observer that behaviour is
distinctive and that there are many levels and methods of description for signals. Categories can
embody muscle movements or groups of such movements like in Ekman’s and Friesen’ s Facial
Action Coding system (Ekman and Friesen, 1978) or descriptions of limb movements like in the
Berner System developed by Frey and Pool (1976) or even more complex units like walking.
Muscle movement description is the most basic level of description. on the next level behaviour
is already described by interpretative categories, even when the definition is highly
operationalized. This interpretation sometimes can involve hypotheses on the function of a
behaviour. The term "coy smile" is already a hypothesis on the function of behaviour although it
describes a distinct motor pattern of head movements combined with a smile (Eibl-Eibesfeldt,
Common to all these approaches is that a continuous behaviour stream is forced into a series of
event categories which might subsume comparable, but visually distinct behaviours. For
example a non-verbal threat can be done in many ways. By raising an arm fast or slow, with fist
clenched or not, the movement staying at the maximum flexion for a cer 100 K. Grammer et al tain time and going back fast or slowly. Any of the possible combinations will produce a
different type of "threat." We also can transfer the movement combination itself to a leg or even a
head movement: moving the head fast towards somebody else then staring at him and finally
look slowly away. In every case we will produce an event of "threat" by using a certain
movement configuration. A solution would be to describe all these different typos of threat, with
the result that the numbers of each type of threat events will become small and useless for
statistical treatment. An additional problem at this level is the reliability of observations: in order
to be identified reliably and to avoid the development of an observer bias, categories have to be
unmistakable, stereotypical, homogeneous and discrete. This can lead to oversimplification and
Thus a dilemma arisesobservers surely can interpret and understand the behaviour of
others and their possible intentions correctly but we might not be able to identify them on the
bases of communication theories which use discrete signals.
The solution we propose to this dilemma is that the categories which are used to break up
the flow of behaviour are only a poor approximation to how the receiver processes information.
Any behaviour is a change in a continuous information flow which could be noticed by a
receiver. These changes in the information flow can be of various qualities. It is possible to
encode information in the quality of a change. We propose that body movements and parts of
body movements themselves are processed, i.e. the receiver does not perceive and summate
behaviour in single categories "Legs open" or "Hair Flips." In contrast the receiver could assess
elementary dimensions of behaviour like speed, acceleration and amount of movement or motion
quality. The brain might not use categorical perception in the same way as we classically analyse
behaviour. An example for different possible meanings that arise through speed differences in
onsets of the events was given by Grammer et al. (1988) for "eye-brow-flashes" in a crosscultural analysis. In the most common event the pattern starts with the contraction of the M.
corrugator supercilii. This contraction disappears and in a fast movement the brows are lifted and
a smile appears. The duration of the contraction of the M. frontalis and the Pars palpebralis which
causes the eyebrow lift is variable and disappears slowly, whereas the smile caused by a
contraction of the M. zygomaticus major stays on the face. The second pattern also starts with a
contraction of the M. corrugator supercilii. But then, there is a slow lift of the brows, and the
contraction of the M. corrugator supercilii does not disappear. Rarely a smile is added (See figure
3 for details). In this case, meaning arises through the combination of elements present and the
dynamic properties of the elements.
The solution to the dilemma between event oriented categorisation of behaviour and
possible qualitative differences in the same behaviour with different meaning can be found in the
way visual information is processed. The information is not only reduced, but parallel new
information is created. Processing of visual signals takes place on two levels: one is low level
processing where the perceived information is recoded: colour, motion, depth, time integration of
movements. on this level information is detected which seems to be necessary for spatial
navigation. During high level processing, where pattern recognition occurs, a "world-model" or a
priori knowledge like form, size, schemes is added. on this level, the brain compares the results it
has got so far and then tries to come to a coherent interpretation of the world (Arbib and Hansen,
1987). Traditional behaviour research is working on the second level and neglected the first level
which basically describes the quality and not the content of behaviour.
In order to assess qualitative aspects of body movement Johansson (1973, 1976) used a
point light display fixed to the joints of his subjects and filmed them in the dark. The resulting
films appeared as a configuration of bright points against a dark background. If such point-light
clips are shown to raters, they can recognise sex and age of the The Communication Paradox and Possible Solutions 101 a
0.1 sec 0.08 to 1.2 sec 0.08 to 2 sec APEX: MAX CONTRACTION ONSET >0.3 sec OFFSET 1.2 to 3 sec 0.08 to 2 sec b
Figure 3. Time structure and patterns in the eye-brow-flash. The figure shows two different patterns in a eyebrowflash. The prototypical pattern in (a) was described by Eibl-Eibesfeldt (1972). This pattern starts with a frown which
disappears. The brows are lifted quickly and a smile is added. The brow raise disappears, while the smile can stay on the
face. The second pattern (b) is completely different: It also starts with a frown, which does not disappear whilst the
brow raise appears on the face. The brow raise onset duration is three times as long as in the first pattern and the
patterns duration is much longer than in (a) (Grammer et al., 1989). In addition to the difference in combinations of
muscle movements there is also a difference in the time structure which changes the quality of the expression. subject (Cutting and Proffitt, 1981). This effect has been replicated several times (e.g. Runeson
and Frykholm, 1983). If the point light displays are presented statically they are rated as a
random display of points. observers are able to isolate abstract patterns of movements from such
displays. Moreover, observers are able to detect effort, intentions and deception (Runeson and
Frykholm, 1983), and if the point light displays are applicated in faces, perceivers are able to
detect information connected to emotion (Bassili, 1979). Gait and movement analysis has a
considerable tradition in medicine where it is used to describe movement disorders. It has rarely
been used in behaviour research.
Motion quality is a powerful descriptor which is used for signal decoding by humans. Yet there
is a serious methodological drawback: point light displays can not be used in unstaged
interactions, because they are an obvious research device. They make any stimulus person rather
self conscious and alerts them to the variables of interest to the researcher (Berry et al., 1991).
We thus developed a digital image analysis system which can be used to decode automatically
speed, acceleration and size of moving body parts from digitised video images. 2. AUTOMATIC MOVIE ANALYSIS (AMA) AND
Machine understanding of human action has become a fascinating topic in computer sciences
through the last years. Unfortunately the researchers have mostly avoided dealing with people in
their natural environments (Pentland, 1995). In contrast to this, several re- 102 K. Grammer et al. search groups have developed devices which are able to accomplish the basic task of making
computers recognise who they are working with and to be sensitive to people’s gestures and
expressions. Computers who are able to track model human actions can be used for behaviour
analysis and not only as new input devices. They can recognise faces (Moghaddam and Pentland,
1995), and facial expressions (Essa and Pentland, 1995). Unfortunately the approaches are made
in a way that the dynamical component is lost and expressions are forced again in event
categories, like surprise, happiness, disgust or anger with all the drawbacks of categorisation.
In a more general approach, Starner and Pentland (1995) and Pentland and Liu (1995) have
developed a motion analysis system where the human body is modeled as a Markov device with
a number of internal mental states, each with its own particular behaviour, and inter-state
transition probabilities. Internal states are determined through an indirect estimation process,
using the person’s movement and vocalizations as measurements. These variables then are fed
directly into a computer which then can observe a person’s actions and respond accurately.
Niyogi and Adelson (1995) developed a set of techniques which are capable of analysing the
patterns which are generated by walking. These patterns are then translated into a stick-figure
and the gait can be analyzed.
Unfortunately these devices are quite sophisticated and expensive and they have not been used
for behaviour research. In order to assess the quality of movements and human expression we
developed a system which can be used even in labs with a low budget. We dispensed with real
time operation capabilities and turned to the analysis of digitised vide0. These procedure is
called Automatic Movie Analysis (AMA). The advantages are clear: it is possible to repeat any
type of analysis and control for artifacts. 2.1.The Programming Platform
Automatic Movie Analysis is a programming platform which can apply a row of sophisticated
filters to videoimages sequentially. In this article, we will restrict our analysis on Motion Energy
Detection (MED). This is a simple but elegant filtering method for the determination of
qualitative aspects of behaviour. The method relies on the fact that pictures of movies (frames)
are time dependent distributions of grey-scales or colour values. A video-picture consists of an
array of pixels which take different values. In a grey-scale these values usually range from 0
(white) to 255 (black). If the camera view is static, single pixels will change their values when
movement occurs. If there is no movement, the pixels stay at the same value. The solution is to
make the difference between two picturesif there was no movement, we will get an all white
picture, if there is movement we will get grey values in those regions where movement occurred.
The amount of movement will become visible through the amount of pixels which are not white.
This image differencing (Sonka et al., 1993) works quite well if the camera position and the
lighting conditions remain unchanged. The method can detect movements, not their direction. In
this case AMA consists of the following steps (illustrated in Fig. 4):
1. Digitising of a movie in a range between 12.5 to 25 pictures a second (320 x 240 pixels frame
size) in greyscales (Pixelvalue=[0..256] greys, where 0 is white and 256 is black).
2. Making difference pictures The arithmetic difference is calculated between two or more
consecutive images of the movie (MED).
3. Digital noise-reduction Videonoise is related to tape quality, camera resolution and light
conditions. In a video picture, pixels will change their colour ran The Communication Paradox and Possible Solution 1O3 domly. As a result "lonely pixels" will appear in the difference picture. The noise was reduced by
the application of a median filter to the difference pictures, where a pixel received the median
grey value of its nine surrounding neighbours.
4. Error detection Flashes result from poor videotape quality or technical problems of the
recording equipment. If such flashes occur, there is an immediate high change in mean grey
density in the overall picture. If this occurred the grey density for this frame (t) was replaced by
the difference density between points t-1
S. Calculate mean grey values for one or more predefined regions of the picture. These regions
have to be determined by hand In a newer version of the program one or more persons can
separated automatically from the background.
6. Standardising of mean grey values Different viewing angles of the person and changes of body
postures during the experiment lead to different overall values of mean grey density, because
visible contours change. Thus the mean grey values were transformed to z-scores.
7. Smoothing At this stage, noise was still present and resulted in small but very fast and short
changes. The scores then were smoothed with a 5-point moving average.
8. Thresholding Further analysis is possible if the continuous recordings are collapsed into events.
A threshold method was used. When the grey level change was under a certain threshold, the grey
level change was set to zero. The threshold was determined with an optimal thresholding method
(Sonka et al., 1993).
Figure 4a shows a videosequence of a female "Hair-Flip" and a male watching her. In Figure 4b
all difference pictures are shown. Figure 4c shows the transformed z-scores of the male and
female from Fig. 4a. The male shows no movement, three movement clusters were identified
(Burst 1, 2, 3). First a hand movement together with a head toss (Burst1, frame (17). Immediately
after this hand movement the hair flip starts (Burst 2, frames19 - 39). Finally, the female turns the
head towards the male (Burst 3, frames S8-65).
Thus we have the following descriptors for movement quality: Number of bursts, their duration
and the size of an event (the area enveloped by the burst). The number of elements in an event was
calculated by counting the number of maxima and minima in the burst. A burst can have several
elements which are produced by combining different movements. The number of elements
describes the complexity of the movement. Finally we were able to record speed of movement
change, which is the size of the burst divided by its duration. 2.2. Interactions between Strangers and Subliminal Manipulation
This procedure was applied by Grammer et al. (1996) to social interactions between strangers in a
waiting room situation. In an experiment, strangers of both sexes met and interacted for 10
minutes while they were videotaped with a hidden camera. The experiment was made in Japan
and in Germany. This situation can be described in terms of high risk of social non-acceptance;
thus communication should be forced into a manipulative level. After the experiment, the subjects
made a self report on their interest in their partner and how pleasant they found the interaction.
The first two minutes of the videoclips were analysed first with traditional methods and then with
AMA. on the one hand, traditional analyses yielded cultural differences be 104 K. Grammer et al Figure 4. A female Hairflip analyzed with AMA. Instance of a female Hairflip preceded by hand movement. In (a), the
original film sequence is shown (70 frames, every second frame skipped). The rectangles in frame I delimit the areas of
interest for which difference-pictures were calculated (b). Note the changes in greydensity (e.g. head movement). In (c),
terminology and the movement descriptors resulting from the automatic movie analysis (AMA) are shown: Number of
bursts (total amount of movement), and duration, size and speed of bursts (size divided by duration, i.e. greydensity
change per s). The number of elements (number of maxima) within a burst is considered to be a measure of complexity
since elements normally are produced by different movements. The three movement bursts of the woman identified by
the threshold method are in good agreement with the film sequence. The first burst depicts a hand movement
accompanied by a head toss (frames ~17), whereas the immediately following burst represents the hair flipping (frames
19 39). During the third burst, after a pause of about 1.5 s, she turns her head towards him (frames S~67). tween Japan and Germany. A typical Japanese behaviour is "nodding" which rarely occurs with
comparable high frequencies in Germany. Moreover, when the frequencies of the generic
behaviour codes were compared to male and female interest, no significant correlations were
found. on the other hand, AMA reached similar results in Japan and in Germany. Females
changed the quality of their behaviour when they had high interest in the male. These qualitative
changes were not due to mere nervousness or excitementthe The Communication Paradox and Possible Solutions 10S c Male 10 Burst 2
Max Burst 3 Max Burst 1 5 Max
ELEMENT 2.5 Size Threshold
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 Frames females actually moved more, but showed smaller and slower movements. These qualitative
changes give an impression of slow and determined movements where the single parts were
accentuated. Males reacted to these qualitative modifications positively and experience the
situation more pleasant although their interest in the partner is not affected. In addition, males
who perceive the situation positively talk more. Thus we can conclude that it is not the content
of a non-verbal behaviour, it is the quality of the movement which actually holds the
information about the interest of the sender.
one objection could be that the movements are generated by speech and lively conversation
between the interactants. This was not confirmed. The amount of speech did not correlate with
qualitative changes in the females behaviour nor with female interest. When the correlations
between movement data and interest where corrected for speech, it became clear that non-verbal
behaviour is the means of communication in this situation. Thus in real life situations where risk
is present, non-verbal behaviour plays the main role in communication. The evolutionary theory
behind this explanation is the fact that females actually have a greater risk in male-female
interactions than males (Trivers, 1972). But it is not only the risk of loosing investment:
actually the risk of being deceived by a male is quite high. In a questionnaire study by Tooke
and Camire (1991) 60% percent of the males reported that they had used deception in such
interactions. Thus it seems logical that females would try to manipulate the males slowly
without revealing their intentions in order to gather information about the male’s behaviour
tendencies. This is possible when the male feels pleasant and starts to talk and reveal
A second even more interesting result is the fact that behaviour is not a simple continuous flow
of movementsit is definitely structured into single bouts (see Figure 5). Times of movement
and non-movement alternate. Thus for a closer analysis we will look at the quality of the bouts
themselves, and take a look which information qualitative changes might provide. 2.3. Showing off: Simple Movements and Their Communicative Value
In order to accomplish this task, video-material from an observational study on female cycle and
self-presentation (Grammer et al., 1996) was used. The starting point of 106 K. Grammer et. al. f 8 a 7
fz 4 b 3
1 Threshhold 0
100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
Frames Figure 5. Movements from a two-minute interaction. Greydensity changes of a person throughout a whole film
sequence of two minutes. In (a), greydensities obtained by averaging all greyvalue differences of a difference-picture are
plotted (end of step S, cf. text for details). In (b), the data are z-transformed (step 6), smoothed (step 7), and a threshold
has been calculated (step 8). AMA identified I S bursts. this study was the fact that in humans we find almost complete female sexual cripsis. This
means that in contrast to most non-human primates, humans show no obvious behavioural or
visual sign for ovulation. There has been a lot of speculation about the function of female sexual
cripsis and it is proposed that it leads to male-female bonding, that it promotes active female
choice or that it may give a chance to females to induce sperm-competition (Alexander and
Noonan, 1979, Benshoof and Thornhill, 1979, Baker and Bellis, 1995). By hiding the point
where conception is quite likely, females would force the male to stay near them to ensure
conception. The male would do so to be sure that the female conceives his child and not that of
another male. The fact that the male has to stay near the female could promote the emergence of
a male-female relationship and lead to male investment in the offspring. The induction of spermcompetition and geneshopping only works if females use extra-pair copulations at the time of
maximal likelihood of conception. This is actually the case and leads to relative paternity
security of about 90% (Baker and Bellis, 1995) The function of sperm-competition is thought to
be either the optimisation of offspring quality and, if a relationship already exists, to obtain
genetically variable offspring. Grammer et al. (1996) showed that females who had a steady
partner and who came alone to an anonymous discotheque, showed more skin and weared
tighter clothes when oestrogen levels where high. Moreover, these females knew consciously
that they were signaling availability through clothingstyle. The Communication Paradox and Possible Solutions 107 Non-verbal behaviour can be used strategically when somebody tries to impress another person.
In the context of self-presentation non-verbal behaviour is irrepressible and impactful. It is offthe-record and it is difficult to describe verbally (DePaulo, 1992). Many studies on selfpresentation have shown that people can successfully make clear to others, using non-verbal
cues, the internal state that they are actually experiencing and that they also can convey the
impression of a state that they are not really experiencing. In this process many non-verbal cues
can play a role, even body movements, postures (Cunningham, 1977) and gait (Montepare et al.,
1987) can be used as signals. According to DePaulo (1992) women are more concerned with
self-presentation than men, they are non-verbally more involved and spontaneously more
out of these results we can hypothesise that the quality of behaviour may also change in order to
signal sexual availability. The reason for unobtrusive signaling in this case would be enforced by
the fact that active female choice has to exist for the induction of sperm-competition. Thus risk is
also present in such a situation because signaling sexual availability will obviously attract all
males. The risk is that one might attract the wrong males thus the signaling level should be subtle
and not exaggerated.
2.3.1. Turning AroundAn instance of Self-Presentation. During this study 123 females (mean age
23) were filmed from front and back in order to determine the clothing style. The females had to
turn around on a command issued by a female experimenter standing behind the camera. This
turn-around movement has no communicative value on the first sight. In this situation however
we had onlookers (male and female) and a videocamera. According to Goffman (1959) selfpresentation will take place as soon as a stage for the presentation is present. In this type of selfpresentation public self-conciousness plays also a role. This is defined as "an awareness of and a
responsitivity to the impressions that are being made on others" (Scheier and Carver, 1981,
p.198).We thus assumed that the turning-around movements can be used as communicative
tools. At the side of the subject, a male or female was placed for holding a blackboard with an
identification number on it. The female subjects all were non-pilltakers, who either were singles
or who had a steady partner but came alone to the discotheque. The result is a sex of
experimenter (male/female) times type of female (single/paired but alone) design. Oestrogen
levels which can give a hint on cycle state were determined by enzyme immuno assay from
saliva. In addition skin exposure and tightness were analysed by Grammer et al. (1996) with
digital methods and a rating scheme for tightness.
The turning movement was digitised in a 320x240 pixels size at 25 frames a second. Digitisation
started with the beginning of the turning movement and ended when the turning movement came
to a halt. These digital videos were analyzed with Automatic movie analysis, Motion energy
detection was applied for each subject on the screen, the experimenter and the female separately.
Duration of the movement, the number of bursts in the movement (the basic units of the
movement), complexity (the number of single maxima), the maximum speed of the movement,
and the information content (amount of grey value change per frame) were recorded (see figure
6). The resulting movement graphs then were played back parallel to the movies. This procedure
allowed a frame exact determination of the beginnings and the endings of the movements by
comparing the movement graph to the videopictures. The play back also showed that the
movements could be divided into several parts (see figure 6 and table 1). Basically a turn-around
can start with an intention movement prior to the actual turning. We find different types of
movement initiation. The head can introduce the movement as can one foot, one knee or a turn of
the upper body. Then the turning around itself follows. When the turning ends, the standing
position can 108 K. Grammer et al Figure 6. Female reaction to a command: Turning-around movements. Sequence (a) shows the digitized pictures for a
reaction to a command. The female command giver is standing at the camera. on the command "Please turn around" the
female turns around. On the left side of each picture the "stimulus male" can be seen. Sequence (b) shows the difference
pictures with the regions for the stimulus and the subject which are used to calculate the mean grey density. In (c) these
grey density levels are presented as z-scores. The movement has three phases: phase I where an initiation movement is
made. Phase 11 where the actual turn around movement happens and phase 111 where an additional movement is added
to the turn around (see also text). The description parameters for the movement are the same as in Fig.4c. The difference
to Figure 4 is that the threshold is calculated dynamically for each 10 frames in order to deal with shifts of grey density
changes. The Communication Paradox and Possible Solutions 7 109 Turning-Duration 6 Head 5 4 Threshold 3 Female movements
0 1000 2000 3000
Time 1/600 sec be corrected or additional movements can be added. These movements may consist of hip
swaying, hair-flip arm swaying or moving the body from the left to the right. Fig. 6 shows a
breast-presentation movement, where the upper body turns over after the stop of the actual
turning movement and the body silhouette becomes visible from the side. In addition a typical
"breast out-shoulder-back movement" occurs. These four types form only the main classes, the
additional movements can be done by any movable body part. The movements also were coded
"traditionally" with the observation categories presented in Table 1. 110 K. Grammer et al 2.3.2. Movement Quality and oestrogen Levels. In a first step a conventional statistical
analysis was performed in order to see if the four experimental groups are qualitatively
different. Table 2 shows the results which actually propose that the four groups can be
differentiated on the basis of partner status and stimulus person in only respect to information
content of the movement. The highest amount of information is present when the female is
confronted with a female stimulus. on the first glance this contradicts the showing off
If we go one step further, we find a high interrelation between the recorded movement
parameters themselves. Duration, the number of bursts, complexity and information are
positively correlated in all groups (n=12O, rs=0.65 to 0.84) and correlate negatively with speed
(-0.45 to -0.87). That is short movements have fewer bursts, are less complex and have less
information, and their speed is high. These interrelations suggest that there are physical
constraints on movements.
In a next step, we correlated oestrogen levels with the movement parameters. The
Bonferroni corrected correlations show that only females confronted with a male stimulus
change their behaviour quality. Single females make slower and more complex movements.
Both single and paired females show more information per time in their movements with
increasing oestrogen levels (Table 3).
2.3.3. Movement Quality and Stimulus Reaction. These results suggest that single females
react toward a male stimulus by changing the quality of their behaviour. But how does the
stimulus male perceive it? When we correlate the stimulus male’s behaviour quality with those
of the subjects, we do not find any significant correlation. Thus it seems that males do not
notice the female’s qualitative behaviour changes directly. It is also possible to directly test if
the male reacts to female oestrogen levels. The stimulus male’s behaviour depends on the
paired female’s oestrogen levels. He changes the duration of his The Communication Paradox and Possible Solutions 111 movements (n=14, rs=0.68) makes more bursts (rs=0.66) and makes more complex movements
(rs=0.57). The stimulus female does not show any significant reaction to other females oestrogen
But as we know, there is no direct coupling between the male’s and the female’s qualitative
behaviour changes, i.e. stimulus males’movement quality does not change parallel to subjects
qualitative change.-Maybe the males use other sources for information, like skin showing or
tightness of clothes.
If this is the case, the reaction of the stimulus males might not be due to qualitative
changes in female behaviour at all. The reaction of the males could be a reaction to the exalted
sexual signaling through clothing style. Thus we controlled the correlations between stimulus
behaviour and oestrogen levels for skin with partial correlations. The correlations for the male
stimulus reactions to unpaired females oestrogen levels do not disappear (df=11, duration: rs
partialized=0.64, p=0.02, complexity: 0.62, p=0.02).
So far the results suggest that the behavioural changes are actually changes which occur
together with high oestrogen levels. We may conclude that females who develop interest in a
male signal high oestrogen levels. If this assumption is true, then we should expect that these
changes are present in .all females with high oestrogen levels, and that it is impossible to
suppress these changes completely. Nevertheless, under the right stimulus conditions, females
could either fake or superelevate them suggesting a cognitive accessibility.
Females with higher oestrogen levels show higher information content in their movements
when they are confronted with the stimulus male but only in the case of paired females the male
reacts. We have found the highest values of information content when a female stimulus is
present but there is no difference between paired and single females when a male stimulus is
present (See Table 2).
If we look back at the considerations about possible communicative mechanisms, we find
many possible solutions for this. The most obvious one is that the simple physical measures we
used for movement description are not adequate, or there is a summation of different features
over time. So far we can exclude at least multimodal communication where skin showing and
movement quality add up.
2.3.4. A Neural Network Approach for the Analysis of Movement Quality: Parallel Distributed
Processing. In recent years, connectionism has become a focus of research in a number of disciplines. Neural networks represent a special kind of information processing: connectionist
systems simulated by a computer consist of many primitive cells which are working in parallel
and are connected via directed links. This forms an analogy to the human brain: the cells are
analogous to neurons and the links are the connections between those neurons. The main
processing principle of these cells is the distribution of activation patterns across the links similar
to the basic mechanisms of the brain. Information processing in the brain is based on the transfer
of activation from one group of neurons to others through synapses. In analogy to activation
passing in biological neurons each unit receives a net input that is computed from the weighted
output of prior units with connections leading to this unit. However, the most current neural
networks do not try to closely imitate biological reality.
In neural networks "knowledge" is distributed through the activation of cells and the
weighting of the links. The networks are organised by training. In supervised training, the
network "learns" a set of patterns together with their classification by repeated presentation.
Through this "learning process," classical logical conclusions are replaced by vague and
associative recalls. This is of advantage in all cases where no set of clear logical rules 112 K. Grammer et al can be given. After learning, the neural network can be able to classify unlearned patterns
correctly or not. In the first case we then can assume that in the patterns is at least some
information present which is common to certain classes. Unfortunately it is very difficult to
recall the information the network has used for classification.
Neural networks thus can be used to look if information is present in a pattern which then can be
used to classify these patterns. A neural network analysis was applied to the raw data from the
"showing-off’ study above. The network was constructed as a time delayed network (TDNN,
Waibel, 1989) on the SNNS-Simulator (Zell, 1994). The network embodied 10 (features) times
24 (total delay length) input units, 120 hidden units (receptive field) and 3 output units for low,
middle and high oestrogen levels. Time delayed networks do not use a static presentation of
patterns and they can be used for the independent recognition of features within a larger pattern.
The update algorithm forces the network to train on time/position independent detection of subpatterns. However, there is no specific set of rules on how to construct a network, and building
networks heavily relies on trial and error. Thus, the fact that it is not possible to train a network
does not mean that there is no information to learn.
We applied two basic training methods. First the network was trained with data from single
females with male stimulus. The validation was done with single females with female stimulus.
The data from paired females with male and female stimulus were then tested for classification
analysis. Second, the training was done with data from paired females with male stimulus, the
validation was done with paired females with female stimulus, and the testing with single
females with male or female stimulus.
The classification results showed astonishing stability: 66% of cases from method one and 70%
of the cases from method Two were classified correctly. A closer look reveals that the wrong
classifications were due to the fact that only low and middle oestrogen levels were classified
sometimes incorrectly as either low or middle but never as high. With both methods combined
high oestrogen levels were classified 100% correct. Thus the TDNN was able to discriminate
between high and middle/low oestrogen levels correctly (See Table 4) using MED data from
videopictures processed through AMA.
In order to isolate the movement prototypes, we calculated the mean movement curves for the
three classes of oestrogen levels. Figure 7 shows the results. The three curves for high, middle
and low oestrogen look differentbut when tested the only significant difference is in the
information content. Lowest oestrogen shows lowest information content in movement
(Median:34). Middle oestrogen shows middle information (Median:36) and highest oestrogen
level shows the highest information content (Median:39, K-W 1-Way Anova, p=0.118). This
result finally brings us back to coding with discrete categories. The Communication Paradox and Possible Solutions 113 a 3
Column 1 2 1 0 -1 -2 ovu-proto-data
0 50 100 150 3
Column 1 b 2 1 0 -1 -2
0 50 100 150 3 c 2 Column 1 1 0 -1 -2
0 50 100 150 Time/frames Figure 7. Movement prototypes for oestrogene levels in females. This figure shows the mean movement curves for the
three oestrogen levels: (high (a), middle (b) and low (c)). The black bars indicate the standard deviation for each frame.
The three movement phases and the apex (See Table I ) are indicated by dashed lines. 114 K. Grammer et al 2.3.5. Discrete Coding of Movement Patterns. In order to find out if certain discrete patterns as
described in Table 1 are connected to oestrogen levels a traditional coding was applied to the
digitised videos. It turned out that in none of the three phases discrete codes could separate
between oestrogen levels. The exception was an additional movement in phase three. The
presence of one or more additional movements occurred significantly more often under high
oestrogen levels. This relation was independent from the content of the behaviour (Median-test,
So far we can conclude that it is possible to describe intentions in communicative acts with the
help of qualitative changes in movements. Yet it is still unclear which changes are present,
because MED only crudely describes qualitative changes on a holistic level. Single movement
features are not captured by this method. Yet our hypothesis is confirmed that under high risk
conditions, communicative acts are forced to a level where it is only difficult to assess them with
generic coding methods. This situation has lead us to a series of new developments which we
currently pursue. 3.FUTURE DEVELOPMENTSALYSIS
The description of the whole human body and its moving parts seems to be an unsolvable
endeavour. In recent years however, digital analysis of human movements and bodies has moved
far away from simple MED approaches. Basically, all methods which have been used up to now
are derivatives from two approaches. Either the contours of moving or non-moving objects are
separated from a background, or the displacement of pixels or groups of pixels are calculated as
optical flow (Sonka et al, 1993). For instance, when we want to look at emotions, a method for
surface analysis of the face has to be developed in contrast to a three dimensional tracking
method for an arm. The assessment of movements will differ from the assessment of postures
and a method to translate the static states will also be necessary. Basically a body has to be
separated from its background and then divided into its segments. This means that the body has
to be dissolved into head, face, arms, body and legs. Each of these parts can then be described
separately. Interestingly enough, there are many approaches to solve the task of body movement
tracking. The isolation of body parts including head and face does not pose a problem. This task
has been solved repeatedly (Pentland, 1995). Kakadiaris, Metaxas and Bajcsy (1994) for instance
proposed an integrated approach to segmentation, shape and motion estimation of complex
articulated objects which can also be used for human bodies.
The best results in the human body tracking and action recognition are achieved at MIT Media
Lab (Maes et al., 199S). The ALIVE "Artificial Life Interactive Video Interface" allows wireless
full-body interaction between the human participants and a rich graphical world inhabited by
autonomous agents. Agents are modelled as autonomous behaving entities that have their own
sensors and goals and that can interpret the actions of the human and react to them in "interactive
time." Vision routines compute figure/ground segmentation and analyse the user’s silhouette to
determine the location of the head, hands, and other parts of the body in a colour image. This
self-calibrating stereo person tracker can recover the 3D shape and motion of the hands and head
of the moving person (Pentland, 199S).
our new developments work on a model base. The human body can be modelled as a system of
objects connected together by joints with one or more degrees of freedom. Tracking motion of
human body can be formulated as the real-time visual tracking of kinematic chains. A kinematic
object is a collection of objects connected by joints. With The Communication Paradox and Possible Solutions 11 5 each object a local co-ordinate system can be associated to specify its 3D position and
orientation. Since the objects are connected instead of using a six dimensional vector for each
object to describe its 3D position and orientation, the joint parameters can be used to define the
mutual relationships of the objects and degree of freedom. We refer to these parameters as
kinematic parameters. For modelling the shape of the objects different models can be used
ranging from line, plane, to more sophisticated surface models. We refer to this parameters as
shape parameters. As the objects project onto the image plane, the image data may reflect the
texture of the objects surface, the object contour, the optical flow if the objects are in motion, etc.
We refer to this as image features. The problem of tracking articulated objects can be viewed as
an estimation of the objects kinematic and shape parameters from image features. our aim is to
isolate the (movement) vectors for the joints of the model. This first will give a position vector
for all body parts and second we will get a movement vector for the head, which allows to gather
data for the assessment of gaze direction, the shoulders, the elbows, the wrists, the lower body,
the thighs, the knees and the ankles. These movement vectors then will be applied to a rendered
simultaneously moving model. Speech will be processed simultaneously for loudness and
frequency, thus allowing comparisons between movements and speech.
This analysis will produce a continuous data-stream that can be analysed in various ways. Each
posture is defined as an unique set of vectors and each movement through unique changes of
these vectors. Its advantage is that it does not need any interpretation on a higher level through
an observer nor a computer based expert system which tries to reinterpret the movements. The
vector data can be fed directly to neural nets for pattern recognition, and the patterns can be
verified through rating studies. Future applications are person-recognition from gait, the
monitoring of therapy success and the comparison of quality of signals in different species,
under different contexts and physiological conditions. 4. COMMUNICATION THEORY AND PHYSIOLOGICAL STATES
Although it seems that we are just proposing a new method, this approach has consequences not
only for the observation but also for the explanation of behaviour.
The main advantage of this approach, when it is compared to those using conventional coding
methods, is that no presupposition on the structure and content of behaviour is made. This frees
us from restrictions of conventional coding methods. With the use of conventional codes we can
only find what we have put in the codes; behavioural codes are already hypotheses about
behaviour. Behavioural codes are categories which represent many, sometimes different
behavioural events. Although these categories correspond to the basic construction principles of
our brain, which uses prototypes in order to reduce environmental information (Rosch, 1978), the
assumption that communication works on the same level may be wrong. Indeed signals can be
organised as prototypes but this is not necessarily s0. We have shown that communication
between humans can work on a level where no categorisation exists. The fallacy of looking at
communication with the principles of the apparatus involved in it leads to a false and incomplete
understanding of the nature of communication.
With this approach we possess the almost complete data-stream and we can look at how the
brains of both the receiver and sender actually construct communicative reality. our approach
allows the manipulation of stimuli which can be tested against reality. We will propose a new
communication theory which is in its nature multi-modal and multilayered with different
channels and many possible communicative principles. 116 K. Grammer et al The starting point for such a theory is that like the evolution of intelligence, the evolution of
human communication has its basic constraints in machiavellism. Human brains are devices for
processing information. We can suppose that there was differential survival and reproduction
connected to optimal information processing (Lorenz, 1973, Cosmides et al. 1992). If there are
adaptations to optimal information processing, these adaptations can be exploited. Thus,
communication research has to deal with the constraints and possibilities presented by theses
adaptations. Future studies should look for adaptive information processing structures which
could be exploitable through communication. We suppose that low-level processing of
information is at least one possibility. This means that levels where the basic information is
extracted from visual stimuli could be exploited. Comparable approaches could be made with the
complexity of a stimulus. The less complex a stimulus is, the easier it could be decoded,
producing higher levels of excitation in the brain. This is basically an open field but it can not be
mastered with traditional coding and research methods. The problems which are connected to any
communication theory we have shown in the introduction starting from the possibility of
deception and ending with the possibility that noise is used tactically to veil intentions are
avoided by our methodological approach. The method can even deal with repeated meaning
encoded in pulse rate modulation when small changes or movements are repeated in time as
shown in the waiting room study.
We propose that there are multi-layered processing mechanisms. The top layer for processing
holds consciously accessible information. The bottom layers can not be assessed directly and
controlled. on the top layer communication is actually an accessible information exchange about
the real world with its social and ecological aspects. We can tell each other what we think about
each other or gossip about others (Dunbar, 1993), we can create and use non-verbal signals like
gestures (Morris et al., 1979) differently in different cultures and we are able to lie and detect
on the other hand we are able to veil our intentions by many measures like the creation of noise.
We try to manipulate each other’s physiological states and influence information processing in
our social environment. This is the basic assumption of a new communication theory: brains are
able to exploit others brain’s functions and structures in order to manipulate them. These
manipulations are intended and planned, and conscious access to these plans is not necessary for
their realisation. Qualitative changes of behaviour which are present under different oestrogen
levels can be used intentionally when male brains have been selected for detection of qualitative
changes in behaviour caused by oestrogen levels which promise stable female cycles and
successful reproduction. Such an approach do" not need necessarily innate behavioursjust
basic construction principles for "what brains like." If our brain perceives "approaching speed" as
danger then any fast movement toward another will be interpreted as threat. In such a way
individuals could learn to use the same behaviours again and again. This would explain the wide
variety and individual differences.
We are able to show that on the sender’s side information might be encoded in the quality of
behaviour. Females seem to do so under high risk conditions. This corresponds to the fact that
females are more sensitive to the production and decoding of non-verbal behaviour (Rosenthal
and Depaulo, 1979). The fact that communication is goal directed and depends on the pursued
goals and possible risk of not achieving the goal has been neglected so far.
The results from qualitative movement changes in both studies have theoretical consequences. So
far it is the first time where it has been shown that females try to manipulate male perception
directly. Moreover, there should be at least some conscious The Communication Paradox and Possible Solutions 117 asessment of cycle state, because females signals can become more obvious when a stimulus is
present and when the female is at an ovulatory stage. This underlines the hypotheses that female
sexual cripsis has indeed the function of promoting active female choice and thus can be used to
induce sperm competition.
Showing off and subliminal manipulation is a means to manipulate the perception of one’s self
through others, and there is no need to assume that this is done consciously. In this article we
have shown that the principles are comparable: changing the quality of behaviour, so that the
receiver actually can not access the changes directly. This brings up another principle of
manipulative communication. The sender has to avoid that the receiver might be able to learn.
We can suppose that there is pressure on learning signals very fast in order to assess other’s
intentions early and reliably. Thus, communication should be variant and use different means in
the same situations constantly. This leads to a model of parallel distributed processing for the
decoding of meaning. The results on the classification of movements through neural nets propose
such a model, although the model is only a poor approximation of human parallel processing.
Classical communication theories also do not account for the fact that signaling is not only about
external information, but also about internal states and the manipulation of internal states which
are encoded in behaviour quality. An exception to these hypotheses seems to be emotions which
can be produced as signals. The problem in this is that the nature and signal value of emotions
are unclear and it is not known to what extent qualitative changes in facial muscle movements
affect emotional interpretation. There are some hints that actual movement quality is the cue
which could be used for decoding information and not the actual configuration of muscle
movements. Emotions among expressive actors are recognised easier than emotions among nonexpressive actors (Wallbott, 1990). The solution to the communicative paradox thus lies in the
possibility to observe the actual nature of communication with the help of new methods. Only
behaviour recordings which are free from interpretation and which produce direct data are useful
for the detection of communicative principles. ACKNOWLEDGMENT
Funded by the Jubiliumsfond of the Austrian National Bank, P5676. REFERENCES
Alexander, R. D. & Noonan, K. M. 1979. Concealment of ovulation, parental care, and human social evolution. In:
Evolutionary biology and human social organization (Ed. by N. A. Chagnon & W. G. Irons), pp. 436 4S3.
Duxbury: North Scituate.
Arbib, M. A. & Hansen, A. R. 1987. Vision, Brain and Cooperative Computation: an overview. In: Vision, Brain
and cooperative computation (Ed. by M. A. Arbib & A. R. Hansen), pp. I86. Cambridge MA: The MIT
Argyle’ M. 1988. Bodily communication. London: Methuen.
Baker, R. R. & Bellis, M. A. 199S. Human sperm competition. Copulation, masturbation and infidelity. London:
Chapman and Hall.
Bassili. J. N.1979. Emotion recognition: The role of facial movement and the relative importance of upper and
lower areas of the face. J. Personality Soc. Psych., 37, 2O49 2OS8.
Benshoof, L. & Thornhill, R. 1979. The evolution of monogamy and concealed ovulation in humans. J. Soc. Biol.
Struct., 2, 9SIO6. 118 K. Grammer et al. Bernieri, E J. & Rosenthal, R. 1991. Interpersonal coordination: behaviour matching and interactional synchrony. In:
Fundamentals of Nonverbal Beharior Part K Interpersonal Processes (Ed by Feldman and Rime), pp. 4OI- 431.
Harvard: Harvard University Press.
Berry, D. S., Kean, K. J., Misovich, S. J. & Baron, R. M.1991. Quantized displays of human movement: a
methodological alternative to the point light display. J. Nonverb. Behav, I S, 1-97.
Brown, P. & Levinson, S. 1978. Universals in Language Usage: politeness phenomena. In: Questions and Politeness.
Strategies in Social Interaction. (Ed. by E. Goody), pp. 56 289. Cambridge: Cambridge Univ.Press.
Chance, M. R. A. & Russel, W. M. S. 19S9. Protean displays: a form of allaesthetic behaviour. Proc. Zool. Soc. London,
132, 65 70.
Cosmides, L., Tooby, J. & Barkow, J. H. 1992. Evolutionary psychology and conceptual integration. In: The adapted
mind (Ed. by L. Cosmides, J. Tooby & I. H. Barkow), pp. 3-18. Oxford: Oxford University Press.
Cunningham, M. R. 1977. Personality and the structure of the nonverbal communication of emotion. J. Personality, 4S,
Cutting, J. E. & Proffitt, D. E. 1981. Gait perception as an example of how we may perceive events. In: Intersensory
perception and sensory integration (Ed. by R. D. Walk, & D. E. Proffitt), pp. 249-273. New York: Plenum Press.
Dawkins, R. & Krebs, J. R. 1981. Signale der Tiere: Information oder Manipulation In: Eco-Ethologie. (Ed. by J. R.
Krebs & N. B. Davies), pp. 222-242. Berlin und Hamburg: Parey.
De Paulo, B. M. 1992. Nonverbal Behavior and Self-Presentation. Psych. Bull., 111/2, 2O3 243.
Dunbar, R. 1. M. 1993. Coevolution of neocortical size, group size and language in humans. Behav.Brain.Sci., 16, 681
Eibl-Eibesfeldt, 1. 1972. Similarities and differences between cultures in expressive movements. In: Nonverbal
communication. (Ed. by R. A. Hinde), pp. 297-312. Cambridge: Cambridge University Press.
Eibl-Eibesfeldt, l. 1989. Human Ethology. New York: Aldine de Gruyter.
Ekman, P. & Friesen, W. V. 1969. Nonverbal leakage and clues to deception. Psychiatry, 32, 88 1 O6.
Ekman, R & Friesen, W. V. 1971. Constants across cultures in the face of emotion. J. Personality Soc. Psych., 17, 124
Ekman, R & Friesen, W. 1972. Hand Movements. J Comrnunication, 22, 3S3 - 374.
Ekman, R & Friesen, W. 1978. Facial Action Coding system. Palo Alto, CA: Consulting Psychologists Press.
Ekman, R, Friesen, W. V., O’Sullivan, M. & Scherer, K. R. 1980. Relative importance of face, body, and speech in
judgements of personality and affect. J Personality Soc. Psych., 38, 27O 277.
Ekman, R, Levenson, R. W. & Friesen, W. V. 1983. Autonomous nervous activity distinguishes among emotions.
Science, 221, 12O8 12O9.
Essa, 1. & Pentland, A. 199S. Facial expression recognition using a dynamic model and motion energy. Int’l Conference
on Computer Vision, Cambridge, MA, June 2O 23, 1995.
Forgas, J. R 1992. Affective Influences on Partner ChoiceRole of Mood in Social Decisions. J Personality Soc. Psych.,
Frey, S. & Pool, J. 1976. A New Approach to the Analysis of Visible Behaviour. Forschungsberichte aus dem
Psychologischen Institut der Universitt Bern. Bern.
Goffman, E. 19S9. The presentation of self in everyday life. NewYork: Doubleday.
Grammer, K. 1989. Human Courtship: Biological Bases and Cognitive Processing, In: The sociobiology of sexual
and reproductive Strategies (Ed. by A. Rasa, C. Vogel & E. Volland), pp. 147 - 169. London: Chapman and
Grammer, K. 1992. Intervention in conflicts among children: context and consequences. In: Coalitions and alliances in
humans and other animals. (Ed. by A. Harcourt & F. deWaal), pp. 259 283. Oxford: Oxford University Press.
Grammer, K. 1991. Strangers meet: laughter and nonverbal signs of interest in opposite-sex encounters. J. Non
verb. Behav, 14, 2O9 - 236.
Grammer, K. 1995. Signale der Liebe . 3., neu berarbeitete Auflage. Mnchen: dtv-Wissenschaft.
Grammer, K. & Eibl-Eibesfeldt, l. 1989: The ritualisation of laughter. In: Natrlichkeit der Sprache und der. Kultur
(Ed. by W. A. Koch), pp. 192 - 214. Bochum: Brockmeyer.
Grammer, K., Honda, M. & Schmitt, A. 1996. Human courtship: digital image analysis of body movements. J
Personality Soc. Psych., under revision.
Grammer, K., Jtte, A. & Fischmann, B. 1996. Der Kampf der Geschlechter und der Krieg der Signale. In: Sexualitt im
Spiegel der Wissenschaft. Edition Universitas, Stuttgart:Hirzel. In press.
Grammer, K. & Kruck, K. 1991. Decision making in opposite sex-encounters: love at first sight ?. Kyoto, 22nd
International Ethological Conference. The Communication Paradox and Possible
S l ti
Grammer, K. & Kruck, K. 1996. Female control and female choice. In: When women want sex: perspectives on
female sexual initiation and aggression. (Ed. by B. Anderson & C. Struckmann-Johnson) New York:
Grammer, K., Kruck, K. & Magnusson, M. 1996. The courtship dance: mathematical algorithms for pattern
detection in non-verbal behaviour. J. Nonverb. Behav. (under revision).
Grammer, K., Schiefenhvel, W., Schleidt, M., Lorenz, B. & Eibl-Eibesfeldt, 1. 1988. Patterns on the Face: brow
movements in a crosscultural comparison. Ethology, 77, 279 - 299.
Harper, D. G. C. 1992. Communication. In: Behavioural ecology. An evolutionary approach. (Ed. by J. R. Krebs
& N. B. Davies), pp. 347 - 398. Oxford: Blackwell.
Johansson, G. 1973. Visual perception of biological motion and a model of its analysis. Perception &
Psyc.hophysics, 14, 2OI211.
Johansson, G. 1976. Spatio-temporal differentiation and integration in visual motion perception. Psychol. Res.,
Kakadiaris, l. A., Metaxas, D. & Bajcsy, R. 1994. Active part-decomposition, shape and motion estimation of
articulated objects: A physics-based approach. Proc. of IEEE Conference on Computer Vision and Pattern
Recognition, pp. 98O-984. Seattle, Washington.
Krauss, R. M., Apple, W., Morency, N. L., Wenzel, C. & Winton, W. 1981. Verbal, vocal, and visible factors in
judgements of another’s affect. J. Personality Soc. Psychol, 40, 312-320.
Kraut, R. E. & Johnston, R. E. 1979. Social and emotional messages of smiling: An ethological approach. J
Personality / Soc. Psych., 37, I S39 15S3.
Lorenz, K. 1973. Die Rckseite des Spiegels. Mnchen: Piper.
MacArthur, L. Z. & Baron, R. M. 1983. Toward an ecological theory of social perception. Psychol. Rev, 90, 21S
Maes, P., Darrell, T., Blumberg, B. & Pentland, A. 199S. The ALIVE system: wireless, full-body interaction with
autonomous agents. Proc. Computer Animation, IEEE Press, April 199S.
Malatesta, C. A. & Izard, C. E. 1984. The facial expression of emotion: Young, middle-aged, and other adult
expressions. In: Emotion in adult development (Ed. by C. Z. Malatesta & C. E. Izard), pp. 2S3 273. Beverly
Hills, CA: Sage.
Markl, H. 198S. Manipulation, modulation, information, cognition: some of the riddles of communication.
Fortschritte der Zoologie, 31, 163-194.
Mehrabian, A. 1972. Nonverbal communication. Chicago: Aldine.
Moghaddam, B. & Pentland, A. I99S. Probabilistic visual learning for object detection. Int’l Conference on Com
puter Vision, Cambridge, MA, June 2O 23 199S.
Montepare, J. R, Goldstein, S. B. & Clausen, A. 1987. The identification of emotions from gait information. J
Nonverb. Behavª 11, 33 42.
Moore, M. M. 198S. Nonverbal courtship patterns in women: context and consequences. Ethol. Sociobiol., 6,
Morris, D., Collett, B., Marsh, P. & O’Shaugnessy, M. 1979. Gestures, their origins and distribution. London:
Pentland, A. 199S. Machine understanding of human action. M.I.T. Media Laboratory Perceptual Computing
Section Technical Report N0.3SO, Sept.199S. Appeared: 7th Int’l Forum on Frontier of Telecommunication
Technology, Nov. 199S, Tokyo, Japan.
Pentland, A. & Liu, A. 199S. Toward augmented control systems. IEEE Intelligent Vehicle Symposium 9S,
September 2S-26, Detroit, Ml.
Provine, R. R. & Young, Y. L. 1991. Laughter: a stereotyped human vocalization. Ethology, 89, 1 I S 124.
Rosch, E. H. 1978. Principles of Categorization. In: Cognition and Categorization (Ed. by E. Rosch & D. Lloyd),
pp. 27 48. Hillsdale: Erlbaum.
Rosenthal, R. & Depaulo B. M. 1979. Sex differences in eavesdropping on non-verbal cues. J. personality: Soc.
Psychol., 37, 273-28S.
Runeson, S. & Frykholm, G. 1983. Kinematic specification of dynamics as an informational basis for person-and
action perception. Expectation, gender recognition, and deceptive intention. J Exp. Psychol., 112, S8S-61S.
Scheier, M. F. & Carver, C. S. 1981. Private and public aspects of self. In: Review of personality and social
Psych chology,, Vol. 2 (Ed. by L. Wheeler), pp. 189 216. Beverly Hills, CA: Sage.
Schleidt, W. M. 1973. Tonic communication:contionous effects of discrete signs in animal communication
systems. J. Theoret. Biol., 42, 369 386.
Siddiqi, J.A., Schwind, H.L. & Voss, H.G. 1973. Irrelevanz des Inhalts Relevanz des Ausdrucks. Z Experimen
tielle und Angewandte Psychologie, 2O, 472 488. 11
9 120 K. Grammer et al
Sonka, M., Hlavac, V. & Boyle R. 1993. Image Processing, Ana/ysis and Machine Vision London: Chapman and Hall.
Starner, T. & Pentland, A. 199S. Visual recognition of american sign language using hidden Markov models Proc Int’I
Workshop on Automatic Face- and Gesture-Recognition, Zurich, Switzerland, June 26 28, 1995.
Tooby, J. & Cosmides, L. 1990. On the universality of human nature and the uniqueness of the individual: the role of
genetics and adaptation. J Personality, S8. 1.
Tooke, W. & Camire, L. 1991. Patterns of Deception in Intersexual and Intrasexual Mating Strategies. Ethol. Sociobiol.,
12, 345 345.
Trivers, R. L. 1972. Parental investment and sexual selection. In: Sexual selection and the descent of man 1871-1971.
(Ed. by B. Campbell), pp. 136 179. Chicago: Aldine.
Van Hooff, J. A. R. A. M. 1972. A Comparative Approach to the Phylogeny of Laughter and Smile. In: NonVerbal
Communication. (Ed. by R. A. Hinde), pp. 2O9-241. Cambridge: Cambridge University Press.
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K. & Lang K.J. 1989. Phoneme recognition using time-delay neural
networks. IEEE Transactions on Acoustics, Speech, and Signal Processing, 3713, 328 339.
Wallbott, H. G. 1990. Mimik im Kontext. Gttingen: Verlag fur Psychologie Dr.C.J.Hogrefe.
Wallbott, H. G. 1991. The Emotional in Social Psychology and the Social in Emotion PsychologyAn Overview
Concerning the Intersection Between Social Psychology and Emotion Psychology. Z. fr Sozialpsychologie, 22/1,
Zell, A. 1994. Simulation neuronaler Netze. Bonn: Addison-Wesley. ...
View Full Document
This note was uploaded on 05/12/2010 for the course PSYCHOLOGY clinical p taught by Professor Assistant during the Spring '10 term at École Normale Supérieure.
- Spring '10