How Can We Measure Internal States And Traits? A Behaviour-based Approach

White Paper Series Part 2

Vima’s technology infers personality traits, soft skills, and emotions by capturing the behaviours that human experts consistently perceive and use in their assessment. For example, extraverts appear talkative and energetic, which reflects in more speaking and a louder voice compared to introverts. These and many other behaviour indicators can be measured objectively using visual and acoustic analysis (will be covered later in this series).

Vima’s technology attains levels of reliability and accuracy never reached before while reducing bias (see part 5 on the human bias). This is only possible through continued empirical research in collaboration with renowned and independent research centres (HEC at the University of Lausanne, Idiap Research Institute, Switzerland).

Below we explain the scientific principles of behaviour-based psychological assessment that help to understand the processes underlying Vima’s technology.

Multimodal Behaviour


Decades of research on emotion as well as personality expression and judgment show that people behave in systematic ways that express their feelings, skills and personality traits. People are also quick and often correct at inferring what others are thinking and feeling from merely observing them.

These studies converge to the consistent finding that verbal and nonverbal behaviours act as cues to the emotions, skills and personality traits of a person. These are picked up by others and used to form impressions even in very brief interactions and with no or limited verbal content. It is important to integrate behaviours from multiple modalities (voice tonality, speech content, facial expression, posture and gesture) to gain a full understanding of the underlying functions. Furthermore, nonverbal expressions are generally less self-monitored (i.e. adapted to the interaction situation) than verbal expressions and thus help to build more truthful inferences.

Multimodal behaviour analyses integrated behaviours from the different communication channels. Two broad categories are defined :

  • Verbal behaviours describe what the person is saying. The use of language involves both phonology (how sounds form words), syntax (grammar), semantics (word meaning) and pragmatics (the context of the interaction). For example, extraverts speak more and address others more often than introverts. They use more social and positive emotion words, first-person pronouns (e.g I, me, myself), and present tense verbs.


  • Nonverbal behaviours describe how the person is moving and talking (often referred to as “body language”). They cover both static appearance-based cues (e.g. body height, clothing style) and dynamic visual and auditory cues coming from different channels of communication: the face, the hands, the body and the tonality of the voice. In an employment interview, for example, the applicant’s frequency of smiling, upright posture and fluent speech all contribute to a positive evaluation by the recruiter.

Implication For Measurement Techniques


The lens perspective described above has some methodological implications. The type of data gathered from typical judgment studies using non-expert raters informs us about people’s general perceptual representations of behaviours and intuitive inferences of personality. When the interest lies in understanding personality beyond basic intuition, then standardized expert assessments and direct behavioural measurements are required. Behaviour-based assessment involves trained expert coders and/or computational techniques for feature extraction.

In sum, there are two bottlenecks for accurate person assessment:

  1. The quality of the expression of the internal traits or states (left side of Figure 1).
  2. The number, accuracy and precision by which behaviours (i.e. valid indicator cues) are measured or observed (middle and right side of Figure 1).

Vima has actively tackled these challenges and developed proprietary solutions to achieve the following:

  1. Making sure we have standardized and representative situations where soft skills and personality are expressed (self-presentation, common interview questions probing for expressing stress-resilience, persuasiveness, conflict management, team orientation, work organization, etc.)
  2. Scaling up the number of verbal and nonverbal behaviours measured from video using automatic extraction.

Removal of these bottlenecks accelerates the development of knowledge on the relationship between internal states and traits and behavioural expressions. Looking not far back, practitioners and researchers in the field of nonverbal behaviour relied mostly on subjective interpretation or coding behaviour by one of a handful of persons (mostly just the interviewer).

With continued research powered by cutting-edge computational technologies, we expect that the knowledge base for accurate person assessment will increase rapidly. Vima leverages this knowledge in its development of new applications by being at the forefront of innovation in the field.



Inferences From Multimodal Behaviour


Given the inherent complexity of the human communication system, behavioural cues are often uncertain and partly redundant. This basically means that there is more than one way of expressing and perceiving a specific trait (e.g. personality trait, skill) or state (e.g. emotion).


Figure 1. A lens perspective on the expression and perception of internal traits and states

A particular trait or state does not have a one-to-one relationship with a fixed set of behaviours (this is illustrated in Figure 1). Instead, the same behaviour can correlate with more than one state or trait. Therefore, emotion or personality differentiation is probabilistic, not deterministic. For example, observing a person smiling can indicate that this person feels happy, but could equally imply that this person has friendly intentions or even feels nervous.

Linking Expression And Perception


The question of how behavioural cues relate to inferences of psychological states or traits can be answered in different steps along the communication process. Figure 1 illustrates these steps using a lens perspective. A person expresses his or her personality, skill, or emotion by several indicator cues in the face (e.g. smiling), body (e.g. forward-leaning posture), hand gesture (e.g. illustrating speech), speech content (e.g. positive experience words) and tonality (e.g. variation). In the presence of such valid cues, the observer can then use this information to infer the person’s skill, emotion or personality (e.g. high extraversion), based on his or her previous expert and/or personal experience and beliefs.

A lens perspective on human communication integrates two processes: 1) Expression: the observed behaviour emits enough valid cues to the person’s personality, skill, or emotion (“validity”), 2) Perception: the observer is able to perceive these cues given the environment, and makes a correct interpretation of the relationship between each cue and the personality, skill, or emotion that is inferred (“cue utilization”).

Accurate judgment requires both appropriate expression and perception, and thus a “shared code” between expresser and perceiver. In the example of a job interview, appropriate expression can be obtained when questions or scenarios are used that target specific skills or traits and are able to elicit expressive behaviour. Appropriate or effective perception means that the observer (e.g. interviewer) is trained and/or highly perceptive, picks up on the available cues and uses them correctly.

In sum, the lens perspective helps to identify different steps in the communication process, which all need to be taken into account for accurate person assessment based on behaviour observation. Inaccurate judgement may result from an absence of valid cues (e.g. irrelevant job question), non-use of available valid cues (e.g. poor detection skills), or, from inaccurate use of valid information and use of irrelevant information (see part 5 on the human bias).

Vima takes into account all steps in the communication process so as to make observation-based person assessment as accurately as possible. For example, Vima’s research and collaborations focus on a) the development of relevant scenarios or questions for recording behaviour, b) on establishing the number and selection of annotators observing the recorded behaviour following a standardized coding protocol, and c) maximizing the number and precision of automatically extracted multimodal behaviours. More details can be found in the next section and in the next parts of this white paper series.