MSc-IT Study Material
June 2010 Edition

Computer Science Department, University of Cape Town
| MIT Notes Home | Edition Home |

Cognitive Modeling

As we discussed at the start of this unit, a model is a representation of something that we can easily work with. In the previous section we looked at models of user requirements – essentially models of users’ context. In contrast, this section considers models of users’ cognitive abilities. That is, rather than looking at the context in which users work we will consider models of how people behave. These cognitive models are important because they provide a way of understanding people’s cognitive abilities which can be used to inform design and, as we shall see in Unit 9, to evaluate systems. In this section we look at two cognitive models: GOMS and ICS.

GOMS

GOMS uses an explicit model of cognitive processes developed from studies of users. GOMS itself stands for Goals, Operators, Methods, and Selection rules. This core set of concepts describes the tasks that users perform, and how they achieve them. GOMS is more than just a way to describe tasks though, it provides a family of modelling techniques which can be used to generate predictions of the time to complete tasks given descriptions of the tasks and the user interfaces to be used. These predictions are based on a simple model of cognition called the Model Human Processor (MHP) and have an absolute accuracy of between 10% and 20%. The basic assumption of GOMS is evident in its name – that users understand their goals, GOMS considers the actions to meet such goals. Having made this assumption GOMS can give us feedback on the coverage and consistency of the user interface. By coverage we mean whether the user interface contains the necessary functionality to support the tasks it was designed for. Consistency refers to whether similar tasks are performed similarly with the user interface. Moreover, GOMS can give an indication of whether frequent goals can be achieved quickly using the given interface.

Model Human Processor

The central part of GOMS (Card, Moran, and Newell, 1983; see Chapter 6 of Dix 1998 for a summary) is the Model Human Processor (MHP). This is a simplified view of the psychological processes involved in human cognition which can be used to make approximate predictions of user behaviour. Card et al. based this model on empirical evidence i.e. psychological studies of users. Even though it is based on psychological evidence and theory the motivation for the MHP is that is should be usable by non-psychologists e.g. designers and evaluators. The MHP is consists of:

  • set of processors – systems which process information and may make decisions

  • set of memories – areas in which information is stored in various forms by processors

  • set of principles of operation – these guide the operation of the processors and are crucial to a realistic account of human behaviour

The MHP itself is divided into 3 interacting subsystems: the perceptual system, the motor system, and the cognitive system. Each of these subsystems has their own memories and processes and works on different kinds of information as illustrated in the following diagram.

  1. Perceptual system - input from the eyes and ears are stored symbolically in visual and auditory image stores which go on to be processed in working memory by the cognitive system, and possibly stored in long term memory for later access. E.g. the sound of a car horn is coded as a loud warning.

  2. Cognitive system - processes information from working and long term memory to make decisions about how to respond. E.g. processing the fact that there is a loud warning noise, recalling that evasive action should be taken in such situations, and deciding to run to the pavement.

  3. Motor system - executes responses to decisions made by the cognitive system. E.g. running to the pavement.

Each component of these systems has parameters which are used in the generation of behavioural predictions. Each memory system is defined in terms of its storage capacity (how many pieces of information can it hold at one time), and the decay time of an item (how long is it before an item is forgotten). Similarly, each processor is defined in terms of processor cycle times – how long it takes to process one piece of information. Card et al. (1985; see Johnson, 1992, for a summary) defined approximate values for these parameters based on psychological research which are listed below. Note that these values assume skilled (errorless) performance and only predict time for responses to stimuli – not for self initiated actions.

Table 7.1. Memory parameters

Memory type Storage capacity Decay time
Auditory4 letters1500 msec
Visual17 letters200 msec
Working memory7 chunks of information7 sec
Long term memoryUnlimitedNever


Table 7.2. Processor parameters

Processor typeProcessing cycle time
Processor type100 msec
(eye movement)230 msec
Cognitive70 msec
Motor70 msec


So, we have described the MHP, but we need to return to the core of GOMS – the goals, operators etc. – before we can see how these parameters are used to make predictions.

Goals

Goals are something that the user wants to achieve e.g. go to airport, delete a file, or create a directory. They have a hierarchical structure – that is they are composed of many sub-goals which need to be achieved to meet the larger goal. These are similar to the goals identified in Task Analyses (see Unit 8).

Operators

Operators are elementary (can not be decomposed into smaller operations) perceptual, motor or cognitive acts which are necessary to change user’s mental state or environment. As such they are the lowest level of a GOMS analysis. Using GOMS a user’s behaviour can be recorded as a sequence of operators as operators can’t occur concurrently. They are a similar level of description as actions in task analysis (see Unit 8).

For example, to move a file to a different folder the user might perform the following operations:

  • Move cursor to item

  • Hold mouse button down

  • Locate destination icon

  • Let go of mouse button

Methods

From operators we build up methods which are sequences of steps that accomplish a goal (and so are like tasks in a task analysis (see Unit 8)). As with goals these methods can include other (sub) goals. A fundamental assumption in GOMS is that methods are learned and routine (so no problem solving involved), and that there is only one way a user stores knowledge of a task.

For example, a user moving a file to a different folder could be described in GOMS as:

  • Goal – move file to a different folder

  • Method – move file

  • Operators - Move cursor to item, Hold mouse button down, Locate destination icon, Let go of mouse button

Selection Rules

If there is more than one method to accomplish a goal, the Selection rules tell you which method to use. Again, as with methods, they assume error-free performance (so the user does not selected the wrong method by accident). They are written as IF … THEN statements as below:

IF <condition> THEN accomplish <GOAL>

For example:

IF <restaurant accepts credit cards> THEN <pay by credit card>

ELSE

IF <restaurant accepts cheques> THEN <pay by cheque>

ELSE

<pay by cash> 

Activity 4 – GOMS

For the example in Activity 3 construct a GOMS model of a customer withdrawing money from a cash machine.

A Discussion on this activity can be found at the end of the chapter.

Keystroke Level Model

The lowest level of GOMS analysis is called the Keystroke Level Model (KLM). This produces quantitative predictions of the time it would take a skilled operator to complete a task. Again, it assumes error-free performance by the operator.

Execution of a task is described in terms of

  • 5 physical-motor operators:

    1. Tk: (k)eying – how long it takes to press a key (including using modifiers such as the shift key)

    2. Tp: (p)ointing – how long it takes to move the mouse (or other such input device) to a target

    3. Th: (h)oming – how long it takes to change between input devices e.g. changing between mouse and keyboard

    4. Td: (d)rawing – how long it takes to draw a line using an input such as a mouse

    5. Tb: click (b)utton – how long it takes to click the mouse button

  • Tm: (m)ental operator – how long it takes to perform the mental processing for the task

  • Tr: system (r )esponse operator – how long the system takes to respond

Therefore, execution time for a task is described in terms of the sum of the operators used. For example, suppose we had typed the sentence the quick fox jumps over the lazy dog. Now we want to insert brown just after quick, using a word processor, and assuming that the current point is at the end of the sentence, we need to perform the following steps:

  1. move hand to mouse

  2. position mouse just after quick

  3. move hand to keyboard

  4. formulate word to insert - brown

  5. type brown

  6. reposition insertion point at end of sentence

In terms of the KLM the following operators are needed for the above steps:

  1. H (mouse)

  2. P, B

  3. H (keyboard)

  4. M

  5. K (b) K (r) K (o) K (w) K (n)

  6. H (mouse), M, P, B

So, in total the execution time for this simple task is 3Th + 2Tp + 2Tb + 2Tm + 5Tk (assuming there is no significant response time for the system). Card et al. derived values for the time to complete these operators from empirical studies. These are listed below (for an expert typist), and give a total execution time of 1.2 + 2.2 + 0.4 + 2.7 + 0.6 = 7.1s in this case. As we shall see later in Unit 9, these quantitative predictions of execution time can also be used to compare designs.

Operators Time (s)
Tk0.12
Tp1.10
Th0.40
Td1.06
Tb0.20
Tm1.35

Activity 5 – KLM

Carrying on with the example in Activity 4, imagine that customers could withdraw money using their personal computer. In this case data entry would be via the keyboard, and selection of options would be done using the mouse. Using KLM, work out the execution time for the activity of withdrawing £10 assuming that both keyboard and mouse are to be used, that the PIN is 1234, that it takes the system 10s to return the card and cash, and that £10 is one of the predetermined amounts listed.

A Discussion on this activity can be found at the end of the chapter.

Summary of GOMS

GOMS provides a way of making predictions about the time an expert user would take to complete a task using a given user interface. Furthermore, as GOMS modeling makes user tasks and goals explicit these descriptions could be usefully re-employed in the development of an on-line help system. These descriptions can additionally be used to suggest questions users will ask and the answers in terms of actions needed to complete tasks and meet goals.

However, as mentioned several times before, the tasks must be must be goal-directed, that is the user must have a specific aim in mind. Some activities are more goal-directed than others, but it could be argued that even creative activities contain goal-directed tasks. Furthermore, GOMS assumes that tasks involve routine cognitive skill as opposed to problem solving, and that no errors occur, which is hardly realistic.

Review Question 5

GOMS is a cognitive modelling approach. What does it model, and how?

Answer at the end of the chapter.

Interacting Cognitive Subsystems

ICS is an elaborate framework which assumes that human perception, cognition, and action can be analysed in terms of discrete, inter-linked, information processing modules. In contrast to GOMS, ICS is a much richer way of modelling human cognition as we shall see in this section.

Subsystems

Each subsystem of ICS is independent and operates in a specific domain of processing of which there are three main components:

  • sensory – visual and auditory stimulus

  • representational – representations of information

  • effector – body movement

Although each subsystem operates in different domains, and on different kinds of information, they all share a common structure as illustrated in the following diagram. Each subsystem has an input to its left, and one or more outputs to its right, as well as its own memory record and a set of transformations. All information which is input to the subsystem is stored in the memory record (and can at a later point be used as an input to the subsystem). The outputs depend on the transformations which transform information from one form to another. In the example we have three transformation which transform the input information into three different codes – X, Y, and Z.

As mentioned previously, ICS models human cognition in terms of several of these subsystems linked together. We shall come to the complete network later, but let’s start to understand the ICS framework by considering the operation of one subsystem. A good point to start understanding ICS is by considering the visual subsystem (illustrated below) which forms part of the sensory domain. This takes input from the eyes in the form of sensory representations with information about edges, angles, shades, contrasts, hues etc. (here the person is looking at the pentagon to the left of the diagram) and transforms it to object representations with information about depth, position, orientation, shapes, etc. (here the object representation describes the image as a grey pentagon). Note that a copy of the visual representation is stored in the memory record for later use. Note also that the clarity of the visual representation alters the success of the transformation to an object representation. It is important to remember that these two representations (visual and object) are both mental representations, but at different levels of information.

The key to using ICS is the inter-linking of the subsystems. Below is a diagram of the complete ICS framework – the VIsual Subsystem (VIS) just described can be seen in the bottom left of the diagram in the sensory domain i.e. input from the sense. We need to further process information to extract meaning and make decisions e.g. we may need to interpret the grey pentagon as a fifty pence piece and then use it appropriately.

ICS can be used to describe a user’s mental processes whilst performing a task. Such descriptions can then be used to assess the relative amounts of cognitive resources (the numbers of transformations) needed to complete tasks with different interfaces. In the example a user is revising some text in a word processor. Whilst reading the text they notice a particularly difficult sentence and decide to rephrase it. In an ICS description of the cognitive processes involved in achieving this task we might start by considering what the sensory inputs are. As the user has read the text from the screen they are using their visual sensory input. So the first subsystem involved is the Visual Subsystem which transforms the visual representation into an object representation (VIS ® OBJ). This transformation produces information about what letters are being viewed. As the output is in OBJ form it is then processed by the Object Subsystem which performs the transformation OBJ ® MPL. This morphonolexical representation encodes the surface structure of the sentences in a speech based code which discards the surface structure of the words i.e. it encodes what the words are, rather than what letters they are composed of. The MPL representation is then processed by the Morphonolexical Subsystem which performs the transformation MPL ® PROP producing a representation which encodes the meanings of the individual words in the sentence, and the relationships between the words. From this representation the Propositional Subsystem transforms it to an implication representation (PROP ® IMPLIC) in which the meaning of the sentence itself is identified. Several different meanings of the sentence may be identified in which case there may be several iterations of PROP ® IMPLIC (Propositional Subsystem) and IMPLIC ® PROP (Implicational Subsystem) to arrive at a satisfactory understanding of the meaning. At this point the representation needs to be transformed into a form suitable for the Limb Subsystem so that the correction can be made. This would involve the following transformations by Subsystems: IMPLIC® PROP (Implication Subsystem), PROP ® OBJ (Propositional Subsystem), OBJ ® LIM (Object Subsystem), LIM ® MOT(hand) (Limb Subsystem – transforms LIM into actual hand actions).

Of course, there is plenty more processing involved in locating the text on the screen, positioning the cursor, reading the menu bar etc. but this description gives and idea of the processes involved. The sequence described can be summarised as follows (Þ indicates external input or output, ® is a transformation, » indicates data transmitted across the data network, and {} enclose a set of transformations which may occur several times):

Þ VIS ® OBJ »

» OBJ ® MPL »

» MPL ® PROP »

{» PROP ® IMPLIC »

» IMPLIC ® PROP »}

» PROP ® OBJ »

» OBJ ® LIM »

» LIM ® MOT(hand) Þ

Summary of ICS

ICS is intended to provide a detailed description of the cognitive resources needed to use and learn a system. As such it is far more powerful than GOMS, but also much harder to use and to interpret the results it produces. Even trained psychologists would find it hard to consistently determine which Subsystems are involved in users attempting to complete tasks. One approach to alleviating this problem is to develop some sort of software toolset to help evaluators use the framework. This might support activities such as determining which Subsystems are used, and which representations are appropriate.

Review Question 6

ICS provides a detailed model of human cognition which can be used to determine how much cognitive effort a user will have to employ to complete a task. How does this differ to GOMS in terms of its model of human cognition, the intended users of the model, and the kinds of assessments it can make?

Answer at the end of the chapter.

Activity 6 - ICS

ICS is notoriously difficult to use. How would you develop a tool to make its use easier , what functionality would it have, and how would that make using ICS easier?

A Discussion on this activity can be found at the end of the chapter.

Summary of Cognitive Modeling

In this section we looked at models of human cognition which are intended to help designers understand how users would behave with their systems. Both models have a computational feel reflecting their roots in cognitive psychology. Furthermore, both models can be quite difficult to use, especially for large systems

Summary of User Modeling

The two approaches discussed in this unit – user requirements modelling and cognitive modelling – are very different uses of models. The first models the situation in which users work and the constraints their environment places upon them, and therefore new systems, whereas the second tries to give designers some understanding of the working of the human mind. They may be radically different models, but the reason for their existence is the same – to ensure that designs take account of users. This is referred to as user centred design and is an important departure from the conventional approach to design which concentrates on functional requirements of the system. The aim is that by considering the user (either from the point of view of their work context, or their cognitive abilities) the developed systems will be successfully deployed in the workplace and accepted by the users.