MSc-IT Study Material
June 2010 Edition

Computer Science Department, University of Cape Town
| MIT Notes Home | Edition Home |

Ways of Evaluating User Behaviour

There are many ways of evaluating user behaviour and we shall discuss in more depth in later units. Which techniques should be used depends on what information you need to get from the users. The sort of information you can gather ranges from quantifiable measures of performance time to much more qualitative aspects such as levels of user satisfaction. Which measures are taken and acted upon is dictated largely by the purpose to which the system is to be put. Designers of consumer leisure products are going to be much more interested in levels of user satisfaction than performance measures. However designers of systems where profits rely on performance and whose users are paid to use the systems are going to be much more interested in performance measures. Gray et al (1993) give an interesting economic figure: they were involved in the redesigning of the workstations used by New England telephone operators. Given the number of operators and the number of calls they took, Gray et al estimate that a one second reduction in work time per call would result in a saving of $3 million per year.

The cost of performing an evaluation can also be a very important factor; performing an evaluation and analysing the results can be very time consuming and costly. Some techniques for evaluation (e.g. questionnaires) are much cheaper than others.

Laboratory set evaluation

This method of evaluation derives from scientific psychological studies. Psychologists attempt to study behaviour experimentally by giving subjects tasks to do in a controlled environment. By controlling the environment psychologists attempt to control variables that may effect the subjects’ behaviour. If the experimenter can argue that they held all variables steady except one, and that changing that one variable changes the subjects’ behaviour, then the experimenter is in a good position to argue that the one variable has a causal effect on the subject’s behaviour.

Laboratory set evaluations are intended to produce results that have a high level of scientific rigour to them. Typically users will be invited to perform certain tasks using a system under laboratory conditions, then other users may be invited to perform the same tasks using a variation on the system. If the two sets of users behave in different ways then the experimenter can claim that the difference in behaviour are caused by the differences in the system.

Users’ behaviour can be recorded by videotaping or by programming the system to automatically record what the users do in log files. After the experiment the analysts will usually try to get more qualitative responses to the system from users by asking them to fill in questionnaires or by interviewing them.

The main criticism levelled at laboratory set evaluations is that they lack ‘real world validity’. Users may behave in a certain way in the rather unusual setting of a laboratory, but there is no guarantee that they will not behave in a completely different way in the ‘real world’. Furthermore care needs to be taken with the users that are asked to perform experiments. Many psychological experiments that are reported widely are undertaken in universities by researchers who use the undergraduate population as subjects. On deeper investigation you will find that psychological statements about behaviour do not really apply to the population as a whole, only to a rather small and demographically strange set of university students. Evaluators of interactive systems should take similar care to evaluate their systems with the sort of people who are actually going to be using it.

Laboratory set evaluation is a way achieving scientific rigour in an evaluation, which is a very strong requirement for an evaluator to set themselves. Unless the evaluator wants to publish their results in learned journals there are easier and cheaper ways of getting data about user behaviour. Furthermore laboratory investigations require skilled evaluators and specialised equipment in the laboratory; they can be very expensive.

Ethnographic studies

An ethnographic study is a way of getting round the problems of real world validity presented by laboratory set evaluations. Ethnographic studies realise that valid user behaviour is not to be found in laboratories but in the users’ homes and workplaces. Ethnographers also realise that users behaviour is also influenced by the presence of the experimenter. Hence the experimenter will try and become part of the users’ environment. If a study is being made of a system in a workplace then the experimenter will join the workforce and perform the tasks that the users do.

Because the experimenter cannot control the environment to anywhere near the same extent as can be done in laboratories then it is much more difficult to make claims of cause and effect. Ethnographic studies are a newly emerging way of studying behaviour and are beginning to gain respect for the real world and valuable insights they give into behaviour. New sets of procedures for performing ethnographic studies are emerging and as a field ethnography is rapidly moving towards scientific respectability.

Much care must be taken with ethnographic studies though; they must be studies of the users’ behaviour and not of the experimenter’s. Because the barriers between users and experimenters are deliberately broken down there must be good evidence in the evaluation that the experimenter is reporting the users’ behaviour and not their own. Although an explicit laboratory is not required, ethnography is still expensive in terms of experimenter skill and time.

Questionnaires

Questionnaires are a cheap way of gathering opinions from a large number of users. They range in how prescriptive they are in what sort of answers the user can give. They can ask open questions and leave the user space to respond by writing free text, or they can give very specific questions with a set range of answers that can be ticked. The answers on questionnaires can be read automatically if they are ‘tick box’ answers as opposed to free text. Being able to read answers automatically can also dramatically decrease the costs involved in the evaluation. Web pages can also be written in the form of questionnaires so that users can automatically send information to the developers.

There is a trade off between the amount of freedom given to the user in how they fill questionnaires and the time and effort required to analyse the questionnaires. A questionnaire made up entirely of questions with a set of answers that the user must tick allow very little freedom, but are very quick and easy to analyse. The more you allow users to fill in free text, the more effort is involved in analysing them.

The questions need to be carefully written so that users can understand them and give useful answers. Users will soon get bored and fill in fallacious answers if they do not understand the questions or find them irrelevant. Furthermore questionnaires should be fairly short and to the point to prevent users getting bored. If a lot of questions are to be included then it is best to put the important ones first so that users answer them before giving up. Because filling in questionnaires is such a boring task for most users evaluators will often offer incentives to users to complete the questionnaires.

User interviews

Interviewing users is quite skilled; the interviewer needs to be able to get appropriate information out of the users, while leaving the interview open enough to let users add their own opinions, while not letting the users ‘run away’ with the interview and discuss things of interest to them, but not much to do with the system being evaluated.

Summary

Each of these approaches to evaluation have their own weaknesses and strengths. A really extensive evaluation will make use of many, if not all of these techniques in order to try and maximise the benefits of each. A small evaluation will probably get best results by conducting several user interviews; they tend to generate the best quality information. Although evaluating by questionnaire is appealing because of its cheapness, it should be applied with care; a badly designed questionnaire can generate misleading results.

Review Question 7

Each of these approaches to evaluation have their own weaknesses and strengths. A really extensive evaluation will make use of many, if not all of these techniques in order to try and maximise the benefits of each. A small evaluation will probably get best results by conducting several user interviews; they tend to generate the best quality information. Although evaluating by questionnaire is appealing because of its cheapness, it should be applied with care; a badly designed questionnaire can generate misleading results.

Answer to this question can be found at the end of the chapter.