Evaluation

Introduction to Multimedia in Museums

Section Two: Developing Multimedia Systems

Chapter 10. Evaluation

This chapter consists of the following sections:

Introduction
Problems and Difficulties
Types of Evaluation
Observation
Interviewing
On-screen Questionnaires
Comments Books
User Interaction Logging
Checklist of Evaluation Criteria
Research Findings
Conclusion

Introduction

Multimedia technologies as a relatively new phenomenon are still undergoing development and rapid changes. Despite the miraculous qualities attributed to multimedia by production companies and software developers, there are still many problems related to this technology and many areas that need to be evaluated and investigated further. As the multimedia craze and 'technolunacy' (the use of the technology as a means to an end) have started pervading museums, the need for careful re-assessment and examination of the effect of these programs on visitors becomes imperative. For museums, with their usually tight budgets and increased public accountability -- especially when funded by public authorities -- it is even more important to evaluate the success of any multimedia venture.

Multimedia design and production are labour intensive and time-consuming. Even when the programming and the technical design are complete, evaluation of multimedia projects can offer valuable information for improving the applications and useful lessons for further development.

Finally, evaluation should be an on-going process, integrated in the overall function of the museum. It raises questions that affect the whole institution and relate to its role, its function, and the ways in which the museum fulfils its goals.

Problems and Difficulties

Despite the urgent need for evaluation of multimedia applications, published reports of projects are unfortunately very limited in number, and most presentations focus only on positive outcomes. One of the reasons may be that properly designed and conducted evaluation surveys can be a demanding and daunting task, requiring specialized knowledge and expert advice, which not many museums possess or can afford. Furthermore, not many institutions are ready to share unsuccessful experiences and unpredictable or negative results, despite the fact that the application of a relatively new and experimental technology, such as multimedia in museums is very likely to be associated with problems and not well-received first efforts.

Museum professionals have been sceptical of the effectiveness of evaluation in general and the methodology and results of several specific visitor surveys. Evaluating the effect of museum exhibitions has been seen by some as a trivializing exercise which can not record all the subtle and unmeasurable experiences that visitors might have in a gallery. Kenneth Hudson (Hudson, 1993. pg. 34-40) believes that visitor surveys "can be helpful, provided that they confine themselves to simple facts which can be processed and classified without too much distortion" (Hudson, 1993, p. 35) and goes further to question the usefulness of most of them: "Because I believe that museum-going is such a personal affair and its results so subtle and so unpredictable, I consider that a high proportion of visitor surveys are useless, impertinent and a waste of money" (Hudson, 1993, p. 38).

On the other hand, this sceptical attitude is sometimes considered as a hiding place for those who fear unpleasant facts or changes (Shettel, 1989, p. 134). Over the last few years, recording of the responses and satisfaction of the public has begun to develop into a proper science, borrowing research techniques and methodology from various disciplines.

Another criticism voiced about museum evaluation studies is that they have often focused on testing cognitive gain and the acquisition of factual information. "Too often, criteria better suited to more formal learning environments have been applied to the museum, and given such an inappropriate comparison, museums have not fared well" (Munley, 1986, p. 20). But the museum visit includes a wide range of other experiences and types of learning that are often ignored: social and aesthetic learning, development of new interests, consolidation of previous knowledge, awareness of issues, change of perceptions. Although these are experienced by many visitors in museums, the are usually elusive and difficult to quantify, measure, and record. This demanding task, is made even more complex by the diversity and heterogeneity of the museum audience.

Types of Evaluation

Many authors distinguish three main types of evaluation: front-end, formative, and summative.

Front-end analysis refers to the evaluation carried out before an exhibit is developed (exhibit in this report indicates the components that form part of an exhibition, from an interactive program to an interpretative label or a museum case). This type of evaluation can gauge the reactions of users to the subject matter of the application and the appropriateness of the computer for communicating the intended messages. Before embarking in interactive multimedia development, think again if this is the most suitable medium for the task in hand and if it couldn't be substituted by more affordable and easy solutions (McLean, 1992, pg. 4) After observing that visitors at the Hall of Human Biology and Evolution in the American Museum of Natural History in New York engaged with and appreciated the dioramas and artefacts much more than the computer interactives, Ellen Giusti (Giusti, 1994b) warns that "when they choose to visit a natural history museum, [visitors] expect educational displays consisting of objects and dioramas. When a natural history museum uses non-traditional media, there had better be a good reason (educationally) for doing so".

Formative evaluation takes place while the program is being developed and its results help refine and change the application. It can illustrate the appropriateness and intuitiveness of the user-interface and pinpoint problematic areas and programming 'bugs'. Both steps are vital exercises for the design of multimedia programs, ensuring that even if the final application is not perfect (Giusti, 1993b), it is at least better than one that hasn't been tested a all.

It is never too early to test the program and incorporate the final users in the design process. Even handmade paper mock screens and testing with cheap and crude prototypes can offer valuable feedback and suggest changes before it is too late. In most cases even a brief survey with a small sample, if a large one can not be administered, will offer a wealth of information.

Summative evaluation tests the effect and impact of the exhibit once it has been completed. This is often the first time that evaluators can test interpretive exhibits and gallery kiosks in greater depth in relation to the surrounding space, examine their role in the exhibition, and explore the dynamics between objects, visitors, and computer interactives.

Evaluation, mainly formative and summative, can help identify who uses the program.

Systematic and rigorous evaluation with a large random sample can give a valid indication of the profile of the users of the computer interactives. This can include demographic characteristics (information about age, gender, nationality, level of education, occupation), as well as other information such as the visitors' interests and computer skills. As is the case with visitor surveys in general, it is important to follow a rigorous sampling methodology in order to acquire results from the sample that can be generalized about most museum visitors. Random sampling ensures that every person in the population (the museum audience in this case) has an equal chance of being selected to complete a questionnaire or offer information.

Museum staff sometimes face surprises with the disparity between the intended audience and the real public of multimedia programs, as was the case with the Musee d'Orsay in Paris. Its Gallery of Dates, where seven consultation stations offer information about the historical context of the period from 1848 to 1914, was designed essentially for cultivated adults, but--as shown form observation, interviews, and questionnaires--was in fact used by a younger audience with a widely varied level of knowledge (Le Coz and Lemessier, 1993, 377-83). If the museum keeps records about its general visiting public, it is very useful to compare these with the profile of the multimedia users.

Formative and summative evaluations can also help answer the questions:

How do they use the program?

Observation, video-recording, and computer logging of the users interaction can show if the application is used by groups or individuals; the amount spent with the program; the depth of information reached; the choices made; the navigation paths followed.

What is the relation of the program with the exhibition and the museum?

The findings of the evaluation study can highlight the role and function of the multimedia program in the gallery; the relationship with the real objects displayed; the time spent with the objects compared with the time spent with the program; the effect of the program on the way the exhibition is viewed; the pattern of traffic flow around the museum, etc.

What is the effect of the program?

This is one of the most interesting and most frequently posed questions: What do users get out of the program? Do they learn anything? What is the impact of these applications? Do they help visitors understand and appreciate the objects better? These questions refer to educational effectiveness, emotional and aesthetic impact, successful communication, and visitor satisfaction. They are also among the most demanding and difficult characteristics to evaluate.

The evaluation of multimedia in museum settings should acknowledge the difficulty of measuring and recording the often elusive reactions to museum exhibits and the personal meanings that people derive from exhibits. It is best to employ a combination of methodologies to measure the effectiveness of multimedia programs at various levels, relating quantitative with qualitative results.

The most important thing to remember when evaluating museum multimedia programs is that there is no single golden method to be applied. It is usually necessary to combine several methods in order to have a better chance to verify and combine data.

Observation

Observations can be recorded on data collection sheets with the floor plan of the exhibition space or with checklists of specific behavior categories, together with personal notes. A stop watch can help record time. Video-recording can also provide a wealth of data, although often these might take longer to analyse, thus raising the costs.

Interviewing

This can be open-ended, with the interviewer just discussing freely with visitors. Interviews can be conducted with a group of people or individuals from the real or targeted audience of a program. Open-ended interviewing can be particularly useful for front-end analysis, to test how and what the targeted audience thinks about a topic before starting program development. If an application is intended for specific groups (e.g. a researcher's resource or a schoolchildren's outreach program), discussions with focus groups can be very useful during the planning and development stages. This often helps outline a list of questions for more formal interviewing. Interviews usually provide a lot of useful and meaningful data, but are time consuming, demanding on the interviewer, and difficult to analyse and categorize.

When testing a prototype, interviewing can take the form of cued and uncued testing. Cued testing involves explaining to users what the program is about and asking them to perform specific tasks or answer questions. It might also entail engaging users in conversation and encouraging them to 'think aloud' as they go through the program, while recording their responses. With uncued testing, users are observed unobtrusively using the program and are then asked questions about their experience.

On-screen Questionnaires

Although not very valid on its own for statistical analysis, this is an easy way to record feedback from the users of the program who are encouraged to answer some questions or click on multiple choice answers after using the program. The National Museums of Scotland used electronic questionnaires in their Western Isles National Database Evaluation Exercise (WINDEE) project to record basic information about the users, like their age, sex, and place of residence. With World Wide Web virtual exhibitions and online applications this is often the only way to acquire some information about the users. The Smithsonian Institution's National Museum of American Art in Washington, DC, used this method with their America Online electronic pages to obtain information about the number of virtual visits, the place of residence and occupation of users, the areas visited, and the features they would like to see.

Comments Books

In a similar way, attractive and clearly laid out printed questionnaires placed next to the computer station can encourage visitors to leave their impressions and comments about the program. Providing enough pens and visible boxes or assigned points for returning the questionnaire can help increase the number of responses. Again, this method is not statistically valid and doesn't include answers from a representative sample, but it is cheap and easy to carry out and can often give useful information.

User Interaction Logging

This is an electronic, reliable way of recording the choices of the users and the path they selected through the program. Once the scripting has been set up, it is an easy and objective way of obtaining a large set of data, which can be analysed statistically. One problem with this method is that with programs in public galleries, it is sometimes difficult to distinguish in the log among the interaction of different users. Also, although it indicates the most popular visitor choices, it doesn't explain why they were chosen (Shneiderman et al, 1989, 172-82). The results are not very meaningful on their own, but can be very useful when combined for example, with interviews and observation.

Feedback from real users of museum multimedia programs in their natural environment is very important and usually more useful than laboratory studies. Laboratory testing in control situations with users solving tasks defined by the evaluators can be useful in the first testing of the prototype to correct programming errors, but the sooner the system can be tested in its natural environment, the more valid and meaningful the findings.

Checklist of Evaluation Criteria

Identifying appropriate criteria for judgement is vital for every evaluation. These depend on the scope and purpose of the study, the aims of the multimedia program, the objectives set by the development team, and the time and funds available. The following checklist outlines only some of the basic aspects of multimedia applications that can be assessed:

User Interface/ Presentation

- Is the user interface consistent and appropriate to present the subject matter to the users? - If real-world metaphors are used, are these successful? - If icons are used for the buttons, are these understandable by the users? - Is the quality of graphics, images, sound, video adequate? - Is the text legible? (fonts, sizes, lay out, spacing) - Is the screen design attractive and effective? - Are the media used integrated successfully? - Is the level of interactivity appropriate for the intended audience and environment?

Structure/ navigation

- Is the structure of the various components appropriate to the content? (linear, hierarchical, network, combination) - Is the application easy to navigate? Does it indicate user's position, past moves, and available paths?

Programming

- Are there any programming problems or errors? - What happens if the users don't use the application the way it was intended?- Are users forgiven for making mistakes? - Is there feedback for operations that may take a long time? - Are users able to exit or restart at any time? - Is there a 'Help' section? Is it easily accessible throughout the program?

Content

- Is the amount and depth of information adequate? - Is the information accurate? - Is the writing appropriate, correct, and clear? - Is the presentation designed for different learning styles? Are the intended messages communicated effectively?

Integration with exhibition/museum

- Does the program complement the exhibition? Does it match the overall feel of the display? - Does it integrate well with surrounding exhibits? - Does it motivate users to look at the objects? - Does it create bottlenecks in traffic flow? - Can it be used by several people at the same time? - Ergonomics of installation (height of screen(s), interface devices, kiosk design, lighting, disabled users)

Distribution

- Will the application be compatible with other hardware or future devices? - Does it follow established standards? - Can it be distributed for educational or home use?

Overall impressions

- Does the program provide enjoyment, arouse curiosity and interest? - Is it easy to use? - What is its attracting and holding power? - Does it fulfil its intended purpose? (e.g. does an orientation program help visitors find their way? does a research tool answer the needs of specialists?)

Research Findings

The few published evaluation studies of computer interactives give us some indication of how interactive multimedia applications have been used in museum settings. The users of museum interactives tend in most cases to be young and male (Sharpe, 1983; Doering et al, 1989; McManus, 1993b, 74-114; Menninger, 1991; Morrissey, 1991, 109-118; Giusti, 1994a). In their study of the use of the 'Information Age' exhibit at the National Museum of American History, Allison & Gwaltney (Allison and Gwaltney, 1991, 62-73) have also found that visitors over 45 in age were under represented, while the ones under 25 were over represented, but their study did not show any inequity linked to sex. Few studies seem to contradict this pattern (Hilke et al, 1988, 34-49).

The study of the program 'The Caribou connection' at the National Museum of Natural History in Washington, DC, recorded both direct and indirect users, the latter being those who observed without participating. The results showed that while over 60% of the direct users of the computer system were male, the indirect users were divided equally between the sexes. Although we need to be careful when interpreting these data and take into account the special characteristics of every case, it seems however that in general the visitors who will voluntarily approach a computer or compete for its use in a busy gallery, are those who are already comfortable and familiar with using the technology (Doering et al, 1993b, 21-24). At present, they are more likely to be young and male. In most cases visitors use the multimedia interactives in groups. Many adults make use of their attractive presentation and educational content to explain and discuss with children issues related to the objects and the exhibition.

As many museum professionals and educators suspect, visitors seem to like the [process] of using the computer (Serrill and Raphling 1992, 181-189) and comment in many surveys on the attractiveness of the medium (see Doering et al, 1989; McManus, 1993b; Menniger, 1991; Giusti, 1994a and Giusti, 1993a). Computers are often more popular than any other exhibit in the display (Hilke et al, 1998) and have a strong holding power (McManus, 1993b, Menniger, 1991). One exception is the Hall of Human Biology and Evolution in the American Museum of Natural History where visitors seem to appreciate the dioramas and artefacts much more than the computer interactives (Giusti, 1994b). These observations cause uneasy feelings to many educators and curators who fear that state-of-the-art machinery and dazzling programs are going to destroy the atmosphere of specialness of many galleries and steal attention away from the objects.

Interactive programs can indeed have a powerful impact in public exhibitions, isolating the users from the surrounding environment and distracting them from looking at the displays. On the other hand, research so far on the use of computers in the galleries suggests that they actually complement and increase the enjoyment of the exhibits (see Menninger, 1991; Morrisey, 1991; Allison and Gwaltney, 1991; Hilke et al, 1988; Wanning, 1991; Worts, 1990 and Mellor, 1993). When thoughtfully designed and carefully positioned, interactive systems seem to function as supplements and enhancements of exhibitions, instead of replacing objects.

The above studies report that after having used interactives interpreting the museum's collections, visitors spent more time viewing the galleries. Allison and Gwaltney observed that visitors are clearly still spending most of their time looking at the traditional displays in the exhibition. The availability of interactives does not diminish interest in seeing artifacts or period settings. In Information Age at least, the time visitors spend with interactives seems to increase the normal amount of time they would have spent in the gallery if the interactives were not present' (Allison and Gwaltney, 1991, p. 70). Although time spent viewing the exhibits is not always an accurate measure of learning or attention, it has been used in numerous studies of museum audiences as it is an easily measurable factor which suggests increased interest and can be an indication of exhibit effectiveness.

The presence of interactive multimedia in most galleries has had a strong influence, but it seems to be a positive one, creating a lively and positive atmosphere and increasing the interest in the subject. Evaluators in many cases tried switching the machines off for certain periods and observed the differences in visitors' behavior. In the Art Gallery of Ontario when the computers were on there were more animated conversations, visitors were pointing to works of art and calling friends to look at details (Worts, 1990). In the 'Laser at 25' travelling exhibition 'visitors read, questioned, and discussed exhibit topics more frequently when the computer was on. Surprisingly, there were more reading and looking behaviors not only at the computer but also for visitors to all parts of the exhibition' (Hilke et al, 1988, p. 48). Interestingly, in the same exhibition, but also at the evaluation of the interactive videodisc program 'Birds in trouble in Michigan' at the Michigan State University Museum, non- computer users spent more time in gallery than when the computer was off (Morrisey, 1991). Modelling after the behavior of other visitors is probably an important factor in this situation.

The observation of the use of museum interactives and the pattern of exploration showed that visitors appreciate the initiative and flexibility that these programs offer. Users prefer to have more complete control over their choices, rather than explore in a linear fashion (Diamond et al, 1989), while they often created unique paths through the program (Morrisey, 1991). The evaluation of the 'Electronic Newspaper' program in the American National Museum of Natural History showed that the things visitors learned and remembered more vividly were the more active, discovery-based parts of the program (Giusti, 1994a). Developers of multimedia interactives should therefore, not forget to make full use of the powerful interactive elements of these programs.

Several studies included cognitive tests to investigate what people learn from multimedia interactives. Knowledge acquisition is a common evaluation criterion, although it is worth remembering that the museum visit is different from the classroom experience: it often includes many different types of learning, not related with the acquisition of factual information, and is difficult to measure and record. The evaluation studies of the Getty museum videodiscs on Greek vases and on Illuminated Manuscripts tested cognitive gain by comparing the answers to a quiz of the users of the interactive program with those of a control group which had only visited the gallery without viewing the program. The results indicated that the score of the users of the interactives was significantly higher (Menninger, 1991 and Herman, 1986). The studies of the 'Caribou Connection' and the 'Electronic Newspaper' at least two thirds of those interviewed mentioned that they had learned something specific (Doering et al, 1989 and Giusti, 1994a).

The visitor study of Birmingham Museums' 'Gallery 33' also offered some interesting indications about the impact of computer interactives. The permanent exhibition about cultural relativism, entitled 'Gallery 33: a Meeting Ground of Cultures', was organized by the Archaeology and Ethnography Department and includes the interactive video 'Collectors in the South Pacific'. This raises issues such as the act of collecting, Western influence in the South Pacific, and the return of cultural property. Although the study pointed out that the majority of visitors did not associate the interactive videodisc with the adjacent exhibit that the designers had intended it to be paired, it also reported that the program 'had aroused (visitors) to think of matters related to the collection of artefacts which had never occurred to them previously' (McManus, 1993b, 79). Almost a third of those interviewed said that the interactive video had influenced their views on the repatriation of artefacts and on museums (McManus, 1993b). The interactive videos in Gallery 33 seem to have been influential in stimulating qualitative learning (Jones, 1993).

The summative evaluation of Gallery 33, one of the most extensive carried out,included an exit questionnaire, a tracking study of visitor use of the gallery, an analysis of the memories of the gallery based on a postal survey, a questionnaire to test the reactions to the interactive video, a statistical analysis of the hit counts and choices in the program, a study of school's use of the interactive video, and an analysis of visitors' written comments. It was possible to track down a number of visitors about seven months after their visit to the exhibition, thanks to the addresses that they were encouraged to leave at the visitors' book. From those who responded about what they remembered from their visit to the gallery, it was evident that the range and depth of memories from the exhibition was remarkable. The study showed that the interaction with the multimedia programs had clearly been a memorable experience for the visitors. From the total of memories related to objects or things, the largest proportion (a quarter) was related to the interactive videos (McManus, 1993a, 60).

Conclusion

It seems that museum interactives are used mainly by young males, who visit the exhibition in groups. The applications are often the most popular, visitors use them for a long time, enjoy the novelty of the technology, and have a memorable experience. The use and presence of the computers affects the way they behave in the museum, often encouraging them to stay longer and pay closer attention to the objects and exhibition themes. In some cases, the program raises new issues and encourages visitors to challenge, their perceptions.

Further research, experimentation, and communication will be necessary to confirm these findings and increase our knowledge and understanding of the use of multimedia in museums.

Front-end analysis and formative evaluation, even when informal and of small scale, can offer valuable information to the developers of multimedia interactives and help design successful and effective exhibits. 'A computer interactive that is prototyped and formatively tested will not necessarily be perfect, but it is guaranteed to be better than one that has not been tested at all' (Raphling 1994, p. 45).

It is difficult to find among the museum community a uniform answer to the question of what constitutes a successful exhibit, and consequently measure its effectiveness. Attracting a large portion of the audience; contributing to visitors' entertainment and enjoyment; facilitating learning; stimulating curiosity and interest; raising issues; or encouraging self-exploration, whatever objectives the team has set, one can only find out if they have actually been achieved by studying the visitors. The effort and resources expended in observation, interviews, and analysis should be seen as an investment which will increase understanding, enable the museum to improve its exhibitions, and offer valuable insights for future ventures.

Today information technology and telecommunications are becoming increasingly important; we live in a media and computer-rich world. In this setting, museums are expected to explore the particular features and novel possibilities of multimedia and to investigate and invent effective ways of applying them. Continuous testing with the users, consideration of the audience needs, and further research can help museums take full advantage and make optimal use of this medium.

Send Comments: Evaluation

9. Defining Structures

11. Funding