42 CB. Seaman to get the greatest advantage from improvements made to data collection procedures as a result of the exercise.
Before the observations in which she participated, the second observer was instructed by the principal observer in the forms used for data collection, the codes
used to categorize discussions, the procedure used to time discussions, and some background on the development project and developers. A total of 42 discussions were recorded during the three doubly-observed meetings. Out of those, both observers agreed on the coding for 26, or 62%. Although, to our knowledge, there is no standard acceptable threshold for this agreement percentage,
we had hoped to Fig. 2Time log used to document discussions during inspection meetings
2 Qualitative Methods obtain a higher value. However, the two observers were later able to come to an agreement on coding for all discussions on which they initially disagreed. The observers generally agreed on the length of each discussion.
Many of the coding discrepancies were due to the second observer’s lack of familiarity with the project and the developers. Others arose from the second observer’s lack of experience with the instrument (the form and coding categories, and the subjectivity of the categories. The coding scheme was actually modified slightly due to the problems the second observer had. It should be noted that some of the discrepancies over coding (3 out of 26 discrepancies) were eventually resolved in the second observer’s favor. That is, the principal observer had made an error. Another troubling result of this exercise was the number of discussions (five) that one observer had completely missed, but had been recorded by the other. Both the principal and second observers missed discussions. This would imply that a single observer will usually miss some interaction.
The results of a rater agreement exercise, ideally, should confirm that the data collection techniques being used are robust. However, as in the Inspection Study, the exercise often reveals the limitations of the study.
This is valuable, however, as many of the limitations revealed in the study design can be overcome if they are discovered early enough. Even if they are not surmountable, they can be reported along with the results and can inform the design of future studies. For example, in the Inspection Study, the results of the rater agreement exercise indicated that the data collected during observations would have been more accurate if more observers had been used for all observations, or if the meetings had been recorded. These procedural changes would have either required prohibitive amounts of effort, or stretched the goodwill of the study’s subjects beyond its limits. However, these should betaken into consideration in the design of future studies.
Recording
of observations, either with audio or video, is another issue to be considered when planning a study involving observation. The main advantage of electronically recording observations is in ensuring accuracy of the data. Usually, the field notes are written after the observation while listening to or watching the recording. In this way, the notes are much less likely to introduce inaccuracies due to the observer’s faulty memory or even bias.