unobtrusive measures

The logic and advantages of unobtrusive measures are briefly reviewed. The literature on unobtrusive measures is then summarized under four large classes-physical traces, archives, simple observations, and measures gathered with hardware. A small number of methods also fall into a residual category. Each class of methods is discussed briefly and recommendations and/or problems with its use are discussed. This is followed by a listing of the measures in that category, an indication of what variable is being indexed and applicable

references.

UNOBTRUSIVE MEASURES An

I nventory of Uses

THOMAS J. BOUCHARD, Jr. University of Minnesota

purpose of this paper is not to he unobtrusive

tout the merits of

measures; they are as a rule inferentially weak. The intent is to illustrate a wide variety of them and, by example, to persuade the reader to consider seriously supplementing more traditional procedures with them. Neither will we discuss in any detail the problems entailed by their use. Every user should read the superb methodological discussion in Webb et al. (1966). Other excellent methodological discussions can be

found in Brandt (1972, ch. 5) and Denzin

(1970,

chs. 1 1 and

12). AUTHOR’S NOTE: This paper

was

written while the author

was

at the

Oregon Research Institute supported by General Research Support Grant RR-05612 from the National Institutes of Health and Grant MH-12972 from the National Institutes of Mental Health Service. The author would appreciate being informed of any methods of references missed by this review so they can be included in a future article. SOCIOLOGICAL METHODS &

RESEARCH,

Vol. 4

No. 3, February 1976

[267]

[268] THE LOGIC OF MULTIPLE METHODS

If unobtrusive methods yield measures which are as a rule inferentially weak, why use them? Two reasons. First, because all methods are fallible and methods that are fallible in different ways complement each other even if one is absolutely weaker than the other. Convergence of findings by two methods with different weaknesses enhances our belief that the results are valid and not a methodological artifact. Second, researchers tend to develop preferences for single methods (e.g., the interview, the questionnaire, participant observation) and skills in their use. While this may increase the resolving power of the measures generated in specific instances, it also blinds the researchers to a myriad of other &dquo;events&dquo; and distorts those they record. It is important to keep in mind the distinction between observations (records) and data (Runkel and McGrath, 1972). A method generates observations, the researcher turns those observations into data and the data into measures of constructs. Methods are in a sense &dquo;event recorders.&dquo; An example is useful here. Consider the movie camera or video camera as a datagathering method (event recorder). The field of view of the camera only encompasses a limited perspective on an event. It is blind to everything that goes on outside the field of view. It distorts because it has a fixed locus, is two-dimensional in its recording, foreshortens, and often requires unusual lighting conditions. If we consider the camera operator as part of the method, these problems are all confounded by his skill and his biases. He may focus on what looks good at the expense of what is important. Close-ups may reveal what appear to be important considerations, but all opportunities to test rival interpretations are lost because close-ups are taken only of a selected

subsample. Similar considerations apply to any method. It is well known that substantial correlations between variables can occur simply because a common method was used to measure each of them (Campbell and Fiske, 1959). Given the modest predictive power of most social science research, it behooves a researcher to check

[269] and see if all his &dquo;predicted variance&dquo; might not be due to this fact. Since every investigator is interested in generalizing his findings beyond the circumstances which characterize his method of measurement it behooves him to multiply his methods. A researcher who uses multiple methods in order to generate multiple measures is engaged in triangulation, and when the measures converge on a common finding, he is said to have generated convergent validity. High convergent validity enhances the researcher’s confidence that all his measures are tapping a common construct. Confirmation of theoretical expectations with various measures generated by divergent methods contributes heavily to the validity of the construct. ’

THE PROBLEM OF REACTIVITY

A special problem faced by many social science research methods is that of reactivity. The respondent (subject) knows he is being observed (tested), and so forth, and his behavior is affected thereby. Unobtrusive measures can often, but not always, eliminate the rival hypothesis &dquo;reactive measurement effects&dquo; (Webb et al., 1966: 173). When &dquo;a reactive measure effect&dquo; is a plausible rival hypothesis its elimination via the use of an unobtrusive measure greatly enhances the generalizability of a set of findings. Great care must be taken in the choice of a measure when the intent is to exclude reactive measurement effects. There is a tendency to think that reactivity is restricted to verbal (interview) and test (questionnaire) behavior. This is short-sighted. People can and do dissimulate with their behavior. Furthermore, reactivity is not always a response to the experimenter. Routine records and documents which on their face might appear to be unbiased are often reactive to political considerations (Dalton, 1959). An individual or a group may be maintaining a facade with respect to everyone else in the environment. If this is true much of the data that one would normally consider to be unobtrusive or nonreactive is contaminated prior to the investigator’s arrival.

[270] FOCUS ON BEHAVIOR

An underplayed advantage of unobtrusive measures is that they tend to focus the researcher on behavior and the results of behavior rather than on verbal expressions of behavior. This is no small gain in light of the consistent finding that test-taking behavior, especially in the form of attitude measures, is often unrelated to the behavior of real interest to investigators (Brayfield and Crockett, 1955; Wicker, 1969). In our opinion the goal of the behavioral sciences is the prediction and control of behavior in its broadest sense and we find it difficult to understand why researchers have held so tenaciously to paper-andpencil methods rather than turning to a systematic examination of the structure of the behavior of interest (see Brandt, 1972; Wernimont and Campbell, 1968). Webb et al. (1966) have suggested four large classes of unobtrusive measures: physical traces, archives, simple observations, and measures gathered with hardware. The following discussion is organized around this classification system.

PHYSICAL TRACES

Physical traces are generally very indirect indicators of psychological and social processes. They are, therefore, prone to misinterpretation and should be used with caution. An example of this

is the use of indices of floor wear to assess frequency of use, and indirectly, popularity. Alternative interpretations might be: A bathroom or water fountain was located in the area, the arrangement of furniture allowed no degrees of freedom, the floor material in that area had different characteristics and simply wore faster (poorly calibrated instrument). Contaminating factors such as these should be looked for when physical traces are used. A logical scheme for classifying physical traces is shown in Table 1. As a result of human activities, physical material can accumulate (accretion, e.g., litter); it also may be used up (erosion, e.g., error

’

[271] TAB LE 1

Six

Categories of Physical Traces

spots on the floor in front of popular museum exhibits). When changes in physical material wrought by people are minor, people are said to have left traces, e.g., fingerprints. It is often possible to manipulate physical material in such a way that its use in assessing accretion, erosion, and traces is enhanced. The size of the material units may be fixed or randomized, the texture (wear qualities) may be standardized, its frequency of exposure regulated, or its capacity to retain traces enhanced. We have been unable to find any examples for cells III (Controlled accretion) or IV (Controlled erosion). This means that no one (as far as we know) has manipulated physical material, except in a trace manner, in order to generate measures of human behavior. There are on the other hand a number of uses of uncontrolled accretion and erosion. It is of interest to note that a parallel situation exists with respect to systematic observation methods (see Bouchard, forthcoming). Investigators collect data in either contrived situations (e.g., experimental groups) or in a naturalistic context. Natural settings are seldom purposefully manipulated in order to test hypotheses or facilitate the collection of data. This method is clearly underutilized by social scientists. Weick (1968) has called this method &dquo;tempered naturalness.&dquo; It is another name for the &dquo;naturalist’s alternative.&dquo; &dquo;Manipulate only as much as necessary to answer your questions clearly and otherwise leave things alone, for there is order even in what seems to you to be the worst confusion&dquo; (Menzel, 1969: 91 ). Below we list examples by category when examples exist. Many of the examples are from Webb et al. (1966) and the page number where the material is discussed wear

[272] in that source is given. The remaining examples enced. Unreferenced material is original. Measure

Variable

are

also refer-

and/or Reference

Uncontrolled-Accretion Bent Dirt

comers on

library book

sections of books and

on

(Webb et al.,

1966:

37) usage, interest patterns (Mosteller, cited in Webb et al., 1966: 38) usage rate

encyclopedias Dust on library books

recency and amount of usage

(Webb et al., Inventories

1966:

38)

before and after size of inventory can be used as a crude indicator of success of a sales campaign; absolute size of a shipment can index sales expectations

.

Litter

used to assess effectiveness of antilitter campaigns (Webb et al.,

1966: 42) Trash

empty liquor and beer bottles

analysis ’

,

’

garbage cans have been used as sources of information (Gold, 1964; Hughes, 1958;

waste baskets and ’

’ . ’

j:

,

’J

~ , ,,.

_

, .4j

,,

,.

Shadegg, 1964)

’

&dquo;

.,. .j-

.