Psych - Tests and Measurements in Psychology Unit 3 Essay
Question # 45228 | Psychology | 3 years ago |
---|
$30 |
---|
Unit 3 Essay
Prompt: What is the Purpose of “culture-fair” tests? What types of items do such tests usually employ?
500-750+ words. Okay to go over on word count. MLA style. Please cite any additional sources
NOTE: BELOW IS A WORD FOR WORD QUOTE FROM THE PASSAGE OF TEXT CONCERNING THE PROMPT QUESTION.
Textbook: Hogan, T.P. Psychological Testing: A Practical Introduction, 3rd ed., Wiley Publishing Company, 2015
Text reads as follows…
Culture-Fair Tests of Measurement
Here is one of the great dilemmas in the psychometric universe. On the one hand, common definition s of intelligence emphasize the ability to succeed within one’s environment. This suggests testing with the ordinary symbols, conventions, and artifacts of the culture. On the other hand, we would like a “pure” measure of intelligence, untrammeled by the specifics of a particular culture. The desire for such a “pure” measure has seemed particularly urgent as concerns about minority group performance has increased. We should note, however, that psychologists’ pursuit of the “pure” measure predates contemporary concerns about minority group scores. The pursuit originated, as we will see, in the late 1930s, whereas concerns about minority group performance arose primarily in the 1960s.
Tests like the Stanford-Binet, Wechsler scales, Otis series, and SAT are obviously steeped in Western cultural practices. They are heavily verbal, specifically in standard English. Are there ways to measure intelligence without dependence on a particular culture and language? This is the question addressed by what we call culture-free or culture-fair test. The term culture-fair is currently preferred, although a few sources use the term culture-reduced. Culture-free was the original tag for these tests. Not long from all culture trappings. For example, using paper, assuming a left-to-right and top-to-bottom orientation on a page, giving direct answers to questions, being responsive to time constraints—all of these are culturally bound practices. Perhaps, however, we can create a test that is equally fair across cultures. Let us examine some efforts to accomplish this. We should note that, while we treat this topic under group mental ability tests, the topic is also directly relevant to individually administered tests.
Raven’s Progressive Matrices
Probably the best-known example of a test purporting to be culture-fair is Raven’s Progressive Matrices (RPM). This is an important example because it is so widely cited in the research literature on intelligence. Recall from our discussion of theories of intelligence the central role played by “g.” general intelligence. Many people consider the RPM to be a particularly useful measure of “g.” It often serves as a benchmark in factor-analytic studies of intelligence. Therefore, students of psychological testing should have a basic familiarity with this test.
Many sources refer to “the Raven’s” or “Raven’s Matrices” as if it were a single test. However, Raven’s matrices actually constitute there different test series, as summarized in Table 9.15. First, there is the Coloured Progressive Matrices (CMP), designed for younger children, and in general, the lower end of the intelligence distribution. The test uses color to enhance interest level. Second, there is the Standard Progressive Matrices (SPM). This is the classic version, consisting of 60 items.
It is intended for persons in the middle of the mental ability spectrum. In most recent edition is the “Extended Plus” version, released in 1998. Third, there is the Advanced Progressive Matrices (APM), designed for the upper 20% of the mental ability distribution. A single test, Raven’s Progressive Matrices, the forerunner of SPM, was published in 1938. Recall from Chapter 7 that Cattell published his seminal article proposing a culture-free test, based mostly on matrix-type items in 1940.
Figure 9.0 shows simulations of Raven’s items at several levels of difficulty. The item “stem” shows a pattern (matrix) with a part missing. The examinee’s task is to select the option that completes the pattern. The important point for our presentation here is to observe the nature of these matrix-type items.
The Raven’s has several desirable features. It is completely nonverbal. Even the directions can be given in pantomime if necessary. Hence, it is attractive for use with culturally different, linguistically diverse, and physically challenged individuals (except the seeing impaired). It is multiple-choice and therefore easy to score. Its administration does not require special training. It can be used with individuals or groups. It is reasonably brief.
Why has the Raven’s not experienced wider use in practical settings? The answer, probably, lies in its rather confusing, uncoordinated hodge-podge of materials. There are three levels, each with distinct titles. There are at least five separately published manuals, with assorted supplements besides. There are a host of different norm groups. Much more crucial in conflicting evidence regarding the trait or traits measured by the test(s). The Raven’s manual emphasizes measurements of the “education of relations” in Spearman’s “g” (Raven, Court, & Raven, 1993). Several reviewers (e.g., Llabre, 1984, Vernon, 1984) agree. On the other hand, some authors (e.g. Esquivel, 1984; Gregory, 2011) note that factor-analytic studies identify several different traits, even within a single level. We can surmise some of these different traits from the simulations given in Figure 9.9 above. The first item seems to rely primarily on perception. The third item calls for analogical reasoning. Shot through all of the findings is a kind of figural/spatial ability. Finally, the Raven’s manuals make continual reference to companion tests of vocabulary. However, within the Raven’s complex, the two types of tests are not well coordinated. Nevertheless, the Raven’s in its various incarnations, is a widely cited instrument in psychological testing.
Other Culture-Fair Tests and Some Conclusions
The Raven’s is by no means the only attempt to provide a culture-test. There are numerous other examples. Earlier, the description of theories of intelligence we mentioned Cattell’s 1940 article announcing a culture-fair test. The test consisted mostly of matrix-type items. Cattell eventually elaborated in the test into the regularly published Culture Fair Intelligence Test (CFIT). Early work with these matrix-type items served as part of the foundation for Cattell’s distinction between fluid and crystallized intelligence. The CFIT was supposed to tap the fluid dimensions, while more verbally oriented tests tapped the crystallized dimension. Many other authors have constructed tests using matrix-type items, figural relationships, geometric designs, and other nonverbal stimuli. Figure 9.10 shows examples of such items. The hope nearly always is that the test will be culturally fair.
We should note that the elementary cognitive tasks (ECTs) used in information processing models of intelligence (see pp. 264-266) have been proposed as culture-fair (even culture-free) measures of intelligence. It is an intriguing notion. However, application of the ECTs has been largely confined to laboratory usage. Furthermore, they have a long way to go to prove their worth as measures of intelligence.
The following three conclusions emerge from a review of the work on culture-fair tests. First, the tests tend to be measures of figural and spatial reasoning ability. This may be of some use, especially when there is no reasonable prospect of using a more conventional measure of mental ability. However, these tests are obviously not measuring the same abilities as the benchmark tests of general intellectual functioning, tests in the tradition of the Wechsler and Otis tests. We must be wary of claims by test authors and publishers about “a nonverbal, culture-fair tests of intelligence” on two counts. First, the “intelligence” referenced in the statement is of a very narrow variety. If the statement implies that this nonverbal test can serve as a substitute for, say, a WISC, then the statement is misleading. Second, we can determine by simple inspection that a test is largely nonverbal. We cannot determine by inspection that the test is culture-fair. To determine that a test is culture-fair requires research showing that the test functions equivalently in a variety of cultures.
Second, when used to predict success in such areas as school or jobs, tests that are primarily measures of figural and spatial reasoning are clearly inferior to verbally oriented tests. Furthermore, the figural/spatial tests add little, if any, predictive power to simple verbal tests for these predictive purposes. (There are some very limited exceptions to this generalization, for example, predicting success in architecture.) The reason for the superiority of verbal measures is probably quite simple. Most academic activities and jobs have more verbal demands than figural spatial demands.
Third, the hope that culture-fair tests would eliminate differences in average scores between minority groups and minority group examinees or between culturally different groups has not been fulfilled. Some research shows that these differences are reduced somewhat on nonverbal tests, in comparison with more conventional, verbally loaded tests. Other research shows that the group differences are approximately the same on the verbally loaded and the nonverbal tests. This search for the golden fleece will have to continue.
Intelligence Tests for Micro-Cultures
At the other extreme from attempts to build culture-fair tests are intelligence tests based on a highly specific subculture, what we might call a micro-culture. Not infrequently, one hears about such an “intelligence” test. For example, such a test could be developed for sailing terminology (port, track, starboard, coming about, etc.); baseball (bunt, Texas leaguer, sacrifice fly, etc.); or a city’s subway system (the MTA, the D train, etc.). These tests are often put forward to discredit conventional mental ability tests such as the WAIS or SAT. Some wag will note that a person with a WAIS IQ of 150 (who lives in Chicago) failed the NYC subway test. The implication is that the WAIS IQ is unimportant. The public media are fond of touting such reports. The implication is transparent nonsense. We know a great deal about the generalizability of the WAIS IQ. We must ask the same questions about the subway test as about the WAIS. To what other behavior does subway test performance generalize? What does it correlate with? What are the norms for the test? Is the score reliable? Ordinarily, no such information is available for these micro-culture tests.