Setting Standards

Applying Faculty Judgment to CLEP Exam Cut Scores

Our standard-setting process ensures that the scores students earn on each examination consistently portray student mastery of a subject. Scoring standards are determined through the process of Web-based standard setting, which is accomplished by a standard-setting panel, comprising 15 to 20 college faculty members teaching the equivalent college-level course.

Panelists receive training materials, conduct discussions, and render judgments collaboratively online. The studies are managed by a trained facilitator from ETS who answers questions, monitors the progress of the study, and leads the discussions. The panels follow the modified Angoff method to arrive at their judgments and recommendations. 

The Angoff methodology used in CLEP standard-setting studies is a modification of an approach first introduced by William H. Angoff in 1971. The modified Angoff method asks panelists, or judges, to determine the percentage of typical students at grade levels B and C who would be able to answer a question correctly. This method reflects the fact that at any particular grade level, it is rare for 100 percent of students to answer a question either correctly or incorrectly. For exams that include essays, standards are established using the benchmark method1 in addition to the Angoff method.

The Web-Based Standard-Setting Process

Before meeting, panel members begin the process by familiarizing themselves with the CLEP examinations in general and the examination under review in particular. They are each asked to define the performance characteristics of a typical college student at various grade levels (A, B, C and D). The final description of a typical test-taker to be used for the purposes of the standard setting is determined during an online discussion among all the judges on the panel.

In the next stage of the process, judges are trained to recognize factors, such as format or phrasing, that tend to either increase or decrease the difficulty of a given test question. This training is intended to help panel members critically assess factors other than content difficulty when predicting how students would perform on the questions.

After this training is complete, panel members are asked to estimate, or rate, the performance of typical students at various grade levels on each of the exam questions.

After the first round of ratings, standard-setting facilitators provide each judge with historical information about the items, the mean and variance of the ratings assigned to each item by the panel of judges, and the difference between the highest and lowest ratings for the item. Items with a particularly large or small variance are highlighted in the document. The judges participate in an online discussion in which they compare their individual ratings to those of other judges, with special attention given to the highlighted items.

During the second round of ratings, judges are permitted to view and make adjustments to their ratings from the first round. However, they have the option of leaving their original ratings unchanged. The scores assigned by each judge to the individual items in this second round are then added together to give a total score for the judge in question. These scores are then averaged across all judges to determine the study's proposed passing score.

 

1J. Faggen, Setting Standards for Constructed Response Tests: An Overview, Educational Testing Service Research Memorandum RM-94-19 (Princeton, New Jersey: Educational Testing Service, November 1994).