J. Prof. Issues in Engr. Education & Practice, 128 (1), 1–3 (2002).

DESIGNING TESTS TO MAXIMIZE LEARNING

Richard M. Felder
North Carolina State University

It’s the middle of December. A colleague of yours who teaches mechanics has just gotten the tabulations of his end-of-course student evaluations and he’s steaming! His students clearly hated his course, giving him the lowest ratings received by any instructor in the department. He consoles himself by grumbling that student evaluations are just popularity contests and that even though his students don’t appreciate him now, in a few years they’ll realize that he really did them a favor by maintaining high standards.

He’s probably kidding himself. Although bashing student ratings is a popular faculty sport, several thousand research studies have shown that student ratings are remarkably consistent with retrospective senior and alumni ratings, peer ratings, and every other form of teaching evaluation used in higher education (Cashin 1988; Cashin 1995; Felder 1992). While there are always exceptions, teaching rated by most students as excellent usually is excellent, and teaching rated as atrocious usually is atrocious.

If your colleague decided to take a hard objective look at those evaluations instead of dismissing them out of hand, there is a good chance that he would find that his examinations play a major role in the students’ complaints. Not the difficulty of the exams per se: the research also shows that the highest evaluations tend to go to some of the more demanding teachers, not the ones who hand out A’s for mediocre work (Felder 1992). With the exception of outright sadistic behavior, what students hate more than anything else are examinations that they perceive as unfair. Tests that fall into this category have any of the following features: (1) problems on content not covered in lectures or homework assignments; (2) problems the students consider tricky, with unfamiliar twists that must be worked out on the spur of the moment; (3) excessive length, so that only the best students can finish in the allotted time; (4) excessively harsh grading, with little distinction being made between major conceptual errors and minor calculation mistakes; (5) inconsistent grading, so that two students who make the identical mistake lose different points. Most students can deal with tests that they fail because they don’t understand the material or didn’t study hard enough; however, if they understand but do poorly anyway for any of those five reasons, they feel cheated. Their feeling is not unjustified.

If you teach a course in a quantitative discipline, there are several specific things you can do to minimize your students’ perception that you are dealing with them unfairly on examinations.

The logic of this argument is questionable, to say the least. People acquire skills through practice and feedback, period. No one has ever presented evidence that testing students on unpracticed skills teaches them anything. Moreover, engineers and scientists are never presented with brand new varieties of quantitative problems and told that they have to solve them on the spot without consulting anyone. A student’s ability to solve hard puzzles quickly should not be the main determinant of whether he or she should be certified to practice engineering or science. The way to equip students to solve open-ended or poorly defined problems or problems that call for critical or creative thinking is to work out several such problems in class, then put several more on successive homework assignments and provide constructive feedback, and then put similar problems on tests.

Suggestions such as this and the previous one are often equated with lowering standards or "spoon-feeding" students. They are nothing of the sort. Taking the guesswork out of expectations is not equivalent to lowering them: on the contrary, I advocate raising expectations to the highest level appropriate for the course being taught, knowing that only the best of the students will be capable of meeting all of them. The point is that the more clearly the students understand those expectations and the more explicit training they are given in the skills needed to meet them, the more likely those with the aptitude to perform at the highest level will acquire the ability to do so.

A study guide is an effective way to communicate your expectations—among other reasons, because students are likely to pay attention to it. The guide should be thorough and detailed, with statements of every type of question you might include on the test—calculations, estimations, definitions and explanations, derivations, troubleshooting exercises, etc. The statements should begin with observable action words and not vague terms such as know, learn, understand, or appreciate. (You wouldn’t ask students to understand something on a test—you would ask them to do something to demonstrate their understanding.) Draw from the study guide when planning lectures and assignments and constructing the test. No surprises!

A number of benefits follow from the formulation of such instructional objectives for courses (Stice 1976; Felder and Brent 1997). A well-written set of objectives helps the instructor make the lectures, assignments, and tests coherent, gives other faculty members a good idea of what they can expect students who pass the course to know, and gives new instructors an invaluable head start when they are preparing to teach the course for the first time. An additional benefit in engineering is that the objectives provide accreditation visitors with an excellent summary of the knowledge and skills being imparted to the students in the course, particularly those having to do with Outcomes 3a–3k of Engineering Criteria 2000 (ABET 2000).

In my courses, the problems get quite long: by the end of the course, a single problem might take two or three hours to solve completely. There’s no way I can put one of those problems on a 50-minute test, but I still have to assess my students’ ability to solve them. I do it with the following generic problem:

Given...(describe the process or system to be analyzed and state the values of known quantities), write in order the equations you would solve to calculate...(state the quantities to be determined). Just write the equations—don’t attempt to simplify or solve them. In each equation, circle the variable for which you would solve, or the set of variables if several equations must be solved simultaneously.

The students who understand the material can do that relatively quickly—it’s the calculus and algebra and number-crunching that take most of the solution time. Moreover, I know that if they can write equations that can be solved sequentially for the variables of interest, given sufficient time they could grind through the detailed calculations.

One cautionary note, however. If students have never worked on a problem framed in this manner and one suddenly appears on a test, many of them will be confused and may do worse than they would have if the problem had called for all the calculations to be done. Once again, the rule is no surprises on tests. If you plan to use this device, be sure you work similar problems in class and then put some on homework, and then do it on the test.

Professors don’t want to do this—I certainly don’t! There are only two choices, however. One is to write the test on Sunday night, give it a quick once-through to make sure there are no glaring errors, and administer it Monday morning. You’ll usually find that the test is too long—only a handful of students have time to finish it, and some who really understand the material fail miserably because the only way they’re capable of working is slowly and methodically. (Incidentally, people who work like that are the ones I want designing the bridges I drive across.) It may also happen—and frequently does—that 15 to 30 minutes into the test a puzzled student asks if something is missing from the statement of Problem 2, and you realize that you forgot to include an important piece of data. Telling the class at that point that they’ve been beating their heads against an impossible problem and then figuring out how to grade the test is not an experience you want to have.

The only alternative is to do what I have suggested. I make up my test, think it’s perfect, and then sit down with my stopwatch and take it. That’s when the problems invariably surface. First, it’s too long—in 32 years of teaching, I have yet to make up a test that wasn’t too long on the first round. And there are underspecified problems and overspecified problems and poorly worded problems and problems that call for time-consuming but relatively pointless number-crunching. Then I revise—cleaning up some questions, eliminating busywork in others, eliminating others completely—and take the test again, and sometimes the revised version is acceptable and other times I have to go back and make still more changes.

* * *

In Embracing Contraries, Peter Elbow (1986) notes that faculty members have two conflicting functions—gatekeeper and coach. As gatekeepers, we set and maintain high standards to assure that our students are qualified to enter the community of professional practice by the time they graduate, and as coaches we do everything in our power to help them meet and surpass those standards. Examinations are at the heart of both functions. By making our tests comprehensive and rigorous we fulfill the gatekeeper role, and by doing our best to prepare our students for them and ensuring that they are fairly graded, we satisfy our mission as coaches. The suggestions given in this paper are intended to help us serve well in both capacities. Clearly, adopting them can take time, but it is hard to imagine an expenditure of time more important to our students, their future employers, and the professions they will serve.

References

Accreditation Board for Engineering and Technology (ABET). (2000). Criteria for accrediting engineering programs: Effective for evaluations during the 2001-2002 accreditation cycle, ABET, Baltimore, MD. Available:<http://www.abet.org/downloads/2001-02_Engineering_Criteria.pdf>.

Cashin, W.E. (1988). "Student ratings of teaching: A summary of the research." IDEA Paper No. 20, IDEA Center, Kansas State University. Available: <http://www.idea.ksu.edu/products/Papers.html>.

Cashin, W.E. (1995). "Student ratings of teaching: The research revisited." IDEA Paper No. 32, IDEA Center, Kansas State University. Available: <http://www.idea.ksu.edu/products/Papers.html>.

Elbow, P. (1986). Embracing contraries: Explorations in learning and teaching, Oxford University Press, New York.

Felder, R.M. (1992). "What do they know, anyway?" Chemical Engineering Education, 26(3), 134–135 (1992). Available: <http://www.ncsu.edu/felder-public/Columns/Eval.html>.

Felder, R.M. and Brent, R. (1997). "Objectively speaking," Chemical Engineering Education, 31(3), 178–179. Available: <http://www.ncsu.edu/felder-public/Columns/Objectives.html>.

Stice, J.E. (1976). "A first step toward improved teaching." Engineering Education, 66, 394-398.