Measurement, Assessment, and Evaluation in Education
Dr. Bob Kizlik

Updated December 9, 2012

Throughout my years of teaching undergraduate courses, and to some extent, graduate courses, I was continuously reminded each semester that many of my students who had taken the requisite course in "educational tests and measurements" or a course with a similar title as part of their professional preparation, often had confusing ideas about fundamental differences in terms such as measurement, assessment and evaluation as they are used in education. When I asked the question, "what is the difference between assessment and evaluation," I usually got a lot of blank stares. Yet, it seems that understanding the differences between measurement, assessment, and evaluation is fundamental to the knowledge base of professional teachers and effective teaching. Such understanding is also, or at the very least should be a core component of the curricula implemented in universities and colleges required in the education of future teachers.

In many places on the ADPRIMA website the phrase, "Anything not understood in more than one way is not understood at all" appears after some explanation or body of information. That phrase is, in my opinion, a fundamental idea of what should be a cornerstone of all teacher education. Students often struggle with describing or explaining what it means to "understand" something that they say they understand. I believe that in courses on on the subject of educational tests and measurements it is often that case that "understanding" is inferred from responses on multiple-choice tests or solving statistical problems. A semester later, when questioned about very fundamental ideas in statistics, measurement, assessment and evaluation, the students in my courses seemingly forgot most, if not all of what they "learned."

Measurement, assessment, and evaluation mean very different things, and yet most of my students were unable to adequately explain the differences. So, in keeping with the ADPRIMA approach to explaining things in as straightforward and meaningful a way as possible, here are what I think are useful descriptions of these three fundamental terms. These are personal opinions, but they have worked for me for many years. They have operational utility, and therefore may also be useful for your purposes.

Measurement refers to the process by which the attributes or dimensions of some physical object are determined. One exception seems to be in the use of the word measure in determining the IQ of a person. The phrase, "this test measures IQ" is commonly used. Measuring such things as attitudes or preferences also applies. However, when we measure, we generally use some standard instrument to determine how big, tall, heavy, voluminous, hot, cold, fast, or straight something actually is. Standard instruments refer to physical devices such as rulers, scales, thermometers, pressure gauges, etc. We measure to obtain information about what is. Such information may or may not be useful, depending on the accuracy of the instruments we use, and our skill at using them. There are few such instruments in the social sciences that approach the validity and reliability of say a 12" ruler. We measure how big a classroom is in terms of square feet, we measure the temperature of the room by using a thermometer, and we use an Ohm meter to determine the voltage, amperage, and resistance in a circuit. In all of these examples, we are not assessing anything; we are simply collecting information relative to some established rule or standard. Assessment is therefore quite different from measurement, and has uses that suggest very different purposes. When used in a learning objective, the definition provided on the ADPRIMA for the behavioral verb measure is: To apply a standard scale or measuring device to an object, series of objects, events, or conditions, according to practices accepted by those who are skilled in the use of the device or scale. An important point in the definition is that the person be skilled in the use of the device or scale. For example, a person who has in his or her possession a working Ohm meter, but does not know how to use it properly, could apply it to an electrical circuit but the obtained results would mean little or nothing in terms of useful information.

Click here for a brief explanation of the different types of measurement scales. The information will give you a little more context for the preceding section.

Assessment is a process by which information is obtained relative to some known objective or goal. Assessment is a broad term that includes testing. A test is a special form of assessment. Tests are assessments made under contrived circumstances especially so that they may be administered. In other words, all tests are assessments, but not all assessments are tests. We test at the end of a lesson or unit. We assess progress at the end of a school year through testing, and we assess verbal and quantitative skills through such instruments as the SAT and GRE. Whether implicit or explicit, assessment is most usefully connected to some goal or objective for which the assessment is designed. A test or assessment yields information relative to an objective or goal. In that sense, we test or assess to determine whether or not an objective or goal has been obtained. Assessment of skill attainment is rather straightforward. Either the skill exists at some acceptable level or it doesn’t. Skills are readily demonstrable. Assessment of understanding is much more difficult and complex. Skills can be practiced; understandings cannot. We can assess a person’s knowledge in a variety of ways, but there is always a leap, an inference that we make about what a person does in relation to what it signifies about what he knows. In the section on this site on behavioral verbs, to assess means To stipulate the conditions by which the behavior specified in an objective may be ascertained. Such stipulations are usually in the form of written descriptions.

Evaluation is perhaps the most complex and least understood of the terms. Inherent in the idea of evaluation is "value." When we evaluate, what we are doing is engaging in some process that is designed to provide information that will help us make a judgment about a given situation. Generally, any evaluation process requires information about the situation in question. A situation is an umbrella term that takes into account such ideas as objectives, goals, standards, procedures, and so on. When we evaluate, we are saying that the process will yield information regarding the worthiness, appropriateness, goodness, validity, legality, etc., of something for which a reliable measurement or assessment has been made. For example, I often ask my students if they wanted to determine the temperature of the classroom they would need to get a thermometer and take several readings at different spots, and perhaps average the readings. That is simple measuring. The average temperature tells us nothing about whether or not it is appropriate for learning. In order to do that, students would have to be polled in some reliable and valid way. That polling process is what evaluation is all about. A classroom average temperature of 75 degrees is simply information. It is the context of the temperature for a particular purpose that provides the criteria for evaluation. A temperature of 75 degrees may not be very good for some students, while for others, it is ideal for learning. We evaluate every day. Teachers, in particular, are constantly evaluating students, and such evaluations are usually done in the context of comparisons between what was intended (learning, progress, behavior) and what was obtained. When used in a learning objective, the definition provided on the ADPRIMA site for the behavioral verb evaluate is: To classify objects, situations, people, conditions, etc., according to defined criteria of quality. Indication of quality must be given in the defined criteria of each class category. Evaluation differs from general classification only in this respect.

To sum up, we measure distance, we assess learning, and we evaluate results in terms of some set of criteria. These three terms are certainly share some common attributes, but it is useful to think of them as separate but connected ideas and processes.

Here is a great link that offer different ideas about these three terms, with well-written explanations. Unfortunately, most information on the Internet concerning this topic amounts to little more than advertisements for services.

ASSESSMENT, MEASUREMENT, EVALUATION & RESEARCH

Testing with success series

Multiple choice tests

Multiple choice questions usually include a phrase or stem
followed by three to five options:

Test strategies:

Read the directions carefully
Know if each question has one or more correct option
Know if you are penalized for guessing
Know how much time is allowed (this governs your strategy)
Preview the test
Read through the test quickly and answer the easiest questions first
Mark those you think you know in some way that is appropriate
Read through the test a second time and answer more difficult questions
You may pick up cues for answers from the first reading, or become more comfortable in the testing situation
If time allows, review both questions and answers
It is possible you mis-read questions the first time

Answering optionsImprove your odds, think critically:

Cover the options, read the stem, and try to answerSelect the option that most closely matches your answer

Read the stem with each optionTreat each option as a true-false question, and choose the "most true"

Strategies for answering difficult questions:

Eliminate options you know to be incorrectIf allowed, mark words or alternatives in questions that eliminate the option
Give each option of a question the "true-false test:"
This may reduce your selection to the best answer
Question options that grammatically don't fit with the stem
Question options that are totally unfamiliar to you
Question options that contain negative or absolute words.
Try substituting a qualified term for the absolute one.
For example, frequently for always; or typical for every to see if you can eliminate an option
"All of the above:"
If you know two of three options seem correct, "all of the above" is a strong possibility
Number answers:
toss out the high and low and consider the middle range numbers
"Look alike options"
probably one is correct; choose the best but eliminate choices that mean basically the same thing, and thus cancel each other out
Double negatives:
Create the equivalent positive statement
Echo options:
If two options are opposite each other, chances are one of them is correct
Favor options that contain qualifiers
The result is longer, more inclusive items that better fill the role of the answer
If two alternatives seem correct,
compare them for differences,
then refer to the stem to find your best answer

Guessing:

Always guess when there is no penalty
for guessing or you can eliminate options
Don't guess if you are penalized for guessing
and if you have no basis for your choice
Use hints from questions you knowto answer questions you do not.
Change your first answers
when you are sure of the correction, or other cues in the test cue you to change.

Remember that you are looking for the best answer,
not only a correct one, and not one which must be true all of the time, in all cases, and without exception

Another resource with a wide variety of information on many related topics is Development Gateway.

Cogito et Adoro Te

Sabtu, 16 Februari 2013

Measurement, Assessment, and Evaluation in Educatio

Testing with success series

Multiple choice tests

Tidak ada komentar:

Posting Komentar