How consistently and accurately are we measuring what we intend to measure. What can we do to improve our measurement. And how can we identify that are better or worse than others. These questions all have to do with what makes measurement good. These two terms, reliability and validity, come up many times throughout the course.

Measurement is useless unless it is based on a clearly articulated quick cure. This purpose describes the goals of administering a test or survey, including what will be measured, for whom, and why. It addresses how scores from the test are designed to be interpreted.

How would you express the purposes of these tests. When answering this question, be careful to avoid simply saying that the purpose of the test is to measure something. A statement of test purpose should clarify what can be done with the resulting scores. For example, scores from placement tests are used to determine what courses a student should take or identify students in need of certain instructional resources.

Scores on tests inform the selection of applicants for entrance to a college or university. Some of my work and research is based on a type of standardized placement testing that is used to measure student growth over a short period of time. To summarize this section, the measurement process allows us to capture information about individuals that can be used to describe their standing on a variety of constructs, educational ones, like math and vocabulary knowledge, to psychological ones, like sociability and aggression.

We measure these properties by operationalizing our construct, for example, in terms of the number of items answered correctly or the number of times individuals exhibit a certain behavior. These operational variables are then assumed to represent our construct of interest. The rules that guide the measurement process determine the type of measurement scale that is produced and the statistics that can be used with that scale.

The most basic measurement scale is really the absence of a scale, because the values used are simple categories or names, rather than quantities of a variable. The nominal scale can also represent variables such as zip code or eye color, where multiple categories are present. So, identifying variables such as student last name or school ID are also considered nominal. Only frequencies, proportions, and percentages (and related nonparametric statistics) are permitted with nominal variables.

Means and standard deviations (and related parametric statistics) do not work. It would be meaningless to calculate something like an average gender or eye color, because nominal variables lack any inherent ordering or quantity in their values.

Common examples of ordinal scales include ranks (e. The distance between the ordered categories in ordinal scale variables (i. Statistics which rely on interval level information, such as the mean, standard deviation, and all mean-based statistical tests, are still not allowed with an ordinal scale. Statistics permitted with ordinal variables include the median and any statistics based on percentiles.

Interval scales include ordered values where the distances, or intervals, between them are meaningful. One common example of an interval scale is test score based on number correct, where each item in a test is worth the same amount when calculating the total.

This can sometimes be problematic. Otherwise, scale intervals will not have a consistent meaning. Instead, an increase in number correct will depend on the word that is answered correctly. Another common oil sunflower of an interval scale is temperature as measured in degrees centigrade or Fahrenheit.

These temperature scales both have meaningful intervals, where a given increase in heat, for example, produces the same increase in degrees no matter where you are on the scale.



