Tests and measurements in psychology

Psychological tests and measurements | SFU Library

  • What are psychological tests?
  • Access to psychological tests
  • Tests at SFU Library
  • Departmental collections
  • Administering psychological tests
  • Citing psychological tests

If you need help, please contact Yolanda Koscielski, Liaison Librarian for Criminology, Psychology & Philosophy at 778.782.3315 or [email protected] or Ask a librarian.

Psychological tests (also known as mental measurements, psychological instruments, psychometric tests, inventories, rating scales) are standardized measures of a particular psychological variable such as personality, intelligence, or emotional functioning. They often consist of a series of questions that subjects rank as true or false, or according to a Likert-type scale (agree, somewhat agree...), however tests can use written, visual or verbal methods.

Many tests are commercially published. One well-known commercial test is the Myers-Briggs Type Indicator. Commercial or published tests may need to be purchased from the publisher, and publishers may require proof that users have the professional credentials to administer the test.

In addition to commercial tests, there are countless unpublished tests that researchers design for particular studies in psychology, education, business and other fields.

Note: The SFU Library does not maintain a print or online collection of standardized tests.

Please note that full access (the measurement + scoring key and/or manual) to most clinical Psychological measures is not available to student researchers. Access to clinical tests is often restricted to Registered Psychologists only (those with a PhD in Psychology), to the clinical Psychology graduate students they supervise, and other professionals in health and counselling fields. Restricting access to tests helps ensure the validity of tests, including their persuasiveness when reported upon in a legal context, and reduces false diagnoses and misapplications by non-professionals.

In addition, there are also often publisher-imposed copyright and licensing restrictions (e.g., prohibitions on reproducing tests) which further restrict access.

Commercial psychological tests/measures require a fee to access them, and some (particularly in Business) may be prohibitively expensive for students. You will also likely require professional credentials to access them. However, library resources provide helpful descriptive and evaluative information about commercial tests.

Unpublished/non-commercial tests are free to access, but you may require permission from the test creator(s) to use or obtain the test, and access may be restricted, depending on your credentials.

Information about both specific commercial and unpublished and psychological tests is amply available, including journal articles that discuss the application and scoring of a particular test. In many cases, you may be able to track down the test or measure itself of unpublished tests, but without the scoring key or manual. And indeed there still exists a selection of tests with scoring keys that are available to general researchers.

It can be helpful to look at tests (even those without a scoring manual), such as those indexed in PsycTESTS, and reviews of commercially available psychological tests, to see how other researchers have measured a construct. This can inform you own research methods. 


Search these databases to find:

  1. Descriptive information and reviews of both commercially published and unpublished tests
  2. The full text of a unpublished psychological test or measure (usually without the scoring key, with a few exceptions)



PsycTESTS provides information on over 27,000 psychological tests, measures, and other assessment tools. In many cases, the full-text of test instrument is provided. However, scoring materials are rarely provided. PsycTESTS provides information on both commercial and unpublished tests. For non-commercial tests, you may wish to contact the test creator directly to inquire if further information can be provided directly to you. Contact information is often available via PsycTESTS.


Mental Measurements Yearbook with Tests in Print

Mental Measurements Yearbook with Tests in Print (TIP) Tests In Print "serves as a comprehensive bibliography to all known commercially available tests that are currently in print in the English language".

Mental Measurements Yearbook with Tests in Print offers comprehensive details about commercial psychological tests, for example, the Myers-Briggs Type Indicator and the Strong-Campbell Interest Inventory. The Yearbook also includes information on obtaining a test, as well as insightful reviews about a test, such as its construct validity and reliability.  

  • Tests in print: Descriptive information on all known commercially available tests in English (also in print)
  • Mental measurements yearbook: Descriptive and evaluative information about tests (also in print)


Psychological Test Adaptation and Development

This new open access journal, Psychological Test Adaptation and Development, publishes papers "on adaptations of tests to specific cultural needs, test translations, and the development of existing measures. The journal will focus on the empirical testing of the psychometric quality of these measures". 


Health and Psychosocial Instruments

Health and Psychosocial Instruments includes information on measurement instruments (commercial or unpublished) in the health fields, psychosocial sciences, organizational behavior, and library and information science. Links to journal articles that discuss a particular test.


Free tests in journal articles and books

Tests that have been published within books or journal articles are readily available and may meet your research needs.  Note that many articles and books provide information about tests, but only some of them may include the actual test instruments.

Journal articles

  • PsycINFO: Type "appended" in one of the search boxes and select Tests & Measures from the drop-down menu to the right of the search box. This will narrow your search to articles with tests appended. Use additional search boxes to add keywords.
  • ERIC (EBSCO): Type "tests/questionnaires" in one of the search boxes. and select Publication Type from the drop-down menu to the right of the search box to search for tests. Add keywords via additional search boxes.

Open Access

Open access repositories are a growing resource for accessing test measures, for example:

  • Zenodo - e.g., the PsycTEL online community  

You might also want to check:

  • Medline​
    ​Useful MeSH (medical subject headings) include questionnaires, psychological tests, health status rating scales, psychiatric status rating scales, and personality inventory. You can also keyword search, e.g., depression and questionnaire.
    Enter an instrument name in the search box and select IN Instrumentation from drop-down menu for articles that used a particular test
  • ProQuest Dissertations
    Tests may be included as appendices to dissertations
  • Health and psychosocial instruments
    Links to journal articles that discuss a particular test (commercial or unpublished). Select Primary Source for citation to the original source for the instrument.
  • Directory of unpublished experimental mental measures. 8 vols, 1997 (print)
    Check index to find journal articles that describe tests of a particular variable, actual test may or may not be included in article.

For more detailed information on identifying tests on specific subjects see the American Psychological Association's guide:  Testing and Assessment.



Some examples of SFU Library books that include tests

  • Marketing scales handbook: a compilation of multi-item measures for consumer behavior & advertising research. Volume 5, 2009 
    Includes many examples of tests about consumer behaviour
  • Measuring health: A guide to rating scales and questionnaires, 2006 
    Some tests included
  • Handbook of research design and social measurement, 2002  
    Some tests included
  • Handbook of Psychiatric Measures, 2008 [print and CD-ROM]
    Sample items provided for most measures, many actual measures included in CD ROM
  • Communication research measures: a sourcebook, 1994 edition [print], and 2009 edition [print)
    Descriptive summaries of measures, most measures also provided
  • Measures of personality and social psychological attitudes, 1991 [print]
    Contains many tests related to personality, self-esteem, and other social attitudes 
  • Essentials of Psychological Assessment Series, 1999 - [various titles, print and online]
    Search for this title in the Catalogue's Browse Search to view full series. Some tests may be included

Note: There is no straightforward way to identify books in the Library Catalogue that include tests, but a subject search for either Psychological Tests or Psychological Testing is a good start.

University of British Columbia holds a collection of standardized tests at the Psychoeducational Research and Training Centre (PRTC) within the Faculty of Education. Members of the SFU community can use this collection under some circumstances, but it is advisable to call first (604) 822-5384.

Many online and print books are available at SFU Library to give you background information on using tests and measures. Below are just a few examples:

  • Sage research methods online Includes over 600 books
  • An introduction to Psychological Tests and Scales, 2021
  • Measurement models for psychological attributes, 2021 [print]
  • Handbook of Psychological Assessment, 2016
  • Tests: a comprehensive reference for assessments in psychology, education, and business, 2008 (print)  
  • Comprehensive handbook of psychological assessment, 2004 (print, Vols 1-4)
  • Encyclopedia of social measurement, 2005
  • Dictionary of psychological testing, assessment and treatment, 2007
  • The use of psychological testing for treatment planning and outcomes assessment, 2004 (print) 

The APA blog outlines the format for citing a Psychological test or measure. APA prescribes the general APA syntax for citing a test or measure:

Who (Author) - When (Date) - What (Title) [format note] - Where (Place)

A distinction on whether you are citing the database record for a test, or the test itself is made by writing [Database record] or [Measurement instrument] in square brackets after the test's title.

Note that older citations for print tests (pre-internet) can look exactly like the citation for a book. This can be confusing when tracking down citations. If unclear, you can trying search PsycTESTS or WorldCat to elicit more information.


Owned by: Yolanda Koscielski

Last revised: 2021-11-02

Understanding Psychological Measurement – Research Methods in Psychology – 2nd Canadian Edition

Chapter 5: Psychological Measurement

  1. Define measurement and give several examples of measurement in psychology.
  2. Explain what a psychological construct is and give several examples.
  3. Distinguish conceptual from operational definitions, give examples of each, and create simple operational definitions.
  4. Distinguish the four levels of measurement, give examples of each, and explain why this distinction is important.

 is the assignment of scores to individuals so that the scores represent some characteristic of the individuals. This very general definition is consistent with the kinds of measurement that everyone is familiar with—for example, weighing oneself by stepping onto a bathroom scale, or checking the internal temperature of a roasting turkey by inserting a meat thermometer. It is also consistent with measurement in the other sciences. In physics, for example, one might measure the potential energy of an object in Earth’s gravitational field by finding its mass and height (which of course requires measuring those variables) and then multiplying them together along with the gravitational acceleration of Earth (9. 8 m/s2). The result of this procedure is a score that represents the object’s potential energy.

This general definition of measurement is consistent with measurement in psychology too. (Psychological measurement is often referred to as .) Imagine, for example, that a cognitive psychologist wants to measure a person’s working memory capacity—his or her ability to hold in mind and think about several pieces of information all at the same time. To do this, she might use a backward digit span task, in which she reads a list of two digits to the person and asks him or her to repeat them in reverse order. She then repeats this several times, increasing the length of the list by one digit each time, until the person makes an error. The length of the longest list for which the person responds correctly is the score and represents his or her working memory capacity. Or imagine a clinical psychologist who is interested in how depressed a person is. He administers the Beck Depression Inventory, which is a 21-item self-report questionnaire in which the person rates the extent to which he or she has felt sad, lost energy, and experienced other symptoms of depression over the past 2 weeks. The sum of these 21 ratings is the score and represents his or her current level of depression.

The important point here is that measurement does not require any particular instruments or procedures. It does not require placing individuals or objects on bathroom scales, holding rulers up to them, or inserting thermometers into them. What it does require is some systematic procedure for assigning scores to individuals or objects so that those scores represent the characteristic of interest.

Many variables studied by psychologists are straightforward and simple to measure. These include sex, age, height, weight, and birth order. You can often tell whether someone is male or female just by looking. You can ask people how old they are and be reasonably sure that they know and will tell you. Although people might not know or want to tell you how much they weigh, you can have them step onto a bathroom scale. Other variables studied by psychologists—perhaps the majority—are not so straightforward or simple to measure. We cannot accurately assess people’s level of intelligence by looking at them, and we certainly cannot put their self-esteem on a bathroom scale. These kinds of variables are called  (pronounced CON-structs) and include personality traits (e.g., extraversion), emotional states (e.g., fear), attitudes (e.g., toward taxes), and abilities (e.g., athleticism).

Psychological constructs cannot be observed directly. One reason is that they often represent tendencies to think, feel, or act in certain ways. For example, to say that a particular university student is highly extraverted does not necessarily mean that she is behaving in an extraverted way right now. In fact, she might be sitting quietly by herself, reading a book. Instead, it means that she has a general tendency to behave in extraverted ways (talking, laughing, etc.) across a variety of situations. Another reason psychological constructs cannot be observed directly is that they often involve internal processes. Fear, for example, involves the activation of certain central and peripheral nervous system structures, along with certain kinds of thoughts, feelings, and behaviours—none of which is necessarily obvious to an outside observer. Notice also that neither extraversion nor fear “reduces to” any particular thought, feeling, act, or physiological structure or process. Instead, each is a kind of summary of a complex set of behaviours and internal processes.

The Big Five is a set of five broad dimensions that capture much of the variation in human personality. Each of the Big Five can even be defined in terms of six more specific constructs called “facets” (Costa & McCrae, 1992)[1].

Table 5.1 The Big Five Personality Dimensions.
Openness to experience Fantasy Aesthetics Feelings Actions Ideas Values
Conscientiousness Competence Order Dutifulness Achievement/Striving Self-discipline Deliberation
Extroversion Warmth Gregariousness Assertiveness Activity Excitement seeking Positive emotions
Agreeableness Trust Straight-forwardness Altruism Compliance Modesty Tender mindedness
Neuroticism Worry Anger Discouragement Self-conciousness Impusivity Vulnerability

The  of a psychological construct describes the behaviours and internal processes that make up that construct, along with how it relates to other variables. For example, a conceptual definition of neuroticism (another one of the Big Five) would be that it is people’s tendency to experience negative emotions such as anxiety, anger, and sadness across a variety of situations. This definition might also include that it has a strong genetic component, remains fairly stable over time, and is positively correlated with the tendency to experience pain and other physical symptoms.

Students sometimes wonder why, when researchers want to understand a construct like self-esteem or neuroticism, they do not simply look it up in the dictionary. One reason is that many scientific constructs do not have counterparts in everyday language (e.g., working memory capacity). More important, researchers are in the business of developing definitions that are more detailed and precise—and that more accurately describe the way the world is—than the informal definitions in the dictionary. As we will see, they do this by proposing conceptual definitions, testing them empirically, and revising them as necessary. Sometimes they throw them out altogether. This is why the research literature often includes different conceptual definitions of the same construct. In some cases, an older conceptual definition has been replaced by a newer one that fits and works better. In others, researchers are still in the process of deciding which of various conceptual definitions is the best.

An  is a definition of a variable in terms of precisely how it is to be measured. These measures generally fall into one of three broad categories.  are those in which participants report on their own thoughts, feelings, and actions, as with the Rosenberg Self-Esteem Scale.  are those in which some other aspect of participants’ behaviour is observed and recorded. This is an extremely broad category that includes the observation of people’s behaviour both in highly structured laboratory tasks and in more natural settings. A good example of the former would be measuring working memory capacity using the backward digit span task. A good example of the latter is a famous operational definition of physical aggression from researcher Albert Bandura and his colleagues (Bandura, Ross, & Ross, 1961)[2]. They let each of several children play for 20 minutes in a room that contained a clown-shaped punching bag called a Bobo doll. They filmed each child and counted the number of acts of physical aggression he or she committed. These included hitting the doll with a mallet, punching it, and kicking it. Their operational definition, then, was the number of these specifically defined acts that the child committed during the 20-minute period. Finally, physiological measures are those that involve recording any of a wide variety of physiological processes, including heart rate and blood pressure, galvanic skin response, hormone levels, and electrical activity and blood flow in the brain.

For any given variable or construct, there will be multiple operational definitions. Stress is a good example. A rough conceptual definition is that stress is an adaptive response to a perceived danger or threat that involves physiological, cognitive, affective, and behavioural components. But researchers have operationally defined it in several ways. The Social Readjustment Rating Scale is a self-report questionnaire on which people identify stressful events that they have experienced in the past year and assigns points for each one depending on its severity. For example, a man who has been divorced (73 points), changed jobs (36 points), and had a change in sleeping habits (16 points) in the past year would have a total score of 125. The Daily Hassles and Uplifts Scale is similar but focuses on everyday stressors like misplacing things and being concerned about one’s weight. The Perceived Stress Scale is another self-report measure that focuses on people’s feelings of stress (e.g., “How often have you felt nervous and stressed?”). Researchers have also operationally defined stress in terms of several physiological variables including blood pressure and levels of the stress hormone cortisol.

When psychologists use multiple operational definitions of the same construct—either within a study or across studies—they are using . The idea is that the various operational definitions are “converging” or coming together on the same construct. When scores based on several different operational definitions are closely related to each other and produce similar patterns of results, this constitutes good evidence that the construct is being measured effectively and that it is useful. The various measures of stress, for example, are all correlated with each other and have all been shown to be correlated with other variables such as immune system functioning (also measured in a variety of ways) (Segerstrom & Miller, 2004)[3]. This is what allows researchers eventually to draw useful general conclusions, such as “stress is negatively correlated with immune system functioning,” as opposed to more specific and less useful ones, such as “people’s scores on the Perceived Stress Scale are negatively correlated with their white blood counts.

The psychologist S. S. Stevens suggested that scores can be assigned to individuals in a way that communicates more or less quantitative information about the variable of interest (Stevens, 1946)[4]. For example, the officials at a 100-m race could simply rank order the runners as they crossed the finish line (first, second, etc.), or they could time each runner to the nearest tenth of a second using a stopwatch (11.5 s, 12.1 s, etc.). In either case, they would be measuring the runners’ times by systematically assigning scores to represent those times. But while the rank ordering procedure communicates the fact that the second-place runner took longer to finish than the first-place finisher, the stopwatch procedure also communicates how much longer the second-place finisher took. Stevens actually suggested four different  (which he called “scales of measurement”) that correspond to four different levels of quantitative information that can be communicated by a set of scores.

The  of measurement is used for categorical variables and involves assigning scores that are category labels. Category labels communicate whether any two individuals are the same or different in terms of the variable being measured. For example, if you look at your research participants as they enter the room, decide whether each one is male or female, and type this information into a spreadsheet, you are engaged in nominal-level measurement. Or if you ask your participants to indicate which of several ethnicities they identify themselves with, you are again engaged in nominal-level measurement. The essential point about nominal scales is that they do not imply any ordering among the responses. For example, when classifying people according to their favourite colour, there is no sense in which green is placed “ahead of” blue. Responses are merely categorized. Nominal scales thus embody the lowest level of measurement[5].

The remaining three levels of measurement are used for quantitative variables. The  of measurement involves assigning scores so that they represent the rank order of the individuals. Ranks communicate not only whether any two individuals are the same or different in terms of the variable being measured but also whether one individual is higher or lower on that variable. For example, a researcher wishing to measure consumers’ satisfaction with their microwave ovens might ask them to specify their feelings as either “very dissatisfied,” “somewhat dissatisfied,” “somewhat satisfied,” or “very satisfied.” The items in this scale are ordered, ranging from least to most satisfied. This is what distinguishes ordinal from nominal scales. Unlike nominal scales, ordinal scales allow comparisons of the degree to which two individuals rate the variable. For example, our satisfaction ordering makes it meaningful to assert that one person is more satisfied than another with their microwave ovens. Such an assertion reflects the first person’s use of a verbal label that comes later in the list than the label chosen by the second person.

On the other hand, ordinal scales fail to capture important information that will be present in the other levels of measurement we examine. In particular, the difference between two levels of an ordinal scale cannot be assumed to be the same as the difference between two other levels (just like you cannot assume that the gap between the runners in first and second place is equal to the gap between the runners in second and third place). In our satisfaction scale, for example, the difference between the responses “very dissatisfied” and “somewhat dissatisfied” is probably not equivalent to the difference between “somewhat dissatisfied” and “somewhat satisfied.” Nothing in our measurement procedure allows us to determine whether the two differences reflect the same difference in psychological satisfaction. Statisticians express this point by saying that the differences between adjacent scale values do not necessarily represent equal intervals on the underlying scale giving rise to the measurements. (In our case, the underlying scale is the true feeling of satisfaction, which we are trying to measure.)

The  of measurement involves assigning scores using numerical scales in which intervals have the same interpretation throughout. As an example, consider either the Fahrenheit or Celsius temperature scales. The difference between 30 degrees and 40 degrees represents the same temperature difference as the difference between 80 degrees and 90 degrees. This is because each 10-degree interval has the same physical meaning (in terms of the kinetic energy of molecules).

Interval scales are not perfect, however. In particular, they do not have a true zero point even if one of the scaled values happens to carry the name “zero.” The Fahrenheit scale illustrates the issue. Zero degrees Fahrenheit does not represent the complete absence of temperature (the absence of any molecular kinetic energy). In reality, the label “zero” is applied to its temperature for quite accidental reasons connected to the history of temperature measurement. Since an interval scale has no true zero point, it does not make sense to compute ratios of temperatures. For example, there is no sense in which the ratio of 40 to 20 degrees Fahrenheit is the same as the ratio of 100 to 50 degrees; no interesting physical property is preserved across the two ratios. After all, if the “zero” label were applied at the temperature that Fahrenheit happens to label as 10 degrees, the two ratios would instead be 30 to 10 and 90 to 40, no longer the same! For this reason, it does not make sense to say that 80 degrees is “twice as hot” as 40 degrees. Such a claim would depend on an arbitrary decision about where to “start” the temperature scale, namely, what temperature to call zero (whereas the claim is intended to make a more fundamental assertion about the underlying physical reality). In psychology, the intelligence quotient (IQ) is often considered to be measured at the interval level.

Finally, the  of measurement involves assigning scores in such a way that there is a true zero point that represents the complete absence of the quantity. Height measured in metres and weight measured in kilograms are good examples. So are counts of discrete objects or events such as the number of siblings one has or the number of questions a student answers correctly on an exam. You can think of a ratio scale as the three earlier scales rolled up in one. Like a nominal scale, it provides a name or category for each object (the numbers serve as labels). Like an ordinal scale, the objects are ordered (in terms of the ordering of the numbers). Like an interval scale, the same difference at two places on the scale has the same meaning. However, in addition, the same ratio at two places on the scale also carries the same meaning (see Table 5.2).

The Fahrenheit scale for temperature has an arbitrary zero point and is therefore not a ratio scale. However, zero on the Kelvin scale is absolute zero. This makes the Kelvin scale a ratio scale. For example, if one temperature is twice as high as another as measured on the Kelvin scale, then it has twice the kinetic energy of the other temperature.

Another example of a ratio scale is the amount of money you have in your pocket right now (25 cents, 50 cents, etc.). Money is measured on a ratio scale because, in addition to having the properties of an interval scale, it has a true zero point: if you have zero money, this actually implies the absence of money. Since money has a true zero point, it makes sense to say that someone with 50 cents has twice as much money as someone with 25 cents.

Stevens’s levels of measurement are important for at least two reasons. First, they emphasize the generality of the concept of measurement. Although people do not normally think of categorizing or ranking individuals as measurement, in fact they are as long as they are done so that they represent some characteristic of the individuals. Second, the levels of measurement can serve as a rough guide to the statistical procedures that can be used with the data and the conclusions that can be drawn from them. With nominal-level measurement, for example, the only available measure of central tendency is the mode. Also, ratio-level measurement is the only level that allows meaningful statements about ratios of scores. One cannot say that someone with an IQ of 140 is twice as intelligent as someone with an IQ of 70 because IQ is measured at the interval level, but one can say that someone with six siblings has twice as many as someone with three because number of siblings is measured at the ratio level.

Table 5.2 Summary of Levels of Measurements
Level of Measurement Category labels Rank order Equal intervals True Zero
  • Measurement is the assignment of scores to individuals so that the scores represent some characteristic of the individuals. Psychological measurement can be achieved in a wide variety of ways, including self-report, behavioural, and physiological measures.
  • Psychological constructs such as intelligence, self-esteem, and depression are variables that are not directly observable because they represent behavioural tendencies or complex patterns of behaviour and internal processes. An important goal of scientific research is to conceptually define psychological constructs in ways that accurately describe them.
  • For any conceptual definition of a construct, there will be many different operational definitions or ways of measuring it. The use of multiple operational definitions, or converging operations, is a common strategy in psychological research.
  • Variables can be measured at four different levels—nominal, ordinal, interval, and ratio—that communicate increasing amounts of quantitative information. The level of measurement affects the kinds of statistics you can use and conclusions you can draw from your data.
  1. Practice: Complete the Rosenberg Self-Esteem Scale and compute your overall score.
  2. Practice: Think of three operational definitions for sexual jealousy, decisiveness, and social anxiety. Consider the possibility of self-report, behavioural, and physiological measures. Be as precise as you can.
  3. Practice: For each of the following variables, decide which level of measurement is being used.
    • An university instructor measures the time it takes her students to finish an exam by looking through the stack of exams at the end. She assigns the one on the bottom a score of 1, the one on top of that a 2, and so on.
    • A researcher accesses her participants’ medical records and counts the number of times they have seen a doctor in the past year.
    • Participants in a research study are asked whether they are right-handed or left-handed.

  1. Costa, P. T., Jr., & McCrae, R. R. (1992). Normal personality assessment in clinical practice: The NEO Personality Inventory. Psychological Assessment, 4, 5–13. ↵
  2. Bandura, A., Ross, D., & Ross, S. A. (1961). Transmission of aggression through imitation of aggressive models. Journal of Abnormal and Social Psychology, 63, 575–582. ↵
  3. Segerstrom, S. E., & Miller, G. E. (2004). Psychological stress and the human immune system: A meta-analytic study of 30 years of inquiry. Psychological Bulletin, 130, 601–630. ↵
  4. Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680. ↵
  5. Levels of Measurement. Retrieved from http://wikieducator.org/Introduction_to_Research_Methods_In_Psychology/Theories_and_Measurement/Levels_of_Measurement ↵

5.3. Testing and measurement theory

Variety procedures for measuring object properties is a psychological test (for details on it, see topic 6).

WITH theoretical point of view testing consists of two main components: actual testing - interactions test subject with test and interpretation – data interactions (indicators) subject with a set of data.

IN depending on what properties and the researcher deals with indicators on a set of subjects (determined the nature of the property) or indicators (defined by behavior description and tasks), different models are obtained test. If the property is not defined, then considers the ratio of differences on lots of people. This relationship gives rise new class of objects. This test reveals measure of similarity of each person with "standard man".

If property is qualitatively defined, then it is considered as a point, which allows you to restrict the class of objects − select people who have this property, and people who don't have it. In this case test allows you to make a dichotomous classification.

If property is linear or multidimensional, then it is possible to determine the value of the property, characterizing each person. Test allows the property to be quantified.

Cumulative additive model test proposed by a German psychologist K. Levin, who understood the behavior as a function of personality and situation. In the test solve the problem of restoring the property personality by behavior in a situation. The situation is the test item, and behavior - the response of the subject. So Thus, each property indicator is combination of behavior and situation. Tem most personality is derived from a set of indicators. Procedure discovering properties, to which it boils down test measurement, ends with the output total score. The raw score is considered assessment characterizing the subject.

cumulative hypothesis checked by correlation of results application of various techniques. At the presence of a high positive linear correlation coefficient results cumulative-additive model is accepted for data processing personal questionnaire.

Probabilistic model test. Critical appraisal of the application cumulative-additive model gave Swiss psychologist R. Meili. He believed that tests only measure the likelihood whether the subject has one or the other psychological properties, not his intensity. 62 According to V.N. Druzhinin, criticism, s which R. Meili performs, wears only qualitative character and does not mathematical or empirical justification. 63 From the point of view of the generalized model, the main test requirement is that measurement and interpretation procedures were identical.

Topic 6. Psychological testing

6.1. General characteristics of psychological testing

Psychological testing is a method of measuring and evaluating psychological characteristics of a person using special techniques. Subject testing can be any psychological characteristics of a person: mental processes, states, properties, relationships, etc. psychological testing is psychological test – standardized testing system, to detect and measure qualitative and quantitative individual psychological differences.

Initially testing was seen as kind of experiment. However, to present time specifics and independent testing value in psychology allow you to delimit it from the actual experiment.

Theory and testing practice summarized in independent scientific disciplines – psychological diagnosis and testology. Psychological diagnostics is the science of identifying and measurements of individual psychological and individual psychophysiological human features. Thus, psychodiagnostics is experimental psychological branch of differential psychology. Testology is the science of developing, designing tests.

Process testing usually includes yourself three stages:

1) choice methodology adequate to the goals and objectives testing;

2) actually testing, i.e. collecting data in in accordance with the instructions;

3) comparison received data with the "norm" or between yourself and making an assessment.

IN due to the presence of two ways of making there are two types of test scores psychological diagnosis. First type is to ascertain the presence or the absence of any sign. In that case, the obtained data on individual characteristics of the psyche of the tested correlate with some given criterion. The second type of diagnosis compare multiple testers among themselves and find a place for each of them on a certain "axis" depending on on the degree of expression of certain qualities. This is done by ranking all surveyed according to the degree of representation of the studied indicator, are entered high, medium, low, etc. levels studied features in this sample.

Strictly speaking, there is a psychological diagnosis not only the result of a comparison of empirical data with a test scale or between itself, but also the result of a qualified interpretation, taking into account many attendant factors (mental state tested, his readiness to perceive assignments and report on their performance, testing situations, etc.).

Psychological tests are especially clear on the relationship method of research with methodological the views of a psychologist. For example, in depending on the preferred theory personality, the researcher chooses the type personal questionnaire.

Usage tests are an integral feature of modern psychodiagnostics. Can be distinguished several areas of practice using the results of psychodiagnostics: sphere of education and upbringing, sphere professional selection and professional orientation, advisory and psychotherapeutic practice and finally, the field of expertise is medical, judicial, etc.

Measurement in psychology. Measurement scales. - Methods of mathematical statistics - - Catalog of articles

Author of the article: Popov Oleg Aleksandrovich.

When copying or quoting, a link to the site and the author is required!

In psychology quite often one has to deal with measurement. In essence, any psychological test is a measurement tool, the result of which, most often, are numeric data.

Measurement - operation to determine relationship of one object to another. The measurement is realized by assigning value objects so that the relationships between the values ​​reflect the relationships between objects. For example, we measure the height of two people (the object of measurement is height). Having received the values ​​​​of 170 and 185 cm. we We can definitely say that one person is superior to another. This conclusion was made thanks to growth measurement. Thus, the relationship between objects was conveyed through numbers.

In psychology we can see phenomena similar to the previous example. We use tests intelligence to get a numerical value of IQ and be able to compare it with a normative value, we use personality tests to, based on the obtained numbers describe the psychological characteristics of a person, we use tests achievements to find out how well the learning material was learned. Measurement is also a count of the number of certain acts of behavior in during the observation of the subjects, the calculation of the hatching area in the projective figures, counting the number of errors in the correction test.

In the case of growth, the object of measurement was not a person, but his height. Studying the human psyche we also measure not him, but certain psychological characteristics: personality traits, intelligence, individual characteristics of the cognitive sphere, etc. Everything we measure is called variables.

Variable is a property that can change its meaning. Growth is a property of all people, but everyone has it. different, and therefore is a variable. Gender is also variable, but can take only 2 values. All indicators of tests in psychology are variables.

Results some psychological tests, at first glance, it is very difficult to imagine as a measurement result and it is difficult to understand which properties (variables) are being measured these tests. A vivid example of this is projective tests, especially drawing and verbal. Behind each element of the picture lies some kind of psychological feature (variable) and speaking about the severity or non-expression of this variable based on the picture element we we perform the act of measurement. Thus, despite the large number variables measured using projective drawings measurement most often comes down to a simple statement of the fact “the variable is expressed / not expressed”, less often there are three or more gradations. The situation is much simpler with tests, in which you need to order something, because their result is a number that reflects ordinal place. Even more obvious are the results of tests, questionnaires, tests intelligence and cognitive abilities.

like this Thus, the test, as a measurement tool, imposes its own limitations on the resulting result is . This limitation is called the measurement scale.

Measuring scale - type limitation relationships between the values ​​of variables, superimposed on the measurement results. Most often, the measurement scale depends on the measurement tool.

For example, if the variable is eye color, then we cannot say that one person more or less than another in this variable, we also cannot find the average arithmetic colors. If the variable is the order (exactly the order) the birth of children in the family, then we can say that the first child is definitely older than the second, but we cannot say how much older he is (relationships "more less"). Having the results of the intelligence test, we can definitely say how much one person is smarter than another.

S. Stevens considered four scales of measurement.

1. Name scale - the simplest of measurement scales. Numbers (as well as letters, words, or any symbols) are used to distinguish objects. Displays the relationships through which objects are grouped into separate non-overlapping classes. Number (letter, name) class does not reflect its quantitative content. An example of such a scale can serve as a classification of subjects for men and women, the numbering of players sports teams, phone numbers, passports, barcodes of goods. All these variables do not reflect more/less relationships, and therefore are a scale items.

Special a subspecies of the naming scale is dichotomous scale , which is encoded with two mutually exclusive values ​​(1/0). Floor person is a typical dichotomous variable.

Scaled denominations it is impossible to say that one object is greater or less than another, by how many units they differ and how many times. Only surgery possible classification - different/not different.

In psychology sometimes it is impossible to avoid the scale of names, especially when analyzing figures. TO For example, when drawing a house, children often draw the sun at the top of the sheet. Can assume that the location of the sun on the left, in the middle, on the right, or the absence the sun in general can talk about some of the psychological qualities of the child. The sun positions listed are variable scale values. items. Moreover, we can designate the location options by numbers, letters or leave them in the form of words, but no matter how we call them, we can't say that one child is "bigger" than the other if the sun is not drawn in the middle and on the left. But we can say for sure that the child who drew the sun on the right is definitely not the one who drew the sun on the left (or is not included in the group).

like this Thus, the naming scale reflects relations of the type: similar / not similar, that / not one that belongs to the group / does not belong to the group.

2. Ordinal (rank) scale - display of order relations. The only possible relationships between objects measurements in this scale are more/less, better/worse.

The most typical the variable of this scale is the place occupied by the athlete in the competition. It is known that the winners of the competition receive first, second and third place and we know for sure that the athlete with the first place has better results than athlete with second place. In addition to the place, we have the opportunity to find out specific athlete results.

In psychology less certain situations arise. For example, when a person is asked order the colors by preference, from the most pleasant to the most unpleasant. In this case, we can definitely say that one color is more pleasant. another, but we cannot even assume about the units of measurement, because Human ranked colors not based on any unit of measurement, but based on own feelings. The same thing happens in the Rokeach test, according to the results which we also do not know how many units one value is higher (greater) another. That is, unlike competitions, we do not even have the opportunity to find out exact difference scores.

After spending a measurement on an ordinal scale cannot be known by how many units differ objects, especially how many times they differ.

3. Interval scale - in addition to relationships specified for the naming and order scales, displays the relationship distances (differences) between objects. Differences between adjacent points in this scale are equal. Most psychological tests contain norms that are an example of an interval scale. Intelligence quotient, test results FPI, degrees Celsius, are all interval scales. Zero in them is conditional: for IQ and FPI, zero is the lowest possible test score (obviously, even random answers in an intelligence test will allow you to get some score other than zero). If we did not create a conditional zero in the scale, but used a real zero as a reference, then we would get a ratio scale, but we know that intelligence cannot be zero.

Not a psychological example of an interval scale is the degree Celsius scale. Zero here conditional - the freezing point of water and there is a unit of measurement - degree Celsius. Although we know that there is an absolute temperature zero - this is the minimum temperature limit that a physical body can have, which in The Celsius scale is -273.15 degrees. Thus, the conditional zero and the presence equal intervals between units of measurement are the main features of the scale intervals.

Having measured phenomenon on an interval scale, we can say that one object per the number of units is greater or less than the other.

4. Ratio scale . Unlike the scale intervals can reflect how much one indicator is greater than another. Scale relations has a zero point, which characterizes the complete absence measurable quality.

Learn more