15 Item Response Theory

Item Response Theory (IRT) uses a mathematical function to model the probability of a person’s success on a test item, given their level of ability and the item’s difficulty. The function should be bounded by 0 and 1, meaning that it models probability. It should also be increasing, indicating that the chance of success is higher for people with higher abilities. The function should have an S-shaped curve, with the left side close to 0 and the right side close to 1 but not exceeding it.

There are several mathematical functions that fit this shape, such as cumulative distribution functions (cdf) or ogives. Examples of ogives that have been used in IRT include the logistic ogive and the normal ogive, but for the purposes of this discussion, we will focus on the logistic ogive. The logistic ogive is a function that can be used to model the probability of success on an item and has the desired S-shaped curve, making it a commonly used function in IRT.

The Rasch item response model (Rasch 1960) uses the following function to model the probability of success on an item with item difficulty δ and ability θ

\[ Prob(X=1) = \frac{\exp(\theta - \delta)}{1+\exp(\theta - \delta)} \]

where X is the person’s score on the item, taking values of 0 or 1.

rm(list=ls())  #remove all objects in the R environment
require(tidyverse)

Loading required package: tidyverse

Warning: package 'ggplot2' was built under R version 4.2.3

Warning: package 'tidyr' was built under R version 4.2.3

Warning: package 'readr' was built under R version 4.2.3

Warning: package 'dplyr' was built under R version 4.2.3

Warning: package 'stringr' was built under R version 4.2.3

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

# Calculate Rasch model probability
delta <- 0  #item difficulty is 0.6
theta <- 0  #person ability is 1.0
prob <- exp(theta-delta)/(1+exp(theta-delta))
prob

[1] 0.5

Change the values of delta and theta, and observe how prob changes. In particular, try values when (1) delta = theta, (2) delta > theta, and (3) delta < theta. What happens when theta is very high, and when theta is very low?

We will plot the probability as a function of θ.

#Calculate prob as a function the theta
id = 1 # item id
delta <- -10 #item difficulty is 1
theta <- seq(-3,3,0.01) #a vector of theta from -3 to 3 in steps of 0.01
prob_function <- function(theta, delta){
    return (exp(theta-delta)/(1+exp(theta-delta)))
}

probs <- tibble(id = id, delta=delta,theta=theta)
probs <- probs %>% mutate(prob = prob_function(theta, delta))

p <- ggplot(probs, aes(x=theta, y=prob))
p <- p + geom_line()
print(p)

15.1 Exercise

As an exercise, plot three ICC for three items with (1) δ = −1, (2) δ = 0, and (3) δ = 1.

#Calculate prob as a function the theta
id = 1 # item id
delta <- -1 #item difficulty is 1
theta <- seq(-3,3,0.01) #a vector of theta from -3 to 3 in steps of 0.01
prob_function <- function(theta, delta){
    return (exp(theta-delta)/(1+exp(theta-delta)))
}

probs <- tibble(id = id, delta=delta,theta=theta)
probs <- probs %>% mutate(prob = prob_function(theta, delta))

item1 <- tibble(id = 1, delta=-1,theta=theta)
item2 <- tibble(id=2, delta=0, theta=theta)
item3 <- tibble(id=3, delta=1, theta=theta)
items <- bind_rows(item1, item2, item3)

items <- items %>% mutate(item=factor(id))
items_probs <- items %>% mutate(prob = prob_function(theta, delta))

p <- ggplot(items_probs, aes(x=theta, y=prob,colour=item))
p <- p + geom_point()
print(p)

See if you can work out how to plot the three graphs using different colors. Can you easily identify the ICC of the easier item?

15.2 Theta Scale Unit

The θ scale is in “logit” unit. Answer the following questions, assuming the item responses follow a Rasch model:

What is the probability of success when θ = δ?
What is the probability of success when θ = 2 logit and δ = 1 logit?
What is the probability of success when a person’s ability is 1.5 logit higher than the item difficulty?
What is the probability of success when a person’s ability is 0.6 logit lower than the item difficulty?
Is the following statement TRUE or FALSE: The difference in logit between a person’s ability and an item’s difficulty determines the probability of success, irrespective of what the ability is.
Consider CTT item difficulty and person ability, can we say anything about the likelihood of success for a person who scored 80% on a test, on an item where 80% of the candidates answered correctly?