The propensity score is the probability of receiving treatment given observed covariates. Units with similar propensity scores have similar confounder profiles, even if they differ on individual covariates.

Matching pairs each treated unit with one or more control units with close propensity scores, then compares outcomes. The goal is covariate balance — treated and control look alike on measured confounders before we compare outcomes.

Strengths and limits

Matching is intuitive and works well when treatment is binary, confounders are measured, and overlap is reasonable (some controls could plausibly have been treated).

It does not fix unmeasured confounding. Poor overlap — when some treated units have no comparable controls — means estimates rely on extrapolation and should be interpreted cautiously.

Check balance, not just p-values

Compare standardized mean differences before and after matching.
Inspect whether key confounders are balanced in the matched sample.
Compare with IPW or doubly robust estimators as a robustness check.