In the initial stages of statistics, we spend a lot of time measuring things—height, weight, temperature, or income. But what happens when the data doesn’t come in numbers? What if the data is about “Success” or “Failure,” “Yes” or “No,” or the “Type” of blood a patient has? This is the domain of Categorical Data Analysis. It is the essential toolkit for social scientists, medical researchers, and marketers who need to find patterns in “qualitative” information. For students, this unit is a transition into the logic of odds, ratios, and contingency tables.
Below is the exam paper download link
PDF Past Paper On Categorical Data Analysis For Revision
Above is the exam paper download link
To help you move from basic counts to sophisticated modeling, we have synthesized the most common “exam-clashing” concepts into this revision guide.
What is the fundamental difference between ‘Nominal’ and ‘Ordinal’ Data?
This is the starting block of any categorical study.
-
Nominal Data: These are categories with no inherent order (e.g., Eye color, Nationality, or Political Party). You can’t say “Blue” is higher than “Brown.”
-
Ordinal Data: These categories have a natural ranking (e.g., “Satisfied,” “Neutral,” “Dissatisfied,” or “Low,” “Medium,” “High”).
In an exam, your choice of test—such as the Mann-Whitney U or a standard Chi-Square—depends entirely on whether your data has this “built-in” order.
How do we use ‘Contingency Tables’ (Crosstabs)?
A contingency table is a matrix that shows the distribution of one variable across the levels of another. For example, you might look at “Vaccination Status” versus “Infection Rate.” During revision, focus on calculating the Expected Frequencies for each cell. If the “Observed” frequencies are vastly different from the “Expected” ones, it suggests there is a significant relationship between the two variables.
What is the ‘Odds Ratio’ (OR) versus ‘Relative Risk’ (RR)?
This is a guaranteed favorite for “Interpretation” questions.
-
Relative Risk: Compares the probability of an event happening in one group versus another. It is used in “Prospective” studies where you follow people over time.
-
Odds Ratio: Compares the odds of an event happening. It is the gold standard for “Case-Control” studies (looking back in time).
If an Odds Ratio is 1.0, there is no difference between groups. If it is 2.5, the odds of the event in the treatment group are 2.5 times higher than in the control group.
Why is ‘Logistic Regression’ the king of Categorical Analysis?
When your dependent variable is binary (0 or 1), a standard linear regression fails because it might predict a probability of 120% or -10%, which is impossible. Logistic Regression uses the Logit Link Function to “squeeze” the predictions between 0 and 1. In your past paper practice, make sure you can interpret the “Exp(B)” coefficients—these are actually the Odds Ratios for each independent variable.
What is the ‘Pearson Chi-Square Test’ of Independence?
This test asks: “Are these two categorical variables related, or are they independent?” The test statistic follows a Chi-Square distribution with $(r-1)(c-1)$ degrees of freedom. A key limitation to remember for your theory questions is that the Chi-Square test is unreliable if your Expected Cell Counts are too small (usually less than 5). In those cases, you must use Fisher’s Exact Test instead.
What are ‘Generalized Estimating Equations’ (GEE) and ‘Log-Linear Models’?
When you have more than two categorical variables and you want to see how they all interact simultaneously, you move into Log-Linear Modelling. This is like ANOVA but for counts. If your data is “Nested” or “Repeated” (like measuring the same patients multiple times), you use GEE to account for the fact that observations within the same person are related.

Conclusion
Categorical Data Analysis is about finding the structure in the “names” and “labels” of the world. It requires a mindset that looks beyond averages and into proportions and probabilities. Success in your finals comes from your ability to look at a 2×2 table and immediately know whether to calculate a Chi-Square, a Risk Ratio, or a Sensitivity/Specificity score.
To help you master these qualitative calculations and secure your grade, we have provided a link to a comprehensive PDF resource below.
Last updated on: March 24, 2026