🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
berksons_paradox
Berkson's Paradox occurs when conditioning on a shared consequence (a collider variable) of two independent causes creates a spurious negative correlation between those causes. When you select a sample based on some criterion that is influenced by both variables, you artificially introduce a relationship that does not exist in the general population. This is the opposite of confounding: instead of an upstream common cause creating a spurious positive correlation, a downstream common effect creates a spurious negative correlation.
Among hospitalized patients, a negative correlation is observed between diabetes and a bone fracture. This does not mean diabetes prevents fractures. Rather, people are in the hospital because they have diabetes OR a fracture (or both). Selecting only hospitalized people creates the illusion that these two independent conditions are inversely related.
A talent agency notices that among its signed actors, those who are highly attractive tend to be less talented, and vice versa. This seems to suggest looks and talent are negatively correlated — but in reality, both traits independently increase the chance of being signed, creating the illusion of a tradeoff.
A dating app analyst observes that among users who receive many matches, kindness and physical attractiveness appear negatively correlated. In the general population, the two traits are unrelated — but getting many matches requires being high on at least one dimension, distorting the apparent relationship.
Binary (yes/no) questions an LLM must answer to identify this aspect:
Is the sample selected based on a criterion that both variables can influence?
Type: binaryCould the observed correlation be an artifact of the selection process?
Type: binaryDoes the relationship hold in the general population, not just the selected subgroup?
Type: binaryIs a common effect (collider) being conditioned on, creating a spurious association?
Type: binaryBerkson's Paradox occurs when conditioning on a shared consequence (a collider variable) of two independent causes creates a spurious negative correlation between those causes. When you select a sample based on some criterion that is influenced by both variables, you artificially introduce a relationship that does not exist in the general population. This is the opposite of confounding: instead of an upstream common cause creating a spurious positive correlation, a downstream common effect creates a spurious negative correlation.
Selection bias is invisible to someone analyzing only the selected sample. The data genuinely shows the negative correlation within the sample; the problem lies in the sampling process itself.
Identify whether your sample was selected based on a variable that could be a collider. Draw a causal diagram and check if conditioning on a descendant of both variables might create a spurious association.
Berkson's Paradox appears in hospital-based epidemiological studies, dating pools (the 'attractiveness vs. niceness' tradeoff perceived in available partners), and university admissions studies.
Use these tools to detect, analyze, or train this aspect.