tetrachoric correlation coefficient
Học thuậtThân thiện
Definition
Noun: A tetrachoric correlation coefficient is a statistical measure of association. It estimates the Pearson correlation coefficient between two theoretical, continuous, and normally distributed variables, based on the observed relationship in a 2x2 contingency table where both variables have been artificially dichotomized (split into two categories).
Usage
The tetrachoric correlation coefficient is used in statistics, psychometrics, and social sciences when researchers believe that the underlying constructs they are measuring (e.g., ability, trait, attitude) are continuous and normally distributed, but the observed data are binary (e.g., pass/fail, yes/no, correct/incorrect).
Examples
- In a research paper: "The between the two test items was calculated to be 0.75, suggesting a strong underlying relationship between the latent abilities they measure."
- In data analysis: "Because our questionnaire items were scored as 'agree' or 'disagree,' we used the to create the inter-item correlation matrix for the factor analysis."
- In methodology: "For dichotomous variables assumed to reflect an underlying normal distribution, the provides a more accurate estimate of the true relationship than the phi coefficient."
Advanced Usage
- Assumption of Underlying Normality: The calculation is valid only under the assumption that there is an underlying bivariate normal distribution for the two latent variables. Violation of this assumption can lead to biased estimates.
- Estimation, Not Direct Calculation: Unlike the Pearson correlation, the tetrachoric correlation is not directly computed from raw data. It is from the proportions in the four cells of a 2x2 table, often requiring iterative numerical methods.
- Use in Factor Analysis: It is commonly used as input for factor analysis or structural equation modeling with binary or ordinal data to avoid the distortions that can arise from using Pearson or polychoric correlations inappropriately.
Variants and Related Words
- Tetrachoric Correlation: A common shortened form of "tetrachoric correlation coefficient."
- Polychoric Correlation Coefficient: A generalization of the tetrachoric correlation used when both observed ordinal variables have more than two categories.
- Biserial Correlation Coefficient: A related measure used when one variable is continuous and normally distributed and the other is artificially dichotomized.
- Phi Coefficient: A correlation measure for two truly dichotomous variables (not assumed to underlie a continuous distribution).
Synonyms
- Tetrachoric Correlation (the most direct synonym)
- (informal, using the common symbol for correlation)
Related Concepts (Not Phrasal Verbs)
- Latent Variable: An underlying, not directly observed variable that the dichotomous measure is assumed to represent.
- Contingency Table: A table showing the frequency distribution of variables.
- Dichotomization: The process of splitting a continuous variable into two categories.
Noun
- a correlation coefficient computed for two normally distributed variables that are both expressed as a dichotomy