![]() ![]() ![]() The figure below shows four hypothetical scenarios in which one continuous variable is plotted along the X-axis and the other along the Y-axis. Graphical displays are particularly useful to explore associations between variables. Therefore, it is always important to evaluate the data carefully before computing a correlation coefficient. It is important to note that there may be a non-linear association between two continuous variables, but computation of a correlation coefficient does not detect this. A correlation close to zero suggests no linear association between two continuous variables. The magnitude of the correlation coefficient indicates the strength of the association.įor example, a correlation of r = 0.9 suggests a strong, positive association between two variables, whereas a correlation of r = -0.2 suggest a weak, negative association. The sign of the correlation coefficient indicates the direction of the association. The correlation between two variables can be positive (i.e., higher levels of one variable are associated with higher levels of the other) or negative (i.e., higher levels of one variable are associated with lower levels of the other). Ranges between -1 and +1 and quantifies the direction and strength of the linear association between the two variables. The sample correlation coefficient, denoted r, In correlation analysis, we estimate a sample correlation coefficient, more specifically the Pearson Product Moment correlation coefficient. Compute and interpret coefficients in a linear regression analysis.Compute and interpret a correlation coefficient.Define and provide examples of dependent and independent variables in a study of a public health problem.Learning ObjectivesĪfter completing this module, the student will be able to: The terms "independent" and "dependent" variable are less subject to these interpretations as they do not strongly imply cause and effect. Also, the term "explanatory variable" might give an impression of a causal effect in a situation in which inferences should be limited to identifying associations. [ NOTE: The term "predictor" can be misleading if it is interpreted as the ability to predict even beyond the limits of the data. In regression analysis, the dependent variable is denoted "Y" and the independent variables are denoted by "X". The outcome variable is also called the response or dependent variable, and the risk factors and confounders are called the predictors, or explanatory or independent variables. Regression analysis is a related technique to assess the relationship between an outcome variable and one or more risk factors or confounding variables (confounding is discussed later). For example, we might want to quantify the association between body mass index and systolic blood pressure, or between hours of exercise per week and percent body fat. In this section we discuss correlation analysis which is a technique used to quantify the associations between two continuous variables. For example, the relationship shown in Plot 1 is both monotonic and linear.Boston University School of Public Health The Pearson correlation coefficient for these data is 0.843, but the Spearman correlation is higher, 0.948. This relationship is monotonic, but not linear. Plot 5 shows both variables increasing concurrently, but not at the same rate. In a linear relationship, the variables move in the same direction at a constant rate. In a monotonic relationship, the variables tend to move in the same relative direction, but not necessarily at a constant rate. This relationship illustrates why it is important to plot the data in order to explore any relationships that might exist. However, because the relationship is not linear, the Pearson correlation coefficient is only +0.244. Plot 4 shows a strong relationship between two variables. This curved trend might be better modeled by a nonlinear function, such as a quadratic or cubic function, or be transformed to make it linear. If a relationship between two variables is not linear, the rate of increase or decrease can change as one variable changes, causing a "curved pattern" in the data. The Pearson correlation coefficient for this relationship is −0.253. ![]() They do not fall close to the line indicating a very weak relationship if one exists. The data points in Plot 3 appear to be randomly distributed. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |