Skip to Main Content

To access Safari eBooks,

ContinueClose

R Studio guide

Cross Tabulation

A crosstabulation or a contingency table shows the relationship between two or more variables by recording the frequency of observations that have multiple characteristics. Crosstabulation tables show us a wealth of information on the relationship between the included variables. No formula is needed for a crosstabulation, since at a crosstabulation's core it is counts and percentages of observations.

The Chi-squared test is often used to accompany a crosstabulation to test if a significant relationship exists and the strength of the relationship between variables. 

As a general rule, the dependent variable in a crosstabulation and Chi-squared test is represented in the columns while the independent variable is represented in the rows. In this example, our two variables are sex, the independent variable, and language, the dependent variable. If you want to include other variables, you may simply change sex and language and replace them with another variable in the dataset. 

R studio cross tabulation formula

There are two lines of code above. The first line of code we are table (crosstabulating) the variables sex and language from the SLID dataset. 

The second line of code we are conducting a chisq.test (Chi-squared test) on the crosstabulation table sex and language from the SLID dataset. 

Output

A   

         English French Other
  Female    2999    262   564
  Male      2717    235   527

 

 

B

Pearson's Chi-squared test

data:  table(SLID$sex, SLID$language)
X-squared = 0.24422, df = 2, p-value = 0.8851

A

In the output chart Rstuido shows the crosstabulation of sex by language. We can see that sex is first in the code and appears in rows while language is written second and appears in the columns.

B

The second output table, Pearson's Chi-squared test, ​we can see that the X-squared (Chi-squared) value is .24422, the degrees of freedom is 2 and the significance level is 0.8851. Since we will be using the standard 0.05 or below as out cutoff point for the significance level, we can see that 0.8851 is very much above 0.05 and then conclude there is no statistical significance of the Chi-squared test. This means that there is no statistically significant relationship between the variables sex and language in this dataset.

  • Last Updated: Feb 1, 2023 1:45 PM
  • URL: https://research.library.gsu.edu/R
  •  Print Page

Login