Skip to Main Content

To access Safari eBooks,

ContinueClose

R Studio guide

Descriptive Statistics for One Variable

Getting the descriptive statistics in RStudio is quick for one or multiple variables. Descriptive statistics are measures we can use to learn more about the distribution of observations in variables for analysis, transforming variables, and reporting. Each descriptive statistic has their own formula that we will not be covering in this guide, but we will walk through the interpretation of each.

Below is the code for calculating the descriptive statistics of the variable wages.

R studio descriptive 1 variable code

The output chart shows us descriptive statistics and missing values. We are going to focus on a couple of descriptive statistics in this output. Moving from left to right, we can see the Min. (minimum), 1st Qu (first quartile), Median, Mean, 3rd Qu (third quartile), Max. (maximum), and NA's (missing values).

The average wage value in this dataset is 15.553 which is below the middle value of 26.11 ((49.92 – 2.30)/2) , indicating the distribution of the data is skewed toward lower values. 

Descriptive Statistics for Multiple Variables

We can also calculate the descriptive statistics for all the variables in one command line.

Code

summary(SLID)   

 

We are conducting a summary on the SLID dataset.

Output

         wages          education          age            sex          language   
 Min.   : 2.300   Min.   : 0.00   Min.   :16.00   Female:3880   English:5716  
 1st Qu.: 9.235   1st Qu.:10.30   1st Qu.:30.00   Male  :3545   French : 497  
 Median :14.090   Median :12.10   Median :41.00                 Other  :1091  
 Mean   :15.553   Mean   :12.50   Mean   :43.98                 NA's   : 121  
 3rd Qu.:19.800   3rd Qu.:14.53   3rd Qu.:57.00                               
 Max.   :49.920   Max.   :20.00   Max.   :95.00                               
 NA's   :3278     NA's   :249        

In this chart RStudio provides us with each variable name, Min. (minimum), 1st Qu (first quartile), Median,  Mean3rd Qu (third quartile), Max. (maximum), and NA's (missing values).