Calculating Various Statistical Measures using R
R provides a comprehensive set of functions for calculating various statistical measures.
Measures of Central Tendency:
Mean: The average of a dataset.
data <- c(10, 15, 20, 25, 30)
mean_value <- mean(data)
print(mean_value)
Median: The middle value of a dataset when ordered.
data <- c(10, 15, 20, 25, 30)
median_value <- median(data)
print(median_value)
Mode: The most frequently occurring value. R does not have a built-in mode() function, but it can be calculated using a custom function or packages like DescTools.
Measures of Variability (Dispersion):
Standard Deviation: Measures the spread of data points around the mean.
data <- c(10, 15, 20, 25, 30)
std_dev <- sd(data)
print(std_dev)
Variance: The square of the standard deviation.
data <- c(10, 15, 20, 25, 30)
variance_value <- var(data)
print(variance_value)
Range: The difference between the maximum and minimum values.
data <- c(10, 15, 20, 25, 30)
range_value <- range(data) # Returns a vector with min and max
print(range_value)
Other Descriptive Statistics:
Quantiles (Percentiles): Divide the data into equal parts.
data <- c(10, 15, 20, 25, 30)
quartiles <- quantile(data, probs = c(0.25, 0.5, 0.75))
print(quartiles)
Summary Statistics: The summary() function provides a quick overview of key descriptive statistics for a vector or data frame, including min, max, mean, median, and quartiles.
data <- c(10, 15, 20, 25, 30)
summary(data)