Difference between Factor and Vector in R


A factor is represented as an object of class factor, which is an integer vector of codes and an attribute with name levels. In the code below, we first set the random seed to ensure that all readers will get the same values if they run the code on their own machines.

A factor is useful when a potentially large collection of data contains relatively few, discrete levels. Such data are usually referred to as categorical variable.


> set.seed(123)
> x = sample(letters[1:5], 10, replace = TRUE)
> y = factor(x)
> y


b d c e e a c e c c
Levels: a b c d e


It is the simplest data structure in R Programming. It represents a sequence of data points of the same type. A vector can be created with the function x() (for collection or concatenation).


> x(3,1,7)

3 1 7