When we work with data, we usually find with an obstacle: repeated values. This type of values don’t represent a critical problem if we have the ability to identify. Once we have that list of repeated values, it is very easy to discard, eliminate or simply extract.

We are going to see two type of functions in R which allow to identify repeated values: *unique()* and *duplicated()* function. Besides, as we will see below, we can use these functions with different types of data, such as **vectors**, **matrix** or **dataframes**.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# Example with vector of numbers vector_example <- c(1,2,3,4,1) unique(vector_example) [1] 1 2 3 4 duplicated(vector_example) [1] FALSE FALSE FALSE FALSE TRUE # Example with vector of strings vector_example2 <- c("A", "B", "C", "D", "E", "A") unique(vector_example2) [1] "A" "B" "C" "D" "E" duplicated(vector_example2) [1] FALSE FALSE FALSE FALSE FALSE TRUE |

- As we can see,
*unique()*function uses numeric indicators to determine**unique values**. - Instead,
*duplicated()*function uses logical values to determine**duplicated values**.

Besides, we can use these functions in matrix:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
set.seed(123) m <- matrix(sample(1:3, 20, TRUE), ncol = 2, nrow = 10) m [,1] [,2] [1,] 1 3 [2,] 3 2 [3,] 2 3 [4,] 3 2 [5,] 3 1 [6,] 1 3 [7,] 2 1 [8,] 3 1 [9,] 2 1 [10,] 2 3 duplicated(m) [1] FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE unique(m) [,1] [,2] [1,] 1 3 [2,] 3 2 [3,] 2 3 [4,] 3 1 [5,] 2 1 |

Now, we will identify unique and duplicated rows, using very common dataframe called iris. Besides, we will also select not repeated rows:

1 2 3 4 5 6 7 8 9 10 11 |
nrow(iris) [1] 150 nrow(unique(iris)) # The row nº 143 is deleted because is equal to nº 102. [1] 149 iris[duplicated(iris),] # We select repeated row nº 143. [1] 1 iris[!duplicated(iris),] # We select all uniques rows (150 - 1 = 149) [1] 149 |

Finally, we can see that we can obtain the same result with *iris[unique(iris),]* and* iris[!duplicated(iris),]*