Jason A. French

Northwestern University

Recursion in R

| Comments

Most technical interviews with companies will ask you to whiteboard code some type of recursive function in your favorite programming language. Although Python seems to be the dominate king in data science, recursion can be a powerful tool in R.

What is recursion?

Recursive functions call themselves. That is, they break down the problem into the smallest possible components and the function() calls itself within the original function() on each of the smaller components. Afterward, the results are put together to solve the original problem. Let’s take a look at more concrete examples.

Quicksort

A technical interview will usually ask you to implement some type of sorting function that can be solved using a recursive algorithm. Let’s try implementing quicksort in R:

Programming Quicksort in R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env Rscript
# Author:  Jason A. French

quickSort <- function(vect) {
  # Args:
  #  vect: Numeric Vector
  
  # Stop if vector has length of 1
  if (length(vect) <= 1) {
      return(vect)
  }
  # Pick an element from the vector
  element <- vect[1]
  partition <- vect[-1]
  # Reorder vector so that integers less than element
  # come before, and all integers greater come after.
  v1 <- partition[partition < element]
  v2 <- partition[partition >= element]
  # Recursively apply steps to smaller vectors.
  v1 <- quickSort(v1)
  v2 <- quickSort(v2)
  return(c(v1, element, v2))
}
1
2
quickSort(c(4, 65, 2, -31, 0, 99, 83, 782, 1))
[1] -31   0   1   2   4  65  83  99 782

Merge Sort

A second sorting algorithm that we can implement using recursion is the Merge Sort. Sorting algorithms are important because they differ in their speed, depending on the nature of the input data. In the case of merge sort, the worst possible outcome for the time it takes to sort the data is n * log(n), where n is the length of your list of numbers.

Like Quicksort, we’re splitting the vector into smaller sub-vectors by halving it until each vector has just one integer. We then merge the vectors back together, this time in order. Let’s tackle this algorithm in R as well.

I found that Rosetta Code already provides a good example, so below is the their public domain code with my comments added to the code for explanation.

Merge sort in R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#!/usr/bin/env Rscript
# http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#R

mergesort <- function(m)

{
   merge_ <- function(left, right)
   # Recursive function to compare and append values in order
   {
      # Create a list to hold the results
      result <- c()
      # This is our stop condition. While left and right contain 
      # a value, compare them
      while(length(left) > 0 && length(right) > 0)
      {
           # If left is less than or equal to right,
           # add it to the result list
         if(left[1] <= right[1])
         {
            result <- c(result, left[1])
            # Remove the value from the list
            left <- left[-1]
         } else
         {
            # When right is less than or equal to left,
            # add it to the result.
            result <- c(result, right[1])
            # Remove the appended integer from the list.
            right <- right[-1]
         }
      }
      # Keep appending the values to the result while left and right
      # exist.
      if(length(left) > 0) result <- c(result, left)
      if(length(right) > 0) result <- c(result, right)
      result
   }

   # Below is our stop condition for the mergesort function.
   # When the length of the vector is 1, just return the integer. 
   len <- length(m)
   if(len <= 1) m else
   {
      # Otherwise keep dividing the vector into two halves.
      middle <- length(m) / 2
      # Add every integer from 1 to the middle to the left
      left <- m[1:floor(middle)]
      right <- m[floor(middle+1):len]
      # Recursively call mergesort() on the left and right halves.
      left <- mergesort(left)
      right <- mergesort(right)
      # Order and combine the results.
      if(left[length(left)] <= right[1])
      {
         c(left, right)
      } else
      {
         merge_(left, right)
      }
   }
}