Horse races. All the doo dah day.

The puzzle

Twenty-five horses, a five-lane track, and - if so desired - a fitting Johnny Cash song.
No watch. Those are the ingredients for this puzzle.

The objective?

To determine the minimum number of races required to find the fastest three horses.

Preliminaries

Horses. We need horses. Horses with names.

Luckily, extensive lists of horse names turn out to exist.

In [1]:
horse_names = c("Daddys Dollars", "Dakota", "Dalton", "Danny Boy", "Dash of Class",
                "Days of Thunder", "Deal Me In", "Destiny", "Devils Advocate", "Diablo",
                "Diamond Flame", "Dime a Dozen", "Dirty Harry", "Dixie", "Dock a da Bay",
                "Domingo", "Double Delight", "Dr. Pepper", "Dream Catcher", "Driftwood Pacific", 
                "Duchess", "Duke", "Durango", "Dustbuster", "Dynamite Jack")

Beautiful, no?

We also source a plot function, that will show us an overview of the situation after each race. For the interested readers, the code for the function plotOverview.fnc() is provided in Appendix A below.

In [2]:
source("plotOverview.R")

Approach

Below, I will illustrate the optimal strategy for finding the three fastest horses. For this illustration, we randomly assign a rank to each horse, ranging from 1 (fastest) to 25 (slowest).

In [3]:
# Create a dataframe
horses = data.frame("Name" = horse_names)
horses$Name = as.character(horses$Name)

# Set the seed for reproducibility of the code
set.seed(314159265)

# Assign random ranks for hoses
horses$Rank = sample(25,replace=FALSE)

# Show the horses dataframe
horses
NameRank
Daddys Dollars 20
Dakota 9
Dalton 14
Danny Boy 5
Dash of Class 8
Days of Thunder 6
Deal Me In 21
Destiny 10
Devils Advocate 24
Diablo 11
Diamond Flame 4
Dime a Dozen 18
Dirty Harry 16
Dixie 13
Dock a da Bay 17
Domingo 12
Double Delight 15
Dr. Pepper 22
Dream Catcher 19
Driftwood Pacific 2
Duchess 3
Duke 7
Durango 23
Dustbuster 25
Dynamite Jack 1

To be able to find the fastest horses, we need to be able to organize races on our five-lane track.

We therefore define a function that simulates a race between five horses and returns the order in which the horses finished:

In [4]:
race.fnc = function(entrants,data=horses) {
  
  # Create a dataframe with the lineup for the race
  lineup = data[which(data$Name%in%entrants),]

  # Determine the results by ordering the dataframe on the basis of the rank of the horses
  result = lineup$Name[order(lineup$Rank)]
    
  # Return the result of the race
  return(result)
  
}

The optimal strategy is based on step-wise elimination.

After each race, we can be certain that some horses are not among the three fastest horses. To be able to determine the three fastest horses, we need to organize the races in a manner that allows for the elimination of the 22 slowest horses in the minimum number of races.

A first step is to run five races. The first five horses run in race 1, the second five horses run in race 2, et cetera. Each horse therefore runs in one and only one race:

In [5]:
# Separate the horses into five races
# The split function splits a vector and writes the results to a list
races = split(horses$Name,rep(1:5,each=5))

# Apply the race.fnc function to each race in the races list
races = lapply(races,race.fnc)

# Show the results of the five races
for(r in 1:5) {
  cat(paste("Results of race ", r, ": ", sep=""))
  cat(paste(races[[r]], collapse=", "))
  cat("\n")
}
Results of race 1: Danny Boy, Dash of Class, Dakota, Dalton, Daddys Dollars
Results of race 2: Days of Thunder, Destiny, Diablo, Deal Me In, Devils Advocate
Results of race 3: Diamond Flame, Dixie, Dirty Harry, Dock a da Bay, Dime a Dozen
Results of race 4: Driftwood Pacific, Domingo, Double Delight, Dream Catcher, Dr. Pepper
Results of race 5: Dynamite Jack, Duchess, Duke, Durango, Dustbuster

We can now exclude all horses that did not finish in the top three in their race.

In [6]:
# Get the horses that ended up 4th and 5th in each race
exclude = lapply(races,FUN=function(x) {return(x[4:5])})
exclude = as.character(unlist(exclude))
cat("We can exclude the following 10 horses:\n")
cat(paste(exclude, collapse=", "))

# Create a column Eliminated in horses and set it to TRUE for the 10 horses that can be 
# eliminated
horses$Eliminated = FALSE
horses$Eliminated[which(horses$Name%in%exclude)] = TRUE

# Show the head of horses
head(horses)
We can exclude the following 10 horses:
Dalton, Daddys Dollars, Deal Me In, Devils Advocate, Dock a da Bay, Dime a Dozen, Dream Catcher, Dr. Pepper, Durango, Dustbuster
NameRankEliminated
Daddys Dollars 20 TRUE
Dakota 9 FALSE
Dalton 14 TRUE
Danny Boy 5 FALSE
Dash of Class 8 FALSE
Days of Thunder 6 FALSE

We can visually represent the current situation using a plotting function, plotOverview.fnc(). For interested readers, the code for plotOverview.fnc() is included in Appendix A below.

In [7]:
# Set plotting options
options(repr.plot.width=6, repr.plot.height=3)
par(mar=c(0,0,0,0))

# Plot
plotOverview.fnc(races[1:5])

Next, we organize a race between the five horses that won the first race:

In [8]:
# Get the winners of the first five races
winners = lapply(races,FUN=function(x) {return(x[1])})
winners = as.character(unlist(winners))
cat("The winners of the first five races were:\n")
cat(paste(winners,collapse=", "))

# Run race six
races[[6]] = race.fnc(entrants = winners)
cat("\n\nResult of race six:\n")
cat(paste(races[[6]],collapse=", "))
The winners of the first five races were:
Danny Boy, Days of Thunder, Diamond Flame, Driftwood Pacific, Dynamite Jack

Result of race six:
Dynamite Jack, Driftwood Pacific, Diamond Flame, Danny Boy, Days of Thunder

The results of race six allow us to exclude another 9 horses:

  • The two horses that finished 4th and 5th in race 6
  • Horses that finished 2nd or 3rd in earlier races that were won by the horses that finished 3th, 4th, and 5th in race 6
  • The horse that finished third in the previous race of the horse that finished 2nd in race 6

To do this, we define a function eliminate.fnc():

In [9]:
eliminate.fnc = function(race=1,results=races,data=horses) {
  
  # Get the result of the winner of the race in race six
  result_winner = which(results[[6]] == results[[race]][1])

  # Find out how many horses from the top three of the race should be excluded
  num_exclude = result_winner-1
  num_exclude = min(num_exclude,3)
  
  # Return names of horses that should be excluded
  if(num_exclude > 0) {
    results[[race]][(3-num_exclude+1):3]
  } else {
    return(NA)
  }

}

And we apply this function:

In [10]:
# Eliminate horses based on race 6
exclude = unlist(sapply(1:5,eliminate.fnc))
horses$Eliminated[which(horses$Name%in%exclude)] = TRUE

Furthermore, we now know that the horse that won race six is the fastest horse. We therefore select this horse as one of the fastest three horses:

In [11]:
# Select fastest horse based on race 6
horses$Selected = FALSE
horses$Selected[which(horses$Name==races[[6]][1])] = TRUE

Let's again look at the current situation:

In [12]:
# Show the current situation
options(repr.plot.width=6, repr.plot.height=3)
par(mar=c(0,0,0,0))
plotOverview.fnc(races[1:6])

The beauty of this puzzle unravels!

After race 6, we are able to exclude 19 horses from being among the three fastest horses. We also now that Dynamite Jack is the fastest horse. This leaves 5 of the original 25 horses that could still be the second and third fastest horse, which is exactly the number of horses that can participate in a single race.

A seventh race is therefore sufficient to establish the fastest three horses:

In [13]:
# Run race 7
entrants_final = horses$Name[which(!horses$Eliminated & !horses$Selected)]
races[[7]] = race.fnc(entrants = entrants_final)
races[[7]]

# Eliminate and select horses based on race 7
horses$Eliminated[which(horses$Name%in%races[[7]][3:5])] = TRUE
horses$Selected[which(horses$Name%in%races[[7]][1:2])] = TRUE

# Verify that we indeed selected three horses
table(horses$Eliminated,horses$Selected)
  1. 'Driftwood Pacific'
  2. 'Duchess'
  3. 'Diamond Flame'
  4. 'Duke'
  5. 'Domingo'
       
        FALSE TRUE
  FALSE     0    3
  TRUE     22    0

This yields the following results:

In [14]:
# Show the current situation
options(repr.plot.width=6, repr.plot.height=3)
par(mar=c(0,0,0,0))
plotOverview.fnc(races[1:7])

Appendix A

Code for the function plotOverview.fnc():

In [15]:
plotOverview.fnc = function(races) {
  
  # Create empty plot with correct layout
  plot(1:5,1:5,type="n",axes=F,xlim=c(-0.25,5.75),ylim=c(0,6.5),xlab="",ylab="")
  rect(-0.25,0,5.75,6.5,col="#EEEEEE")

  # Define plotting coordinates
  coords = list()
  coords[[1]] = rep(1:5,each=5)
  coords[[2]] = 6-rep(1:5,5)

  # Get names in the right order
  names = as.character(unlist(races))[1:25]

  # Get information about elimination status of each horse
  eliminated = lapply(races,FUN=function(x) {return(x[4:5])})
  eliminated = as.character(unlist(eliminated))

  # We can exclude more if we have race six
  if(length(races)>5) {
    
    # Get results of winners of first five races in race six
    results_winners = lapply(1:5,FUN=function(x){which(races[[6]] == races[[x]][1])})
    results_winners = unlist(results_winners)
    
    for(i in 1:5) {
    
      # Get the result of the winner of the race in race six
      result_winner = results_winners[i]
      
      # Find out how many horses from the top 3 of the race should be excluded
      num_exclude = result_winner-1
      num_exclude = min(num_exclude,3)
      
      # Update list of eliminated horses
      if(num_exclude > 0) {
        exclude = races[[i]][(3-num_exclude+1):3]
        eliminated = c(eliminated,exclude)
      }
    
    }
  
  }
    
  # Eliminate even more if we have race seven
  if(length(races)>6) {
  
    exclude = races[[7]][3:5]
    eliminated = c(eliminated,exclude)
    
  }
    
  # Define colors  
  colors = rep("#FFFFFF",25)
    
  # Define eliminated colors
  if(length(races) > 5) {
    order_winners = sapply(1:5, FUN = function(x) {which(results_winners==x)})    
    names = as.character(unlist(races[order_winners]))[1:25]
  }
  eliminated = which(names%in%eliminated)
  colors[eliminated] = "#CC000066"
    
  # Define selected colors
  selected = vector()
  if(length(races) > 5) {
    selected = which(names==races[[6]][1])
    colors[selected] = "#00CC0066"
  }
  if(length(races) > 6) {
    for(i in 1:2) {
      selected = c(selected,which(names==races[[7]][i]))
    }
    colors[selected] = "#00CC0066"
  }

  # Plot horse names and elimination status
  for(i in 1:25) {
    rect(coords[[1]][i]-0.5,coords[[2]][i]-0.5,coords[[1]][i]+0.5,coords[[2]][i]+0.5,
         col = colors[i])
    text(coords[[1]][i],coords[[2]][i],names[i],cex=0.6)
    if(i%in%selected) {
      text(coords[[1]][i]-0.42, coords[[2]][i]+0.3, which(selected==i),cex=0.6)
    }
  }
    
  # Add horizontal labels
  for(i in 1:5) {
    if(length(races) > 5) {
      message = paste("Race", order_winners[i])
    } else {
      message = paste("Race", i)
    }
    text(i, 6, message, font=2, cex=0.6)
  }
  
  # Add vertical labels
  diamonds = 5
  if(length(races) > 5) {
    text(0.12, 5, "Race 6", font=2, cex=0.6)
    diamonds = 4
  }
  points(rep(0.12,diamonds),1:diamonds,font=2,cex=0.6,pch=18)
  
}