Module preview:
Module summary:
Flashcard
Knowledge check
Homework
Image
Reading
**Backtracking**
# Introduction and Motivation
*Backtracking* is a recursive implementation of a brute-force algorithm. Why
would we want to use a recursive implementation when we already have an
iterative one (see "Iterative brute force" notes)?
The answer is that the recursive implementation is often more
elegant and easier to understand. And because of that, it may be easier to see
how to optimize it later on. In the end of this block we will also discuss
how one can turn a recursive implementation back into an iterative one, while
preserving the optimizations, thus getting the best of both worlds.
# Recursive subset and permutation generation
## Subsets
This was offered as a programming assignment in the "Recursion" block, but we will go through it here as well.
The idea is to "grow" the subsets one element at a time. We start with the empty set, then we look at the first element of the input set. We can either include it in the subset or not. If we include it, we add it to the current subset and move on to the next element. If we don't include it, we simply move on to the next element without adding it to the current subset. We repeat this process until we have considered all elements of the input set.
```
E <-- decide whether to include 0 or not
/ \
/ \
/ \
0 E <-- decide whether to include 1 or not
/ \ / \
01 0 1 E <-- decide whether to include 2 or not
/ \ / \ / \ / \
012 01 02 0 12 1 2 E
```
Pseudocode for this algorithm is as follows:
```cpp
void subset( set, subset, depth ) {
if ( depth == max_depth ) {
print subset;
return; // hit the leaf, start backtracking (go up)
}
subset( set, subset + set[depth], depth+1 ); // add element
subset( set, subset, depth+1 ); // do not add element
}
```
The C++ code for this algorithm:
```cpp
#include
#include
void subset_aux( std::vector const& set, std::vector & subset, int depth ) {
if ( depth == set.size() ) { //base case -> subset complete, print it
std::cout << "{ ";
for (int i=0; i < subset.size(); ++i) { std::cout << subset[i] << " "; }
std::cout << "}" << std::endl;
return;
}
subset.push_back( set[depth] ); //add
subset_aux(set,subset,depth+1 ); //go down the recursion tree
subset.pop_back(); //do not add - remove just added
subset_aux(set,subset,depth+1 ); //go down the recursion tree
return; //backtracking (go up)
}
//kick start recursion
void subsets( std::vector const& set) {
std::vector subset;
subset_aux(set,subset,0 );
}
int main () {
std::vector set;
set.push_back(1); set.push_back(2); set.push_back(3); set.push_back(4);
subsets(set);
}
```
Notes:
- we will be passing both set and subset by reference, to avoid unnecessary copying.
- since we modify the subset in place, we will need to explicitly remove the last added element after the first recursive call, so that the second recursive call gets "element not added" version of the subset.
- recursive function often use extra parameters to keep track of the current state of the solution being built, in this case `subset` and `depth`. Instead of requiring the caller/user to provide correct values for these parameters, we create a simpler kick start function `subsets` that initializes those parameter and then calls the recursive function. This way, the user only needs to provide the input set, and does not need to worry about the details of how the recursion works.
## Permutations
For permutation generation we will use a similar approach - incrementally build the permutation one element at a time, but:
- the current depth of the recursive will correspond to which position in the permutation we are filling
- we will **not** know which elements have been used in the current permutation, so we will need to check that before we add an element to the current permutation
- branching will be changing as we have fewer and fewer elements to choose from as we go down the recursion tree, thus we will be using a loop to generate branches instead of a fixed number of recursive calls
Pseudocode:
```cpp
void permutation( set, permutation, depth ) {
if ( depth == max_depth ) {
print permutation;
return; // hit the leaf, start backtracking (go up)
}
for each element in the set {
if ( element is not used in the current permutation ) {
permutation[depth] = element; // add element to the current permutation
permutation( set, permutation, depth+1 ); // go down the recursion tree
}
}
}
```
C++ implementation:
```cpp
#include
#include
void permutation_aux( std::vector const& set, std::vector & permutation, int depth ) {
if ( depth == set.size() ) { //base case -> permutation complete, print it
std::cout << "{ ";
for (int i=0; i < permutation.size(); ++i) { std::cout << permutation[i] << " "; }
std::cout << "}" << std::endl;
return;
}
for (int i=0; i < set.size(); ++i) {
if ( std::find( permutation.begin(), permutation.end(), set[i]) != permutation.end() ) continue; //check if element is not used in the current permutation
permutation[depth] = set[i]; //add element to the current permutation
permutation_aux(set,permutation,depth+1 ); //go down the recursion tree
}
return; //backtracking (go up)
}
```
# Example - knapsack problem
The previous example just prints all possible subsets, but we can easily modify it to do something more interesting, like solve the knapsack problem.
The problem, also called *0-1 knapsack problem*, is as follows: we are given a set of items, each with a weight and a value, and a knapsack with a maximum weight capacity. We want to find the subset of items that maximizes the total value while keeping the total weight within the capacity of the knapsack. The `0-1` means that we can either include an item in the knapsack or not, we cannot include fractional parts of an item.
We will reuse the subset printing algorithm from above and just add knapsack related parameters and logic:
- we will need to add two arrays - weights and values,
- as well as a current best subset and its value.
- instead of printing the subset when we hit the base case, we will check if its value is better than the current best and update the best subset and its value accordingly.
Pseudo code:
```cpp
void knapsack( weights, values, capacity, subset, depth ) {
if ( depth == max_depth ) {
int value = total value of the subset;
int weight = total weight of the subset;
if ( weight <= capacity ) { // check if subset is valid
if ( value > best_value ) { // check if subset is better than the best so far
best_value = value;
best_subset = subset;
}
}
return; // leaf node, start backtracking (go up)
}
knapsack( weights, values, capacity, subset + set[depth], depth+1 ); // add element
knapsack( weights, values, capacity, subset, depth+1 ); // do not add element
}
```
Where:
- `weights` and `values` are arrays containing the weights and values of the items
- `capacity` is the maximum weight capacity of the knapsack
- `subset` is the current subset being built
- `best_subset` and `best_value` are the best subset and its value found so far, when recursion is done, they will contain the optimal solution to the knapsack problem.
C++ code:
```cpp
#include
#include
void knapsack_aux( std::vector const& weights,
std::vector const& values,
int capacity,
std::vector & subset,
int depth,
std::vector & best_subset,
int & best_value )
{
if ( depth == weights.size() ) { //base case
int value = 0;
int weight = 0;
for (int i=0; i < subset.size(); ++i) {
value += values[subset[i]];
weight += weights[subset[i]];
}
if ( weight <= capacity && value > best_value ) { //check if subset is valid and better than the best so far
best_value = value;
best_subset = subset;
}
return;
}
subset.push_back( depth ); //add element
knapsack_aux(weights,values,capacity,subset,depth+1,best_subset,best_value); //go down the recursion tree
subset.pop_back(); //do not add - remove just added
knapsack_aux(weights,values,capacity,subset,depth+1,best_subset,best_value); //go down the recursion tree
}
//kick start recursion
std::vector knapsack( std::vector const& weights, std::vector const& values, int capacity) {
std::vector subset;
std::vector best_subset;
int best_value = 0;
knapsack_aux(weights,values,capacity,subset,0,best_subset,best_value);
return best_subset;
}
int main () {
std::vector weights;
std::vector values;
weights.push_back(3); values.push_back(7);
weights.push_back(2); values.push_back(4);
weights.push_back(2); values.push_back(4);
weights.push_back(1); values.push_back(2);
int capacity = 4;
std::vector solution = knapsack(weights,values,capacity);
std::cout << "Best subset: { ";
int solution_value = 0;
for (int i=0; i < solution.size(); ++i) {
std::cout << solution[i] << " ";
solution_value += values[solution[i]];
}
std::cout << "}, value: " << solution_value << std::endl;
}
```
Notes:
- `best_subset` and `best_value` are local variables of `knapsack` function and are passed by reference to `knapsack_aux` which updates them as the function is trying different subsets.
- once `knapsack_aux` is done, `best_subset` and `best_value` will contain the final optimal solution to the knapsack problem, which is then returned to the caller of `knapsack` function.
# Example - assignment problem
*Assignment problem* is a minimization problem of assigning $n$ tasks to $n$ agents, such that each task is assigned to exactly one agent (thus each agent is assigned to exactly one task), and the total cost of the assignment is minimized. The cost of assigning task $i$ to agent $j$ is given by a cost matrix $C$, where $C[i][j]$ is the cost of assigning task $i$ to agent $j$. Example:
| | Agent 0 | Agent 1 | Agent 2 |
|--------|---------|---------|---------|
| Task 0 | 6 | 2 | 5 |
| Task 1 | 2 | 3 | 1 |
| Task 2 | 3 | 1 | 2 |
Solution to the problem is an assignment of tasks to agent - like
- agent 0 is assigned to task 1
- agent 1 is assigned to task 2
- agent 2 is assigned to task 0
with total cost of $2 + 1 + 5 = 8$.
An assignment of $n$ tasks to $n$ agents can be represented as a permutation of $n$ elements, where the $i$-th element of the permutation is the index of the task assigned to agent $i$. For example, the assignment above can be represented as the permutation $(1, 2, 0)$.
With this representation, we can use a backtracking algorithm to generate all permutations of $n$ elements and calculate the cost of each permutation, keeping track of the best one found so far.
Pseudo-code
```cpp
void assignment_aux( cost_matrix, assignment, depth, best_assignment, best_cost ) {
if ( depth == max_depth ) { // base case - complete assignment
int cost = calculate_cost( cost_matrix, assignment );
if ( cost < best_cost ) { // check if found a better solution (cheaper)
best_cost = cost;
best_assignment = assignment;
}
return;
}
// recursive step - for each unassigned task, assign it to the current agent (agent index depth)
for (int i=0; i < n; ++i) { // try assigning task i to agent depth
if ( task i is not assigned to any agent yet ) {
assignment[depth] = i; // assign task i to current agent (agent index depth)
assignment_aux( cost_matrix, assignment, depth+1, best_assignment, best_cost ); // go down the recursion tree
}
}
}
std::vector assignment( cost_matrix ) {
int n = cost_matrix.size();
std::vector assignment(n); // current assignment
std::vector best_assignment(n); // best assignment found so far
int best_cost = std::numeric_limits::max(); // best cost found so far
assignment_aux( cost_matrix, assignment, 0, best_assignment, best_cost ); // kick start recursion
return best_assignment;
}
```
C++ implementation of the above algorithm
```cpp
#include
#include
#include
#include
#include
#include
#include
using Matrix = std::vector >;
/////////// BACKTRACKING VANILLA ///////////////////////
void
backtracking_vanilla_aux(
Matrix const& m, int depth,
std::vector & current_assignment,
int & cost_current_assignemnt,
std::vector & best_solution_so_far,
int & cost_best_solution_so_far )
{
// base case - last level reached
if ( depth == m.size() ) {
if ( cost_current_assignemnt < cost_best_solution_so_far ) {
cost_best_solution_so_far = cost_current_assignemnt;
best_solution_so_far = current_assignment;
}
return;
}
// recursive case - try all possible jobs for agent index `depth`
for ( int j=0; j < m.size(); ++j ) {
//skip if job j is already assigned
if ( std::find( current_assignment.begin(), current_assignment.end(), j) != current_assignment.end() ) continue;
//otherwise assign and call recursively
current_assignment.push_back( j );
cost_current_assignemnt += m[depth][j];
backtracking_vanilla_aux( m, depth+1,
current_assignment, cost_current_assignemnt,
best_solution_so_far, cost_best_solution_so_far );
// undo assignment and cost update
current_assignment.pop_back( );
cost_current_assignemnt -= m[depth][j];
}
}
std::vector backtracking_vanilla( Matrix const& m ) {
std::vector best_solution_so_far;
std::vector current_assignment;
int cost_current_assignemnt = 0;
int cost_best_solution_so_far = std::numeric_limits::max();
backtracking_vanilla_aux( m, 0, current_assignment, cost_current_assignemnt, best_solution_so_far, cost_best_solution_so_far );
return best_solution_so_far;
}
int main() {
Matrix m = {
{6,2,4,8},
{3,4,7,6},
{2,7,8,5},
{3,5,4,2}
};
int N = m.size();
std::vector solution = backtracking_vanilla( m, N );
int cost = 0;
for (int i=0; i < N; ++i) {
cost += m[i][ solution[i] ];
std::cout << solution[i] << " ";
}
std::cout << " cost " << cost << std::endl;
}
```
Notes:
- `current_assignment` has all task assignments so far, where `current_assignment[i]` is the index of the task assigned to agent `i`.
- we use `std::find` to check if a task is already used in the current assignment `current_assignment`
- `best_solution_so_far` and `cost_solution_so_far` are the best assignment found so far and its cost
# Optimizations
Recursive structure of the algorithm makes it easier to add some optimizations.
In the knapsack we can maintain the current weight and value of the subset as
we build it (same way as we maintain the `subset` itself), so that we do not
have to compute them from scratch every time we hit the base case. The reason why
this is an optimizations is that we eliminate duplicate calculations - consider
a subtree of the recursion tree rooted at the node corresponding to some subset
`S`. Every leaf in this subtree will contain that subset, thus the original
algorithm will be adding up the same weights and values of the items in `S` in
every leaf.
Second optimization is to check if the current subset is already invalid (i.e.
its weight exceeds the capacity) before we even make the recursive call.
Currently we wait until we hit the base case to check if the subset is valid,
thus making a lot of recursive calls that we know will not lead to a valid
solution. This optimization is called *pruning* - we are pruning the recursion
tree by not exploring branches that we know will not lead to a valid solution.
```cpp
void knapsack_aux( std::vector const& weights, // problem description
std::vector const& values, // problem description
int capacity, // problem description
std::vector & subset, // current candidate solution
int current_weight, // current candidate solution - weight
int current_value , // current candidate solution - value
int depth,
std::vector & best_subset, // best solution found so far
int & best_value // best solution found so far - value
)
{
// pruning - check if current candidate solution is valid before we continue
if ( current_weight > capacity ) { // prune - no need to explore this branch
return;
}
// base case
if ( depth == weights.size() ) { //base case -> subset complete, check if it's the best
if ( current_value > best_value ) { //check if subset is better than the best so far
best_value = current_value;
best_subset = subset;
}
return;
}
// recursive case
subset.push_back( depth );
knapsack_aux( weights, values, capacity,
subset, current_weight + weights[depth], current_value + values[depth], // updated weight and value
depth+1,
best_subset, best_value);
subset.pop_back();
knapsack_aux( weights, values, capacity,
subset, current_weight, current_value, // do not add element, so weight and value do not change
depth+1,
best_subset, best_value);
}
//kick start recursion
std::vector knapsack( std::vector const& weights, std::vector const& values, int capacity) {
std::vector subset;
std::vector best_subset;
int best_value = 0;
// we maintain the current weight and value of the subset as we build it
int current_weight = 0, current_value = 0;
knapsack_aux(weights,values,capacity,subset,current_weight,current_value,0,best_subset,best_value);
return best_subset;
}
```
## Branch and bound optimization
The optimization described above is a basic pruning technique that relies only on the quality of the current candidate solution.
A more effective approach is to use a stronger pruning method known as branch and bound.
Instead of considering only the current state of the partial solution, branch and bound also evaluates the potential future value that can still
be achieved from the current partial solution. Computing the exact best possible future outcome would be as difficult as solving the original problem itself, so instead we use an estimate of that outcome.
This estimate, called a *bound*, represents a value that is guaranteed to be at least as good as the best possible solution that could still be obtained within the current branch. For maximization problems, the bound is an *upper bound* on the best possible solution in the current branch, i.e. all possible solutions in this branch will have a value that is less than or equal to the bound.
For minimization problems, the bound is a *lower bound* on the best possible solution in the current branch and all possible solutions in this branch will have a value that is greater than or equal to the bound.
The estimate will be used as follows - if the bound is same as or worse
than the best solution we have found so far, then we can prune this branch.
I.e. even if our estimate is exact, and the value of the bound can be achieved
(best case scenario), it is still not better than the best solution we have
found so far, thus there is no point in exploring this branch.
Formulas for the branch and bound optimization applied to the knapsack problem (maximization):
- definition of the bound
$$
\mbox{any solution in this branch} \leq \mbox{upper bound}
$$
- pruning condition: $\mbox{upper bound} \leq \mbox{best solution found so far}$
Results in
$$
\begin{align*}
\mbox{any solution in this branch} & \\
&\leq \mbox{upper bound} \\
&\leq \mbox{best solution found so far}
\end{align*}
$$
i.e.
$$
\mbox{any solution in this branch} \leq \mbox{best solution found so far}
$$
so there is no point in exploring this branch.
### Branch and bound optimization for the knapsack problem
For the knapsack problem, we can use the following estimate:
- we can sort the remaining items by their value-to-weight ratio
- create estimate by adding items in that order until we reach the capacity
- the value of the last item added can be fractionally added to fill the remaining capacity
Pseudocode for the estimate:
```cpp
double estimate( weights, values, capacity, current_weight, current_value, depth ) {
estimate = current_value;
for each remaining item in order of value-to-weight ratio {
if ( item fits ) {
estimate += item_value; // take the whole item
} else {
// add fractional part of the last item
estimate += item_value * (capacity - current_weight) / item_weight;
break; // we have reached the capacity, no need to continue
}
}
return estimate;
}
void knapsack_aux( weights, values, capacity, subset, current_weight, current_value, depth, best_subset, best_value ) {
// prune illegal solutions
if ( current_weight > capacity ) {
return;
}
double est = estimate( weights, values, capacity, current_weight, current_value, depth );
// prune branches that cannot lead to a better solution than the best one found so far
if ( est <= best_value ) { // prune - no need to explore this branch
return;
}
// base case
...
// recursive case
...
}
```
Several notes:
- if we just add items until next item cannot fit (no fractional part of the last item), we will get a valid solution, but not necessarily the best one. Example - maximum capacity is 2, and we have two items, one with value 3 and weight 1 (value-to-weight ratio 3) and another with value 4 and weight 2 (value-to-weight ratio 2). The first item has better value-to-weight ratio, so we add it first, and there is no room for the second item, thus getting a solution with value 3. However if we add the second item instead, we get a better solution with value 4. What this tells us is that a **greedy** strategy of adding items in order of their value-to-weight ratio is not always optimal. We will be discussing greedy algorithms in more details later in this course.
- another observation about the above incorrect approach is that greedy solution produces the opposite inequality to the one we need for branch and bound optimization:
$$
\mbox{best possible solution in this branch} \geq \mbox{greedy solution}
$$
while for branch and bound we need the opposite inequality
$$
\mbox{best possible solution in this branch} \leq \mbox{bound (estimate)}
$$
- by using the fractional part of the last item, we get an **invalid** solution (knapsack problem does not allow fractional items), but we get an **upper bound** of the best possible value that we can get in this branch.
- we only need to sort the weights and values once at the beginning of the algorithm.
### Branch and bound optimization for the assignment problem
Assignment problem is a minimization problem, so we will be looking for a *lower bound* on the cost of the best possible solution in the current branch. To find a lower bound, we can use the following estimation:
- for each unassigned agent, find the minimum cost of assigning a task to that agent (i.e. find the minimum value in the corresponding row of the cost matrix)
- sum up these minimum costs to get a lower bound on the cost of the best possible solution in the current branch
```cpp
int estimate( cost_matrix, assignment ) {
int est = cost of current assignment;
for each unassigned agent i {
min_cost = find cheapest unassigned task for agent i;
est += min_cost;
}
return est;
}
```
With the estimation function defined, we can add the branch and bound optimization to the backtracking algorithm as follows:
```cpp
void assignment_aux( cost_matrix, assignment, depth, best_assignment, best_cost ) {
if ( depth == max_depth ) { // base case - complete assignment
int cost = calculate_cost( cost_matrix, assignment );
if ( cost < best_cost ) { // check if found a better solution (cheaper)
best_cost = cost;
best_assignment = assignment;
}
return;
}
int est = estimate( cost_matrix, assignment ); // calculate the lower bound on the cost of the best possible solution in this branch
if ( est >= best_cost ) { // prune - no need to explore this branch
return;
}
for (int i=0; i < n; ++i) { // try assigning task i to agent depth
if ( task i is not assigned to any agent yet ) {
assignment[depth] = i; // assign task i to current agent (agent index depth)
assignment_aux( cost_matrix, assignment, depth+1, best_assignment, best_cost ); // go down the recursion tree
}
}
}
std::vector assignment( cost_matrix ) {
int n = cost_matrix.size();
std::vector assignment(n); // current assignment
std::vector best_assignment(n); // best assignment found so far
int best_cost = std::numeric_limits::max(); // best cost found so far
assignment_aux( cost_matrix, assignment, 0, best_assignment, best_cost ); // kick start recursion
return best_assignment;
}
```
Notice that the pruning condition is reversed compared to the knapsack problem, since we are minimizing the cost.
Thus "this branch cannot lead to a better solution than the best one found so far" is equivalent to "the lower bound on the cost of the best possible solution in this branch is greater than or equal to the cost of the best solution found so far", which is expressed as `est >= best_cost`.
C++ implementation of branch and bound optimization for the assignment problem
```cpp
//lower bound evaluation function
int
lower_bound(
Matrix const& m,
int depth,
std::vector ¤t_assignment,
int const& cost_current_assignemnt )
{
int lower_bound = cost_current_assignemnt;
// for each future agent (row) find the minimum cost of assigning it to a job (column)
// that is not already assigned to a previous agent
for ( int i=depth; i < m.size(); ++i ) {
int min_in_row = std::numeric_limits::max();
for ( int j=0; j < m.size(); ++j ) {
//check if job is not assigned (column is taken)
if ( std::find( current_assignment.begin(), current_assignment.end(), j) == current_assignment.end() ) {
if ( min_in_row > m[i][j] ) min_in_row = m[i][j];
}
}
lower_bound += min_in_row;
}
return lower_bound;
}
void
backtracking_branch_bound_aux(
Matrix const& m, int depth,
std::vector & current_assignment,
int & cost_current_assignemnt,
std::vector & best_solution_so_far,
int & cost_best_solution_so_far )
{
if ( depth == m.size() ) {
if ( cost_current_assignemnt < cost_best_solution_so_far ) {
cost_best_solution_so_far = cost_current_assignemnt;
best_solution_so_far = current_assignment;
}
}
for ( int j=0; j < m.size(); ++j ) {
if ( std::find( current_assignment.begin(), current_assignment.end(), j) != current_assignment.end() ) continue;
current_assignment.push_back( j );
cost_current_assignemnt += m[depth][j];
int lb = lower_bound( m, depth+1, current_assignment, cost_current_assignemnt );
//branch cancellation check
if ( lb < cost_best_solution_so_far ) {
backtracking_branch_bound_aux( m, depth+1,
current_assignment, cost_current_assignemnt,
best_solution_so_far, cost_best_solution_so_far );
}
current_assignment.pop_back( );
cost_current_assignemnt -= m[depth][j];
}
}
std::vector
backtracking_branch_bound( Matrix const& m ) {
std::vector best_solution_so_far;
std::vector current_assignment;
int cost_current_assignemnt = 0;
int cost_best_solution_so_far = std::numeric_limits::max();
backtracking_branch_bound_aux( m, 0, current_assignment, cost_current_assignemnt, best_solution_so_far, cost_best_solution_so_far );
return best_solution_so_far;
}
```
Notes:
- `lower_bound` function calculates the lower bound on the cost of the best possible solution in the current branch, based on the current assignment and the cost matrix.
- the function iterates over the remaining unassigned agents (rows of the cost matrix) and for each agent
- for each unassigned agent, it finds an unassigned task (column of the cost matrix) with the minimum cost and adds that cost to the lower bound.
- we moved the pruning check `lb < cost_best_solution_so_far` to be **before** the recursive call, rather than in the **beginning** of the recursive function. For this particular implementation, it does not make much difference - instead of making a recursive call and then immediately returning (assuming the branch is pruned), we are just not making the recursive call. But in the next optimization (best-first), it will be more convenient to have the pruning check this way.
## Best-first optimization
*Best-first" optimization is built on top of branch and bound optimization.
It is based on the idea of exploring the most promising branches first, where the promise of a branch is determined by the value of the bound (estimate) for that branch:
- precalculate the bound (estimate) for each branch at the current depth. Notice that in the previous examples we only calculated the bound for the current branch, best-first will require us to calculate the bounds earlier (before we start the branch).
- sort the branches in order of their bound (estimate) values, and explore them in that order. For a minimization problem, we will be exploring branches with lower bound values first, while for a maximization problem, we will be exploring branches with higher bound values first.
Pseudo-code for best-first optimization:
```cpp
void assignment_aux( cost_matrix, assignment, depth, best_assignment, best_cost ) {
if ( depth == max_depth ) { // base case - complete assignment
if ( this assignment < best_assignment cost ) {
best_cost = cost;
best_assignment = assignment;
}
return;
}
// calculate the bound (estimate) for each branch at the current depth
branch_estimates; // branch (index) -> bound (estimate)
for each unassigned task i {
assignment[depth] = i; // assign task i to current agent (agent index depth)
int est = estimate( cost_matrix, assignment ); // calculate the bound (estimate) for this branch
branch_estimates[i] = est;
}
sort the branches in order of their bound (estimate) values
for each branch in order of their bound (estimate) values {
if ( branch_estimate >= best_cost ) { // prune - no need to explore this branch
break; // since branches are sorted, all subsequent branches will also be pruned
}
assignment[depth] = branch_index; // assign task corresponding to this branch to current agent (agent index depth)
assignment_aux( cost_matrix, assignment, depth+1, best_assignment, best_cost ); // go down the recursion tree
}
```
Notes:
- the main difference between branch and bound and best-first is that in branch and bound we explore branches in the order they are generated, while in best-first we explore branches in the order of their bound (estimate) values.
- by exploring more promising branches first we hope to find better solutions earlier, which will allow us to prune more branches later on, thus reducing the overall search space and improving the performance of the algorithm.
- code-wise, best first has an extra loop that precalulates the bounds for all branches at the current depth, and then sorts the branches based on those bounds before exploring them.
- the pruning check is still the same, but since we are exploring branches in order of their bound values, we can break out of the loop as soon as we encounter a branch that does not satisfy the pruning condition, since all subsequent branches will also not satisfy the pruning condition.
C++ implementation of best-first optimization for the assignment problem
```cpp
void
backtracking_branch_bound_best_first_aux(
Matrix const& m, int depth,
std::vector & current_assignment,
int & cost_current_assignemnt,
std::vector & best_solution_so_far,
int & cost_best_solution_so_far )
{
// base case
if ( depth == m.size() ) {
if ( cost_current_assignemnt < cost_best_solution_so_far ) {
cost_best_solution_so_far = cost_current_assignemnt;
best_solution_so_far = current_assignment;
}
}
std::vector< std::tuple > ordered_branches; // tuples of (job index, lower_bound)
// precompute lower bound for each node and store it in a vector of tuples
for ( int j=0; j < m.size(); ++j ) {
if ( std::find( current_assignment.begin(), current_assignment.end(), j) != current_assignment.end() ) continue;
// precompute lower bound for each node and store it in a vector of tuples (job index, lower_bound)
current_assignment.push_back( j );
cost_current_assignemnt += m[depth][j];
int lb = lower_bound( m, depth+1, current_assignment, cost_current_assignemnt );
ordered_branches.push_back( std::make_tuple(j,lb) );
current_assignment.pop_back( );
cost_current_assignemnt -= m[depth][j];
}
// sort the vector of tuples by lower_bound (second element of the tuple)
std::sort( ordered_branches.begin(), ordered_branches.end(),
[]( std::tuple const& a, std::tuple const& b ) {
return std::get<1>(a) < std::get<1>(b); // increasing order of lower bound
} );
for ( auto const& t : ordered_branches ) {
int job_index = std::get<0>(t);
current_assignment.push_back( job_index );
cost_current_assignemnt += m[depth][job_index];
int lb = std::get<1>(t);
//branch cancelation check
if ( lb < cost_best_solution_so_far ) {
backtracking_branch_bound_best_first_aux( m, depth+1,
current_assignment, cost_current_assignemnt,
best_solution_so_far, cost_best_solution_so_far);
} else {
current_assignment.pop_back( );
cost_current_assignemnt -= m[depth][job_index];
break; // since the vector is ordered by lower bound
// all the following nodes will have a lower bound greater than the best solution so far
}
current_assignment.pop_back( );
cost_current_assignemnt -= m[depth][job_index];
}
}
std::vector
backtracking_branch_bound_best_first( Matrix const& m ) {
std::vector best_solution_so_far;
std::vector current_assignment;
int cost_current_assignemnt = 0;
int cost_best_solution_so_far = std::numeric_limits::max();
backtracking_branch_bound_best_first_aux( m, 0, current_assignment, cost_current_assignemnt, best_solution_so_far, cost_best_solution_so_far );
return best_solution_so_far;
}
```
Notes:
- we calculate the bound (estimate) for each branch at the current depth before we start exploring the branches, and store it in a vector of tuples (job index, lower_bound).
- we sort the vector of tuples by lower_bound in increasing order, so that we explore branches with lower bound values first.
- we check the pruning condition `lb < cost_best_solution_so_far` before making the recursive call, and if the condition is not satisfied, we break out of the loop since all subsequent branches will also be pruned.
# More backtracking algorithms
The two examples we have considered used subsets and permutations. More complex problems may require more complex structures, possible unique to that problem, but the general idea of backtracking remains the same - we are building a candidate solution incrementally, and at each step we check if the current candidate solution is valid and if it can lead to a better solution than the best one found so far. If not, we backtrack and try a different path.
## Rummikub
*Rummikub* is a tile-based game for 2 to 4 players, where the goal is to be the
first player to get rid of all your tiles by forming them into valid sets: groups and runs.
A *group* is formed by three or four tiles of the same number but different colors,
while a *run* is made of three or more consecutive numbers of the same color. Examples:
- group: (red 5, blue 5, black 5)
- run: (red 3, red 4, red 5)
- group: (red 5, blue 5, black 5, yellow 5)
- run: (black 5, black 6, black 7, black 8, black 9)
We will consider a simplified version of the game, where instead of a turn-by-turn
game we are dealing with a static problem: given a set of tiles on the table and our own hand,
is it possible to organize all the tiles into valid sets and runs (which will finish the game)?
We will use backtracking to solve this problem. The candidate solutions do not form a
predefined combinatorial structure like subsets or permutations, we can still build candidate
solutions incrementally. Assume all tiles (table and hand) are stored in a vector,
at each step, we select the next tile from that vector and place it on the table.
Depending on the current state of the table, we may have several options for placing the tile
- we can start a new group (always possible)
- we can start a new run (always possible)
- we can add the tile to an existing group (if denomination is the same and color is not used yet)
- we can add the tile to an existing run (if color is the same and denomination is not used yet)
For example. Table is
- Group 1: (red 5, blue 5)
- Group 2: (red 6)
- Run 1: (black 3, black 4)
If next tile is black 5, we have the following options:
- start a new group with black 5
- start a new run with black 5
- add black 5 to group 1 (since it has the same denomination and different color)
- add black 5 to run 1 (since it has the same color and denomination is not used yet in that run)
But if the next tile is red 5, we only have the following options:
- start a new group with red 5
- start a new run with red 5
since group 1 already has a red 5, group 2 has a different denomination, and run 1 has a different color.
And resulting algorithm will look like this:
```cpp
// this algorithm determines if we can win the during this tuen
bool rummikub_aux( tiles ) {
if ( tiles are empty ) { // base case - all tiles are placed on the table
if ( all runs and groups on the table are valid ) {
return true; // we have found a valid solution
} else {
return false; // continue backtracking
}
}
tile = next tile from the vector of tiles;
for each valid option of placing the tile on the table {
place the tile on the table according to this option;
if ( rummikub_aux( remaining tiles ) ) { // continue building the solution with the remaining tiles
return true; // we have found a valid solution in this branch
}
remove the tile from the table; // backtrack
}
return false; // we have explored all options and did not find a valid solution
}
bool rummikub( table, hand ) {
return rummikub_aux( table + hand ); // combine table and hand into a single vector of tiles,
}
```
Notes:
- this is a decision problem - we are only interested whether solution exists. It is possible that there are multiple valid solutions, but we only need to find one of them, thus we can return true as soon as we find a valid solution, without exploring other options. This is different from optimization problems where we need to explore all options to find the best solution.
- notice how the algorithm will terminate if we find a solution. This is done by checking if the recursive call returns true, and if it does, we return true immediately without exploring other options. This is a common pattern in decision backtracking algorithms.
- when building groups and runs on the tables, we do not know if they will become complete/valid, this is why we have to double check that in the base case. Basically - by keeping track of colors of the runs and denominations of the groups, we check for the **necessary** conditions for a valid solution, but we do not know if they are **sufficient** until we hit the base case and check if all groups and runs are valid.
We can presort the `tiles` before the search begins. One possible heuristic is to sort the tiles by denomination. This ordering allows the algorithm to detect impossible partial solutions earlier during backtracking. For example, suppose the table currently contains the partial run `{black 3, black 4}` and the next available tile is `black 6`. Since the tiles are processed in sorted order, we know that black 2 and black 5 will not appear later in the search. As a result, this partial run can never be extended into a valid run, so the algorithm can immediately prune this branch of the search tree instead of exploring it further, and the current branch can be pruned.
# Exercises
- implement using pseudo-code the backtracking algorithm for the N-queens problem
- implement using pseudo-code the best-first optimization for the knapsack problem
- implement using pseudo-code the brute-force recursive backtracking for the following problem: given a set of integers and a target sum, determine if there is a subset of the integers that sums up to the target. For example, given the set {3, 34, 4, 12, 5, 2} and the target sum 9, the algorithm should return true since there is a subset {4, 5} that sums up to 9. Given the set {3, 34, 4, 12, 5, 2} and the target sum 30, the algorithm should return false since there is no subset that sums up to 30.
- implement using pseudo-code brute-force recursive backtracking for the following problem: given a sequence of tetrominoes (Tetris pieces) determine if it is possible to play them all using the standard Tetris rules. Assume the size of the Tetris cup is 10x20 (width 10 and height 20), each piece may be rotated before being dropped. When a piece is dropped, it falls down until it either reaches the bottom of the cup or lands on top of another piece (it cannot move once it had landed). If a horizontal line is completely filled with pieces, it is cleared and all pieces above it fall down by one row. The algorithm should return true if it is possible to play all the pieces without losing (i.e. without any piece reaching the top of the cup), and false otherwise. For example, given an arbitrary long sequence of pieces 1x4 (called 'I'), the answer should be true, since we can place them vertically and never needing more than 4 rows of the cup. On the other hand a sequence of `Z` pieces (the piece that looks like a zig-zag) will eventually lead to a loss, since we cannot stack them without leaving gaps, and thus we will eventually reach the top of the cup.
- modify the above algorithm to find the best possible game play - the one that clears the most lines. The algorithm should return the maximum number of lines that can be cleared by playing the given sequence of pieces.
- implement using pseudo-code the brute-force recursive backtracking for the following problem: given a set of words and a string, determine if the string can be segmented into a space-separated sequence of one or more dictionary words. For example, given the string "leetcode" and the dictionary ["leet", "code"], the algorithm should return true since "leetcode" can be segmented as "leet code". Given the string "applepenapple" and the dictionary ["apple", "pen"], the algorithm should return true since "applepenapple" can be segmented as "apple pen apple". Given the string "catsandog" and the dictionary ["cats", "dog", "sand", "and", "cat"], the algorithm should return false since there is no way to segment "catsandog" into a sequence of dictionary words.
- implement using pseudo-code the brute-force recursive backtracking for the following problem: given a set of arbitrary tiles (orthogonal polygons - i.e. polygons with all edges either horizontal or vertical) and a rectangular board, determine if it is possible to place all the tiles on the board without overlapping and without going outside the board. Note that the tiles cannot be rotated. Assume the sum of the areas of the tiles is equal to the area of the board. The algorithm should return true if such placement exists, and false otherwise. For example, given a 3x3 board and the following tiles: a 2x2 square, a 1x3 rectangle, and a 2x1 square (assume width first, height second), the algorithm should return true since we can place the 2x2 square in the top-left corner of the board, the 1x3 rectangle on the right side of the board, and the 2x1 square in the bottom-left corner of the board.
```
+-------+---+
| | |
| 2x2 | |
| |1x3|
+-------+ |
| 2x1 | |
+-------+---+
```
On the other hand, given a 3x3 board and the following tiles: a 2x2 square, a 1x3 rectangle, and a 1x2 square (the last is different from the previous problem), the algorithm should return false since there is no way to place all the tiles on the board.
```
+-------+ +---+ +---+
| | | | | |
| 2x2 | | | | |
| | |1x3| |1x2|
+-------+ + | +---+
| |
+---+
```
Discuss how one can detect dead ends earlier in the search, and thus prune the search tree.