Chapter 18
IN THIS CHAPTER
Determining how to perform a local search on an NP-hard problem
Working with heuristics and neighboring solutions
Solving the 2-SAT problem with local search and randomization
Discovering that you have many tricks to apply to a local search
When dealing with an NP-hard problem, a problem for which no known solution has a running complexity less than exponential (see the NP-completeness theory discussion in Chapter 15), you have a few alternatives worth trying. Based on the idea that NP-class problems require some compromise (such as accepting partial or nonoptimal results), the following options offer a solution to this otherwise intractable problem:
Local search is a general approach to solving problems that comprises a large range of algorithms, which will help you escape the exponential complexities of many NP problems. A local search starts from an imperfect problem solution and moves away from it, a step at a time. It determines the viability of nearby solutions, potentially leading to a perfect solution, based on random choice or an astute heuristic (no exact method is involved).
You have no guarantee that a local search will arrive at a problem solution, but your chances do improve from the starting point when you provide enough time for the search to run its computations. It stops only after it can’t find any further way to improve the solution it has reached.
Local search algorithms iteratively improve from a starting solution, moving one step at a time through neighboring solutions until they can’t improve the solution any further. Because local search algorithms are as simple and intuitive as greedy algorithms, designing a local search approach to an algorithmic problem is not difficult. The key is defining the correct procedure:
At the start of the local search, you pick an initial solution. If you decide on a random solution, it’s helpful to wrap the search in repeated iterations in which you generate different random start solutions. Sometimes, arriving at a good final solution depends on the starting point. If you start from an existing solution to be refined, plugging the solution into a greedy algorithm may prove to be a good compromise in fitting a solution that doesn’t take too long to produce.
After choosing a starting point, define the neighborhood and determine its size. Defining a neighborhood requires figuring the smallest change you can impose on your solution. If a solution is a set of elements, all the neighboring solutions are the sets in which one of the elements mutates. For instance, in the traveling salesman problem (TSP), neighboring solutions could involve changing the ending cities of two (or more) trips, as shown in Figure 18-1.
Based on how you create the neighborhood, you may have a smaller or a larger candidates’ list. Larger lists require more time and computations but, contrary to short lists, may offer more opportunities for the process to end earlier and better. List length involves a trade-off that you refine by using experimentation after each test to determine whether enlarging or shrinking the candidate list brings an advantage or a disadvantage in terms of time to complete and solution quality.
Base the choice of the new solution on a heuristic and, given the problem, decide on the best solution. For instance, in the TSP problem, use the trip switches that shorten the total tour length the most. In certain cases, you can use a random solution in place of a heuristic (as you discover in the SAT-2 problem in this chapter). Even when you have a clear heuristic, the algorithm could find multiple best solutions. Injecting some randomness could make your local search more efficient. When faced with many solutions, you can safely choose one randomly.
Your search has to stop at a certain point, so you need to choose stopping rules for the local search. When your heuristic can’t find good neighbors anymore or it can’t improve solution quality (for instance, computing a cost function, as it happens in TSP, by measuring the total length of the tour). Depending on the problem, if you don’t create a stopping rule, your search may go on forever or take an unacceptably long time. In case you can’t define a stopping, just limit the time spent looking for solutions or count the number of trials. When counting trials, you can decide that it’s not worth going on because you calculate the probability of success and at a certain point, the probability of success becomes too small.
Local search tracks the current solution and moves to neighboring solutions one at a time until it finds a solution (or can’t improve on the present solution). It presents some key advantages when working on NP-hard problems because it
Local search provides a viable way to find acceptable solutions to NP-hard problems. However, it can’t work properly without the right heuristic. Randomization can provide a good match with local search, and it helps by using
Randomization isn’t the only heuristic available. A local search can rely on a more reasoned exploration of solutions using an objective function to get directions (as in hill-climbing optimization) and avoid the trap of so-so solutions (as in simulated annealing and Tabu Search). An objective function is a computation that can assess the quality of your solution by outputting a score number. If you need higher scores in hill climbing, you have a maximization problem; if you are looking for smaller score numbers, you have a problem of minimization.
You can easily find analogies of the techniques employed by local search because many phenomena imply a gradual transition from one situation to another. Local search isn’t just a technique devised by experts on algorithms but is actually a process that you see in both nature and human society. In society and science, for instance, you can view innovation as a local search of the next step among the currently available technologies: https://www.technologyreview.com/s/603366/mathematical-model-reveals-the-patterns-of-how-innovations-arise/
. Many heuristics derive from the physical world, taking inspiration from the force of gravity, the fusion of metals, the evolution of DNA in animals, and the behavior of swarms of ants, bees, and fireflies (the paper at https://arxiv.org/pdf/1003.1464.pdf
explains the Lévy-Flight Firefly algorithm).
Hill climbing takes inspiration from the force of gravity. It relies on the observation that as a ball rolls down a valley, it takes the steepest descent, and when it climbs a hill, it tends to take the most direct upward direction to reach the top. Gradually, one step after the other, no matter whether it’s climbing up or down, the ball arrives at its destination, where proceeding higher or lower isn’t possible.
In local search, you can mimic the same procedure successfully using an objective function, a measurement that evaluates the neighboring solutions and determines which one improves on the current one. Using the hill-climbing analogy, having an objective function is like feeling the inclination of the terrain and determining the next best move. From the current position, a hiker evaluates each direction to determine the terrain’s slope. When the goal is to reach the top, the hiker chooses the direction with the greatest upward slope to reach the top. However, that’s just the ideal situation; hikers often encounter problems during a climb and must use other solutions to circumnavigate them.
An objective function is similar to a greedy criterion (see Chapter 5). It’s blind with respect to its final destination, so it can determine direction but not detect obstacles. Think about the effect of blindness when climbing the mountains — it’s difficult to say when a hiker reaches the top. Flat terrain that lacks any opportunities for upward movement could indicate that the hiker reached the top. Unfortunately, a flat spot can also be a plain, an elbow, or even a hole the hiker happened to fall into. You can’t be sure because the hiker can’t see.
The same problem happens when using a local search guided by a hill-climbing heuristic: It pursues progressively better neighbor solutions until it can’t find a better solution by checking the solutions that exist around the current one. At this point, the algorithm declares it found the solution. It also says that it has found a global solution, even though, as illustrated in Figure 18-2, it may have simply found a local maximum, a solution that’s the best around because it’s surrounded by worse solutions. It’s still possible to find a better solution through further exploration.
An example of hill climbing in action (and of the risks of getting stuck in a local maximum or in a local minimum when you’re descending, as in this example) is the n-queens puzzle, first created by the chess expert Max Bezzel, in 1848, as a challenge for chess lovers. In this problem, you have a number of queens (this number is n) to place on a chessboard of n x n dimensions. You must place them so that no queen is threatened by any other. (In chess, a queen can attack by any direction by row, column, or diagonal.)
Unfortunately, this approach works only about 14 percent of the time because it gets stuck in a chessboard configuration that won’t allow any further improvement 86 percent of the time. (The number of queens under attack won’t diminish for all 2*n moves available as next solutions.) The only way you get away from such a block is to restart the local search from scratch by choosing another random starting configuration of the queens on the chessboard. Figure 18-3 shows a successful solution.
In spite of this weakness, hill-climbing algorithms are used everywhere, especially in artificial intelligence and machine learning. Neural networks that recognize sounds or images, power mobile phones, and motivate self-driving cars mostly rely on a hill-climbing optimization called gradient descent. Randomized starts and random injections in the hill-climbing procedure make it possible to escape any local solution and reach the global maximum. Both simulated annealing and Tabu Search are smart ways to use random decisions in hill climbing.
At a certain point in the search, if your objective function stops giving you the right indications, you can use another heuristic to control the situation and try to find a better path to a better task solution. This is how both simulated annealing and Tabu Search work: They provide you with an emergency exit when needed.
Simulated annealing take its name from a technique in metallurgy, which heats the metal to a high temperature and then slowly cools it to soften the metal for cold working and to remove internal crystalline defects (see http://www.brighthubengineering.com/manufacturing-technology/30476-what-is-heat-treatment/
for details on this metal-working process). Local search replicates this idea by viewing the solution search as an atomic structure that changes to improve its workability. The temperature is the game changer in the optimization process. Just as high temperatures make the structure of a material relax (solids melt and liquid evaporate at high temperatures), so high temperatures in a local search algorithm induce relaxation of the objective function, allowing it to prefer worse solutions to better ones. Simulated annealing modifies the hill-climbing procedure, keeping the objective function for neighbor solution evaluation but allowing it to determine the search solution choice in a different way:
http://www.iue.tuwien.ac.at/phd/binder/node87.html
.)At this point, you must iterate the search for solutions. For each step in the previous iteration, between the preceding Steps 3 and 4, do the following:
Simulated annealing is a smart way to improve hill climbing because it avoids having the search stopped at a local solution. When the temperature is high enough, the search might use a random solution and find another way to a better optimization. Because the temperature is higher at the beginning of the search, the algorithm has a chance of injecting randomness into the optimization. As the temperature cools to zero, less and less chance exists for picking a random solution, and the local search proceeds as in hill climbing. In TSP, for instance, the algorithm achieves simulated annealing by challenging the present solution at high temperatures by
If the resulting adjustments worsen the tour’s length, the algorithm keeps or rejects them according to the temperature in the simulated annealing process.
Tabu is an ancient word from Polynesian Tongan that says certain things can’t be touched because they’re sacred. The word tabu (which is spelled as taboo in English) passed from anthropological studies to the everyday language to indicate something that is prohibited. In local search optimization, it’s common to become stuck in a neighborhood of solutions that don’t offer any improvement; that is, it’s a local solution that appears as the best solution but is far from being the solution you want. Tabu search relaxes some rules and enforces others to offer a way out of local minima and help you reach better solutions.
The Tabu Search heuristics wraps objective functions and works its way along many neighboring solutions. It intervenes when you can’t proceed because the next solutions don’t improve on your objective. When such happens, Tabu Search does the following:
You can relate Tabu Search to caching and memoization (see Chapter 16) because it requires the algorithm to track its steps to save time and avoid retracing previously used solutions. In the TSP, it can help when you try optimizing your solution by swapping the visit order of two or more cities by avoiding repeat solution sets.
As a practical view of how a local search works, this example delves into circuit satisfiability, a classical NP-complete problem. It uses a randomization and Monte Carlo algorithm approach. As seen in Chapter 17, a Monte Carlo algorithm relies on random choices during its optimization process and isn’t guaranteed to succeed in its task, although it has a high likelihood of completing the task successfully. The problem isn’t merely theoretical, though, because it tests how electronic circuits work, optimizing them by removing circuits that can’t transport electric signals. Moreover, the solving algorithm sees use in other applications: automatic labeling on maps and charts, discrete tomography, scheduling with constraints, data clustering into groups, and other problems for which you have to make conflicting choices.
Computer circuits are composed of a series of connected components, each one opening or closing a circuit based on its inputs. Such elements are called logic gates (physically, their role is played by transistors) and if you build a circuit with many logic gates, you need to understand whether electricity can pass through it and under what circumstances.
Chapter 14 discusses the internal representation of a computer, based on zeros (absence of electricity in the circuit) or ones (presence of electricity). You can render this 0/1 representation from a logical perspective, turning signals into False
(there isn’t electricity in the circuit) or True
(there is indeed electricity) conditions. Chapter 4 examines the Boolean operators (AND
, OR
, and NOT
), as shown in Figure 18-4, which work on True
and False
conditions as inputs and transform them into a different output. All these concepts help you represent a physical electric circuit as a sequence of Boolean operators defining logic gates. The combination of all their conditions determines whether the circuit can carry electricity.
This logic representation is a Boolean combinational circuit, and the test to verify its functionality is the circuit satisfiability. In the easiest scenario, the circuit consists of only NOT
conditions (called inverters) that accept one wire input, and OR
conditions that accept two wires as inputs. This is a satisfiability-two (2-SAT) scenario, and if the algorithm were to go through it using an exhaustive search, it would take at worst 2
k
trials (having k as the number of input wires) to find a set of conditions that makes electricity pass through the whole circuit. There are even more complex versions of the problem, accepting more inputs for each OR logic gate and using AND gates, but they are beyond the scope of this book.
No matter the electronic circuit you have to test using a Boolean representation, you can render it as a vector of Boolean variables. You can also create another vector to contain the clauses, the set of conditions the circuit needs to satisfy (for example, that wire A and wire B should both be True
). This isn’t the only way to represent the problem; in fact, there are other solutions involving the use of graphs. However, for this example, these two vectors are enough.
You solve the problem using a randomized local search in polynomial time. Professor Christos H. Papadimitriou, teaching at the University of California at Berkeley (https://people.eecs.berkeley.edu/~christos/
), devised this algorithm, called RandomWalkSAT. He presented it in his paper “On Selecting a Satisfying Truth Assignment,” published in 1991 on the Proceedings of the 32nd IEEE Symposium on the Foundations of Computer Science. The algorithm is competitive when compared to more reasoned ways, and it is an exemplary local search approach because it makes just one change at a time on the current solution. It uses two nested loops, one for trying the starting solution multiple times and one for randomly amending the initial random solution. Repeat the outer loop log
2
(k)
times (where k is the number of wires). The inner loop uses the following steps:
2*k
2
times: To solve the 2-SAT problem using Python and the RandomWalkSAT algorithm, you need to set a few helpful functions. The create_clauses
and signed
functions help generate a circuit problem to solve by handling the OR
and NOT
gates, respectively. Using these functions, you specify the number of OR gates and provide a seed number that guarantees that you can recreate the resulting problem later (allowing you to try the problem multiple times and on different computers).
The create_random_solutions
function provides a cold problem start by providing a random solution that sets the inputs. The chances of finding the right solution using random luck is slim (one out of the power of two to the number of gates), but on average, you can expect that three quarters of the gates are correctly set (because, as seen using the truth table for the OR function, three inputs out of four possible are True
). The check_solution
function determines when the circuit is satisfied (indicating a correct solution). Otherwise, it outputs what conditions aren’t satisfied. (You can find this code in the A4D; 18; Local Search.ipynb file on the Dummies site as part of the downloadable code; see the Introduction for details.)
import numpy as np
import random
from math import log2import matplotlib.pyplot as plt
% matplotlib inlinedef signed(v):
return v if np.random.random()<0.5 else -vdef create_clauses(i, seed=1):
np.random.seed(seed)
return [(signed(np.random.randint(i)), signed(
np.random.randint(i))) for j in range(i)]def create_random_solution(i, *kwargs):
return {j:signed(1)==1 for j in range(i)}def check_solution(solution, clauses):
violations = list()
for k,(a,b) in enumerate(clauses):
if not (((solution[abs(a)]) == (a>0)) |
((solution[abs(b)]) == (b>0))):
violations.append(k)
return violations
After setting these functions, you have all the building blocks for a sat2
function to solve the problem. This solution uses two nested iterations: The first replicates many starts; the second picks unsatisfied conditions at random and makes them true. The solution runs in polynomial time. The function isn’t guaranteed to find a solution, if one exists, but chances are, it will provide a solution when one exists. In fact, the internal iteration loop makes 2*k
2
random attempts to solve the circuit, which usually proves enough for a random walk on a line to reach its destination.
A random walk on a line is the easiest example of a random walk. On average, k
2
steps of a random walk are required to arrive at a k distance from the starting point. This expected effort explains why RandomWalkSAT requires 2*k
2
random chances to amend the starting solution. The number of chances provides a high probability that the algorithm will fix the k clauses. Moreover, it works as the random card guessing game discussed in the previous chapter. As the algorithm proceeds, choosing the right answer becomes easier. The external replications guarantee an escape from unfortunate internal-loop random choices that may stop the process in a local solution.
def sat2(clauses, n, start=create_random_solution):
for external_loop in range(round(log2(n))):
solution = start(n, clauses)
history = list()
for internal_loop in range(2*n**2):
response = check_solution(solution, clauses)
unsatisfied = len(response)
history.append(unsatisfied)
if unsatisfied==0:
print ("Solution in %i external loops," %
(external_loop+1), end=" ")
print ("%i internal loops" %
(internal_loop+1))
break
else:
r1 = random.choice(response)
r2 = np.random.randint(2)
clause_to_fix = clauses[r1][r2]
solution[abs(clause_to_fix)] = (
clause_to_fix>0)
else:
continue
break
return history, solution
Now that all the functions are correctly set, you can run the code to solve a problem. Here’s the first example, which tries the circuit created by seed 0 and uses 1,000 logic gates.
n = 1000
# Solvable seeds with n=1000 : 0,1,2,3,4,5,6,9,10
# Unsolvable seeds with n=1000 : 8
clauses = create_clauses(n, seed=0)
history, solution = sat2(clauses, n,
start=create_random_solution)Found solution in 1 external loops, 1360 internal loops
Plotting the solution, as a chart representing the number of steps on the abscissa (random emendations of the solution) and the clauses left to fix on the ordinal axis, you can verify that the algorithm tends to find the correct solution in the long run, as shown in Figure 18-5.
plt.plot(np.array(history), 'b-')
plt.xlabel("Random adjustments")
plt.ylabel("Unsatisfied clauses")
plt.grid(True)
plt.show()
If you try the circuit with 1,000 gates and seed equal to 8, you will notice that it seems to never end. This is because the circuit is not solvable, and making all the random choices and attempts takes a long time. In the end, the algorithm won't provide you with any solution.
Even though the RandomWalkSAT
algorithm has a runtime complexity of O(log
2
k * k
2
)
at worst, with k the number of inputs, you can speed it up by hacking the starting point. In fact, even though starting with a random configuration means that a quarter of the clauses remains unsatisfied at the start on average, you can fix many of them using a pass over the data.
The problem with clauses is that many require a true input, and simultaneously, many others require a false input. When all clauses require an input to be true or false, you can set it to the required condition, which satisfies a large number of clauses and makes solving the residual ones easier. The following new RandomWalkSAT
implementation includes a start phase that immediately solves the situations in which an input requires a specific true or false setting by all the clauses they interact with:
def better_start(n, clauses):
clause_dict = dict()
for pair in clauses:
for clause in pair:
if abs(clause) in clause_dict:
clause_dict[abs(clause)].add(clause)
else:
clause_dict[abs(clause)] = {clause} solution = create_random_solution(n)
for clause, value in clause_dict.items():
if len(value)==1:
solution[clause] = value.pop() > 0
return solution
The code defines a new function for the cold start where, after generating a random solution, it scans through the solution and finds all the inputs associated with a single state (true or false). By setting them immediately to the required state, you can reduce the number of clauses requiring amendment, and have the local search do less work and complete earlier.
n = 1000
# Solvable seeds = 0,1,2,3,4,5,6,9,10
# Unsolvable seeds = 8
clauses = create_clauses(n, seed=0)
history, solution = sat2(clauses, n, start=better_start)Found solution in 1 external loops, 393 internal loops
By providing this new, simplified starting point, after charting the results you can immediately see an improvement because on average, fewer operations are needed to complete the task.