Quantum Computing for Optimization Problems — Solving the Knapsack Problem

How to solve an optimization problem using quantum computing, compared to a traditional solution

Florin Andrei
Towards Data Science

--

There are many problems that require finding the maximum or minimum of a function, called the objective function, that depends on several (or many) variables; certain constraints may or may not need to be applied to the variables. The variables could be binary, integer, elements in sets, floating point, etc. The constraints may apply to variables individually, or could be more complex.

Specifically, these problems could offer solutions for optimizing industrial processes, or optimizing the flow of goods between factories and warehouses (by cost, by time, by throughput), could optimize driving routes, could find all members of a social network that satisfy some complex criterion — the examples could go on.

The algorithms for solving these problems are numerous: linear programming, integer programming, simulated annealing, pure simulation techniques, and many more. These tend to be studied in fields you may know as Prescriptive Analytics, Operations Research, etc. Generally, these algorithms run on conventional CPUs. Where applicable, GPU solvers are also used.

Quantum computing is a future-oriented topic that has been gaining popularity in recent years. While full general-purpose quantum computing is not generally available right now (January 2023), certain specialized machines can be used already to solve optimization problems extremely fast. Here is an article published by Ars Technica showcasing some applications for these early quantum systems.

In this article, we will describe briefly one of these early quantum architectures, and we will compare it with a classical approach for solving a typical optimization case — the knapsack problem.

But first, let’s talk about models.

Quadratic Models — Binary, Discrete, Constrained

There is a class of models called binary quadratic (BQM). Their variables are binary (can take one out of two possible values), and they are a combination of linear and quadratic terms. The Ising formula provides an example of an objective function for a BQM model:

Ising model
Ising model, objective function

The variables s can take values from {-1, +1}. The h coefficients are the linear biases. The J coefficients are the quadratic couplings. The objective function can be thought of as the energy of the system — more on that later.

An equivalent way to describe the same BQM models is the QUBO formula:

QUBO model
QUBO model, objective function

The model variables are the vector x of binary values — the possible values are {0, 1}. The upper-diagonal matrix Q contains the linear coefficients on the diagonal, and the quadratic coefficients are off-diagonal.

Both the Ising and the QUBO equations describe the same models, and converting between them should be trivial. Ising may look more familiar from a physics background, whereas QUBO is more akin to a computer science model. Solving these models means finding the values for variables that minimize the objective function.

If the variables can take discrete values from a set larger than just binary, the model is called discrete quadratic — DQM. If the variables can have any integer or real values, and various constraints may bind these variables, then the model is called constrained quadratic — CQM. These are generalizations of the basic BQM description.

Let’s explore a hardware architecture that can solve such models very efficiently.

D-Wave Quantum Annealers

A general-purpose quantum computer may contain a number of qubits (quantum bits, with values of either 0 or 1) connected via quantum logic gates. Such a machine could be programmed via an SDK like Qiskit, and could solve virtually any problem — hence the qualifier “general-purpose”. Right now, no such machines can yet solve real-world problems since the hardware is still very limited by low qubit counts and high error rates.

But for specialized applications, quantum computers can already be used. D-Wave has built quantum computers that can be used to solve various optimization problems today. This is the architecture of a D-Wave QPU:

D-Wave Chimera QPU
D-Wave Chimera QPU

The thin, elongated black shapes, grouped in 4+4 grids (unit cells), are tiny superconductive loops, criss-crossing the surface of the chip. Each loop, being superconductive, can hold an electric current. Due to the current, the loop will have a small magnetic field. The direction of the magnetic field could be either up or down (either +1 or -1). Think of the loops as qubits — quantum bits.

The green dots are called internal couplers — they are internal to each 4x4 unit cell. They couple the magnetic fields of two intersecting loops by some amount, so the values of the fields are partially co-dependent. The loops do not touch, they are just coupled magnetically.

The blue dots are called external couplers. They couple the magnetic fields of loops in different unit cells.

There are many unit cells in a Chimera QPU, with the total number of qubits exceeding 2000 per QPU.

Remember the Ising formula for BQMs? The magnetic fields in the loops are the variables s in the model. Each loop can have a hardware bias that strengthens or weakens the field — the biases are the linear coefficients h in the Ising formula. And the couplers (internal or external) are the quadratic coefficients J in Ising.

The total energy of the system, as calculated via the Ising formula, is given by:

  • the orientation (up or down) of the magnetic field of each loop (the s variables in Ising, which can be either +1 or -1)
  • the bias of each loop (the h coefficients in Ising)
  • the couplings (the J coefficients in Ising)

In other words, the total energy of the quantum device is the same as the objective function in the Ising model. If the physical system reaches its minimum possible energy, that ought to be the same as the smallest possible value for the objective function.

To solve a BQM, the Ising formula needs to be written according to the specifics of the problem. The h and J coefficients from the Ising formula give the biases and couplings within the QPU, and are coded into the hardware. Initially, all loops are set in a state of quantum superposition, where their magnetic fields are pointing both up and down. That corresponds to a high energy state. The whole setup process can be done in Python via the D-Wave API.

The total energy of the system is then lowered gradually (a process called annealing), until it reaches the minimum possible value given the h and J coefficients (biases and couplings). At that point, the quantum superposition of the magnetic fields is destroyed; the field of any loop will point either up or down.

The minimum value of the total energy is the value of the objective function that corresponds to the solution to the BQM. The directions of the magnetic fields of the loops are the values of the variables which solve the optimization problem (up=+1, down=-1). Simply reading the loops at that point provides the solution to the problem.

In practice, quantum annealing takes a fraction of a second. Regardless of the size of the problem, if the Ising BQM can fit on a QPU chip, the solution (the annealed final state) is obtained very quickly. D-Wave exploits the tendency of physical systems to reach low energy states, and uses that to solve NP-hard problems very quickly. The name Ising may ring a bell — the equations determined by Lenz and Ising a century ago are a statistical model for ferromagnetism, and are here used to solve a specific type of model via physical processes.

In a sense, the quantum annealer (D-Wave QPU) is like an analog computer that simulates numeric problems with physical processes. It could also be thought of as related to an FPGA, in the sense that it’s a device that can be configured for a specific problem, solves the problem quickly, then can be reconfigured for a different problem.

Back to the Models

As long as your problem can be expressed as a BQM, it can be solved by a D-Wave quantum annealer directly and very quickly.

What if the problem is not quadratic? It could be linear, or cubic, etc. In general, models can be reformulated by introducing dummy variables, and via other techniques, until they become quadratic. The D-Wave Problem-Solving Handbook describes several such techniques.

What if the models, even after reformulation, are quadratic but not purely binary? Some could be discrete models (DQM — more than just two values for each variable), others may have constraints and the variables could be anything (CQM — constrained quadratic models). DQM and CQM cannot run directly on a quantum annealer. However, you can still code the problem using the D-Wave API and submit it to the solvers in the cloud. D-Wave will decompose the problem into BQM (which can be solved by the quantum machine) and the rest (which is then solved by D-Wave using regular solvers). Think of it as divide-and-conquer: D-Wave tries to solve as much of your model as possible using the quantum annealer, but some problems still have parts that do not conform to the BQM equations. The combination of quantum and classical solvers that can solve different kinds of quadratic models is called a hybrid solver.

In practice, even when a DQM or a CQM is solved with hybrid solvers, the process tends to still be very fast. The hard kernel of the problem is solved as BQM (perhaps many BQMs are sampled from a large parameter space), and the classical algorithms can be thought of as summarizing across large sample sets of BQMs.

It is worth spending time to formulate the problem as a BQM, as that will be solved extremely fast on a quantum annealer. It may still be worth formulating the problem as a DQM or CQM, since in many cases the problem will still be solved by the D-Wave hybrid solvers in a very short time.

If the objective function needs to be maximized instead of minimized, just flip the sign on the whole formula. If you need to solve an equality instead of minimizing some expression, move all terms to the left side (which then becomes equal to zero), then square it — the squared amount, when minimized, will reach zero, which is what you want.

Many mathematical tricks like these can be used to formulate the problem in terms close to the quadratic models that the D-Wave solvers (quantum and hybrid) can solve very efficiently — minimize a function that looks a lot like a quadratic model.

The Knapsack Problem

The example we will use here to compare quantum technology with traditional approaches is called the knapsack problem.

knapsack problem
knapsack problem

You have a knapsack with a weight limit — it can only carry a certain maximum weight. You have many items, each with a different weight, and a different monetary value. You want to choose a number of items to carry in the knapsack, while maximizing the monetary value of the whole transport, but without exceeding the maximum allowable weight.

The objective function is obviously the monetary value of all items in the knapsack — bigger is better. The constraint is the total weight of the items, which must be at or below some value.

There are many ways to solve this problem. A brute force approach is doable if the number of items is small. This becomes very quickly unfeasible even for a moderate number of items.

The classical approach we will describe below is called integer programming. Then we will compare it with the D-Wave hybrid solvers.

Integer Programming

Out of the many ways to solve this problem, we will pick the Pyomo library to construct the integer programming model, and then we will use the GLPK solver to solve it. There are certainly faster methods available, but this code is simple and looks rather similar to the quantum code in the next section.

The full Jupyter notebook is linked at the end of the article. This is just the integer programming part:

print(f"solving for {len(values)} items")
print("build the model")
pecm = pe.ConcreteModel(name="Knapsack")
pecm.x = pe.Var(range(0, len(values)), domain=pe.Boolean)
pecm.worth = pe.Objective(
expr=sum(values[j] * pecm.x[j] for j in range(0, len(values))),
sense=pe.maximize,
)
pecm.weight = pe.ConstraintList()
pecm.weight.add(
sum(weights[j] * pecm.x[j] for j in range(0, len(values))) <= max_weight
)

solver_name = "glpk"
print(f"submit model to solver {solver_name}")
solver = pe.SolverFactory(solver_name)
solver.solve(pecm)

print("parse the solver output")
total_value = int(pecm.worth())
total_weight = int(sum(weights[j] * pecm.x[j]() for j in range(0, len(values))))
selected_items = pecm.x

The values and weights lists contain the monetary values and the weights of all items. The objective function is created by summing the monetary values of all chosen items. The constraint is built by summing all weights of chosen items, then capping that sum as equal to or less than max_weight.

The complete model is sent to the solver, running locally. When the solver is done, the results are parsed out of the solver output.

Quantum Programming

We use the D-Wave SDK, which formulates the model, then submits it to the D-Wave solvers running in the cloud.

print(f"solving for {len(values)} items")
print("build the model")
cqm = ConstrainedQuadraticModel()
obj = BinaryQuadraticModel(vartype="BINARY")
constraint = QuadraticModel()

for i in range(len(values)):
obj.add_variable(i)
obj.set_linear(i, -values[i])
constraint.add_variable("BINARY", i)
constraint.set_linear(i, weights[i])

cqm.set_objective(obj)
cqm.add_constraint(constraint, sense="<=", rhs=max_weight, label="capacity")

sampler = LeapHybridCQMSampler()
print(f"submit model to solver {sampler.solver.name}")
sampleset = sampler.sample_cqm(cqm, label="knapsack problem")

print("parse the solver output")
feasible_sampleset = sampleset.filter(lambda row: row.is_feasible)
if not len(feasible_sampleset):
raise ValueError("No feasible solution found")
best = feasible_sampleset.first

selected_items = [key for key, val in best.sample.items() if val == 1.0]
total_weight = sum(list(weights.loc[selected_items]))
total_value = sum(list(values.loc[selected_items]))

We start with the same weights and values lists. The model for this problem cannot be pure BQM, since we have a constraint. Therefore, we choose a hybrid sampler.

An interesting twist is that many sample sets may be returned by D-Wave as part of the solution. Some are not feasible (e.g. violate constraints) and we need to filter them out of the results. Out of the feasible sample sets, one or several will have the lowest Ising energy — those are the best solutions found by the quantum computer.

Assuming there is a “best” solution, the quantum annealer does not always converge on it. Details are complex, see the D-Wave handbook, but in essence a heuristic search is sometimes employed to find a “good enough” solution. In other cases, the solver can be coaxed into converging more strongly toward the absolute best solution. Finally, another trick is to repeatedly sample the solutions until the absolute best combination is obtained. The D-Wave documentation is very good, and it provides many techniques for fine-tuning the solvers for finding the best solutions.

Results

We’ve tested both algorithms with a randomly generated dataset containing 50 thousand items, with weights and values varying randomly in a discrete uniform distribution between 1 and 9999. The knapsack weight limit was set to 80% of the total weight of all items — so some items will always be discarded.

This is the output from Pyomo / GLPK:

solving for 50000 items
build the model
submit model to solver glpk
parse the solver output
solver time: 129.44616556167603
knapsack max weight: 200052600
items total weight: 200052581
items total value: 241399378

GLPK took over 2 minutes to solve the problem. It came up with a combination of items that is just under the maximum allowed knapsack weight.

This is the output from D-Wave:

solving for 50000 items
build the model
submit model to solver hybrid_constrained_quadratic_model_version1
parse the solver output
solver time server side: 18.006079
solver time QPU: 0.016048
knapsack max weight: 200052600
items total weight: 200052560
items total value: 240318952
best sol. energy: -240318952.0

The total time spent by the D-Wave hybrid solver was 18 seconds. This is not quite an order-of-magnitude improvement over integer programming, but it’s getting there. Moreover, slight increases in the number of items force the integer programming code to spend a much, much longer time solving the problem — the increase is non-linear.

Not so with the D-Wave solver. It, too, spends a longer time solving a bigger problem, but the time it takes to do that does not increase catastrophically with the number of items.

The time spent on the actual QPU (the quantum annealer) is only 16 milliseconds. Most of the 18 seconds total is spent decomposing the problem, sampling the solution space, then sending the results back to the coder. The larger the fraction of the problem that runs on the actual QPU, the bigger the speed gain.

With very large problems, your Internet connection may become a limiting factor. In my case, running this problem over a mediocre connection, my code spent a few extra seconds pushing and pulling data across the Internet.

Finally, please note that the solution found by the hybrid solver is actually a tiny bit worse than the solution found by the integer programming solver. The hybrid solvers can be optimized to find even better solutions, as mentioned above, but that is outside the scope of this article.

Notes

The complete notebook with the entire code used in this article, along with other files and materials, can be found here:

The quantum solver code is based on D-Wave’s own examples repository, refactored for clarity:

All images shown in this article are created by the author.

--

--

BS in Physics. MS in Data Science. Over a decade experience with cloud computing.