I was deep in a side-project rabbit hole. It turns out that the outer edge of a jigsaw puzzle form a TSP instance… I was thinking of solving medium sized TSP instances, and reading the web.

This wasn’t even related; technically I didn’t even have a complete graph (most jigsaw pieces don’t fit with each other), and I could apply some constraints like the puzzle must be rectangular. And my TSP was actually asymmetric! So the normal approach probably wouldn’t work here.

But I was googling around, and I found the mmost demanding SO question I have ever seen!

https://stackoverflow.com/questions/7159259/optimized-tsp-algorithms

Question:

So, does anyone know how to implement a simple (by simple I mean that an implementation doesn't take more than 100-200 lines of code) TSP solver that works in reasonable time (a few seconds) for at least 100 cities?
I am only interested in exact solutions.

You may assume that the input will be randomly generated, so I don't care for inputs that are aimed specifically at breaking a certain algorithm.

Please solve exact, medium sized TSP instances, in <200 lines of code. Talk about demanding free labor from others!

But the immensely cool part was that the question was answered exactly! This is asking for a miracle and getting it! I was really impressed. 200 lines of java code!

I recommend reading the SO answer, and I’ll try to elucidate upon it a little, piece by piece. The first concept we need is Branch & Bound.

Branch and Bound

The first step of branch and bound is to formulate your problem as a large binary tree of yes/no questions.

Do I include this edge in the final solution? Yes/ no, split left/right. Do I include this next edge? Split left/right again. You might have a tree with E levels and 2^E leaves at the bottom, which is massive.

And the big idea is, after branching, is to be able to put an upper bound on the whole subtree. So you can say, after we’ve picked edges A, B, and C, the length of the minimum TSP tour that can exist in this subtree is 50.

And then, with this bound, you can exclude the whole subtree if you’ve already found a solution that is better than 50. This is incredible useful, and all current TSP solvers (and other LP/MIP solvers) use it.

Held Karp

The tough step in formulating a Branch & Bound approach is finding a good heuristic for the lower bound of a subtree: if you had a perfect one, then you’d have solved the problem already. Held Karp a very used heuristic in modern solvers: It’s complicated and I don’t understand it, but essentially it takes the form of finding the answer to a linear program.

From the Post:

The point of solving Held–Karp is that the value of the LP is at most the length OPT of the optimal tour
but also conjectured to be at least 3/4 OPT (in practice, usually closer to OPT).

So here Held-Karp is a heuristic that we can use as the lower bound (best possible solution) for a subtree. It’s described a little bit in the post, but HK is the value of a solved linear program. There maybe a typo in the post, because somewhere else I found it was at most 4/3 OPT, which makes more sense.

But the idea here is that we use the value of Held Karp to evaluate a subtree and ignore it or not. To do this, we need to go solve a Linear Program!

1-Tree

Held–Karp is a linear program but it has an exponential number of constraints.
One way to solve it is to introduce Lagrange multipliers and then do subgradient optimization.
That boils down to a loop that computes a minimum spanning tree and then updates some vectors, but the details are sort of involved.
Besides "Held–Karp" and "subgradient (descent|optimization)", "1-tree" is another useful search term.

Oof we’re going deep here.

So, we have an integer program we need to solve. The approach taken by this answer is to do a “Langragian Relaxation”, where the answer to whether an edge is in the solution is not a 0 or 1, but can be between 0.0 and 1.0, and then use those for Lagrangian optimization

Next we do or in this case “subgradient descent optimization”, which is like gradient descent, but more complicated, and you use it when the function you are optimizing is not differentiable, but still convex.

I think the two methods combine so that the Lagrange multipliers give each node a special extra weight, and subgradient descent messes with them to iterate towards an optimal solution.

A one-tree is built by taking a node, finding the Minimum Spanning Tree without it, and then adding the two cheapest edges that connect the node to the tree back.

That’s about all I can tell.

The Rust Version

I read this, and was amazed. But it was (in the best way) code-golfy, and in Java. So obviously the answer was to rewrite in Rust, for fun!

It was really fun, and at times frustrating. It was everything a side-project should be. I spent a long time learning Nom to replace the ~35 lines of Regex in java.

Here is the code. And here are a few notes about the Rust version:

More LOCS (but who cares)

About 2x a long, but a lot of it is linter-enforced newlines, and Rust’s weird Trait boilerplatey-ness. By this I mean, Rust itself isn’t boilerplate-y like Go, but once in a while you find yourself doing something that could be done more succintly, but you don’t care too much:

impl PartialEq for Node {
fn eq(&self, other: &Self) -> bool {
self.lower_bound == other.lower_bound
}
}

I mean, it might be better, but I don’t understand one-trees, so….

The same-ish runtime

I didn’t do this for the performance boost, I did it to try to understand the solution and to have a non JVM version. But secretly, I was expecting a tiny perf boost. Like, ~20% faster. However, on my small sample, I saw 20% slower runtimes. They both run in close to 9 seconds on a 101 node graph (TSPLIB’s eil101).

It makes sense though, as I have no idea what I’m doing in Rust, and I think I’m allocating more memory in the inner loops than the Java version. I was afraid of unallocated arrays so I zeroed them all in the beginning, and I don’t think the SO answer does this.

It’s also all single-threaded; theoretically one shouldc be able to import Rayon and drop a few par_iters in there, but this solution also seems to use a lot of shared state. In the future I’d like to be able to be good enough at rust to write a parallel DFS-ish search, where each thread communicates its best answer so far to a globally locked counter.

Anyway, I have another TSP solver to write (for rectangular tours on asymmetric, non-complete graphs!) to so I’m going to end this here and leave the code the way it is, for someone in the future to explain to me/teach me Rust/help optimize.