2021-10-18 Simulated Annealing for Image Tracing graphics python

I wrote a bit of Python to help “trace” an image (perform a raster-to-vector transformation) using simulated annealing.

# Background

## Image tracing

### Raster images

Raster images (such as digital photos) are stored in the computer as grids of pixels. Since a raster image lives in a grid, it necessarily has a fixed size (number of pixels) and resolution (number of pixels per inch). Image resolution relates to how much detail a photograph has. An image with higher resolution can be printed larger before it begins to appear pixellated or blocky.

### Vector images

Vector images are not grids of pixels. They are instructions to a graphics processor for visiting points, drawing lines between them, shading areas, etc. Since points and lines are stored in the vector format as floating-point numbers, vector images have dramatically higher resolution than raster images. They can be printed at arbitrary size without pixellation since they do not consist of pixels.

### Rasterization

Rasterization refers to the process of converting a vector image to a raster image. This might be useful for rendering a vector to any destination which requires a grid of inputs such as an LCD screen or an inkjet printer. Rasterization is an interesting field full of optimizations (antialiasing or subpixel resolution) but it is essentially straightforward: we want to find a target raster which looks most like its source vector to the human eye.

### Image tracing

Image tracing refers to the process of converting a raster image to a vector image. It is also essentially straightforward: we want to find the target vector which looks most like its source raster to the human eye.

But image tracing is inherently more imprecise since it involves extrapolating information from the raster which isn’t there. (i.e. image content from in-between pixels.)

## Simulated annealing

Simulated annealing is an optimization technique inspired by a physical phenomenon, so let’s discuss the physical phenomenon first.

### Annealing

Annealing is a metallurgical process for making metal more ductile (softer) by heating it up and letting it cool slowly.

Metals usually consist of small regions of atoms called grains, where all the atoms in a grain are packed regularly. But adjacent grains are often misaligned (see the “polycrystalline” example above), leading to fracture lines called discontinuities.

When a metal is heated above some recrystalliaztion temperature, bonds within the metal begin to break.

Cooling that metal again slowly heals fracture lines between grains, aligns adjacent grains with each other to create bigger grains, and eventually leads to a more regular (and more ductile) metal.

### Simulated Annealing

Simulated Annealing is an optimization technique for finding a local maximum within some search space which is too large to search completely. It is named after metallurgical annealing.

For a large and complicated search space where finding a maximum is sufficient (i.e. finding a short path between points but not necessarily the shortest possible), simulated annealing is a surprisingly effective technique.

In physical annealing, we raise and then slowly lower the temperature of a metal. In simulated annealing, temperature corresponds to the internal energy of the system, i.e. how often we spontaneously hop between candidate positions.

When simulated temperature is high, our cursor is flightier and hops between positions at random. As temperature lowers, we still seek out more optimal neighboring candidates, but our cursor “settles” and becomes increasingly unlikely to pick a new point at random.

Simulated annealing and other optimization techniques are complicated and interesting. Read much more about the techniques and history behind it here.

# Work

With a bit of simulated annealing, we can partially automate the process of image tracing.

Here are the four kinds of image we’ll deal with:

Source  ( don't have )            Candidate  ( we generate    )
vector  ( this data  )            vector     ( this candidate )

│                                 │
│  ( rasterized )                 │  ( rasterized )
│  ( by someone )                 │  ( by us      )
│                                 │
▼                                 ▼

Source   ◀────(compared to)────►  Candidate
raster                            raster


The plan is as follows: We guess a series of candidate vectors and trivially rasterize them to produce a series of candidate rasters. Once we have generated a candidate raster which sufficiently resembles the source raster, we can assume its associated candiate vector also sufficiently resembles the source vector (which we cannot see).

## Overview

Our whole algorithm runs in a loop:

                                  Initial guess
parameters
│
│
Guess new     │
use as input  ╭──► parameters ───┤ use them
for annealing  │                  │ to make a new
│                  ▼
Fitness            Candidate
▲               vector
│                  │
compare to  │                  │ rasterize
target raster  ╰─── Candidate  ◀──╯ this to produce
to calculate       raster          a new


Eventually our measure of ‘fitness’ (more on that later) becomes sufficiently low and we decide that our candidate vector is close enough.

## Caveats

Here are some of the important caveats:

• we need a reasonable initial vector guess. This guess can be generated by a human or by some low-quality heuristic.

• we need a very fast way to evaluate fitness. This is the bottleneck, i.e. the majority of the time spent in the loop will be spent doing this. (Running the annealing algorithm, generating vectors, and rasterizing images are all extremely well-optimized already.)

# Worked Example

Take as a source vector this crude vector illustration of an ampersand, and as a source raster this even cruder rasterization.

Hopefully you can see that the raster image has lower resolution. If not, zoom in on both images until the one on the right becomes pixellated.

This is our source. Now let’s make a reasonable initial vector guess:

It doesn’t look very close. But it’s a good start.

As our simulated annealing cycle runs, we mutate each point in this initial candidate vector until the candidate raster image starts to look more and more like our source raster.

Until eventually we arrive at a final raster guess and a final vector guess.

# Measuring image similarity

How do we measure image similarity?

## Measuring image similarity, for humans

For a human, it’s really handy to look at two images with a red/blue overlay. (A red/blue overlay is a blend of two images where one input becomes the red channel of the output, and the other input becomes the blue channel of the output. Pixels which are present in both images are rendered black.)

Here are two overlays: one of the initial guess against the source image, and another of the final guess against the source image.

You can see that the initial guess (in blue) is quite poor. But our final guess is much better. You can see only a few blue or red edges peeking out.

## Measuring image similarity, for computers

It might be tempting to define fitness as “number of black pixels in an overlay”. But it turns out that this definition has some serious shortcomings.

Under this scheme, if two shapes don’t overlap at all, it doesn’t matter how far apart they are. The pixel overlap of two squares with an inch between them is zero, but the pixel overlap of two squares with three inches betweeen them is also zero.

Put another way: if we measure fitness in this way, then we’re not rewarding incremental improvement.

## Blur

The way around this is to blur both images before comparing them. Now we have a smooth transition between different colors in the image, so pixel overlap increases smoothly as our candidate converges.

There is obviously a sweet spot here. If you blur an image too much, then it becomes a mess and the annealing algorithms have no incentive to position the shapes precisely. If you blur them too little then you don’t get the benefit. This almost certainly needs to be tuned to the specific use case.

# Conclusion

This is a powerful technique for a very specific set of circumstances and inputs. But I don’t think this is a useful general-purpose technique for image tracing, since most people who want to trace images want the entire task automated. And the application domain needs a very fast evaluation loop, so this is probably unsuitable for scripting heavyweight professional vector editors. But for certain circumstances, it seems like a fast way to fine-tune the contents of a vector.

# Code

Here is some (lightly commented) code I used to perform the simulated annealing discussed in this article, as well as generate the illustrations above.