real time circle packing - animated edition!
In which the author finally “finishes” a project. We finally have real time circle packing!
While the last post had good performance, it had one fundamental limitation - it was absolutely hammering the GPU with render requests. A single run could make approximately 200 render calls! This was leading to times of around 500ms per full render. The new version takes 10ms (on my weaker laptop), and it can be run in a loop as shown in the demo below!
Note that while the box below displays the initial render time, the code does not use the previous rendered state to speed up future renders - the only difference between the first and following renders is time spent allocating framebuffers, which is negligible.
So how did I achieve this massive (50x) speedup? I tried a few different things, such as implementing some radius based heuristics to improve the probability that selected circles wouldn’t conflict, thus reducing the expected number of renders required. This had little impact. The next thing was to try precomputing the radius’ of all points to at least remove CPU latency. However, CPU was not the bottleneck to begin with so this only had a modest impact, and wouldn’t be suitable when we switch over to using webcam input anyway.
While thinking about the problem, I realized that until now I’d been thinking about constructing a solution set by iteratively adding new points to the output. However, another way of looking at the problem is to start by selecting every pixel, and remove pixels that aren’t part of the solution. We also have bounds on the max radius, so we know for every pixel, the full set of pixels that might hold circles that could collide with the current pixel. Since every pixel probably collides with its immediate neighbors, we also need some source of randomness to give some notion of priority so that every circle doesn’t delete itself. This approach is also good because all of these steps can be done on the GPU.
Thus, the algorithm implemented is as follows:
img\(\leftarrow\) copy input image to a webgl buffer
random_buf\(\leftarrow\) create a \(n\) x \(n\) gl buffer and initialize it with random values between \(0\) and \(1\).
radius_buf\(\leftarrow\) for every pixel in
imgcompute the radius of the circle that would be drawn at that point if selected.
selected_circles\(\leftarrow\) for every pixel in radius buf, check every pixel that is up to
MAX_RADIUSpx away. If the circle at that pixel would not collide with the circle at the current pixel or the corresponding value in
random_buffor that pixel is less than the value in
random_buffor the current pixel output the current radius, otherwise output \(0\).
- As output, for every pixel, check every pixel that is up to
MAX_RADIUSaway. If any pixel has a non-zero value and a radius large enough to include the current pixel in the circle, draw the circle.
Now, there’s virtually no work done on the CPU! The only thing we need to do on the CPU for every frame is initialize the random buffer, which is pretty fast! This doesn’t give us a completely optimal solution, and we could get better solutions by this this multiple times and keeping old results around, but that’s not really necessary for this effect to look good.
The final steps will be to integrate this with my video synth, so that I can combine it with other effects and inputs (such as webcam input :D).