"Genetic" image compression using transparent polygons a hot candidate for an implementati
  1 / 4    
Found this on boingboing today:

[url="http://rogeralsing.com/2008/12/07/genetic-programming-evolution-of-mona-lisa/"]http://rogeralsing.com/2008/12/07/genetic-...n-of-mona-lisa/[/url]

Roger Alsing re-created the Mona Lisa from 50 transparent polygons in what he claims to be a genetic algorithm. It took about a million iterations. Some contest this and say it's just a hillclimbing algorithm - nevertheless the result is outstanding.

The drawing was so far implemented in GDI+, the fitness function that controls the evolutionary selection is based on a pixel by pixel comparison with the source. It took 3 hours on his laptop to compute. I think this algorithm is a hot candidate for an implementation in CUDA - or possibly a combination of CUDA and DirectX or OpenGL.

The autor has published his C# source code, see next posting

A few good samples from his galery
[img]http://rogeralsing.files.wordpress.com/2008/12/lillie.jpg[/img]
[img]http://rogeralsing.files.wordpress.com/2008/12/billgates.gif[/img]
[img]http://rogeralsing.files.wordpress.com/2008/12/moonman.gif[/img]
[img]http://rogeralsing.files.wordpress.com/2008/12/scream.gif[/img]
[img]http://rogeralsing.files.wordpress.com/2008/12/darwin.gif[/img]
[img]http://rogeralsing.files.wordpress.com/2008/12/operahouseday.gif[/img]

The entire gallery is to be found under this URL:
[url="http://rogeralsing.com/2008/12/11/genetic-gallery/"]http://rogeralsing.com/2008/12/11/genetic-gallery/[/url]

Christian
Found this on boingboing today:



http://rogeralsing.com/2008/12/07/genetic-...n-of-mona-lisa/



Roger Alsing re-created the Mona Lisa from 50 transparent polygons in what he claims to be a genetic algorithm. It took about a million iterations. Some contest this and say it's just a hillclimbing algorithm - nevertheless the result is outstanding.



The drawing was so far implemented in GDI+, the fitness function that controls the evolutionary selection is based on a pixel by pixel comparison with the source. It took 3 hours on his laptop to compute. I think this algorithm is a hot candidate for an implementation in CUDA - or possibly a combination of CUDA and DirectX or OpenGL.



The autor has published his C# source code, see next posting



A few good samples from his galery

Image

Image

Image

Image

Image

Image



The entire gallery is to be found under this URL:

http://rogeralsing.com/2008/12/11/genetic-gallery/



Christian

#1
Posted 12/09/2008 11:08 PM   
Sounds neat...I'll have to take a look at it when he publishes the source code.
Sounds neat...I'll have to take a look at it when he publishes the source code.

GPU.NET: Write your GPU code in 100% pure C#.

Learn more at tidepowerd.com, and download a free 30-day trial of GPU.NET. Follow @tidepowerd for release updates.



GPU.NET example projects

#2
Posted 12/10/2008 01:59 AM   
Source code is now published.

[url="http://rogeralsing.com/2008/12/11/genetic-programming-mona-lisa-source-code-and-binaries/"]http://rogeralsing.com/2008/12/11/genetic-...e-and-binaries/[/url]

hack away!

Christian
Source code is now published.



http://rogeralsing.com/2008/12/11/genetic-...e-and-binaries/



hack away!



Christian

#3
Posted 12/11/2008 11:08 PM   
A portable C++ implementation of the same algorithm is here. This might be a better starting
point for a CUDA port. It has no GUI for parameters and uses SDL for rendering polygons.
This compiled fine on my Mac Mini (Intel) using the SDL libraries installed through fink

[url="http://www.ggsoft.org/archives/genetic.zip"]http://www.ggsoft.org/archives/genetic.zip[/url]

see the blog post here for more info:
[url="http://www.ggsoft.org"]http://www.ggsoft.org[/url]

Christian
A portable C++ implementation of the same algorithm is here. This might be a better starting

point for a CUDA port. It has no GUI for parameters and uses SDL for rendering polygons.

This compiled fine on my Mac Mini (Intel) using the SDL libraries installed through fink



http://www.ggsoft.org/archives/genetic.zip



see the blog post here for more info:

http://www.ggsoft.org



Christian

#4
Posted 12/19/2008 09:14 PM   
So how many polygons do the resulting images contain? Can they be saved as SVG or another vector format?

What kind of compression ratio are we achieving here? (Must be something extreme)
So how many polygons do the resulting images contain? Can they be saved as SVG or another vector format?



What kind of compression ratio are we achieving here? (Must be something extreme)

#5
Posted 12/20/2008 12:21 AM   
[quote name='alex_dubinsky' post='479397' date='Dec 19 2008, 05:21 PM']So how many polygons do the resulting images contain? Can they be saved as SVG or another vector format?

What kind of compression ratio are we achieving here? (Must be something extreme)[/quote]

We're talking about 50 to 150 polygons.

These tools have no SVG export function yet, but they can save their current progress so computation can be resumed later.

Compression factor is also determined by how you store the resulting vector coordinates. Quantized or floating point, the amount of entropy coding applied to the coordinates (delta compression, huffman compression, arithmetic compression - you name it).

Don't expect too much compression efficiency. 100 polygons consisting of say 8 vertices on average already make up 800 coordinates hat have to be stored. Add a RGB color and alpha per polygon and you end up with enough data to transmit a similar quality JPEG image of same byte size ;)

If everything is output as XML (SVG) you get a lot of extra overhead (no good compression efficiency really). The coolness of this is the resolution independence: vector data scales nicely.


One problem that I see with this general approach is that the resulting image is dependent on the drawing order of the polygons. During the drawing the RGB pixels can get saturated, which is a nonlinear operator. So the polygons cannot really be treated independently as individual contributors that decrement the error function. If we had a strict linearity, I could think of some optimizations that would speed up the generation of the polygon coordinates. One could for example try to optimize each polygon one by one - reshaping/recoloring it such that it reduces the error function that remains after adding up all other existing polygons.

Christian
[quote name='alex_dubinsky' post='479397' date='Dec 19 2008, 05:21 PM']So how many polygons do the resulting images contain? Can they be saved as SVG or another vector format?



What kind of compression ratio are we achieving here? (Must be something extreme)



We're talking about 50 to 150 polygons.



These tools have no SVG export function yet, but they can save their current progress so computation can be resumed later.



Compression factor is also determined by how you store the resulting vector coordinates. Quantized or floating point, the amount of entropy coding applied to the coordinates (delta compression, huffman compression, arithmetic compression - you name it).



Don't expect too much compression efficiency. 100 polygons consisting of say 8 vertices on average already make up 800 coordinates hat have to be stored. Add a RGB color and alpha per polygon and you end up with enough data to transmit a similar quality JPEG image of same byte size ;)



If everything is output as XML (SVG) you get a lot of extra overhead (no good compression efficiency really). The coolness of this is the resolution independence: vector data scales nicely.





One problem that I see with this general approach is that the resulting image is dependent on the drawing order of the polygons. During the drawing the RGB pixels can get saturated, which is a nonlinear operator. So the polygons cannot really be treated independently as individual contributors that decrement the error function. If we had a strict linearity, I could think of some optimizations that would speed up the generation of the polygon coordinates. One could for example try to optimize each polygon one by one - reshaping/recoloring it such that it reduces the error function that remains after adding up all other existing polygons.



Christian

#6
Posted 12/20/2008 12:32 AM   
[quote name='cbuchner1' post='479401' date='Dec 19 2008, 08:32 PM']Don't expect too much compression efficiency. 100 polygons consisting of say 8 vertices on average already make up 800 coordinates hat have to be stored. Add a RGB color and alpha per polygon and you end up with enough data to transmit a similar quality JPEG image of same byte size ;)[/quote]
Well, if we represent each vertex as a vect2 of 10bit ints, you get about 2KB per image. Add in compression, and maybe 1KB? Here are the above pictures saved as JPEG with lowest settings to get 1-2KB size. I wish I could test JPEG2000, but Paint.NET doesn't support it.

In general, it seems patently obvious that a genetic algorithm, whether one based on polygons or guassians or wavelets or whatnot will be the next step in image compression. An awfully computationally-intensive step, but a step nevertheless.
[quote name='cbuchner1' post='479401' date='Dec 19 2008, 08:32 PM']Don't expect too much compression efficiency. 100 polygons consisting of say 8 vertices on average already make up 800 coordinates hat have to be stored. Add a RGB color and alpha per polygon and you end up with enough data to transmit a similar quality JPEG image of same byte size ;)

Well, if we represent each vertex as a vect2 of 10bit ints, you get about 2KB per image. Add in compression, and maybe 1KB? Here are the above pictures saved as JPEG with lowest settings to get 1-2KB size. I wish I could test JPEG2000, but Paint.NET doesn't support it.



In general, it seems patently obvious that a genetic algorithm, whether one based on polygons or guassians or wavelets or whatnot will be the next step in image compression. An awfully computationally-intensive step, but a step nevertheless.

#7
Posted 12/20/2008 01:07 AM   
Here are the vector images resized like the JPEGs. Difference is stunning.
Here are the vector images resized like the JPEGs. Difference is stunning.

#8
Posted 12/20/2008 02:23 AM   
Interesting! This certainly produces nice images, but these are quite small, and already slow; I can see applictions for vectorization in tools like inkscape, but for normal image compression I can think of very few cases in which the extra time investment will be worth it. How would it perform on multi-megapixel images. Or video streams?
Interesting! This certainly produces nice images, but these are quite small, and already slow; I can see applictions for vectorization in tools like inkscape, but for normal image compression I can think of very few cases in which the extra time investment will be worth it. How would it perform on multi-megapixel images. Or video streams?

#9
Posted 12/20/2008 02:47 PM   
[quote name='wumpus' post='479625' date='Dec 20 2008, 07:47 AM']How would it perform on multi-megapixel images. Or video streams?[/quote]

With CUDA this could get into performance regions that are acceptable.

Without CUDA it will take several hours of computation to get a decent polygon vector image. This is because the "evolution" of the image starts with random polygons that are mutated randomly and selected according to their similarity with the original image. This takes hundreds of thousands of iterations (generations) and for each iteration the CPU then computes the fitness function.

One complex problem is the drawing of the polygons. Non-convex polygons and polygons with intersecting edges have to be tesselated (broken down into a triangle mesh) in order to be drawn by the GPU. This tesselation is usually done on the CPU and may be quite slow for complex polygons - especially when it has to be done over and over in each iteration. If the algorithm would directly mutate a triangle mesh instead of arbitrary shaped polygons this would be better suited to the GPU hardware.

I am thinking of modifying the algorithm for a CUDA implementation. Maybe using simpler primitives (e.g. individual triangles) or even just lines that separate two distinct color regions. This could then be rendered directly by a CUDA kernel and the error function ("fitness") might be determined already during the rendering pass. One could even determine that the most recent mutation was "bad" before rendering finishes.

Christian
[quote name='wumpus' post='479625' date='Dec 20 2008, 07:47 AM']How would it perform on multi-megapixel images. Or video streams?



With CUDA this could get into performance regions that are acceptable.



Without CUDA it will take several hours of computation to get a decent polygon vector image. This is because the "evolution" of the image starts with random polygons that are mutated randomly and selected according to their similarity with the original image. This takes hundreds of thousands of iterations (generations) and for each iteration the CPU then computes the fitness function.



One complex problem is the drawing of the polygons. Non-convex polygons and polygons with intersecting edges have to be tesselated (broken down into a triangle mesh) in order to be drawn by the GPU. This tesselation is usually done on the CPU and may be quite slow for complex polygons - especially when it has to be done over and over in each iteration. If the algorithm would directly mutate a triangle mesh instead of arbitrary shaped polygons this would be better suited to the GPU hardware.



I am thinking of modifying the algorithm for a CUDA implementation. Maybe using simpler primitives (e.g. individual triangles) or even just lines that separate two distinct color regions. This could then be rendered directly by a CUDA kernel and the error function ("fitness") might be determined already during the rendering pass. One could even determine that the most recent mutation was "bad" before rendering finishes.



Christian

#10
Posted 12/20/2008 10:32 PM   
How about we make a contest out of it?
How about we make a contest out of it?

#11
Posted 12/21/2008 10:25 PM   
[quote name='alex_dubinsky' post='480216' date='Dec 21 2008, 03:25 PM']How about we make a contest out of it?[/quote]

Ha! Nice idea. The question is just how exactly to lay out the rules here.

Christian
[quote name='alex_dubinsky' post='480216' date='Dec 21 2008, 03:25 PM']How about we make a contest out of it?



Ha! Nice idea. The question is just how exactly to lay out the rules here.



Christian

#12
Posted 12/21/2008 11:25 PM   
In terms of algorithm, there's two ways to go here. Stick to something uniform (like the original polygon+pseudoevolution or some agreed variation, like using triangles) and make it more about optimization, or give complete freedom (any primitive function, any evolution algorithm) and make it more about genetic compression.

To judge, I think we should pick several (resolution,compression_ratio,time) tuples as contest rounds. (Eg, contestants must compress a 2 MP image to 30KB in 60 seconds, plus or minus a few KB or seconds.) Winner is the victor in the plurality of scenarios (but anyone who wins a round can brag). Judging could be objective via error function, or subjective via voting.
In terms of algorithm, there's two ways to go here. Stick to something uniform (like the original polygon+pseudoevolution or some agreed variation, like using triangles) and make it more about optimization, or give complete freedom (any primitive function, any evolution algorithm) and make it more about genetic compression.



To judge, I think we should pick several (resolution,compression_ratio,time) tuples as contest rounds. (Eg, contestants must compress a 2 MP image to 30KB in 60 seconds, plus or minus a few KB or seconds.) Winner is the victor in the plurality of scenarios (but anyone who wins a round can brag). Judging could be objective via error function, or subjective via voting.

#13
Posted 12/21/2008 11:43 PM   
[quote name='alex_dubinsky' post='480264' date='Dec 21 2008, 05:43 PM']In terms of algorithm, there's two ways to go here. Stick to something uniform (like the original polygon+pseudoevolution or some agreed variation, like using triangles) and make it more about optimization, or give complete freedom (any primitive function, any evolution algorithm) and make it more about genetic compression.

To judge, I think we should pick several (resolution,compression_ratio,time) tuples as contest rounds. (Eg, contestants must compress a 2 MP image to 30KB in 60 seconds, plus or minus a few KB or seconds.) Winner is the victor in the plurality of scenarios (but anyone who wins a round can brag). Judging could be objective via error function, or subjective via voting.[/quote]

I'd say make the first phase of the contest constrained to the uniform scenario, where everyone gets the exactly the same number of degrees of freedom (200 triangles, each with uniform color and alpha, for example).

Once people have gotten familiar with the problem, and winners declared, phase two should be the freestyle round. Use any primitives you like to produce an output representation of the image in fixed size and time. Contestants will have probably learned enough in phase 1 (especially checking out the winning solutions) to make phase 2 really interesting....
[quote name='alex_dubinsky' post='480264' date='Dec 21 2008, 05:43 PM']In terms of algorithm, there's two ways to go here. Stick to something uniform (like the original polygon+pseudoevolution or some agreed variation, like using triangles) and make it more about optimization, or give complete freedom (any primitive function, any evolution algorithm) and make it more about genetic compression.



To judge, I think we should pick several (resolution,compression_ratio,time) tuples as contest rounds. (Eg, contestants must compress a 2 MP image to 30KB in 60 seconds, plus or minus a few KB or seconds.) Winner is the victor in the plurality of scenarios (but anyone who wins a round can brag). Judging could be objective via error function, or subjective via voting.



I'd say make the first phase of the contest constrained to the uniform scenario, where everyone gets the exactly the same number of degrees of freedom (200 triangles, each with uniform color and alpha, for example).



Once people have gotten familiar with the problem, and winners declared, phase two should be the freestyle round. Use any primitives you like to produce an output representation of the image in fixed size and time. Contestants will have probably learned enough in phase 1 (especially checking out the winning solutions) to make phase 2 really interesting....

#14
Posted 12/22/2008 02:44 AM   
Here is some CUDA code (work in progress).

Contrary to Roger Alsing's approach, I am not using alpha blending - but rather strict linear superposition of the polygons. This means each polygon adds to the resulting image without affecting previous polygon contributions. This linearity will allow for a better optimization of each individual polygon (while leaving the other polygons fixed).

Rendering is done in a CUDA kernel, currently in a 512x512 RGB matrix. Rendering works on convex polygons, each having a *signed* 8 bit RGB color - meaning a polygon can either add to the color output, or subtract from it. Superposition of polygon RGB colors is currently done in the floating point domain, before an 8 bit RGB clamping is applied for final output. I found working in floating point to be a little faster than working with ints. I intend to compute the error function as part of the rendering process before the RGB clamping. So I can abort rendering early when the error function goes out of bounds (i.e. the mutation was bad).

The current code sample renders 3 triangles (the middle one subtracting from the final output by using negative colors). The sample is based on the "Box Filter" SDK sample and uses OpenGL. It should compile also on Linux if you define TARGET_LINUX instead of TARGET_WIN32 and apply a modified "BoxFilter" Makefile from the CUDA SDK 2.0

Now my problem is that this kernel only does "only" ~900 FPS on my nVidia 9600 GSO when rendering just three polygons. /no.gif' class='bbc_emoticon' alt=':no:' /> I need some advice how to make this rasterization faster. /fear.gif' class='bbc_emoticon' alt=':fear:' />

You can still enable the box filter with the + , - and [ , ] keys. It brings down the FPS even more but softens the edges.

UPDATE: The most recent addition to my renderer is that it computes an error metric during rendering (which can be easily converted into a PSNR). The FPS values reported are now much more accurate.

Christian
Here is some CUDA code (work in progress).



Contrary to Roger Alsing's approach, I am not using alpha blending - but rather strict linear superposition of the polygons. This means each polygon adds to the resulting image without affecting previous polygon contributions. This linearity will allow for a better optimization of each individual polygon (while leaving the other polygons fixed).



Rendering is done in a CUDA kernel, currently in a 512x512 RGB matrix. Rendering works on convex polygons, each having a *signed* 8 bit RGB color - meaning a polygon can either add to the color output, or subtract from it. Superposition of polygon RGB colors is currently done in the floating point domain, before an 8 bit RGB clamping is applied for final output. I found working in floating point to be a little faster than working with ints. I intend to compute the error function as part of the rendering process before the RGB clamping. So I can abort rendering early when the error function goes out of bounds (i.e. the mutation was bad).



The current code sample renders 3 triangles (the middle one subtracting from the final output by using negative colors). The sample is based on the "Box Filter" SDK sample and uses OpenGL. It should compile also on Linux if you define TARGET_LINUX instead of TARGET_WIN32 and apply a modified "BoxFilter" Makefile from the CUDA SDK 2.0



Now my problem is that this kernel only does "only" ~900 FPS on my nVidia 9600 GSO when rendering just three polygons. /no.gif' class='bbc_emoticon' alt=':no:' /> I need some advice how to make this rasterization faster. /fear.gif' class='bbc_emoticon' alt=':fear:' />



You can still enable the box filter with the + , - and [ , ] keys. It brings down the FPS even more but softens the edges.



UPDATE: The most recent addition to my renderer is that it computes an error metric during rendering (which can be easily converted into a PSNR). The FPS values reported are now much more accurate.



Christian

#15
Posted 01/01/2009 08:59 PM   
  1 / 4    
Scroll To Top