C++ – Does someone have an optimized function to premultiply bitmap by alpha

alphablendingcgdi+winapi

GDIPlus blend functions use premultiplied rgb channel by alpha bitmaps for efficiency. However premultiplying by alpha is a very costly since you have to treat each pixel one by one.

It seem that it would be a good candidate for SSE assembly. Is there someone here that would want to share its implementation? I know that this is hard work so that's the reason I ask. I'm not trying to steal your work. You'll get all my consideration for sharing this if you can.

Edit : I'm not trying to do alpha blending by software. I'm trying to premultiply each color component of each pixel in an image by its alpha. I'm doing this because the alpha blend is done by the formula : dst=srcsrc.alpha+dst(1-dst.alpha) however the AlphaBlend Win32 function does implement dst=src+dst(1-dst.alpha) for optimisation reason. To get the correct result you need that src be equal to src*src.alpha before calling AlphaBlend.

It would take me a bit of time to write as I know little about assembly so I was asking if someone would like to share its implementation. SSE would be great as in the paper the gain would alpha blending by software is 300%.

Best Answer

There's a good article found here. It's a bit old but you might find something useful in the section where it uses MMX to implement alpha blending. This could be easily translated to SSE instructions to take advantage of larger register sizes (128bit)

MMX Enhanced Alpha Blending

Intel Application Notes here, with source code

Using MMX™ Instructions to Implement Alpha Blending

Related Topic