The more programmers learn, the more we forget. This article highlights a bit of forgotten knowledge in the form of a fast algorithm for bitmap scaling and rotation - vintage knowledge from Amiga days. Some time ago, I read with interest an article about optimized bitmap rotation. Like many other game programmers, I started programming many years ago in the “I can spin more cubes than you” Amiga demo scene. I read that article hoping to find a new algorithm to beat the one I picked up some 10 years ago — only to find an algorithm that was slower and less flexible.
I spent a good while searching the Web with no avail and found only a few side notes from the Usenet archives. Maybe what I thought was once well-known has died along with the Amiga. Since I’m a die-hard fan, I decided to write this article to relight a piece of useful code from the Amiga.
There are several methods to scale and rotate a bitmap but generally not at the same time. One well known method for rotation is the shear and transpose [1] method where the image is first sheared, and then transposed, to produce the rotated effect.
One way to rotate and scale at the same time is to scan the target bitmap passing each pixel’s position through a rotation matrix. For each destination pixel at position (x, y) on the target bitmap, this transformation produces the source position (u, v) that supplies that pixel from the original source bitmap, as illustrated in this pseudo code:
for(y=0; y<height; y++) { for(x=0; x<width; x++) { u = cos(-angle) * x * (1.0/scale) + sin(-angle) * y * (1.0/scale); v = -sin(-angle) * x * (1.0/scale) + cos(-angle) * y * (1.0/scale); dstArray[x,y]=srcArray[u,v]; } }
A faster version of this includes passing the end points of the line through the matrix and using Bresenham’s algorithm to generate the source pixel position.
This is very similar to the method I’ll present in this article, except I go a stage further and calculate the x and y displacements between each pixel in the rotated bitmap. I can use these displacements to generate two vectors, one between each pixel for the horizontal (duRow, dvRow) in the destination image, and one for the vertical (duCol, dvCol). These two vectors are at right angles to each other:
duCol = sin(-angle)*(1.0/scale) dvCol = cos(-angle)*(1.0/scale) duRow = dvCol dvRow =-duCol
Using these two vectors, scaling and rotating the bitmap is simply a case of holding a second set of coordinates for the source pixel, scanning the target image row by row, copying the pixel at the source position to the linear target position, and then adding the horizontal vector. Once at the end of a row, the coordinate for the source pixel is reset and the vertical vector added. Because the vectors are common over the complete operation, I only need to calculate them once. The original coordinate for the source pixel needs to be rotated and scaled so that it points to the correct pixel for the first pixel of the target image. I do this by rotating the coordinates of the center point of the target image and subtracting it from the coordinates of the center of the source:
startingu = fSrcRotCX - (fDstRotCX * dvCol + fDstRotCY * duCol); startingv = fSrcRotCY - (fDstRotCX * dvRow + fDstRotCY * duRow);
Given these calculated values, I use the following pseudo code to render the rotated and scaled image:
rowu = startingu rowv = startingv for(y=0; y<height; y++) { u = rowu; v = rowv; for(x=0; x<width; x++) { dstArray[x,y]=srcArray[u,v]; u += duRow v += dvRow } rowu += duCol rowv += dvCol }As you can see from the above code, I still have the problem of what to do with out of bound u,v coordinates. This really only happens when the destination bitmap is much larger than the source bitmap, or the scale is small — less than 1.0. There are two possible solutions; the first is to clip to the source image dimensions using the following pseudo code:
if(u>0 && v>0 && u<dstW && v<dstH) dstArray[x,y] = srcArray[u,v]; else dstArray[x,y] = 0;This introduces four more comparisons to the main loop and is a considerable slowdown. The other solution is to wrap the image. If the source image dimensions are a power of two, you can just use a logical AND to produce the wrapping:
dstArray[x,y] = srcArray[u1 & srcW-1,v1 & srcH-1];Alternatively, you could use modulus. Alas, modulus has the problem of negative values that you can fix with explicit checking and inversion:
u1 = ((u < 0.0 ? (int)-u : (int)u) + srcW) % srcW; v1 = ((v < 0.0 ? (int)-v : (int)v) + srcH) % srcH; dstArray[x,y] = srcArray[u1,v1];or by the less correct, but faster method of loading the u,v coordinate with a high multiple of its modulus before dividing:
u1 = u + (srcW << 8) % srcW v1 = v + (srcH << 8) % srcH dstArray[x, y] = srcArray[u1, v1];Obviously, this is a slowdown in itself and can still lead to problems if the scale goes too small. The multiple of the modulus can be made dependent on operating parameters. Suffice it to say that in the high-speed, cutthroat world of Amiga demos, the only version to see the light of day was the wrapping power-of-two version with source and target dimensions of 256, allowing for automatic wrapping by accessing bytes. Implementation In the world of Windows programming, the only real way of getting to the pixels of a bitmap is to create the bitmap as a DIB (device-independent bitmap) using CreateDIBSection() or to define the LR_CREATEDIBSECTION flag in LoadImage() and use GetObject() to get the bits. To keep the implementation simple and fast, I decided to only use 16-bit DIBs. It allows me to access the DIB as an array of WORDs, whereas 24-bit DIBs offer uneven alignment, and 8-bit DIBs would require extra code for palette handling. The DIB creation code is in the CDIB class shown in cdib.h (ListingOne) and cdib.cpp (Listing Two). Listing One: cdib.h — Declaration of CDIB
/////////////////////////////////////////////////////////////////// class CDIB { public: CDIB(); ~CDIB(); bool Create(HDC hdcSrc, int iSrcX, int iSrcY, int iWidth, int iHeight); void Release(); // Varibles HDC m_hdc; // HDC of the DIB HBITMAP m_hbm; // HBITMAP of the DIB HBITMAP m_hbmOld; // old HBITMAP from the hdc WORD* m_pSrcBits; // pointer to DIB pixel array int m_iWidth; // Width of the DIB int m_iHeight; // Height of the DIB int m_iSWidth; // Storage Width (in bytes) }; /* End of File */Listing Two: cdib.cpp — Definition of CDIB
////////////////////////////////////////////////////////////////// // CDIB - Small wrapper class to handle DIB`s #include <windows.h> #include "cdib.h" ////////////////////////////////////////////////////////////////// CDIB::CDIB() { m_hdc = 0; m_hbm = 0; m_hbmOld = 0; m_pSrcBits = NULL; m_iWidth = 0; m_iHeight = 0; m_iSWidth = 0; } ////////////////////////////////////////////////////////////////// CDIB::~CDIB() { Release(); } ////////////////////////////////////////////////////////////////// // Takes DC handle, creates a DIB from given location and size bool CDIB::Create(HDC hdcSrc, int iSrcX, int iSrcY, int iWidth, int iHeight) { // Release the old DIB Release(); // fill in the BITMAPINFO structure BITMAPINFO bmi; ZeroMemory(&bmi, sizeof(BITMAPINFO)); bmi.bmiHeader.biSize = sizeof(BITMAPINFOHEADER); bmi.bmiHeader.biWidth = iWidth; bmi.bmiHeader.biHeight = iHeight; bmi.bmiHeader.biPlanes = 1; bmi.bmiHeader.biBitCount = 16; bmi.bmiHeader.biCompression = BI_RGB; bmi.bmiHeader.biXPelsPerMeter = 72; bmi.bmiHeader.biYPelsPerMeter = 72; m_iWidth = iWidth; m_iHeight = iHeight; // Each line of the DIB is always quad aligned. m_iSWidth = (((iWidth*sizeof(WORD)) + 3) & ~3); // Get hdc of screen for CreateDIBSection and create the DIB HDC hdcScreen = GetDC(NULL); m_hbm = CreateDIBSection(hdcScreen, &bmi, DIB_RGB_COLORS, (void**)&m_pSrcBits, NULL, 0); // Create and select the bitmap into a DC m_hdc = CreateCompatibleDC(hdcScreen); m_hbmOld = (HBITMAP)SelectObject(m_hdc, m_hbm); // Copy the original image into the DIB BitBlt(m_hdc,0,0,iWidth,iHeight,hdcSrc,iSrcX,iSrcY,SRCCOPY); // all GDI functions must be flushed before we can use the DIB GdiFlush(); ReleaseDC(NULL, hdcScreen); return true; } ////////////////////////////////////////////////////////////////// void CDIB::Release() { if(m_hdc) { if(m_hbmOld) { SelectObject(m_hdc, m_hbmOld); } if(m_hbm) { DeleteObject(m_hbm); } DeleteDC(m_hdc); } m_hdc = NULL; m_hbm = NULL; m_hbmOld = NULL; m_pSrcBits = NULL; m_iWidth = 0; m_iHeight = 0; } ////////////////////////////////////////////////////////////////// //End of FileThe example code uses two 16-bit DIBs, one for the source image and one for the window image. The window DIB is recreated each time the window receives a WM_SIZE message, to keep the window DIB the same size as the window. The example uses WM_TIMER messages to pass the rotate function the pixels of both the source and window DIBs to perform the rotation, then uses BitBlt() to copy the window DIB to the window. Another option would have been to use DirectX to get direct access to the screen display, thereby cutting out a render stage. However, for demonstration purposes, this would have overcomplicated the example. I provide three versions of the rotate routine in rotate.h (Listing Three) and rotate.cpp (Listing Four). Rotate() is the slowest but supports any sized source image (by using modulus). FastRotate() requires the source image to be of a power of two. Finally, RotateWithClip() clips the u,v of the source image to the limits of the source dimensions. This removes the need for wrapping the u,v coordinate. main.cpp (Listing Five) demonstrates how to use these functions. Listing Three: rotate.h — Interface to rotation functions
/////////////////////////////////////////////////////////////////// void Rotate( WORD *pDstBase, int dstW, int dstH, int dstDelta, WORD *pSrcBase, int srcW, int srcH, int srcDelta, float fDstRotCenterX, float fDstRotCenterY, float fSrcRotCenterX, float fSrcRotCenterY, float fAngle, float fScale); /////////////////////////////////////////////////////////////////// void FastRotate( WORD *pDstBase, int dstW, int dstH, int dstDelta, WORD *pSrcBase, int srcW, int srcH, int srcDelta, float fDstRotCenterX, float fDstRotCenterY, float fSrcRotCenterX, float fSrcRotCenterY, float fAngle, float fScale); /////////////////////////////////////////////////////////////////// void RotateWithClip( WORD *pDstBase, int dstW, int dstH, int dstDelta, WORD *pSrcBase, int srcW, int srcH, int srcDelta, float fDstRotCenterX, float fDstRotCenterY, float fSrcRotCenterX, float fSrcRotCenterY, float fAngle, float fScale); /* End of File */Listing Four: rotate.cpp — Implementation of rotation functions
#include <windows.h> #include <math.h> ////////////////////////////////////////////////////////////////// // Rotate - wrapping version // This version takes any dimension source bitmap and wraps. ////////////////////////////////////////////////////////////////// void Rotate( WORD *pDstBase, int dstW, int dstH, int dstDelta, WORD *pSrcBase, int srcW, int srcH, int srcDelta, float fDstCX, float fDstCY, float fSrcCX, float fSrcCY, float fAngle, float fScale) { srcDelta /= sizeof(WORD); dstDelta /= sizeof(WORD); float duCol = (float)sin(-fAngle) * (1.0f / fScale); float dvCol = (float)cos(-fAngle) * (1.0f / fScale); float duRow = dvCol; float dvRow = -duCol; float startingu = fSrcCX - (fDstCX * dvCol + fDstCY * duCol); float startingv = fSrcCY - (fDstCX * dvRow + fDstCY * duRow); float rowu = startingu; float rowv = startingv; for(int y = 0; y < dstH; y++) { float u = rowu; float v = rowv; WORD *pDst = pDstBase + (dstDelta * y); for(int x = 0; x < dstW ; x++) { int sx = ((u < 0.0 ? (int)-u : (int)u) + srcW) % srcW; int sy = ((v < 0.0 ? (int)-v : (int)v) + srcH) % srcH; WORD *pSrc = pSrcBase + sx + (sy * srcDelta); *pDst++ = *pSrc++; u += duRow; v += dvRow; } rowu += duCol; rowv += dvCol; } } ////////////////////////////////////////////////////////////////// // FastRotate - wrapping version // // This version assumes the dimensions of the source image to be a // power of two. For none-power2 dimensions, use Rotate // ////////////////////////////////////////////////////////////////// void FastRotate( WORD *pDstBase, int dstW, int dstH, int dstDelta, WORD *pSrcBase, int srcW, int srcH, int srcDelta, float fDstCX, float fDstCY, float fSrcCX, float fSrcCY, float fAngle, float fScale) { srcDelta /= sizeof(WORD); dstDelta /= sizeof(WORD); float duCol = (float)sin(-fAngle) * (1.0f / fScale); float dvCol = (float)cos(-fAngle) * (1.0f / fScale); float duRow = dvCol; float dvRow = -duCol; float startingu = fSrcCX - (fDstCX * dvCol + fDstCY * duCol); float startingv = fSrcCY - (fDstCX * dvRow + fDstCY * duRow); float rowu = startingu; float rowv = startingv; for(int y = 0; y < dstH; y++) { float u = rowu; float v = rowv; WORD *pDst = pDstBase + (dstDelta * y); for(int x = 0; x < dstW ; x++) { WORD *pSrc = pSrcBase + (((int)u) & srcW-1) + ((((int)v) & srcH-1) * srcDelta ); *pDst++ = *pSrc++; u += duRow; v += dvRow; } rowu += duCol; rowv += dvCol; } } ////////////////////////////////////////////////////////////////// // Rotate - nowrapping clipping version // // this version does not have the limitation of power of 2 // dimensions and will clip the source image instead of wrapping. ////////////////////////////////////////////////////////////////// void RotateWithClip( WORD *pDstBase, int dstW, int dstH, int dstDelta, WORD *pSrcBase, int srcW, int srcH, int srcDelta, float fDstCX, float fDstCY, float fSrcCX, float fSrcCY, float fAngle, float fScale) { srcDelta /= sizeof(WORD); dstDelta /= sizeof(WORD); float duCol = (float)sin(-fAngle) * (1.0f / fScale); float dvCol = (float)cos(-fAngle) * (1.0f / fScale); float duRow = dvCol; float dvRow = -duCol; float startingu = fSrcCX - (fDstCX * dvCol + fDstCY * duCol); float startingv = fSrcCY - (fDstCX * dvRow + fDstCY * duRow); float rowu = startingu; float rowv = startingv; for(int y = 0; y < dstH; y++) { float u = rowu; float v = rowv; WORD *pDst = pDstBase + (dstDelta * y); for(int x = 0; x < dstW ; x++) { if(u>0 && v>0 && u<srcW && v<srcH) { WORD *pSrc = pSrcBase + (int)u + ((int)v * srcDelta); *pDst++ = *pSrc++; } else { *pDst++ = 0; } u += duRow; v += dvRow; } rowu += duCol; rowv += dvCol; } } ////////////////////////////////////////////////////////////////// //End of FileListing Five: main.cpp — Demonstration program
/* ************************************************************** ** NAME: Fast Bitmap Rotation and Scaling ** AUTHOR: Steven M Mortimer ** COMPILER: Visual C++ V6.0 Enterprise SP3 ** Target Platform: Win32 ** RIGHTS: Stevem Mortimer ** Date: 18th Jan 2000 ************************************************************** */ #include <windows.h> #include <windowsx.h> #include <stdio.h> #include "cdib.h" #include "rotate.h" ///////////////////////////////////////////////////////////////// // defines #define _MINSCALE 0.4f #define _MAXSCALE 5.0f #define SZIMAGE "test1.bmp" ///////////////////////////////////////////////////////////////// // Globals CDIB* gDibSrc = NULL; CDIB* gDibDst = NULL; float gdScale = _MAXSCALE; float gdAngle = 0.0f; float gdScaleDir = 0.1f; double gdTicksPerSec = 0.0; bool gbTimeFunction = false; ///////////////////////////////////////////////////////////////// // Function Prototypes LRESULT CALLBACK WndProc (HWND, UINT, WPARAM, LPARAM) ; ///////////////////////////////////////////////////////////////// // WinMain int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR szCmdLine, int iCmdShow) { HWND hwnd; MSG msg; // Register the window class WNDCLASS wndclass; ZeroMemory(&wndclass, sizeof(WNDCLASS)); wndclass.style = CS_HREDRAW | CS_VREDRAW; wndclass.lpfnWndProc = WndProc; wndclass.hInstance = hInstance; wndclass.hCursor = LoadCursor(NULL, IDC_ARROW); wndclass.lpszClassName = __FILE__; RegisterClass(&wndclass); // Check if we can use the high performace counter // for timing the rotation function LARGE_INTEGER perfreq; if(QueryPerformanceFrequency(&perfreq)) { gdTicksPerSec = (double)perfreq.LowPart; gbTimeFunction = true; } else { MessageBox(NULL,"High resolution timer not available", "Warning", MB_OK); } // Create our source and destination DIB`s gDibSrc = new CDIB(); gDibDst = new CDIB(); if(gDibSrc && gDibDst) { // Create Window hwnd = CreateWindow( __FILE__, "Fast Bitmap Rotation", WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, CW_USEDEFAULT, 400, 400, NULL, NULL, hInstance, NULL); ShowWindow (hwnd, iCmdShow) ; UpdateWindow (hwnd) ; while (GetMessage (&msg, NULL, 0, 0)) { TranslateMessage (&msg) ; DispatchMessage (&msg) ; } // cleanup DIB`s delete gDibSrc; delete gDibDst; } return msg.wParam; } ///////////////////////////////////////////////////////////////// double GetTimer() { if(gbTimeFunction) { LARGE_INTEGER time; QueryPerformanceCounter(&time); return (double)time.QuadPart / gdTicksPerSec; } return 0.0; } ///////////////////////////////////////////////////////////////// void Update(HDC hdc) { double dStartT = GetTimer(); // Call Rotate routine, using center of the window and the // center of the source image as the points to rotate around FastRotate( gDibDst->m_pSrcBits, gDibDst->m_iWidth, gDibDst->m_iHeight, gDibDst->m_iSWidth, gDibSrc->m_pSrcBits, gDibSrc->m_iWidth, gDibSrc->m_iHeight, gDibSrc->m_iSWidth, (float)gDibDst->m_iWidth/2, (float)gDibDst->m_iHeight/2, (float)gDibSrc->m_iWidth/2, (float)gDibSrc->m_iHeight/2, gdAngle, gdScale); // Change direction of the scale if(gdScale <= _MINSCALE || gdScale >= _MAXSCALE) { gdScaleDir *= -1.0; } // Update angle and scale gdScale += gdScaleDir; gdAngle += 0.02f; double dUpdateT = GetTimer(); // Copy our rotated image to the screen BitBlt(hdc, 0, 0, gDibDst->m_iWidth, gDibDst->m_iHeight, gDibDst->m_hdc, 0, 0, SRCCOPY); double dRenderT = GetTimer(); // Print function timing satistics if(gbTimeFunction) { char szBuffer[256]; TextOut(hdc, 5, 5, szBuffer, sprintf(szBuffer, "Update took %3.6f (~%3.2ffps)", dUpdateT-dStartT, 1.0 / (dUpdateT-dStartT))); TextOut(hdc, 5, 20, szBuffer, sprintf(szBuffer, "Render took %3.6f (~%3.2ffps)", dRenderT-dUpdateT, 1.0 / (dRenderT-dUpdateT))); } } //////////////////////////////////////////////////////////////// BOOL OnCreate(HWND hwnd, CREATESTRUCT FAR* lpCreateStruct) { BOOL bSuccess = FALSE; // Load test bitmap HBITMAP hbm = (HBITMAP)LoadImage(NULL, SZIMAGE, IMAGE_BITMAP, 0, 0, LR_LOADFROMFILE); if(hbm) { // Map the bitmap into a dc HDC hdc = CreateCompatibleDC(NULL); HBITMAP hbmOld = (HBITMAP)SelectObject(hdc, hbm); // Get info about this bitmap BITMAP bm; if(GetObject(hbm, sizeof(BITMAP), &bm) != 0) { // Convert the bitmap into DIB of known colour depth if(gDibSrc->Create(hdc, 0, 0, bm.bmWidth, bm.bmHeight)) { // Start the update timer SetTimer(hwnd, 0, 100, NULL); bSuccess = TRUE; } // cleanup hdc SelectObject(hdc, hbmOld); DeleteDC(hdc); } // delete the loaded image DeleteObject(hbm); } else { char szError[512]; wsprintf(szError, "Error loading image %s", SZIMAGE); MessageBox(hwnd, szError, "oops", MB_OK); } return bSuccess; } ///////////////////////////////////////////////////////////////// void OnSize(HWND hwnd, UINT state, int x, int y) { // Recreate the window DIB to match the size of the window gDibDst->Create(NULL, 0, 0, x, y); } ///////////////////////////////////////////////////////////////// BOOL OnEraseBkGnd(HWND hwnd, HDC hdc) { // clearing the bk results in tears. return TRUE; } ///////////////////////////////////////////////////////////////// void OnPaint(HWND hwnd) { HDC hdc; PAINTSTRUCT ps; hdc = BeginPaint (hwnd, &ps); Update(hdc); EndPaint (hwnd, &ps) ; } ///////////////////////////////////////////////////////////////// BOOL OnTimer(HWND hwnd, UINT id) { HDC hdc = GetDC(hwnd); Update(hdc); ReleaseDC(hwnd, hdc); return TRUE; } ///////////////////////////////////////////////////////////////// void OnDestroy(HWND hwnd) { PostQuitMessage(0); } ///////////////////////////////////////////////////////////////// LRESULT CALLBACK WndProc(HWND hwnd, UINT iMsg, WPARAM wParam, LPARAM lParam) { switch (iMsg) { HANDLE_MSG( hwnd, WM_CREATE, OnCreate ); HANDLE_MSG( hwnd, WM_SIZE, OnSize ); HANDLE_MSG( hwnd, WM_PAINT, OnPaint ); HANDLE_MSG( hwnd, WM_ERASEBKGND, OnEraseBkGnd ); HANDLE_MSG( hwnd, WM_TIMER, OnTimer ); HANDLE_MSG( hwnd, WM_DESTROY, OnDestroy ); } return DefWindowProc (hwnd, iMsg, wParam, lParam) ; } ///////////////////////////////////////////////////////////////// // End of File
a----b | | | | c----dto the given x component of the angle as such:
a----b \ \ —> \ \ c----dThe bitmap is then transposed:
b / \ / \ a d \ / \ / cTransformations around the diagonals can be used to increase the range of rotation to the full 360 degrees, however this method requires several passes in order to achieve the final rotation. — S.M.