2006/02/20

Quartz Musings (I): Basic drawing

I will forget for a while the final goal of having a small simulation engine, and just look at the very simple CoreGraphics implementation that displays the game of life example from my previous post. I'll use this as a pretext to describe a bit of CoreGraphics programming.

The first version is very naive. We first need to draw the grid:

 CGContextSetFillColorWithColor(c, _backColor);
 CGContextFillRect(c, r);
 CGContextBeginPath(c);

 for (i=0; i<gridWidth; i++)
 {
  CGContextMoveToPoint(c,i*cellWidth,0);
  CGContextAddLineToPoint(c,i*cellWidth,height);
 }
 for (i=0; i<gridHeight; i++)
 {
  CGContextMoveToPoint(c,0,i*cellHeight);
  CGContextAddLineToPoint(c,width,i*cellHeight);
 }
 CGContextSetStrokeColorWithColor(c,_gridColor);
 CGContextDrawPath(c,kCGPathStroke);

And the cells:

 CGContextSetFillColorWithColor(c, _cellColor); 
 for (i=0; i < gridWidth; i++)
  for (j=0; j < gridHeight; j++)
 {
  if (alive(i,j))
   CGContextFillEllipseInRect(c,CGRectMake(i*cellWidth,
                     j*cellHeight,cellWidth,cellHeight));
 }
}

(this is a simplified version of the actual code, just highlighting the CoreGraphics operations). The c parameter is a CGContextRef, the graphic context for our game of life window.

Let's do some time measurements by forcing display in a for loop, without even computing new states.

Full display, drawing cells as ellipses 30 fps
Full display, drawing cells as rectangles 59.6 fps
Grid only, no cells 59.9 fps
Black background 59.9 fps
Nothing drawn 59.9 fps

Strange, isn't it ? It looks like for simple drawing, we're hitting a 60 frames par second mark. And for our more complex drawing, boom, 30 frames. Just the half. How curious.

What we're actually seeing here is the effect of coalesced updates. This is a new behavior in 10.4 that enables CoreGraphics to be more efficient when updating the frame buffer (and also avoid visual artifact). The trick is that the window server composites all changes into a single buffer before flushing, and if the application requests a flush, it won't impact the screen until the next refresh. And this is happening system-wide.

Our small test app here really is a bad citizen: it is over-flushing, so CoreGraphics throttle us. Let's check the technote about Coalesced Updates:

Over-flushing: Applications which draw and flush much faster than the display refresh are throttled down to the refresh rate. Ideally, applications should not draw faster than the display refresh as it would be wasting time drawing pixels the user won't see on the display. Once a window has been drawn into and flushed the buffer needs to be locked in preparation for window server access, so an application can do anything it likes until that flush makes it to the screen except draw into the buffer again. If an application tries to draw immediately after a flush it will block until that flush actually completes, so if the application just misses a frame sync it has to wait around until the next one, and won't be able to start drawing the next frame in the meantime.
"If an animation spends more time in its drawing routine than it takes for the screen to refresh, then it will become throttled to some factor of the refresh rate. So, if the refresh rate was 60 fps and the animation can run at at most 55 fps, it will be throttled down to 30 fps."
That clearly explains our measures. But we do know our small drawing benchmark is a bad citizen, so it would be nice to override this. The trick is in the "Debugging Graphics" TechNote: the "Show Beam Sync Tools" allows us to completely shut down beam synchronization. By the way, the coalesced updates feature is only activated for Mach-O applications linked on 10.4

Let's disable that (Quartz Debug, Show Beam Sync Tools, Disable Beam Synchronization).

Full display with ellipse 49 fps (was: 30 fps)
Full display with rectangle 104 fps (was: 59.6 fps)
Grid only, no cells 145 fps (was: 59.9 fps)
Black background 262 fps (was: 59.9 fps)
Nothing drawn 315 fps (was: 59.9 fps)

That's better.

We can now play a bit more with our drawing code and CoreGraphics. Let's start with the grid display and disable the cells for now.

With our current code, we're basically building the same path over and over. Instead of defining the path in the current context, we can also create a new path (that is, a CGMutablePathRef), add the lines to this path, and do this once, when creating our view, and on redraw, just add our precomputed path to the context and stroke it:

 CGContextAddPath(c,gridPath);
 CGContextStrokePath(c);

But as we're only drawing lines here, we could also use a function at CGContext level that just takes an array of points (organized in pairs) and strokes the line segments:

 k=0;
 for (i=0; i<gridWidth; i++)
 {
  segments[k++] = CGPointMake(i*cellWidth,0);
  segments[k++] = CGPointMake(i*cellWidth,height);
 }
 for (i=0; i<gridHeight; i++)
 {
  segments[k++] = CGPointMake(0,i*cellHeight);
  segments[k++] = CGPointMake(width,i*cellHeight);
 }
 CGContextStrokeLineSegments(c,segments,2*(gridWidth+gridHeight));

Another approach: instead of accumulating all lines in a path and draw the path at the end, we can also draw one line at a time, and change the for loops block with something like:

  for (i=0; i<gridWidth; i++)
  {
  CGContextBeginPath(c);
  CGContextMoveToPoint(c,i*cellWidth,0);
  CGContextAddLineToPoint(c,i*cellWidth,height);
  CGContextDrawPath(c,kCGPathStroke);
 }

Of course, we can use this idea of drawing each line separately with the CGContextStrokeLineSegments technique:

 for (i=0; i<gridWidth; i++)
 {
  line[0] = CGPointMake(i*cellWidth,0);
  line[1] = CGPointMake(i*cellWidth,height);
  CGContextStrokeLineSegments(c,line,2);  
 }

By the way, a Quartz graphics context is anti-aliased by default, but this can be turned off manually with

 CGContextSetShouldAntialias(context,NO);
AAnon-AA
A single path 133 fps 154 fps
A single path, cached 134 fps 155 fps
Line segments, precomputed 134 fps 155 fps
Each line as a single path 155 fps 154 fps
Each line stroked with single segment 156 fps 155 fps

Interesting points: the difference between cached and uncached path, or between a path and a segments array is negligible. The only somewhat surprising result is the difference between the 'single path / single array' only in the antialiased case and all other results. Probably some side effect of our extremely simple path is missing a fast path somewhere (CoreGraphics paths are really powerful beasts, think clipping + quad curves + dash patterns + blend modes).

But can we do better ? Yes. In all these approaches, we're basically drawing the grid again and again. As the grid will almost never change in the simulation, we should cache it. Mac OS X 10.4 introduced a very interesting concept here: CGLayer. A CGLayer is a way for an application to use layers for drawing by constructing and reusing layer contents as desired. Two interesting things with CGLayers. First it is implemented with performance in mind. This means that when a CGLayer is constructed, a reference to the intended graphic context destination, so it will be optimized for the same color depth, resolution, ... and even context kind (think bitmap, screen context versus PDF context versus OpenGL context), so it might be stored as an offscreen buffer, a texture in VRAM, some PDF primitives, depending on what is the most efficient in a given situation. Even if the layer is optimized for the intended destination, it can be used on any type of context. Secondly, a CGLayer is much more powerful than a simple store path, as once created, it acts as a full graphic context itself, so the full range of CoreGraphics operations can be applied. Once a layer has been set up, it can be drawn in any graphic context at a given point or in a given rectangle.

 if (_gridLayer==NULL) {
  CGContextRef layerContext;
  _gridLayer = CGLayerCreateWithContext(c,view.size,NULL);
  layerContext = CGLayerGetContext(_gridLayer);
  
  CGContextSetFillColorWithColor(layerContext, _backColor);
  CGContextFillRect(layerContext, r);
   // and our grid drawing code, using layerContext and not c)
 }
 CGContextDrawLayerAtPoint(c,CGPointMake(0,0),_gridLayer);

Let's compare this against the previous implementation:

Line by line drawing 156 fps
CGLayer caching 224 fps

Nice improvement, and easy to implement.

Actually, we can use the exact same technique for cell drawing: instead of using CGContextFillEllipseInRect for each cell, we can cache one ellipse into a CGLayer, and just reuse this CGLayer when we have to draw a cell.

Comparing this approach with our original measure on the complete display reaveals a very significant overall improvement:

Naive implementation 49 fps
CGLayer implementation 115 fps

Using layers has additional advantages. One example: if I'm drawing a layer into a rect with a size different from the original layer size, CoreGraphics will silently scale. For instance, if I'm resizing my simulation window, I could have a very fast, altough a bit fuzzy, drawing through layer scaling and recompute the layers when the window resize is finished. We could also have something more complex than just an ellipse (for instance an small image for each cell) and avoid any performance impact.

Another nice feature of CoreGraphics is device independence. Next, we'll play with other kinds of graphic contexts.


Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?