2006/02/27

Quartz Musings (V): GPU Filters

Core Image is an interesting image processing framework in Mac OS X 10.4. The basic principles are very simple: setup a pipeline of effects, and render it. But the interesting part is that Core Image tries to leverage the GPU as much as possible, by computing the filters on the GPU, which means that Core Image uses the GPU and OpenGL as a generic 'pixel computing' engine (or fall back on the CPU if needed).

What are the main concepts in Core Image ?

Let's add another way of rendering our game of life grid. We will add a realtime zoom blur effect on top.

The first thing is to create the CIContext and our filter pipeline, which will only contain the zoom blur filter. We could create it from an OpenGL context and reuse our NSOpenGLView subclass, but for simplicity sake, let's just take the plain CoreGraphics route.

We already saw how to get a CoreGraphics context: if we're in a Cocoa drawRect: method just obtain it through the currentContext. Then, the Core Image context can be create by passing this regular CGContext.

 NSGraphicsContext* nsctx = [NSGraphicsContext currentContext];
 CGContextRef context = [nsctx graphicsPort];
  
 _cicontext = [[CIContext contextWithCGContext:context options:nil] retain];
Next, we'd like to use our still unchanged rendering code, and feed the result to CoreImage. Conveniently, CIContext provides a way to create a CGLayer from a CIContext, and a CGLayer has an associated CoreGraphics context, so we'll be able to plug our rendering code here.

We will also create our Core Image filter. Filters can be discovered by exploring broad categories (color management, distortion, compositing, tiling, ...) or by name, which is what we do here. Another intersting choice with Core Image filters is how parameters are handled with generic key-value expression. It makes extremely easy to plug a filter paramater to a GUI binding, and setting a parameter is clearly not something in the performance hot spot.

The zoom blur filter offers two (or three, counting the input image) parameters: the zoom center and zoom/blur amount.

 CGSize =  CGSizeMake([self bounds].size.width,[self bounds].size.height);

 _cicontext = [[CIContext contextWithCGContext:context options:nil] retain];
 _layer   = [_cicontext createCGLayerWithSize: CGSizeMake(size.width, 
                       size.height)  info: nil];
 _cgcontext = CGLayerGetContext(_layer);  

 _filter = [[CIFilter filterWithName:@"CIZoomBlur"] retain]; 
 [_filter setDefaults]; 
 [_filter setValue:[CIVector vectorWithX:(size.width/2) Y:(size.height/2)] 
            forKey:@"inputCenter"];
 _filter setValue:[NSNumber numberWithFloat:5] 
           forKey:@"inputAmount"];
The actual rendering will be a three-step process: extract a CIImage from the CGLayer (easy, there is a constructor for that), feed the image to our filter (we could also change a filter parameter here for instance to have an increasing/decreasing zoom effect in real time), get the resulting _CIImage image

 CIImage* input;
 CIImage* result;

 // Actual rendering code, applied to _cgcontext
 
 input = [[CIImage alloc] initWithCGLayer:_layer];
 [_filter setValue: input forKey: @"inputImage"]; 
 result = [_filter valueForKey:@"outputImage"];
 
 [_cicontext drawImage:result atPoint:CGPointZero fromRect:bounds];
 [input release];
Tadaam !


Quartz Musings (IV): Bitmap & Textures

Just to sum up the CoreGraphics contexts comparison, I have to try the CGBitmapContext.

CGBitmapContext has some common point with CGLayer, but is more specialized: drawing into a bitmap (with associated colorspace, bits depth, bytes per row, etc). CGLayer is more a generic drawing cache, optimized for a given context type. It's easy to reuse a CGLayer by drawing it to a graphic context. And it's easy to use a bitmap context to generate a CGImage (the generic CoreGraphics image abstraction), or keep control on its data buffer.

So, what could we do with a CGBitmapContext and control over its buffer ? A texture. Let's build another way of using the same drawing code.

We will again use a NSOpenGLView subclass. We just need to note a few details. First, the view and OpenGL setup. I created a NIB file to store the interface, so this class is going to be deserialized when loading the NIB, it means that the initializer will -not- be initWithFrame:pixelFormat but initWithCode:. As I need to setup my pixel format, I will override it and set the pixel format immediately after calling initWithCoder on the superclass.
-(id)initWithCoder:(id)o
{
 [super initWithCoder:o];
 [self setPixelFormat: [self myPixelFormat]] ;
 // Additional setup code
 return self;
}
Nothing notable in the pixel format and OpenGL setup: double buffered (I won't run full screen, no justification for direct mode), 16 bits color, enabled for texture.
- (NSOpenGLPixelFormat*)myPixelFormat
{
 NSOpenGLPixelFormatAttribute pixelAttribs[ 4 ];
 int pixNum = 0;
 
 pixelAttribs[ pixNum++ ] = NSOpenGLPFADoubleBuffer;
 pixelAttribs[ pixNum++ ] = NSOpenGLPFAColorSize;
 pixelAttribs[ pixNum++ ] = 16;
 pixelAttribs[ pixNum ] = 0;
 return [[[NSOpenGLPixelFormat alloc] initWithAttributes:pixelAttribs]
                   autorelease];
}

-(void)prepareOpenGL
{
 glEnable( GL_TEXTURE_2D );
 glShadeModel( GL_SMOOTH );
 glClearColor( 0.0f, 0.0f, 0.0f, 0.5f ); 
 glGenTextures(1, &texture );
}
Now, the real thing: generating the texture for a given width and height. As I am using a GL_TEXTURE_2D, WIDTH and HEIGHT must be equal and a power of 2. The important point here is to use compatible bitmap format between the CGBitmapContext creation and the glTexImage2D parameters.
- (void)generateTexture
{
 CGRect r = CGRectMake(0,0,WIDTH, HEIGHT);

 if (!setup)
 { 
  data = malloc( r.size.width * r.size.height * 4);
  cs = CGColorSpaceCreateDeviceRGB();
  context = CGBitmapContextCreate(data, r.size.width, r.size.height,
   8, r.size.width * 4, cs, kCGImageAlphaPremultipliedFirst);
  setup = TRUE;
 }
 
 // My rendering code, using context 
 CGContextFlush(context);

 glBindTexture(GL_TEXTURE_2D, texture );
 
 glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, r.size.width, r.size.height, 
    0, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, data);
 glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR );
 glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
}
The drawing code: update the texture on each frame, clear everything, bind the texture and finally draw it.
- (void) drawRect:(NSRect)rect
{
 [self generateTexture];
 
 GLfloat texturewidth = 1.0f;
 GLfloat textureheight = 1.0f;
 glBindTexture( GL_TEXTURE_2D, texture );
 
 glBegin(GL_QUADS);
 
 glTexCoord2f(0.0f, 0.0f);
 glVertex2f(-1.0f, 1.0f);
 glTexCoord2f(0.0f, textureheight );
 glVertex2f(-1.0f, -1.0f);
 glTexCoord2f(texturewidth, textureheight );
 glVertex2f(1.0f, -1.0f);
 glTexCoord2f(texturewidth, 0.0f );
 glVertex2f(1.0f, 1.0f);

 glEnd();   

 [[self openGLContext] flushBuffer ];
}

It is actually possible to do better, and easier. First by using the rectangle texture extension, which work on all modern cards which allows for rectangular and non power of 2 sizes. To use it, we need to replace all GL_TEXTURE_2D with GL_TEXTURE_RECTANGLE_EXT. There is another trick actually: texture coordinates must be expressed in pixels and not in normalized coords, so the texturewidth and textureheight variable affectations must be replaced with:

 GLfloat texturewidth = WIDTH;
 GLfloat textureheight = HEIGHT;
The other trick we can use is OpenGL Apple extensions to improve texture upload performance. There is an interesting sample code describing this. The first extension is GL_UNPACK_CLIENT_STORAGE_APPLE which avoid one data copy if the application retains the data, that's the case here (we don't recreate the bitmap context and its buffer on each frame). The other trick is a hint passed to the system, telling that you want VRAM or AGP texturing, GL_STORAGE_CACHED_APPLE is good for textures cached and reused, and GL_STORAGE_SHARED_APPLE works best for one-shot textures, what we have here.

I just have to insert before glTexImage2D in the texture generation code the following snippet (it also works for GL_TEXTURE_2D):

 glTexParameteri(GL_TEXTURE_RECTANGLE_EXT, 
    GL_TEXTURE_STORAGE_HINT_APPLE, GL_STORAGE_SHARED_APPLE); 
 glPixelStorei(GL_UNPACK_CLIENT_STORAGE_APPLE,  GL_TRUE); 
Some measures. With these OpenGL extensions, we're more than tripling the performance !

Regular CGContext, CGLayer178
CGBitmapContext, CGLayer, OpenGL texture63
CGBitmapContext, CGLayer, OpenGL texture (RECT, CACHED)211

We must also remember that it's not only texture upload and drawing we're measuring but also the rendering in the bitmap context. If the bitmap context rendering takes too much time, the OpenGL texture upload might not be competitive enough.

But of course, what is really interesting here is that from this point, I'm directly dealing with OpenGL, so it's a piece of cake to achieve some fun effects, for instance by mapping my rendering on a cube, etc.

But there's other ways to apply effects on a rendering on OS X, easier than raw OpenGL programming. More about that later.


2006/02/25

Quartz Musings (III): OpenGL Context

I mentioned that one of the graphic contexts that CoreGraphics provides is an OpenGL CGContext. With this context, everything drawn is supposed to be translated to the equivalents OpenGL commands and displayed to a low-evel OpenGL context.

Here is my sample code. I'm inheriting from the AppKit-provided NSOpenGLView for convenience (the super class defines some convienence methods and behaviors). But it should be noted that you can associate an OpenGL context with any NSView (there's some sample code).
A side note: you can also very easily create a texture from a NSView.

@interface TSGLWorldView : NSOpenGLView
{
 CGContextRef context;
}
@end


@implementation TSGLWorldView

- (void)drawRect:(NSRect)rect
{
 if (context==NULL)
 {
  CGSize size = CGSizeMake([self bounds].size.width,
                                            [self bounds].size.height);
  CGColorSpaceRef cs = CGColorSpaceCreateDeviceRGB();
  context = CGGLContextCreate([[self openGLContext] CGLContextObj], 
                   size, cs);   
 }

  /* My drawing code, using 'context' */

 CGContextFlush(context);
}

- (void)update
{
 glViewport(0, 0,[self bounds].size.width,[self bounds].size.height );
 if (context!=NULL)
   CGGLContextUpdateViewportSize (context,
      CGSizeMake([self bounds].size.width,[self bounds].size.height));

 [super update];
}

@end
The update method is here to handle window resizing, by keeping in sync the view bounds, gl viewport and CGGLContext. Again, no change in my drawing code.

Although this is a very cool addition to the CoreGraphics arsenal, I have to say that it is apparently one of the less common uses for CoreGraphics, can be occasionally buggy and might be made obsolete in the future if Quartz 2D Extreme is some time enabled by default, making basically everything rendered through an OpenGL pipe, without having to use CGGLContext explicitely (but you can play with it right now through QuartzDebug).


Quartz Musings (II): PDF Explorations

In the previous post, I explored various ways of drawing simple shapes with CoreGraphics in a window. I was mainly talking about what is possible to do with a CoreGraphics context. But how do we get such a context ?

My application is a small AppKit-based application, so I just have to defined my custom view, something inheriting from the generic NSView class, and define my own "draw" method:

 NSGraphicsContext* nsctx = [NSGraphicsContext currentContext];
 CGContextRef context = [nsctx graphicsPort];
     myCustomDrawingInContext(context);

AppKit maintains a notion of what is the current context, so I just have to get my raw CoreGraphics CGContextRef from that. Easy. But CoreGraphics offers more: an OpenGL context, a bitmap context for offscreen drawing, or a PDF context.

PDF generation for free

Let's play with the PDF context. It's a bit more complicated to setup than a simple onscreen context, but actually very straightforward.

 CGContextRef context;
 CGRect bounds = CGRectMake(0,0,500,500);
 context = CGPDFContextCreateWithURL(myFilePathURL, &bounds , NULL);
 CGContextBeginPage (context, &bounds);

 myCustomDrawingInContext(context);
 CGContextEndPage (context);
 CGContextRelease (context);
CGPDFContext actually defines additional PDF-specific functions to handle pages, but also to setup a hyperlink for a given zone, or include author, title, output intent metadata in the PDF file. What is really interesting here is that I don't have to change a single line to my drawing code, my PDF context is a specialized instance of a generic CoreGraphics context, so the various drawing techniques should still work. And it's fully vector-based, look at the closeup.

Let's look at the generated PDF for the 'naive' drawing code (omitting the grid drawing for concision, just keeping the ellipse part):

%PDF-1.3
2 0 obj
<< /Length 4 0 R >>
stream
q Q q /Cs1 cs 0 0.5 0 sc 25 62.5 m 25 62.5 l 25 69.403557 19.403561
75 12.5 75 c 5.5964403 75 0 69.403557 0 62.5 c 0 55.596439 
5.5964403 50 12.5 50 c 19.403561 50 25 55.596439 25 62.5 c 25 
62.5 25 62.5 25 62.500004 c h 25 112.5 m 25 112.5 l 25 119.40356 
19.403561 125 12.5 125 c 5.5964403 125 0 119.40356 0 112.5 c 0 
105.59644 5.5964403 100 12.5 100 c 19.403561 100 25 105.59644
[...]
What do we have here ? 2 0 obj tells that we're going to define an object whose identify is 2 0. Within the stream (it is actually a bit more complicated as the stream is Zlib encoded, but I show here the decoded stream), we found a big list of PDF instructions. For instance /Cs1 cs specifies the colorspace (referenced by /Cs1), sc is the stroking color. m is a "move to" instruction and l is a "line to". c defines a Bezier curve, and this continues for lines and lines. So apparently this defines drawing for all our ellipses. Neat.

Remember the CGLayer technique ? We were constructing the ellipse in a 'layer' context and then stamping the layer in the drawing context for better performance, to avoid drawing the ellipse again and again. Let's give a look to the CGLayer-based PDF.

%PDF-1.3
2 0 obj
<< /Length 4 0 R >>
stream
q Q q Q q 0 50 25 25 re W n q 1 0 0 1 0 50 cm /Fm1 Do Q Q q Q q 
0 100 25 25 re W n q 1 0 0 1 0 100 cm /Fm1 Do Q Q q Q q 0 175 
25 25 re W n q 1 0 0 1 0 175 cm /Fm1 Do Q Q q Q q 0 200 25 25 
re W n q 1 0 0 1 0 200 cm /Fm1 Do Q Q q Q q 0 225 25 25 re W n 
q 1 0 0 1 0 225 cm /Fm1 Do Q Q q Q q 0 275 25 25 re W n q 1 0 
0 1 0 275 cm /Fm1 Do Q Q q Q q 0 300 25 25 re W n q 1 0 0 1 0
300 cm /Fm1 Do Q Q q Q q 0 375 25 25 re W n q 1 0 0 1 0 375 
cm /Fm1 Do Q Q
[...]
What's different here ? cm is a coordinate transformation operator, re defines a rectangle, Q and q pop and push the graphics state, and we have this strange /Fm1 Do.

/Fm1 is actually an external object - XObject in PDF speak - something defined outside of the current stream. If we look further into this PDF file, we'll found:

3 0 obj
<< /ProcSet [ /PDF ] /XObject << /Fm1 5 0 R >> >>
endobj
5 0 obj
<< /Length 416 0 R /Type /XObject /Subtype /Form /FormType 1 /BBox
[0 0 25 25] /Resources 6 0 R >>
stream
q Q q /Cs1 cs 0 0.5 0 sc 25 12.5 m 25 12.5 l 25 19.403561 19.403561 
25 12.5 25 c 5.5964403 25 0 19.403561 0 12.5 c 0 5.5964403 
5.5964403 0 12.5 0 c 19.403561 0 25 5.5964403 25 12.5 c 25 
12.500001 25 12.500002 25 12.500002 c h f* Q
endstream
endobj
This snippet is in two parts: the first one defines the name /Fm1 and tells us that it is actually the object 5 0 (defined immediately after). This object 5 0 is quite interesting. It's defined as a "form" (more about that later) XObject, a bounding box for this object is present (0 0 - 25 25), and what do we found in the definition (the stream section): a suite of m, l, c, the same move-to, line-to, Bezier curve operator we discovered in the previous file. But this time, the stream is really short. So apparently, Quartz is doing the right thing here: defining the ellipse into this Fm1 object, and reusing it all over the place with the /Fm1 Do which will cause the redrawing of the Fm1 definition in place.

Let's check the PDF specification for confirmation.

A form XObject is a self-contained description of any sequence of graphics objects (including path objects, text objects, and sampled images), defined as a PDF content stream. It may be painted multiple times—either on several pages or at several locations on the same page—and will produce the same output each time, subject only to the graphics state at the time it is invoked. Not only is this shared definition economical to represent in the PDF file, but under suitable circumstances, the PDF viewer can optimize execution by caching the results of rendering the form XObject for repeated reuse.

Conclusion ? Quartz does the right thing. When I use a CGLayer in my CoreGraphics code to use a bit of caching, if I'm using a window graphic context, CoreGraphics will cache the drawing and paste it where needed, but if I'm using a PDF graphic context, the (vector-based) drawing definition is cached and the correct PDF abstraction is used to store and reference it: a XObject. And for instance, an OpenGL-based context could upload the CGLayer rendering to a texture in VRAM and reuse it directly from there on the GPU.

Rendering PDF

Just for fun, I adapted the PDF context code to actually use it as a renderer onscreen. Three steps:

Here is the code (slightly edited for lisibility):

  CGContextRef context;
  CGDataConsumerRef datacon;
  NSMutableData* data;
  
  data = [[[NSMutableData alloc] init] autorelease];

  datacon = CGDataConsumerCreateWithCFData((CFMutableDataRef)data);
  context = CGPDFContextCreate(datacon,&bounds,NULL);
    
  CGContextBeginPage (context, &bounds);

  /* My drawing code, using 'context' */

  CGContextEndPage (context);
  
  CGContextRelease (context);
  CGDataConsumerRelease(datacon);

  [self setDocument:[[[PDFDocument alloc] initWithData:data] autorelease]]; 
I was surprised to obtain almost 20fps, with this highly inefficient way to animate.

2006/02/23

Visual Regexp

I usually don't use this space as a link-blog, but as somebody who (a long time ago) learnt and then taught finite-state automata, this tool by Oliver Steele is awfully nice. Learn more about it.

And that was even before my discovery of his svn2ics tool. Yay!


2006/02/20

Quartz Musings (I): Basic drawing

I will forget for a while the final goal of having a small simulation engine, and just look at the very simple CoreGraphics implementation that displays the game of life example from my previous post. I'll use this as a pretext to describe a bit of CoreGraphics programming.

The first version is very naive. We first need to draw the grid:

 CGContextSetFillColorWithColor(c, _backColor);
 CGContextFillRect(c, r);
 CGContextBeginPath(c);

 for (i=0; i<gridWidth; i++)
 {
  CGContextMoveToPoint(c,i*cellWidth,0);
  CGContextAddLineToPoint(c,i*cellWidth,height);
 }
 for (i=0; i<gridHeight; i++)
 {
  CGContextMoveToPoint(c,0,i*cellHeight);
  CGContextAddLineToPoint(c,width,i*cellHeight);
 }
 CGContextSetStrokeColorWithColor(c,_gridColor);
 CGContextDrawPath(c,kCGPathStroke);

And the cells:

 CGContextSetFillColorWithColor(c, _cellColor); 
 for (i=0; i < gridWidth; i++)
  for (j=0; j < gridHeight; j++)
 {
  if (alive(i,j))
   CGContextFillEllipseInRect(c,CGRectMake(i*cellWidth,
                     j*cellHeight,cellWidth,cellHeight));
 }
}

(this is a simplified version of the actual code, just highlighting the CoreGraphics operations). The c parameter is a CGContextRef, the graphic context for our game of life window.

Let's do some time measurements by forcing display in a for loop, without even computing new states.

Full display, drawing cells as ellipses 30 fps
Full display, drawing cells as rectangles 59.6 fps
Grid only, no cells 59.9 fps
Black background 59.9 fps
Nothing drawn 59.9 fps

Strange, isn't it ? It looks like for simple drawing, we're hitting a 60 frames par second mark. And for our more complex drawing, boom, 30 frames. Just the half. How curious.

What we're actually seeing here is the effect of coalesced updates. This is a new behavior in 10.4 that enables CoreGraphics to be more efficient when updating the frame buffer (and also avoid visual artifact). The trick is that the window server composites all changes into a single buffer before flushing, and if the application requests a flush, it won't impact the screen until the next refresh. And this is happening system-wide.

Our small test app here really is a bad citizen: it is over-flushing, so CoreGraphics throttle us. Let's check the technote about Coalesced Updates:

Over-flushing: Applications which draw and flush much faster than the display refresh are throttled down to the refresh rate. Ideally, applications should not draw faster than the display refresh as it would be wasting time drawing pixels the user won't see on the display. Once a window has been drawn into and flushed the buffer needs to be locked in preparation for window server access, so an application can do anything it likes until that flush makes it to the screen except draw into the buffer again. If an application tries to draw immediately after a flush it will block until that flush actually completes, so if the application just misses a frame sync it has to wait around until the next one, and won't be able to start drawing the next frame in the meantime.
"If an animation spends more time in its drawing routine than it takes for the screen to refresh, then it will become throttled to some factor of the refresh rate. So, if the refresh rate was 60 fps and the animation can run at at most 55 fps, it will be throttled down to 30 fps."
That clearly explains our measures. But we do know our small drawing benchmark is a bad citizen, so it would be nice to override this. The trick is in the "Debugging Graphics" TechNote: the "Show Beam Sync Tools" allows us to completely shut down beam synchronization. By the way, the coalesced updates feature is only activated for Mach-O applications linked on 10.4

Let's disable that (Quartz Debug, Show Beam Sync Tools, Disable Beam Synchronization).

Full display with ellipse 49 fps (was: 30 fps)
Full display with rectangle 104 fps (was: 59.6 fps)
Grid only, no cells 145 fps (was: 59.9 fps)
Black background 262 fps (was: 59.9 fps)
Nothing drawn 315 fps (was: 59.9 fps)

That's better.

We can now play a bit more with our drawing code and CoreGraphics. Let's start with the grid display and disable the cells for now.

With our current code, we're basically building the same path over and over. Instead of defining the path in the current context, we can also create a new path (that is, a CGMutablePathRef), add the lines to this path, and do this once, when creating our view, and on redraw, just add our precomputed path to the context and stroke it:

 CGContextAddPath(c,gridPath);
 CGContextStrokePath(c);

But as we're only drawing lines here, we could also use a function at CGContext level that just takes an array of points (organized in pairs) and strokes the line segments:

 k=0;
 for (i=0; i<gridWidth; i++)
 {
  segments[k++] = CGPointMake(i*cellWidth,0);
  segments[k++] = CGPointMake(i*cellWidth,height);
 }
 for (i=0; i<gridHeight; i++)
 {
  segments[k++] = CGPointMake(0,i*cellHeight);
  segments[k++] = CGPointMake(width,i*cellHeight);
 }
 CGContextStrokeLineSegments(c,segments,2*(gridWidth+gridHeight));

Another approach: instead of accumulating all lines in a path and draw the path at the end, we can also draw one line at a time, and change the for loops block with something like:

  for (i=0; i<gridWidth; i++)
  {
  CGContextBeginPath(c);
  CGContextMoveToPoint(c,i*cellWidth,0);
  CGContextAddLineToPoint(c,i*cellWidth,height);
  CGContextDrawPath(c,kCGPathStroke);
 }

Of course, we can use this idea of drawing each line separately with the CGContextStrokeLineSegments technique:

 for (i=0; i<gridWidth; i++)
 {
  line[0] = CGPointMake(i*cellWidth,0);
  line[1] = CGPointMake(i*cellWidth,height);
  CGContextStrokeLineSegments(c,line,2);  
 }

By the way, a Quartz graphics context is anti-aliased by default, but this can be turned off manually with

 CGContextSetShouldAntialias(context,NO);
AAnon-AA
A single path 133 fps 154 fps
A single path, cached 134 fps 155 fps
Line segments, precomputed 134 fps 155 fps
Each line as a single path 155 fps 154 fps
Each line stroked with single segment 156 fps 155 fps

Interesting points: the difference between cached and uncached path, or between a path and a segments array is negligible. The only somewhat surprising result is the difference between the 'single path / single array' only in the antialiased case and all other results. Probably some side effect of our extremely simple path is missing a fast path somewhere (CoreGraphics paths are really powerful beasts, think clipping + quad curves + dash patterns + blend modes).

But can we do better ? Yes. In all these approaches, we're basically drawing the grid again and again. As the grid will almost never change in the simulation, we should cache it. Mac OS X 10.4 introduced a very interesting concept here: CGLayer. A CGLayer is a way for an application to use layers for drawing by constructing and reusing layer contents as desired. Two interesting things with CGLayers. First it is implemented with performance in mind. This means that when a CGLayer is constructed, a reference to the intended graphic context destination, so it will be optimized for the same color depth, resolution, ... and even context kind (think bitmap, screen context versus PDF context versus OpenGL context), so it might be stored as an offscreen buffer, a texture in VRAM, some PDF primitives, depending on what is the most efficient in a given situation. Even if the layer is optimized for the intended destination, it can be used on any type of context. Secondly, a CGLayer is much more powerful than a simple store path, as once created, it acts as a full graphic context itself, so the full range of CoreGraphics operations can be applied. Once a layer has been set up, it can be drawn in any graphic context at a given point or in a given rectangle.

 if (_gridLayer==NULL) {
  CGContextRef layerContext;
  _gridLayer = CGLayerCreateWithContext(c,view.size,NULL);
  layerContext = CGLayerGetContext(_gridLayer);
  
  CGContextSetFillColorWithColor(layerContext, _backColor);
  CGContextFillRect(layerContext, r);
   // and our grid drawing code, using layerContext and not c)
 }
 CGContextDrawLayerAtPoint(c,CGPointMake(0,0),_gridLayer);

Let's compare this against the previous implementation:

Line by line drawing 156 fps
CGLayer caching 224 fps

Nice improvement, and easy to implement.

Actually, we can use the exact same technique for cell drawing: instead of using CGContextFillEllipseInRect for each cell, we can cache one ellipse into a CGLayer, and just reuse this CGLayer when we have to draw a cell.

Comparing this approach with our original measure on the complete display reaveals a very significant overall improvement:

Naive implementation 49 fps
CGLayer implementation 115 fps

Using layers has additional advantages. One example: if I'm drawing a layer into a rect with a size different from the original layer size, CoreGraphics will silently scale. For instance, if I'm resizing my simulation window, I could have a very fast, altough a bit fuzzy, drawing through layer scaling and recompute the layers when the window resize is finished. We could also have something more complex than just an ellipse (for instance an small image for each cell) and avoid any performance impact.

Another nice feature of CoreGraphics is device independence. Next, we'll play with other kinds of graphic contexts.


2006/02/14

Current toy

As I have this irrepressible need to start new projects without finishing the current ones, I started code a small multi-agent simulation toolkit. Which has a distinctive 'back to the future' tone to me, as I spent a few years writing an agent plaform, but what interests me here is more to have a pretext to have fun with some technologies. CoreGraphics to have simulation display that looks and prints good, maybe CoreImage to play a bit with GPU programming, bindings and KVO/KVC to have a very declarative-oriented setup of the display, measure and models.

2006/02/09

Apache 2, svn authentication, and OS X Server

I'm using Subversion for my source management needs, and I use it with Apache, not svnserve, and the whole thing is on top of a test OS X Server, which incorporate a unified, pluggable directory / security architecture, OpenDirectory. And of course, I'd like to have my subversion repositories use the regular system users and authentication scheme.

This works nicely on the provided web server, first because web management is integrated in the GUI system administration console, but also because Apple developed a few OS X specific modules, to handle Zeroconf discovery, MacBinary support ... and OpenDirectory authentication. The only drawback here is that the default web server is apache 1, and even if Apple does ship an optional Apache 2, it is not supported in the GUI console (no big deal) and does not have apache2 versions of the apple-specific modules (more annoying).

So, how can we enable OpenDirectory authentication on apache 2 or apache 2.2 ? The first solution would be to port the mod_auth_apple authentication module to Apache 2. Which, like most of OS X BSD layer, is opensource and available in Darwin Server source code.

Another solution, maybe less elegant but totally functional is to use PAM authentication. Since 10.2, pam is supported in OS X and bridges authentication to OpenDirectory (and is actually used by some services in the OS, look in /etc/pam.d. Thus the only missing part of the equation is a way to authenticate Apache with PAM. Which is quite easy to find: mod_auth_pam provides pluggable authentication module for Apache 1.3 or 2.0. And works almost out of the box (you just need to fix a pam header from #include <security/pam_appl.h> to #include <pam/pam_appl.h>).

Apache 2.2 changed slightly the authentication and authorization modules, but another project, mod_auth implements several modules for the new architecture, including a mod_authn_pam which works just fine, with the same small header fix.

Porting the mod_auth_apple could be a good exercice though (side note: the ADC has an example of down-to-the-metal OpenDirectory authentication, replacing crypt use )


This page is powered by Blogger. Isn't yours?