3
Jan
2013
 

NSFetchedResultsController -sectionNameKeyPath discussion

by Ben Blakely

Core Data and NSFetchedResultsController do clever things under-the-hood to improve performance, such as loading data in batches as it’s needed. But there’s a gotcha with grouping data with sectionNameKeyPath than can cause a big hit in performance. Check this out.

## Starting Simply

Let’s start with a simple table view without any sections. Our entity will be an Event with a date and a name:

@interface Event : NSManagedObject

@property (nonatomic) NSDate *date;
@property (nonatomic) NSString *name;

@end

Next, we’ll ask our fetched results controller to load a ton of events (ordered by date) for our table view. Even with a large number of events, our table view loads very quickly. To see what’s happening behind the scenes, we can have Core Data log its SQL statements by doing the following:

* In Xcode, go to the Product menu and choose Edit Scheme.
* Select Debug on the left.
* Select Arguments on the right.
* Under Arguments Passed On Launch, click the add button.
* Enter: `-com.apple.CoreData.SQLDebug 1`
* Click OK.

Now when we Build and Run, SQL statements will be listed in the app’s output:

SELECT 0, t0.Z_PK FROM ZEVENT t0 ORDER BY t0.ZDATE DESC

SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, t0.ZDATE FROM ZEVENT t0 WHERE t0.Z_PK IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ORDER BY t0.ZDATE DESC LIMIT 20

Our fetched results controller is getting the primary key for all 1000 events, but only loading the first batch of 20 records.

Scrolling the table view triggers another SQL query:

SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, t0.ZDATE FROM ZEVENT t0 WHERE t0.Z_PK IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ORDER BY t0.ZDATE DESC LIMIT 20

The fetched results controller smartly loads the next batch of 20 records when they’re needed.

## Grouping

Now we want to group these events by year. Easy, right? We’ll just write a method to extract the year from the date, and use that as the sectionNameKeyPath. Our object then becomes:

@interface Event : NSManagedObject

@property (nonatomic) NSDate *date;
@property (nonatomic) NSString *name;

– (NSNumber*)year;

@end

When we load our table view, it looks great but there’s a long delay before the view is shown. Let’s check the SQL output:

SELECT 0, t0.Z_PK FROM ZEVENT t0 ORDER BY t0.ZDATE DESC

SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, t0.ZDATE FROM ZEVENT t0 WHERE t0.Z_PK IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ORDER BY t0.ZDATE DESC LIMIT 20

SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, t0.ZDATE FROM ZEVENT t0 WHERE t0.Z_PK IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ORDER BY t0.ZDATE DESC LIMIT 20

SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, t0.ZDATE FROM ZEVENT t0 WHERE t0.Z_PK IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ORDER BY t0.ZDATE DESC LIMIT 20

Our fetched results controller is now loading *everything*. It doesn’t keep all the objects loaded in memory (i.e. scrolling down still loads 20 at a time), but all the objects are temporarily loaded upfront.

Why would it do such a thing? There’s no way for the fetched results controller to know how many groups there are without instantiating all our object to call `- (NSNumber*)year` on each event.

The bigger our dataset becomes, the longer the delay will be. Fortunately, there’s a workaround.

## Fast Grouping

What if we were to use a persistent attribute to group our records? Let’s [denormalize](http://en.wikipedia.org/wiki/Denormalization) (i.e. store redundant data to optimize read performance) by saving the year as a persistent attribute. So our updated Event entity has a date, a name, and now a year:

@interface Event : NSManagedObject

@property (nonatomic) NSDate *date;
@property (nonatomic) NSString *name;
@property (nonatomic) NSNumber *year;

@end

Let’s load the view and check out the SQL queries:

SELECT 0, t0.Z_PK FROM ZEVENT t0 ORDER BY t0.ZDATE DESC

SELECT t0.ZYEAR, COUNT (DISTINCT t0.Z_PK) FROM ZEVENT t0 GROUP BY t0.ZYEAR ORDER BY t0.ZYEAR DESC

SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZNAME, t0.ZDATE, t0.ZYEAR FROM ZEVENT t0 WHERE t0.Z_PK IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ORDER BY t0.ZDATE DESC LIMIT 20

Look how it’s using GROUP BY. It now does the grouping with SQL! The second SQL query calculates the number of groups and how many records are in each group. From there, it only loads the first batch of 20 records. We get lazy fetching again which means a quick-loading table view.

## Summary

It’s pretty amazing that giving NSFetchedResultsController a method to emit sections works at all. But that clever behavior comes at a cost in performance. For small datasets the performance hit might not be noticeable. But if you’re dealing with a lot of data, you don’t want to keep your users waiting while all the data loads upfront. In those cases, you can use denormalization to store your section names in the database. The end result is much snappier performance and, hopefully, happier users.

## Empirical Development

EDev LogoThis guest post was contributed by Ben Blakely of Empirical Development. As part of our development cycle we frequently run across interesting discoveries. When these are potentially of interest to other development teams we will be contributing them to Cocoa Is My Girlfriend.