27
Aug
2008
 

Cocoa Tutorial: Sync Services without Core Data

by Marcus Zarra

Sync Services have come a long way in Leopard. Before Leopard it was an extremely complex operation that was almost completely manual. Needless to say, this sucked and it was probably one of the reasons it was shunned by most developers.

If you are using Core Data in a Leopard application then Sync Services is so trivial that you should be syncing if it makes sense. In this article we are going to cover syncing in a non-Core Data situation as that is quite a bit more complex.

If you have read the Sync Services documentation then you know it is complex. Let me dispel an illusion right away. It is hard. It is not poor documentation, syncing is very hard and very few people get it right. Take a look at Omnifocus to see an example of a company thinking it is easy and losing data. Therefore if you are expecting this subject to be trivial you will be disappointed.

In this example we will be syncing with the bookmarks schema and displaying them in a simple outline view. The outline view itself will be editable and those edits can be synced back. Not terribly useful but provides a very simple example.

Sync Services Setup

There are a few different ways to implement sync services inside of your application. You can set up a sync client and handle everything manually just like we did in Tiger but to be honest, unless you need Tiger compatibility, I would not use that approach again. The option we are going to use in this article is new to Leopard and that will be employing a ISyncDriver. The driver is basically a default sync client that handle some of the mess for us. To create the driver, we need to pass it a data source:

syncDriver = [[ISyncSessionDriver sessionDriverWithDataSource:self] retain];
[syncDriver setDelegate:self];

[self setPreferredSyncMode:ISyncSessionDriverModeFast];

SEL syncSEL = @selector(client:willSyncEntityNames:);
[[syncDriver client] setSyncAlertHandler:self selector:syncSEL];

[syncDriver sync];

In our example application, we are going to hold on to a reference to the syncDriver object so that we can request syncs whenever it is appropriate. To be a data source for the ISyncSessionDriver class, the object needs to implement the protocol ISyncSessionDriverDataSource. While this protocol has a few optional methods, pretty much all of the methods in the protocol are required. In the example application for this article, the application’s delegate is also the data source and the delegate. I would not recommend this for a production application as it makes the application delegate pretty big; it is sufficient for this example.

Once the syncDriver has been initialized, I want to request a sync immediately on start up. This makes sure that the application is dealing with fresh data as opposed to something that may have changed while the application was not running1. Just before I call [syncDriver sync] though I do want to flag my object as the handler for sync alerts. This is a callback method used by the sync engine to tell my application that a sync is occurring on data that I care about and it gives me the opportunity to be apart of that sync. Once I have set myself up as a receiver for those alerts it is time to perform the first sync.

There are essentially three phases when your application is going to perform a sync. Two of those phases our application will have a direct involvement with. The first phase, called the push phase, is where we send our data to the Truth. The Truth is a database that lives on each Mac and keeps track of all the data that is being synced. The Truth handles all merging of data and each application can request a copy of data from it.

Push

Entity Names
Entity names are similar to bundle identifiers in that they uniquely name a data object. In this example we are syncing with the system bookmarks so our two entity names we are dealing with are com.apple.Bookmark and com.apple.Folder.

In the first phase of a sync, we push our data to the Truth. Since this example application does not retain any data this will be very quick. The first part of this push, our application needs to list all of the data objects it knows about for each entity type. The method that handles this is -recordsForEntityName: moreComing: error:. This method returns a dictionary of all the data for a specific entity name.

In that NSDictionary, the key is the unique identifier for that record which is provided by the Truth. The object is another NSDictionary with the properties of the object contained within. In our example application, this method is implemented as follows:

-(NSDictionary*)recordsForEntityName:(NSString*)entityName 
                          moreComing:(BOOL*)moreComing 
                               error:(NSError**)outError
{
	NSMutableDictionary *dict = [NSMutableDictionary dictionary];

  for (NSString *recordID in [[self recordLookup] allKeys]) {
    id object = [self objectForRecordIdentifier:recordID];
    NSString *objectEntityName = [[object class] entityName];
    if (![objectEntityName isEqualToString:entityName]) {
      continue;
    }
    [dict setValue:[object fullRecord] forKey:recordID];
  }

  *moreComing = NO;
  return dict;
}

The more coming flag is used when an application wants to send the data in batches. If it is set to YES then the method will be called repeatedly with the same entityName until a NO is received. NOTE:This method is normally only called on the first sync.

Once our application had told the Truth about all of the data it knows about, the next part of the push phase is to send any changes that have occurred since the last sync. This is handled in one of two methods. If our application can accurately describe each property that has changed then the -changesForEntityName: moreComing: error: method can be used. If, however, we only know that an object has changed but not which specific properties, then the -changedRecordsForEntityName:moreComing:error: method would be used. The former method is easier on the sync since it does not need to resolve that information itself. In our implementation we will be using the latter implementation to make our lives easier.

-(NSDictionary*)changedRecordsForEntityName:(NSString*)entityName 
                                 moreComing:(BOOL*)moreComing 
                                      error:(NSError**)outError
{
  NSMutableDictionary *dict = [NSMutableDictionary dictionary];
  for (NSString *key in [[self recordLookup] allKeys]) {
    id object = [[self recordLookup] objectForKey:key];
    if (![object updated]) continue;
    if (![[object entityName] isEqualToString:entityName]) continue;
    [dict setObject:[object fullRecord] forKey:key];
  }
  return dict;
}

Here we are looping through all of our objects (we store them in a NSDictionary as well to make life easier in the demo) and if an object has the right entity name as well as being flagged as updated we send its data back. Again, we can set the moreComing flag to YES if we need to handle the data in batches.

Mingle

Once we have pushed our data and our changes up to the server and all of the other participants of this sync session have done the same, the Truth mingles the data. It will do its best to merge everything and if it fails it will ask the user to decide which change is the correct one.

Clients have no participation in this phase of the sync session. Once the mingle is complete the next phase is started automatically by the ISyncSessionDriver.

Push

Once the data has been mingled, the Truth will then inform each client of any changes that have taken place. It is expected that each client will accept these changes and store them in their local data stores. The method that is called is -applyChange:forEntityName:remappedRecordIdentifier:formattedRecord:error:. This method will be called for each entity that has changed as a result of the mingle so it needs to be as efficient as possible.

When this method is called, a ISyncChange object is passed in. This object contains all of the information about the change that needs to be performed. The first thing that needs to be looked at is what type of change it is. A change can be a new record, deleting an existing record or updating a record. Our implementation of this method is as follows:

-(ISyncSessionDriverChangeResult)applyChange:(ISyncChange*)change 
                                forEntityName:(NSString*)entityName 
                     remappedRecordIdentifier:(NSString**)outRecordIdentifier 
                              formattedRecord:(NSDictionary**)outRecord 
                                        error:(NSError**)err
{
  NSDictionary *errDict = nil;
  id record = nil;
  switch ([change type]) {
    case ISyncChangeTypeDelete:
      [[self recordLookup] removeObjectForKey:[change recordIdentifier]];
      return ISyncSessionDriverChangeAccepted;
    case ISyncChangeTypeAdd:
      if ([entityName isEqualToString:[FolderEntity entityName]]) {
        record = [[FolderEntity alloc] init];
      } else if ([entityName isEqualToString:[BookmarkEntity entityName]]) {
        record = [[BookmarkEntity alloc] init];
      } else {
        errDict = [NSDictionary dictionaryWithObject:@"Unknown type"
                                              forKey:NSLocalizedDescriptionKey];
        *err = [NSError errorWithDomain:@"CIMGF" code:8002 userInfo:errDict];
        return ISyncSessionDriverChangeError;
      }
      [record setRecordIdentifier:[change recordIdentifier]];
      [[self recordLookup] setObject:record forKey:[change recordIdentifier]];
      break;
    case ISyncChangeTypeModify:
      record = [[self recordLookup] objectForKey:[change recordIdentifier]];
      break;
    default:
      errDict = [NSDictionary dictionaryWithObject:@"Unknown type"
                                            forKey:NSLocalizedDescriptionKey];
      *err = [NSError errorWithDomain:@"CIMGF" code:8001 userInfo:errDict];
      return ISyncSessionDriverChangeError;
  }
  for (NSDictionary *changeDict in [change changes]) {
    NSString *action = [changeDict valueForKey:ISyncChangePropertyActionKey];
    NSString *name = [changeDict valueForKey:ISyncChangePropertyNameKey];
    if ([name isEqualToString:kRecordEntityName]) {
      continue;
    }
    if ([action isEqualToString:ISyncChangePropertyClear]) {
      [record setNilValueForKey:name];
      continue;
    }
    id value = [changeDict valueForKey:ISyncChangePropertyValueKey];
    [record setValue:value forKey:name];
  }
  [record setUpdated:NO];
  return ISyncSessionDriverChangeAccepted;
}

In this method we use a switch to determine what type of change that it is. If it is a delete we remove the record from our global store and return happy. If it is an add then we create a new empty object and set the recordIdentifier. Lastly, if it is a change we retrieve the record from our global store.

Once the object that is going to be changed has been referenced we loop through the changes which are stored in an array inside of the ISyncChange object. Each object in the array is a dictionary containing up three values. The first value has a key of ISyncChangePropertyActionKey and the value can either be ISyncChangePropertySet or ISyncChangePropertyClear. If it is a clear then we are to set the property to nil. Otherwise we change the property to the value stored under key ISyncChangePropertyValueKey. The name of the property is stored under the key ISyncChangePropertyNameKey. If our properties use the same name as the sync objects they represent (and in our case they do) then this step is performed in a loop using KVC to update each property.

Once we have updated the object we return ISyncSessionDriverChangeAccepted to let the sync engine know that everything is happy. We could have returned ISyncSessionDriverChangeIgnored or ISyncSessionDriverChangeRejected if we did not accept the change for some reason. In either of those cases the object and its changes would not be sent to us again unless it changed in the future. Lastly, if there is a problem, we can return ISyncSessionDriverChangeError if there was an error. If we return that, however, the sync engine expects us to populate the NSError pointer before returning.

There are two other pointers that get passed in that should be mentioned. The first one, called outRecordIdentifier in this example is used in the case where we do not want to use the recordIdentifier being passed to us. Perhaps we have our own internal schema or we want to reference this record by another identifier. If we set this pointer then the Truth will remember that change and use the new identifier in the future when talking to us. It will not use this new identifier when talking to anyone else though. This is useful when syncing with a database that uses something other than a guid for uniqueness. The second one, called outRecord in this example, is used to update the object. Perhaps we had a field on new records or change a field; in either case those changes can be sent back to the Truth via this pointer.

Other Data Source methods

That is the major points of a sync. We of course have a lot more flexibility to cover edge cases and we can make changes at any step in the process via the delegate methods but those are the points of contact for passing data back and forth. Other than those complicated touch points, there are a few other methods that need to be implemented to make a sync work. Those methods are more administrative and far less complicated.

  • clientIdentifier
    This is the unique string that identifies this client to the Truth. I like to use the bundle identifier here just to make things simple.
  • schemaBundleURLs
    This method returns an array of NSURL objects. Each NSURL object points to a schema either on disk or on the net that describes the data to be stored in the Truth. If we are creating a new set of data then we would need to generate a schema (and there is a template in Xcode for that purpose). If not then we need to reference the existing schema.
  • entityNamesToSync
    It is not necessary to sync every entity in a schema (although in our case we do since there are only two). If we only care about a subset of the data then we can request to sync only those entities we care about. In any case we pass back an array of strings here to let the sync engine know which entities we care about.
  • preferredSyncModeForEntityName
    A fast sync is always the goal. However if we need to refresh an object or perform a slow sync (useful when a large amount of data has changed or we lost our copy of the data) then this is the method that determines it. This method will be called once for each entity we want to sync.
  • clientDescriptionURL
    The client description is a plist file that describes this client to the sync engine. For example, the icon used in dialogs is defined in this plist along with the human readable name of the sync client, etc. Also in this plist we define what properties of each entity we want synced. If we only want a subset of the entities we can control that here.

Conclusion

I would not be at all surprised if that was as clear as mud. The sync services framework is complicated, no doubt about that. That complication has a purpose though, it is incredibility powerful as well. Now that I have scratched the surface in this article I will be adding other articles to explain some of the more complicated aspects such as adding properties to an existing schema, mid-sync data changing and more.

If a part of this is unclear, please post a comment and I will update it to help clarify. I would also recommend downloading the example application that is attached as I find the code much easier to understand.

xcode.png
Sync Services Tutorial 1

1. There is an option to have a helper tool that sync services will run when your application is not running but that is beyond the scope of this article.

Comments

smorr says:

You. Are. A. God.

I spent last week banging my head against the wall trying to incorporate syncing into my upcoming Mail Act-On 2 (plug in for Mail) — trying to futz in new entities etc into Mail’s existing schema — and I had basically given up —

This provides a much more clear approach and it may indeed work — fingers crossed.

Thanks Thanks Many Many Thanks

Marcus Zarra says:

Thanks smorr, glad you liked the article.

BTW, if any who reads this or any other article and feels the need to respond on Twitter — DON’T!

I will not respond to twitter comments about CIMGF. Put the comments here where they can be preserved and used by the other readers.

Thanks!

Ross P says:

In cases where the clients are NOT always connected to the server, a sync CANNOT take place after each change at a client, but must be done on a delayed basis. In such cases, how does Core Data KNOW that an entity or property has changed? Does it compare the “Last Modified” timestamp to the Last Sync? Does it set a flag at the entity level and then clear those flags after a successful sync? Does it set a flag at the property level and then clear those flags after a successful sync? Does it create a temporary snapshot at the beginning of a sync session and then permanently save that snapshot after the sync completes successfully?

Marcus Zarra says:

The truth database is on the same operating system as the application so the client is always connected to the server.

In addition, both sides keep track of deltas since last sync.

Ross P says:

No, I’m talking about a case where the “client” is an iPhone (for example) that may not always be connected to a server (for example, a change is made while traveling on a plane). So, when the sync does occur, the client must know what changed since the last sync. Thus, my question — how does it know what changed? Does it compare the “Last Modified” timestamp to the Last Sync? Does it set a flag at the entity level and then clear those flags after a successful sync? Does it set a flag at the property level and then clear those flags after a successful sync? Does it create a temporary snapshot at the beginning of a sync session and then permanently save that snapshot after the sync completes successfully? Does it find some other way to identify changes?

Marcus Zarra says:

There is no sync services on the iPhone.

IF there were then it would keep a delta file as I mentioned.

Without a delta file then each property of each record would need to be compared to determine what has changed.

Ross P says:

Interesting. In Apple’s Sync Services Programming Guide, it states “Users might have several computers, at home and at work, an iPhone, an iPod and other cell phones. Users will want to automatically sync data on all their computers and devices—especially, contacts and calendars which are supported on most devices.”

That appears to say that sync services DOES exist on the iPhone.

Marcus Zarra says:

It does not exist on the iPhone as a public API and the hooks for Core Data are also not in any of the public API. So at this time it does not exist on the iPhone.

Ross P says:

So, if I have an iPhone to use while traveling, and a Mac desktop back at my office, and I want to keep my Contacts in sync at all times (when I have an Internet connection for the iPhone), how do I do that?

Marcus Zarra says:

That is an Apple application which has absolutely nothing to do with this topic. Apple is using a PRIVATE API. They could be using Sync Services or little furry rodents. Doesn’t matter, its a private API.

3rd Party Developers cannot use Sync Services on the iPhone at this time even if it does potentially exist for Apple to use.

Is there a reason you are going down this path or are you just trying to wind me up?

Ross P says:

Just trying to assess the feasibility of investing in the development of an application that requires sync and might potentially be integrated with Contacts.

Marcus Zarra says:

There is no built in sync’ing mechanism at this time for iOS that a 3rd party developer can access. There are a few open source and other third party solutions being developed such as ZSync.

Ross P says:

Then, what’s the point of Apple’s “Sync Services Programming Guide”?

Marcus Zarra says:

Sync Services was built long before the iPhone was on the drawing board. It is designed for sharing data between multiple applications (two different browsers sharing bookmarks) and for sharing data through MobileMe. Both of which are extremely useful.

Ross P says:

So, let’s start back at the top, and assume we are using (or trying to use) Apple Sync Services. In cases where the clients (assume Mac laptops traveling on airplanes) are NOT always connected to the server (assume a Mac desktop back at the office), a sync CANNOT take place after each change at a client, but must be done on a delayed basis. In such cases, how does Core Data KNOW that an entity or property has changed? Does it compare the “Last Modified” timestamp to the Last Sync timestamp? Does it set a flag at the entity level and then clear those flags after a successful sync? Does it set a flag at the property level and then clear those flags after a successful sync? Does it create a temporary snapshot at the beginning of a sync session and then permanently save that snapshot after the sync completes successfully?

Marcus Zarra says:

I have already answered this question twice. The truth is on the desktop. There is no server. Think of MobileMe as the only existing sync client that is external to the desktop that sync services is being run on.

In addition, Core Data creates a delta file between syncs; even when the sync server is sitting right there. The delta file is a file that records what objects have changed since the last sync. Core Data then uses that delta file for the next sync to make it faster. On the desktop this file is called the fast sync file. It is part of the Sync Services framework that is added onto Core Data on the desktop.

Keeping track of timestamps is a bad idea in syncing because no two clocks are exactly the same. Keeping a delta of what has changed (Core Data actually keeps this at the entity level) is the most sane way to keep track of what needs to be synced next.

Ross P says:

But, if the client and the truth on both on the same desktop, then all the timestamps would be using the same clock — so no issue as to clock accuracy. Plus, if “conflict resolution” can be resolved by “last modified”, then doesn’t this suggest that timestamps really are used?

Marcus Zarra says:

No, clocks can and do change so timestamps can never be trusted in syncing.

Ross P says:

So, what does “last modified” really mean?

newacct says:

for (NSString *recordID in [[self recordLookup] allKeys])

should be written as

for (NSString *recordID in [self recordLookup])

Agent Bangla says:

Hellow Marcus,

Thanks for the great tutorial. I have found it handy as I am trying to develop an application that can read/write iCal data from Calendar.SyncSchema. I am finding it hard to understand the effects of Sync Schema relationships. Could you please help me to understand, how the schema relationships are maintained in your project or more specifically with ISyncSessionDriver?
Thanks