Ever since Sitecore 7 introduced the
ContentSearch namespace, more than half of the projects I’ve worked on have required me to update Lucene indexes from code. This topic has proven to be even more complicated than publishing Sitecore items programmatically! This post will examine some of the ways you can update the Sitecore indexes as-needed, from your code, and my opinions on when you should each each method.
Types of updates
First of all, we need to distinguish from the different kinds of index update actions. Here is a brief table that should help to make it pretty clear.
|Given an index, deletes all index documents before crawling and re-indexing all documents again|
|Given an index and a starting item, re-indexes (creates or updates index) for that item and all of its descendants.|
|Given a starting item, calls
|Given a document identifier and an index, updates the index for specified item only. If specified item cannot be found by the crawler, then it is deleted from the index. This also calls
|Given a list of document identifiers and an index, calls
|Same as Incremental except that it will start the index operation even if indexing is paused or stopped.|
|Given a document identifier and an index, it removes that document from the index.|
There are basically two ways to initiate the index operations.
- Calling the methods directly on the Index object
- Using IndexCustodian
Like the PublishManager, the IndexCustodian does some extra work that is useful, like queuing and running asynchronous indexing jobs when you call DeleteItem, or refreshing an item in all indexes when calling RefreshTree. Because of this, you should usually prefer to use the IndexCustodian. On the other hand, if you want to deliberately perform a synchronous action on an index, you would not use the IndexCustodian.
There is an additional consideration, when talking about updating search indexes. Since each index lives as a physical file set on each server, updating the index on one server does not synchronize across to the others, so when a new item gets added to the web index on the Content Management server, it is not necessarily added to the web index on one of the Content Delivery servers.
I won’t go into it here except to point you to an article by John West that explains it well. What you are looking for in that article is the “RemoteRebuildStrategy”.