Showing posts with label cross-farm services. Show all posts
Showing posts with label cross-farm services. Show all posts

Friday, 17 February 2012

Search result URLs not correctly mapped to the current zone when the content is crawled in a different farm

This one is just a nice-to-know. I didn't know about this setting, so I'll briefly document it here for future reference.

Situation
The set-up consists of a typical enterprise services farm (http://technet.microsoft.com/en-us/library/cc560988.aspx#Enterprise):

  • Farm A hosts the collaboration web-applications, in which the users upload their documents and other content.
  • Farm B is a remote services farm and hosts the cross-farm service applications. This includes all of our search service application(s). These service applications are published and then consumed on Farm A. 
Farm B crawls farm A using the web-application's default zone url, e.g. http://sps2010. When a user performs a search in farm A on the default zone, all of his search results link back to the default zone. So far so good.

The problem
Now lets say we add a new Intranet AAM zone on farm A's web-applications, which points to http://sps2010intra. When a user performs a search from that zone, all of his search results will again refer to the default zone!

The explanation
This is actually pretty logical: the content is crawled on farm B, using the default zone's URL.
In a single farm setup, the search service application will have access to the crawled web-application's alternate access mappings and will take those into account when a query is done on the crawled content.
Here however, the search service application is hosted on a different farm (farm B) and that search service application doesn't know how the alternate access mapping is set up on farm A!

The solution
When you google this problem, you will get a lot of posts saying that you need to use the Server Name Mapping setting on your search service application. There's a good overview of this setting here.

Personally I would not recommend using that approach, as it only works for one AAM zone.  There's no way to map multiple zones (intranet AND internet for instance) using the server name mapping route.

A much more flexible solution is to mimic the crawled content farm's alternate access mappings on the remote services farm. If you did set-up server name mappings to fix this: delete them and recrawl! ;)
  1. Open the central admin site on your remote services farm (farm B).
  2. Click on Application Management and select Configure alternate access mappings (in the web applications section).
  3. Now click on Map to external resource in the toolbar. This little bugger allows you to create an alternate access mapping for resources that aren't in the farm.
  4. Give it a meaningful name, like Collab {Name of the crawled webApp}. Enter the crawled web-application's default zone URL in the URL protocol, host and port field.
  5. Click on Save to create the external resource AAM. Now you can set this resource up as if it were a web-application on your remote services farm!
  6. Use Edit Public URLs and Add Internal URLs to configure the external resource's AAM just like it is set up on your crawled farm (farm A).
That's it ... no need to recrawl or anything. If all was done well, search results on the content farm (Farm A) will now be correctly translated according to their zone, even while the content was indexed on another farm.





SP2010 - Search Crawl not working on remote services farm


A quick post on two issues I've encountered lately. Both are related to SharePoint 2010 Search crawls that suddenly stopped working.
I'm just going to post the symptoms and what worked as solution for me.

The SharePoint set-up consists of two farms:
  • Farm A hosts the collaboration web-applications, in which the users upload their documents
  • Farm B hosts most of the service applications, including the search service application(s). These search service applications crawl content from Farm A. Furthermore, these service applications are consumed on Farm A (so people on that farm can search the crawled content).

Issue #1
The crawl log was filled with the following errors when crawling farm A:
The SharePoint item being crawled returned an error when requesting data from the web service. ( Error from SharePoint site: Value does not fall within the expected range. ).

Solution
Apparently there was an issue with the web-apps that were being crawled. In order to get them crawled again, I
  1. Detached the content databases linked to the web-apps on farm A
  2. Removed the web-app (make sure to delete the related IIS site)
  3. Recreate the web-app
  4. Attach the old content db again
Simple as that.

Issue #2
Some weeks later, my crawl log got filled with this lovely message:
The SharePoint item being crawled returned an error when requesting the data from the web service. ( Error from SharePoint site: *** Could not find part of the path 'c:\TEMP\gthrsvc_OSearch14\{random-text}.tmp' 

Solution
Despite the error, that path was indeed present on all my machines in farm B. I tried fiddling with the permissions on that folder, but nothing helped.

In the end I just:
  1. Removed the consumed search service applications on farm A.
  2. Removed the search service application(s) on farm B
  3. Recreated the search service application(s) on farm B. Luckily for me, I had already scripted this procedure in PowerShell.
  4. Publishd the service apps again on farm B and consume them on farm A.
  5. Recrawled all content
Conclusion: if some part of SharePoint is fucked up, you can usually fix it by removing & recreating it :)