Corey Roth coreyroth OFCB269 Key Takeaways Understanding of search extensibility points Basics of Content Enrichment Custom Index Connectors Custom entity extraction State of Search SharePoint 2013 ID: 759542
Download Presentation The PPT/PDF document "Search Content Enrichment and Extensibil..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Slide2Search Content Enrichment and Extensibility in SharePoint 2013
Corey Roth@coreyroth
OFC-B269
Slide3Key Takeaways
Understanding of search extensibility points
Basics of Content Enrichment
Custom Index Connectors
Custom entity extraction
Slide4State of Search
SharePoint 2013Result SourcesDisplay TemplatesResults TypesQuery RulesContent Enrichment Web ServiceCustom Entity Extraction
SharePoint Online
Result Sources
Display Templates
Results Types
Query Rules
Slide5This talk is on-premises
Slide6SharePoint 2013 Search Architecture
SearchAdmin
Content
UX
Crawl
Content
P
rocessing
Index
Query
P
rocessing
WFE
API
AnalyticsProcessing
Crawl
Search
Admin
Link
Analytics Reporting
FAST Search Index
Public API
Unit of scale/role boundary
Extensibility Points
Query Features
Slide7Content Enrichment Web Service
Slide8What is it?
Modifies properties after crawling an item
Custom web service external to SharePoint
Receives input managed properties
Returns output managed properties
Executes during content processing
Slide9Content Enrichment Service
Slide10Why Content Enrichment?
Integrate external data into search results
Good for classification and tagging
Cleanup or validate existing properties
Slide11Implementing CEWS
Create a new WCF Service Application project
Add references
Microsoft.Office.Server.Search.ContentProcessingEnrichment.dll
Implement
IContentProcessingEnrichmentService
Implement
ProcessItem
method
Return output managed properties
Register content enrichment with PowerShell
Slide12Registering CEWS with PowerShell
Register endpoint with
New-
SPEnterpriseSearchContentEnrichment
Specify input and output properties
Optionally specify trigger and other values
Save values with
Set-
SPEnterpriseSearchContentEnrichmentConfiguration
Slide13Configuration Object Properties
Parameter
Description
Endpoint
URL to the endpoint
InputProperties
Managed
properties to send into the service
OutputProperties
Managed
properties returned by the service
SendRawData
Sends
the raw data of the file to the service. (Read-only)
MaxRawDataSize
Maximum size
in kilobytes of raw data to send
Trigger
If trigger
returns true the service will be called (i.e.: Contains(
ContentType
, 'Document')
DebugMode
When
true, Trigger,
InputProperties
, and
OutputProperties
are ignored
Timeout
Time in milliseconds that search will wait for your service (default: 5000)
Slide14Registering CEWS with PowerShell
$
ssa
= Get-
SPEnterpriseSearchServiceApplication
$
config
= New-
SPEnterpriseSearchContentEnrichmentConfiguration
$
config.Endpoint
=
http://server/service.svc
$
config.InputProperties
= "Author",
"
LastModifiedTime
"
$
config.OutputProperties
=
"
TestProperty
"
$
config.SendRawData
= $True
$
config.MaxRawDataSize
=
8192
Slide15Registering CEWS with PowerShell
$
config.Trigger
= "Contains(
ContentType
, 'Document
')"
Set-
SPEnterpriseSearchContentEnrichmentConfiguration
–
SearchApplication
$
ssa
–
ContentEnrichmentConfiguration
$
config
Slide16Demo Use Case
Oil an Gas industry
Integrate external well data given a unique id
Retrieves additional data via service and returns new managed properties
Use display templates to show values
Implement new refiners
Slide17Content Enrichment Web Service
Corey Roth
@
coreyroth
Slide18Triggers
Allow calling CEWS conditionally
Web Service executed when condition is
true
Examples
!
IsNullOrEmpty
(
MyManagedProperty
)
Contains(
ContentType
, "Document")
http://
bit.ly/1neTva0
Tips for working with CEWS
Property names are case sensitive
No aliases
Work with small datasets
Managed properties must already exist
Watch out for read-only properties
Increase timeout time when debugging
Slide20CEWS Pipeline Toolkit
Toolkit and working examples for Content Enrichment
Supports SharePoint 2013 and FS4SP
Currently available through MCS or Premier
Publicly available "soon"
http://
bit.ly/1bC0x25
Custom connectors with BCS
Slide22Custom Connectors
Similar to protocol handler
Crawl content in custom systems when no connector is available
Support for incremental crawl and security trimming
Mapping interface between BCS and Search
Slide23Slide24Implementing a custom connector
Use class library project typeReference Microsoft.BusinessData.dllMicrosoft.office.Server.Search.Connector.dll
Function
Description
Connector Class
Defines type. Implements
Finder
and
SpecificFinder
System Utility Type
Implements
ISystemUtility
Input URI Processor
Inherits
LobUri
class. Maps URLs from Search to BCS
Output URI Processor
Implements
INamingContainer
. Maps URLs
from BCS to Search
Model XML
BCS schema for
the connector
Slide25Installing a custom connector
Install assembly to GAC
Copy model file to file share
Register with
New-
SPEnterpriseSearchCrawlCustomConnector
Add protocol handler registry key
Restart search service
Create content source
Full crawl
Slide26Custom File Connector
Corey Roth
@
coreyroth
Slide27Who should use this?
Slide28No one!
Slide29Why?
Extremely complicated
Only suitable for ISVs
Solution is not cloud-focused
Will most likely never be supported in Office 365
Slide30But wait…
Slide31Leverage Search Indexing Toolkit (SIT)
Provides generic implementation of Custom Indexing Connector
Provides support for crawling and security trimming
Demonstrated at SPC14 (SPC414)
http://
bit.ly/1ixcrNO
Also "coming soon"
Slide32Custom Entity Extraction
Slide33What is it?
Adds refiners to content without metadata
Seed search with dictionary (CSV file)
Matches values in dictionary to create refiners
Similar to FS4SP offering but more flexible
Slide34Entity Extraction Process
Slide35What kind of data to use?
Product names
Departments / Business Units
Vendors
Wells, leases, or facility names
Slide36Dictionary Example
Slide37Why is it important?
Most organizations don't have any metadata
Provides basic refinement with minimal effort
Works against any content source (even file shares)
Slide3812 custom entity extractors
Type
Description
Managed Properties
Word Extraction
Case-insensitive, word match
WordCustomRefiner1
WordCustomReinfer2
WordCustomRefiner3
WordCustomReinfer4
WordCustomRefiner5
Word Part Extraction
Case-insensitive, word part match
WordCustomRefiner1
WordCustomReinfer2
WordCustomRefiner3
WordCustomReinfer4
WordCustomRefiner5
Word Exact
Extraction
Case sensitive, word match
WordExactCustomRefiner
Word
Part Exact Extraction
Case insensitive,
word part match
WordPartExactCustomReinfer
Slide39Implementing entity extraction
Slide40Dictionary names for PowerShell
Reference entity extractors by name in PowerShellMicrosoft.UserDictionaries.EntityExtraction.Custom.Word.n [where n = 1,2,3,4 or 5]Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.n [where n = 1,2,3,4 or 5]Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1
Parameter
Description
DictionaryName
Specifies which entity extractor. Use one of the dictionary names above
Filename
Full UNC path to the CSV
file containing the dictionary
SearchApplication
Search Service Application object
Slide41Entity Extraction with PowerShell
$
searchApp
= Get-
SPEnterpriseSearchServiceApplication
Import-
SPEnterpriseSearchCustomExtractionDictionary
–
SearchApplication
$
searchApp
–Filename
.\
WordPartExtraction.csv –
DictionaryName
Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.1
Slide42Custom Entity Extraction
Corey Roth
@
coreyroth
Slide43Key Takeaways
Understanding of search extensibility points
Basics of Content Enrichment
Custom Index Connectors
Custom entity extraction
Slide44Resources
How to: Content Enrichment Web Service
http://
bit.ly/1fMIOcL
My File Custom Connector Example
http://
bit.ly/1hLSC1D
How to: Custom Entity Extraction
http://
bit.ly/1iek3ET
Resources
Learning
Microsoft Certification & Training Resources
www.microsoft.com/learning
msdn
Resources for Developers
http://microsoft.com/msdn
TechNet
Resources for IT Professionals
http://microsoft.com/technet
Sessions on Demand
http://channel9.msdn.com/Events/TechEd
Slide47Complete an evaluation
and enter to win!
Slide48Evaluate this session
Scan this
QR code to evaluate this session.
Slide49©
2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.