with Microsoft FAST Search Server 2010 for SharePoint Jeff Fried CTO amp VP Engineering BA Insight OSP311 About The Speaker Jeff Fried CTO BA Insight Previously VP Advanced Solutions for FAST then Microsoft technical product manager for FAST and SharePoint 2010 Search ID: 657820
Download Presentation The PPT/PDF document "Building Search-Driven Applications" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Building Search-Driven Applications with Microsoft FAST Search Server 2010 for SharePoint
Jeff FriedCTO & VP EngineeringBA Insight
OSP311Slide2Slide3
About The Speaker: Jeff Fried
CTO, BA InsightPreviously VP Advanced Solutions for FAST, then Microsoft technical product manager for FAST and SharePoint 2010 Search
Author
Professional Microsoft Search
Professional
SharePoint 2010 DevelopmentBlog:http://www.ba-insight.net/Blogs/Sharepoint-Search-Expert/default.aspx Twitter:http://twitter.com/jefffriedBased in Boston, MA, USASlide4
OutlineSlide5
Case Study: Pharma R&D R&D researched where it was spending its time, and discovered
56%
of R&D time
(human capital and budget) spent:
Duplicating existing researchIt blew them away, until they reviewed why…Slide6
Top 3 reasons for 56% effort duplication:
Research done in separate groups
Seemingly unrelated research projects
Later in lifecycle (mfg,
reg
/test)
Data not accessible
Isolated content source
Restricted / limited access
Source not searchable
Special knowledge required
Data not linked
Various names/changes leave data disconnected
People not connected to data (experts)
Data managed in many unconnected systemsSlide7
Case Study: Pharma R&D
Documentum Image
SharePoint Doc
Regulatory Record
MEDLINE article
Multiple Sources One Search
Search:
amgen
655
Relationships Discovered:
Antibodies:
mAb
Receptors: DR5, IGF-1R
Labs: Oncology 1
People: David ChangSlide8
Recognizing Search Driven Applications
They are everywhere
“How do I support the
unique search needs of teams and work that impact our business?”
To do so, you need a search platform that has
A deep understanding of your information
Flexible relevance to meet diverse needs
A customizable UX to increase user efficiency
Sales
:
360
o
Customer Insight
Services:
Knowledge Browser
Marketing:
Competitive Intelligence
Research & Development:
Innovation Portal
Support:
Call Center Advisor
Operations:
Systems/Logistics Portal
Legal, HR, IT, Finance, ……Slide9Slide10Slide11Slide12Slide13
Search-Based Applications
Research Portal
Unified View
Customer Service
Compliance
Analyst’s workbenchManagement AdviserInnovation CenterVoice of the CustomerLogistics CenterConsolidated Dashboard
Call CenterOnline Service
Sales Dashboard
Fraud
Center
E-Discovery
Info Governance
Search Based Applications are found in every industry and every function
Traditionally, search vendors describe these as possibilities using their platforms;
but implementation costs have been >$1M
Packaged apps are now possible, leveraging the SharePoint ecosystemSlide14
demoSearch Driven Application
Contoso Consulting DemoSlide15
How would you create this?
Content Crawling: bring in data from lots of places
OOB connectors to SharePoint (reports, account documents), shared files; CMS systems across multiple divisions of contoso
Crawl web intelligently for background content
Content processing:
creating metadata
Names of projects, offerings, key concepts, clients, external experts
Industry terms & taxonomy (from external sources)
Synonyms for key concepts
OOB web parts configured for style
Federation, People Search, Search actions, Scopes
Custom Relevance Profile
Custom web parts for visual navigation
Tab-like selectors, sliders, maps, and taxonomy-aware refiners
Build PowerPoint on the Fly
Custom development using the
OpenXML
SDK
SharePoint workflows for act-on-selected-itemsSlide16
Federation
OpenSearch
Content Processor
Crawler
Indexer
Query Processor
Search Center
Content
User Profiles
…
Format
Conversion
Language
Detection
Entity
Extraction
Lemmatization
Mapper
…
Content Processing Pipeline
People Search
Index Partition
Metadata
Creation
Relevance
Control
User
Context
Indexing
Connectivity
User
Experience
Federation
FAST Search for SharePoint
Designed for CustomizationSlide17
FAST Search for SharePoint UI OOB
Built on SharePoint Search Center
Open
Web Parts, Federation, query suggestions, related queries, Did you mean?
Visual results
Thumbnails for Word and PowerPoint
Visual Best Bets
Preview in
browser
Deep Federation
Thumbnails
Previews
SortingSlide18
What can I do with a Managed Property?
Metadata quality is critical to
a good search experience
Precise hit counts in
deep refiners
are computed across the whole result set.
And many more…
Concepts
Products
Companies
File Formats ,
Metadata is also used for relevancy tuning, multi-level sorting and a
dvanced search
Enables deep refinement
Makes search conversational, guiding users to navigate and refine, while summarizing the results that are found
Enables precision relevancy
Managed properties are also used for relevancy
tuning &
ranking, multi-level sorting, advanced
(or fielded) searchSlide19
PowerShell$crawledProps = Get-
FASTSearchMetadataCrawledProperty -Name “crawled1”$mp
= New-
FASTSearchMetadataManagedProperty
-Name “managed1” -type $
dataTypeNew-FASTSearchMetadataCrawledPropertyMapping -ManagedProperty $mp -CrawledProperty $crawledSlide20
TechNet Script RepositorySlide21
Introducing the Processing Pipeline
Sequential stages perform specific tasks while ingesting content
Breaks
down content to the smallest addressable chunks to build meaning
Understands file encoding, data formats, and written languages
Supports 400+ file formats, 80+ languages
Process your content to make it searchable
Normalizes content so that a consistent relevancy model can be applied
Identifies structured
and unstructured
metadata
in your content
Maps document metadata to SharePoint Crawled Properties
A systematic approach to interpreting your content
Map Crawled Properties
Maps
all of the metadata that was discovered by the various pipeline stages
Web Link Analysis
Analyzes document
s for hyperlinks extracting anchor text which reinforces the authority ranking of a document.
Document Vector
Creates a unique representation
of a document that reflects important terms and frequency of occurrence. Used to find similar documents.
Date and Time Normalization
Converts
dates and times to a standard representation, to handle locale specific representations. For example, knows that 14-Mar-10 is equivalent March 14, 2010.
Entity Extraction
Finds
terms in the content and maps them to predefined categories. Out of the box support for People, Companies and Locations, but can be extended to any category.
Lemmatization
Finds
the root of a word for a given language. For English it maps run, runs, running and ran back to a single lemma. Understands language specific grammar and context.
Tokenization
Apply the language
specific rules for identifying words, concepts, idioms and phrases. Also applies custom word breakers found in part numbers or telephone numbers.
Language Encoding and Detection
Identifies
the native written language and locale specific encoding so that the proper dictionaries can be used by the tokenization and lemmatization stages
Format Conversion
Extracts
plain text from multiple file formats, encodings, and applicationsSlide22
Extending Pipeline capabilities
Configure Optional Processing Steps
XML Properties mapper
Offensive Content Filter
Verbatim (wholeword) extractor
Use a dictionary for custom extraction
Pipeline ExtensibilityCalls external applications for custom item processing
Field Collapsing
Entity Extraction
Straightforward way to add custom text analysis functionality
Add Custom
Processing
Pipeline Extensibility is a specially defined stage that takes
a set of crawled
properties, as flat text as input and maps output to another crawled property
Sandboxed executionExecutable arguments and temporary files are automatically handled with timeouts.Runs just before the Crawled Property Mapper, providing accessibility within SharePointSlide23
DEMO: Pipeline ExtensibilityConfigurationLogging and generating test data
DebuggingUseful included FAST tools
…
Extensibility
Mapper
Standard processing
Let’s make easierSlide24
Pipeline Extensibility examplePurposeCreate a new property based on value of an existing property
Enable searching for a value being emptyImplementationGet value of
ows_ContentType
using
Linq
for XMLSet new property mycontentcheck to eitherhascontenttypenocontenttypeSlide25
Tune RelevancyImprove accuracy and control with Rank Profiles
Quality
Also
known as static rank, consists of multiple managed properties including site, URL depth (preference for shorter URLs), and relative importance of links to this document.
Authority
Rank given when the query word falls in the link or anchor text.Query AuthorityRank given when the query word falls within items that have been selected in previously executed queries.FreshnessRank given for new content, based on last modified parameter.
ProximityRank given depending on where query terms fall and how close they are to each other within a document
Context
Rank given depending
on where a query term falls. If the query term falls within a managed property, it is given an additional boost.
Field Boosts
Any managed property can be used for an additional
boost.
Rank Profiles are exposed by modifying the sorting web part.
Rank Profiles are made by combining multiple ranking elements
Create custom ranking algorithms to combine multiple ranking
properties
Rank Profiles created in
PowerShellSlide26
demoFAST Search for SharePoint
Platform CapabilitiesSlide27
Microsoft SharePoint 2010The
Business Collaboration Platform for the Enterprise and the
Internet
Deliver the Best Productivity Experience
Cut Costs with a Unified Infrastructure
Rapidly Respond to Business Needs
Communities
Search
Sites
Composites
Content
InsightsSlide28
SharePoint as an Application PlatformSlide29
Application Lifecycle
Developer Machine
Development
Testing
F5 Deploy
Team Foundation Server
Check In
Staging
Automated Testing
Warm-Blooded User
T
esting
TFS Build Server
SharePoint Projects
SP2010 DLL’s
Build
Run Tests?
Fix Bugs (repeat as necessary)
Nightly
Build
or
Continuous
Integration
Deploy Using PowerShell
Open/Close
Bugs
WSPSlide30
Customizing Search
Configure
Extend
Create
Intranet Search
People Search
Site Search
Research Portal
Case Management
Save Results to Excel file
…..
IP Portfolio
mgmt
Intel/Surveillance
Drug Discovery
….Slide31
“Out of the Box” Search ExperienceA powerful baseline for customization
Refinement summarizes and narrows results
Query federation
brings together
results
(FAST) Previews provide rich interactionQuery Suggestions guide user keywords(FAST) User Context provides a personal experienceSlide32
Top Customization ScenariosModify the OOB End User Experience
Add new Refinement category
Show results from federated location
Modify the look and feel of OOB end user experience
Enable sorting by custom metadata
Add visual Best Bet for upcoming sales eventConfigure different ranking for HR vs. Engineering departmentCreate a new Search Verticals
Create new visual elements
Query & Result Pipeline Plug-ins
Query & Indexing Shims
Create new Search
D
riven Applications
Create new customer search experience
Indexing content
Define content for search
Design search experience
Create new Audio/Video/Image search experience
Show Location refinement on Chart/Maps
Show tags in tag cloud
Enable export results to Spread Sheet
Summarize Financial Information from customers in Graphs
Expand query terms based on synonyms defined in Term Store
Augment customer results with project information
Show popular customers/people inline with search results
Show people results from other sources
Show email results from personal mailbox on Exchange Server through the EWS
Index content from custom repositories like
Documentum
Create content processing plug-ins to create new metadata
create a new customer page that shows:
Custome
r Contact Details
Customer
Project Details
Customer
Contacts
Internal
Experts
Customer
related documentsSlide33
Search 2010 Architecture
The platform for Search Customization
Search
Web Parts
SharePoint Search Index
OpenSearch /
Custom
Source
SharePoint Indexer
Federation OM
Web Service, RSS
FAST
Search Index
FAST
Indexer
What’s New in 2010?
Primary Search Web Parts now Unsealed
Federation now a key Public OM layer
All
Web Parts built on federation
Query
alteration, custom
Runtimes,
blending results from multiple sources
Web Service / RSS Enhancements
FAST
Search / SharePoint Search:
Shared Web Parts, RSS, Web Service
Shared Federation OM
Index and Crawling Separate
FAST unique Content Processing Pipeline
Search
Web Parts
Federation / Query OM
Web Service, RSS
Content ProcessingSlide34Slide35
Example: Legal ApplicationAttorneys spend too much time looking for information and assessing the relevancy of search results
Must download entire document and conduct manual review to determine
relevancy
No way to internally search documents &
items
Excessive burden to network resources for geo-distributed officesNo way to quickly review, annotate , compile, and reuse relevant information Slide36
Longitude SearchPackaged Application
demoSlide37Slide38
Search: Find and Explore
Penicillin
LSD
Uranus
Viagra
Safety glass
Infrared radiation
Microwave ovens
Inkjet printers
Corn Flakes
Chocolate chip cookiesSlide39
Research Portal exampleSimilar Patterns across IndustriesComponent-based
Example MockupSlide40
Multiple Search CentersSlide41
The Search Platform Concept Content Integration to enable applications
Legacy Data
RDBMS,
Sharepoint, Salesforce
Portals
RSS Feeds,
Data Streams
DMS, CMS, Files
Applications
Email
Rich Media
XML
Internet
FAST Search Platform
SharePoint 2010
Information Integration
Site Search; Transactions;
Media Content
Intranet Search; Knowledge Management;
CRM; Fraud Detection; Risk & Compliance
Longitude Connectors
Customer Facing Apps
Enterprise-Facing Apps
SOA-enabled
ServicesSlide42
Search-Based Applications
Research Portal
Unified View
Customer Service
Compliance
Analyst’s workbenchManagement AdviserInnovation CenterVoice of the CustomerLogistics CenterConsolidated Dashboard
Call CenterOnline Service
Sales Dashboard
Fraud
Center
E-Discovery
Info Governance
Search Based Applications are found in every industry and every function
Traditionally, search vendors describe these as possibilities using their platforms;
but implementation costs have been >$1M
Packaged apps are now possible, leveraging the SharePoint ecosystemSlide43Slide44
Tips for successful FAST projectsStart early with OOB experienceStand up FAST, show it to users
Keep an active staging systemFull scale, with production contentSearch exposes/’audits’ security issues!G
row incrementally & continually
Additional content sources
Design and feature changes and additions
Content grooming / gardeningUse regular rhythm to debug/tuneDon’t be afraid to customizeBuy what you can, build what you can’tEstablish success early, build onMany search apps will emergeSlide45
OutlineSlide46
46
jeff.fried@ba-insight.net
Q&A
jeff.fried@BAinsight.comSlide47
DEV Track Resourceshttp://www.microsoft.com/visualstudio
http://www.microsoft.com/visualstudio/en-us/lightswitch http://www.microsoft.com/expression/http://blogs.msdn.com/b/somasegar/
http://blogs.msdn.com/b/bharry/
http://www.microsoft.com/sqlserver/en/us/default.aspx
http://www.facebook.com/visualstudio
Slide48
Resources
www.microsoft.com/teched
Sessions On-Demand & Community
Microsoft Certification & Training Resources
Resources for IT Professionals
Resources for Developers
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn
Learning
http://northamerica.msteched.com
Connect. Share. Discuss.Slide49
Complete an evaluation on
CommNet
and
enter to win!Slide50Slide51
©
2011 Microsoft
Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment
on
the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation
. MICROSOFT
MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.