56K - views

Exchange Design Concepts and Best Practices

Boris . Lokhvitsky. MCM | Exchange. Principal Consultant / Delivery Architect. Microsoft Consulting Services. BRK3131. Agenda. Architecture Concepts and Design Principles. Exchange Evolution: from Hardware to Software Powered Solution.

Embed :
Presentation Download Link

Download Presentation - The PPT/PDF document "Exchange Design Concepts and Best Practi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Exchange Design Concepts and Best Practices






Presentation on theme: "Exchange Design Concepts and Best Practices"— Presentation transcript:

Slide1
Slide2

Exchange Design Concepts and Best Practices

Boris LokhvitskyMCM | ExchangePrincipal Consultant / Delivery ArchitectMicrosoft Consulting Services

BRK3131Slide3

Agenda

Architecture Concepts and Design PrinciplesExchange Evolution: from Hardware to Software Powered Solution

Exchange Design

Principles

Availability

and Reliability: role of Critical Dependencies and Redundant Components

Architecture Models: Shared Infrastructure vs. Building Blocks

Modern Servers and Datacenters: Scale Like the Cloud

Design Options: Supported vs. Recommended vs. Structured

People and Process, not just Technology

How

Architecture Drives Design Decisions

Consolidated vs. Distributed Design

Service Site Model: Bound vs. Unbound

All about DAGs:

Sizing,

IP-less DAG, Database

Copy Layout, and Dedicated Replication Network

Virtualization and Role Consolidation

Storage Challenges: SAN/DAS, RAID/JBOD, Thin Provisioning, Native vs. Low Level Data Replication

Server and Storage Platform, Disk and Database Layout

Backups and Lagged Copies

Archiving: Retention vs.

Compliance

Client Breakdown and Penalty FactorsSlide4

Architecture Concepts and Design PrinciplesSlide5

Hardware vs. Software powered solution

Exchange 2003

Shared infrastructure with redundant hardware components

Exchange 2010/2013/2016

Commodity building blocks with software controlled redundancy

New architecture and design principles

I/O Meter

I/O Meter

How much disk performance Exchange database needs?Slide6

Exchange Design Principles

In modern Exchange world software

, not hardware, powers and controls the solution

Availability

Reduce complexity, simplify the solution

Decrease the number of system dependencies

to improve

availability

and lower the risksUse native capabilities where possible as it makes the design simplerDeploy redundant solution components to increase availability and protect the solutionAvoid failure domains: do not group redundant solution components into blocks that could be impacted by a single failureFunctionality / ProductivityEnable and enhance user experience

Provide functionality

and access that is required or expected by the end users

Provide

large low cost mailboxes

Use Exchange as a

single data repositoryIncrease value with Lync and SharePoint integrationBuild a bridge to the cloud – ensure feature rich cloud integration and co-existenceOperations: Optimize People and Process, not just TechnologyDecrease complexity of team collaboration by leveraging solution / workload focused teamsSimplify / optimize administration / monitoring / troubleshooting processCost: Reduce / minimize total cost of the ownership (TCO) for the solutionUse commodity hardware and

leverage native product capabilitiesImplement storage solution that minimizes cost, complexity, and administrative overheadSlide7

Availability and Reliability

Failures

*do*

happen!

Critical

system dependencies

decrease

availabilityDeploy Multi-role

serversAvoid intermediate and extra components (e.g. SAN; network teaming; archiving servers)Simpler design is always better: KISSRedundant components increase availabilityMultiple database copiesMultiple balanced serversFailure domains combining redundant components decrease

availabilityExamples: SAN; Blade chassis; Virtualization

hostsSoftware, not hardware

is driving the solution

Exchange powered replication and managed availability

Redundant transport and Safety Net

Load balancing and proxying to the destinationAvailability principles:DAG beyond the “A”http://blogs.technet.com/b/exchange/archive/2011/09/16/dag-beyond-the-a.aspxSlide8

Classical shared infrastructure design introduces numerous critical dependency components

Relying on hardware requires expensive redundant components

Failure domains reduce

availability and

introduce

significant extra complexity

Shared Infrastructure and Failure DomainsSlide9

Building Block Architecture

Inexpensive commodity servers and storage

Scale the solution out, not in; more servers mean better availability

Nearline

SAS storage: provide large mailboxes by using large low cost drives

Exchange I/O reduced by 93% since Exchange 2003

Exchange 2013 database needs ~10 IOPS; single

Nearline

SAS disk provides ~60 IOPS; single 2.5” 15K rpm SAS disk provides ~230 IOPSRedundancy and availability provided and controlled by Exchange, not by infrastructure

3+ redundant database copies eliminate the need for RAID and backups

Redundant servers eliminate the need for redundant server components

(e.g. NIC teaming or MPIO)

DAG is the ultimate building block allowing you to scale the solutionSlide10

Modern

Server: Commodity Hardware

Google, Microsoft, Amazon, Yahoo! u

se commodity hardware for 10+ years already

Not only for messaging but for other technologies as well (started with search, actually)

Inexpensive commodity server and storage as a building block

Easily

replaceable, highly scalable, extremely cost efficient

Software, not hardware is the brain of the solution

Photo Credit

: Stephen

Shankland

/CNETSlide11

People and Process

(

not just Technology)

Decrease complexity of team collaboration

Simplify administration /

monitoring / troubleshooting

Solution focused teams

Traditional

application team owns only a small piece of the solution / workflowMultiple teams must be engaged to implement the design or troubleshoot the issueTeam organization based on solution, not on

specific infrastructure or technology

simplifies

administration / troubleshooting and reduces operational

costs

Own your solution

!Or… if you can’t do it right, reduce your OpEx by moving to O365 ;)Slide12

Exchange

Product Line Architecture

Exchange

PLA

: Special tightly scoped reference architecture offering from Microsoft Consulting Services

Based on deployment best practices and

collective customer

experience

Structured design based on detailed rule sets to avoid common mistakes and misconceptionsBased on cornerstone design principles:4 database copies across 2 sites

Unbound Service Site model (single namespace)

Witness in the 3rd site

Multi-role servers

DAS storage with NL

SAS or

SATA JBOD configurationL7 load balancing (no session affinity)Large low cost mailboxes (25/50 GB standard mailbox size)Enable access for all internal / external clientsSystem Center for monitoring

Exchange Online Protection for messaging hygieneSupported

Recommended

Structured

Standardized

You should be here!Slide13

How Architecture Drives Design DecisionsSlide14

Consolidated vs. Distributed Design

Consolidated design is usually preferred option

Max reduction of server footprint, deployment costs, and administrative overhead

Simplifies use

of single unified

namespace

Still need 2-3 datacenters for proper implementation of site resilience

Distributed design places servers closer to end users

Optimizes client to server traffic (maybe except the DR scenarios)

Might be driven by regulatory compliance

requirements (keep data on the home soil)

Presents challenges for site resilience / DR

design / firewalls / administration

Choice between client-server vs. server-server trafficSlide15

Bound vs. Unbound Service Site Model

Bound

Service

model

Binds user mailboxes to a preferred datacenter

Uses site specific namespaces

Uses Active/Passive DAG design

Building block: two active/passive DAGs

PLA recommended for Exchange 2010

Unbound Service

model (recommended

)

Users don’t have a preferred datacenter

Allows use of single unified

namespace

Uses Active/Active DAG designBuilding block: one active/active DAGPLA recommended for Exchange 2013Slide16

DAG Sizing: Small, Medium, Large?

DAG size is important because DAG is the building block

General guidance is to prefer larger DAG size

Larger DAGs provide better availability and load balancing

If 1 server with X active mailboxes fails in the N-node DAG, active mailbox count on each server increases only by X/(N-1)

Proper symmetric database copy layout is

important

to

achieve good mailbox load balancingLarger DAGs, however, have disadvantages too

Large DAGs are more vulnerable to network issues

as number of network connections in the N-server DAG is

N*(N-1)/2 (16-node DAG needs 240 P2P connections!)

Intermittent network issues can cause databases misbalanced and DAG nodes evicted; see this article for details:

http://aka.ms/partitioned-cluster-networks

More impact on cluster writes and increased failure zoneScalability planning due to growth also impacts the decisionAdding just a few servers to the existing DAG is hard as it requires database copy layout changesSlide17

Database Copy Layout Principles

Goal: Provide symmetric database copy layout to ensure even load distribution

http://blogs.technet.com/b/exchange/archive/2010/09/10/3410995.aspx

Server3

Failure

Server6

Failure

Slide18

Homeless DAG

New capability in Exchange 2013: DAG without a Cluster Administrative Access Point (a.k.a. IP-less DAG)

http://

blogs.technet.com/b/scottschnoll/archive/2014/02/25/database-availability-groups-and-windows-server-2012-r2.aspx

http://

blogs.technet.com/b/timmcmic/archive/2015/04/29/my-exchange-2013-dag-has-gone-commando.aspx

Recommended and preferred model, default in Exchange 2016Advantages: reduced complexityNo need to deal with cluster network object (CNO) computer account and permissions

No need to reserve and manage DAG IP addresses

Single cluster resource left (File Share Witness)

Disadvantages:

Cannot use Failover Cluster Manager – must use Powershell cluster commands

Might present issues to some 3

rd party applications that use cluster name (e.g. backups) – move away from thoseUseful Powershell cmdlets:Get-Cluster -Name DAG01 | select

*Get-ClusterNode -Cluster DAG01 [-Name SVR01] | select *Get-ClusterNetwork -Cluster DAG01 [-Name

DAGNetwork01] | select *

Get-

ClusterQuorum

-Cluster

DAG01

|

fl

Get-

ClusterGroup

-Cluster

DAG01

Move-

ClusterGroup

-Cluster

DAG01

-Name "Cluster Group" -Node

SVR01

Get-

ClusterLog

–Cluster DAG01Slide19

Network: HA/SR and Replication Network

High Availability (HA)

is redundancy of solution components within a

datacenter

Site Resilience (SR)

is redundancy across

datacenters

providing a DR solution

Both HA and SR are based on native Exchange data replicationEach database exists in multiple copies, one of them activeData is shipped to passive copies via transaction log replication over the network

It is possible to use dedicated isolated network for Exchange data replication

Network requirements for

replication:

Each active  passive database replication stream generates X bandwidth

The more database copies, the more bandwidth is required

Exchange natively encrypts and compresses replication trafficPros and cons for dedicated replication network => Not recommended Replication network can help isolating client traffic from replication traffic Replication network must be truly isolated along the entire data transfer path: having separate NICs but sharing the network path after the first switch is meaningless Replication network requires configuring static routes and eliminating cross talk; this leads to extra complexity and increases risk of human error

If server NICs are 10Gbps capable, it’s easier to have a single network for everythingNo need for network teaming: think of a NIC as JBODSlide20

Introduces additional critical solution component and associated

performance and maintenance

overhead

Reduces availability and introduces extra complexity

Could make sense for

small

deployments helping consolidate workloads – but this

introduces shared infrastructureConsolidated roles is a guidance since Exchange 2010 – and now there is only a single role in Exchange 2016!

Deploying multiple Exchange servers on the same host would create failure domainHypervisor powered high availability is not needed with proper Exchange DAG designsNo real benefits from Virtualization as Exchange provides equivalent benefits natively at the application levelVirtualization vs. Role ConsolidationSlide21

Storage Design Options / Challenges

SAN

 DAS

SAN is NOT faster than DAS

Reduce complexity

No need in expensive redundant high performing intermediate SAN components

SAN concept follows shared infrastructure model, not building block

RAID  JBOD (RBOD)

No need for disk redundancy:

data redundancy is moved to application level

Think of Ex2013 servers as software RAID

RAID is supported but doubles disk

count (assuming RAID-10)

and cost

Enable controller caching: 75/25 write/readMany designs are supported; there are three storage design dimensionsFC  SAS  SATA

Need large disks to provide large mailboxesIn Ex2013 IOPS requirements reduced ~93% from Ex2003!Typical Ex2013 database requires ~10 IOPS7200 rpm LFF (3.5”) SATA/NL-SAS disk provides ~60 IOPS15K rpm SFF (2.5”) SAS/FC disk provides ~230 IOPS

No need for fast but small and expensive high performing disksSlide22

RAID vs. JBOD with Native Replication

Conceptually

similar replication – goal is to introduce redundant copy of the data

Software, not hardware powered

Application aware replication

Enables

each server and associated storage as independent isolated building blockExchange 2013 is capable of automatic reseed using hot spare (no manual actions besides replacing the failed disk!)Finally, cost factor: RAID1/0 requires 2x storage (you still want 4 database copies for Exchange availability)!Slide23

Thick Risks of Thin Provisioning

Exchange mailboxes will grow but they don’t consume that much on Day 1

The desire not to pay for full storage capacity upfront is understood

However, inability to provision more storage and extend capacity quickly when needed is a big risk

Successful thin provisioning requires significant operational maturity and process excellence unseen in the wild

Microsoft guidance and best practice is to use thick provisioning with low cost storage

Incremental provisioning model can be considered a reasonable compromise

Thick Provisioning

Thin Provisioning

Incremental ProvisioningSlide24

Exchange

continuous replication is a

native transactional

replication (

based on transaction

data shipping

)

Database itself is not replicated (transaction logs are played back to target database copy)Each transaction is checked for consistency and integrity before

replay (hence physical corruption cannot propagate)Page patching is automatically activated for corrupted pagesReplication data stream can be natively encrypted and compressed (both settings configurable, default is cross site only)In case of data loss Exchange automatically reseeds or resynchronizes the database (depending on type of loss)If hot spare disk is configured, Exchange automatically uses it for reseeding (like RAID rebuild does)Native vs. Low Level Data ReplicationSlide25

Typical Exchange

Disk Layout

Two mirrored (RAID1) disks for system partition

(OS;

Exchange binaries, transport

queues, logs)

One hot spare disk

Nine

or more RBOD disks (single disk RAID-0) for Exchange databases with collocated transaction logsFour database copies collocated per

disk, not to exceed 2TB database sizeSlide26

Need more? Layout

for more storage

There are servers that can house more than 12 LFF disks (up to 16 with rear bay)

There are already DAS enclosures available that provide 720 TB capacity in a single 4U unit

(90 x 8TB drives)!

Scalability limits: still 100 database copies / server

This means no more than 25 drives @ 4 databases/disk or 50 drives @ 2 databases/diskSlide27

Backup and Recovery

Challenge 1: Backup or Extra copy?

Exchange

Native Data Protection: 3+ highly available database copies

Lagged

copy to protect from unlikely scenarios (logical corruption, admin

error)

Replay Lag Manager with automatic copy conversion

What will you do with your tapes?

Challenge 2: How do I recover data without backup?

Self Service Recovery

to restore items from

Recoverable Items – Deletions

Single Item Recovery to protect items in Recoverable Items and restore them administratively via mailbox search

Large mailboxes can accommodate large “dumpster” – no need for archive mailbox or 3rd party archivingSlide28

Archiving: Retention vs. Compliance

Where is my archive?

10+ years ago archiving enabled offloading Exchange data to the lower cost storage

With large mailboxes on commodity storage it does not make sense anymore

Single data repository is better than multiple solutions and lots of PST files

What do you need archiving for?

There’s still in-place archive a.k.a. online archive which is just a second mailbox available in online mode

Only needed based on client performance considerations or when you cannot extend mailbox capacity

Outlook 2013+ has the magic slider = no impact from large OST file1K  20K  100K  1M items in critical path folders (client has impact too) – are you stubbing? RetentionRetention is to help users delete their data when it is no longer neededRetention tags and retention policiesComplianceCompliance is to prevent users from deleting sensitive data that might be requested by legalLitigation Hold, In-Place Hold to protect and preserve data at rest (in the mailbox)

Data Loss Prevention (DLP) to protect data in transitUnified Compliance Center to combine Exchange/Lync and SharePoint together for eDiscoverySlide29

Client Breakdown

In today’s world, users access Exchange mailboxes from many clients and devices

Cumulative client concurrency can be over 100%

Penalty factors are measured in units of load created by a single Outlook client

Some clients generate more server load than a baseline Outlook client

Penalty factor should be calculated as weighted average across all types of clients

Caveats of this model:

Individual penalty factors are provided for illustration purpose only and should be adjusted based on internal test results, client configurations, vendor guidance and other relevant factors

Penalty factor for BES 5 is based on performance benchmarking guide published by Blackberry

at

http://aka.ms/bes5performance

Penalty factor for Good is based on data published on Good Portal at

http://aka.ms/goodperformance

Continue to monitor system utilization closely and adjust sizing model as necessary as you scale out

Sample client breakdown calculation (for illustration only)Slide30

Exchange 2013 PLA Conceptual Design

Four or more

physical servers

(collocated roles) in each DAG split

symmetrically between

two datacenter sites

Four database

copies (one lagged with lag replay manager) for

HA and SR on DAS storage with JBOD; minimized failure domainsUnbound Service site model with single unified load balanced namespace and Witness in the 3rd datacenterSlide31

Main takeaways

Keep your design simple!

Follow building block architecture principles

Ensure sufficient availability

Do your Exchange design right or go to Office 365!Slide32

Sessions to attend

BRK3197 - Exchange Server Preferred ArchitectureBRK3129 - Deploying Exchange Server 2016BRK3178 - Exchange on IaaS

: Concerns, Tradeoffs, and Best Practices

BRK3173 - Experts Unplugged: Exchange Server Deployment and Architecture

BRK2189 - Desktop Outlook: Evolved and Redefined

BRK3102

- Experts Unplugged: Exchange Server High Availability and Site Resilience

BRK3125 - High Availability and Site Resilience: Learning from the Cloud and FieldBRK3147 - Meeting Complex Security Requirements for Publishing ExchangeBRK3160 - Mail Flow and Transport Deep DiveBRK3163 - Making Managed Availability Easier to Monitor and TroubleshootBRK3180 - Tools and Techniques for Exchange Performance TroubleshootingBRK3186 - Behind the Curtain: Running Exchange OnlineBRK3206 - Exchange Storage for Insiders: It’s ESEBRK4105 - Under the hood with DAGsBRK4115 - Advanced Exchange Hybrid TopologiesSlide33

Pre-Release

Programs

Be first

in line

!

Exchange & SharePoint On-Premises Programs

Customers

get:Early access to new features

Opportunity to shape featuresClose relationship with the product teamsOpportunity to provide feedbackTechnical conference calls with members of the product teamsOpportunity to review and comment on documentationGet selected to be in a program:Sign-up at Ignite at the Preview Program deskORFill out a nomination: http://aka.ms/joinofficeQuestions:Visit the Preview Program desk in the Expo HallContact us at: ignite2015taps@microsoft.com Slide34

Evaluation

Please complete session evaluation!

Contact information

E-mail:

borisl@microsoft.com

Profile:

https

://www.linkedin.com/in/borisl

Social:

https

://www.facebook.com/lokhvitsky

Slide35

Visit

Myignite

at

http://myignite.microsoft.com

or download and use the

Ignite

Mobile

App

with the QR code above.Please evaluate this sessionYour feedback is important to us!Slide36

Thank You!