/
Operational Excellence in Operational Excellence in

Operational Excellence in - PowerPoint Presentation

olivia-moreira
olivia-moreira . @olivia-moreira
Follow
345 views
Uploaded On 2019-11-22

Operational Excellence in - PPT Presentation

Operational Excellence in IT Service Management Mehmet Özgür Depren Technical Sales Manager IBM Middleware The Next IT Operations Focus Big Data Focus on operational objectives has seen significant uptick since 2013 ID: 766800

analytics ibm events operations ibm analytics operations events service data event application amp management search analysis performance predict insight

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Operational Excellence in" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Operational Excellence in IT Service Management Mehmet Özgür Depren Technical Sales Manager - IBM Middleware

The Next IT Operations Focus: Big Data “ Focus on operational objectives has seen significant uptick since 2013 ”

Autonomic Operations Developer Productivity Deep Compression pureXML pureScale Pervasive Content Stream Computing Content Analytics Advanced Case Management Workload Optimized Systems Social Analytics/Consumer Insight 2015 2005 Decision Management More than $17B in Acquisitions Since 2005; more than any other company Most comprehensive portfolio, from business to IT Analytics, while most other vendors offer only point solutions C&SI’s suite of analytics products leverage best of breed capabilities from across all of IBM’s portfolio IBM Continues to Invest Heavily in Analytics

IT Operations Analytics Solves New Challenges Reducing & Preventing Outages and Slowdowns for the 24/7 Application World IT Operations Analytics can help Isolate the problem through analysis of all your IT data Never set performance threshold manually again Identify potential issues before customers are impacted 1 2 3 End users Devices Web Servers App Servers Databases The Network

Applications | Systems | Workloads | Wireless | Network | Voice | Security | Mainframe | Storage | Assets Business Outcome Capabilities IBM Big Data Platform IBM or 3 rd Party Solutions Operational Environment Optimize Optimize across your IT app infrastructure   Search Search quickly across massive amounts of data Faster Problem Resolution Predict Predict problems before they occur Proactive Outage Avoidance Optimized Performance Rave SPSS InfoSphere BigInsights Watson Streams Cloud Insights Understanding IBM Operations Analytics Operations Analytics Application Performance Alerts, Alarms & Events System & Log Monitoring Documentation Transactions Assets & Workorders

Barclay’s Bank was able to s earch and diagnose problems 60% faster   to quickly resolve application and infrastructure issues. In addition, they identified customer patterns from log data and applied this to channel intelligence Search Predict problems before they become service impacting Our Capabilities Diagnose application & infrastructure issues using all your operational data Predict 20% Reduction in storage requirements over competitive offerings #1 Leadership position in Operations Management solutions 60% Faster creation of custom high impact mobile ready operations dashboards 5 0% Faster application diagnostics W h y IBM? 30% Reduction in operator event load Avoid Outages While Reducing Threshold Management Costs Resolve Problems Faster Consolidated Communications detects 100 percent of their major incidents, including silent failures, and eliminated the human intensive task of managing manual thresholds, saving $300,000 annually Optimize Ensure your IT infrastructure is operating as efficiently as possible environments Analytics Improve Operational Efficiency IBM Solution for IT Operations Analytics Advanced events analytics has allowed Claranet to reduce the number of trouble tickets and focus more time and resources on what truly matters to their customers.

IBM Operations Analytics – Predictive Insights Challenge : Reacting to performance thresholds is not enough. IT Staffs must become proactive to ensure mission critical apps never go down. Anomaly Detection Alerting before potential issues become service impacting, enabling IT to shift from reactive to proactive Automated Threshold Maintenance No complex manual intervention to setup & maintain with 5 times faster processing On-Prem and SaaS Predictive Insights now available as a Service, providing additional value to our Performance Management solutions Supports Heterogeneous Environments Out-of-the-box integrations to IBM APM/ITM or 3 rd -party monitoring solutions Predict

Why aren’t operations teams proactive today? If no there is no ‘early detection’ before the outage, operations teams can only react while outage is already in effect and already losing money... Too much data to analyze manually Existing analytic techniques, such as standard thresholds, are not up to the task They cannot detect problems while they are emerging (before business impact) Set performance threshold too high, insufficient warning before total failure. Set performance threshold too low, too much noise, everything is ignored

Learn relationships between metrics without static thresholds 9 Predicative Insights learns the normal historical range It will alarm if it falls outside this range Watson DNA inside

European Telco – Flatline Targeting Situation Detections Customer Relationship Management System for large Telco. 100 applications monitored by Compuware System. (40 million metrics) In this Example the regular load on one of the servers has changed indicating application problem. Stopped (crashed) Application - Regular load absent.

European Gambling Website – Adaptive Threshold High disk latency Automated Dynamic Thresholds and Early Detection A gambling Website application monitored by HP . Coming up to busy sporting event traffic increased causing stress on the system and negative customer experience. Using PI early detection of latency issue could have been tackled to avoid this.

Large US Bank– Adaptive Threshold Automated Dynamic Thresholds and Early Detection These are Websphere metrics taken from CAWily performance management system. . The number of actual connections to the WebSphere application server has increased dramatically. The poolsize and bytesInUse are also affected indicating either increased demand, or a problem with connections not being freed up. Insight Poolsize and Bytesinuse on the same node are also behaving anomalous at the same time and are related to each other. Connection Leak

European Bank – Significant trend. Targeting Situation Detections File server under stress as file control operations and bytes per second increase. This sudden change can be tracked back to a patch applied. Disk Thrashing

IBM ITM/TDD & IBM APM HP BAC, Topaz Aircom Optima IBM TNPM A Sample of technologies Predictive Insights integrates with IBM OMEGAMON

Predictive Insights as a Service Performance Management + Predictive Insights Integrated threshold automation and maintenance Anomaly detection Get ahead of potential application and resource outages Learn, Explore, and Try Continuous Delivery

Predict IBM Operations Analytics – Log Analysis Challenge : To diagnose service problems in applications and the infrastructure supporting them involves quickly analyzing incredible amounts of both structured and unstructured data Expert Advice Any competitor can isolate problems. IBM helps clients quickly resolve them. Breadth of Searchable Data Search across all of your IT operational data to quickly resolve issues Mainframe Support Search System z (zLinux & zOS) logs in addition to all your other data Embedded Analytics Out-of-the-box integrations to IBM APM/ITM or 3 rd -party monitoring solutions Search

Collects large volumes of structured and semi-structured data and transforms it through analytics into actionable intelligence. IT Operations App Support Service Desk Search and Visualize Insight Packs Search IBM Operations Analytics – Log Analysis Logs Metrics Events Documentation Normalize Consolidate Collect

Application owner : I got a trouble ticket on my application. I want to quickly find the root cause and fix it and restore app/service ASAP Logs, Traces,.. Events Metrics Transactions Config [10/9/12 5:51:38:295 GMT+05:30] 0000006a servlet E com.ibm.ws.webcontainer.servlet.ServletWrapper service SRVE0068E: Core files 010001100011100001110011000111110000110001 111111000110011100011 Current Challenge : large volume of data to collect and analyze , manual correlation taking days/hours to find the root cause of the problem. Cannot find logs for problem window situations. Highly dependent on SME skills. Its an art

Solution: IBM Operations Analytics – Log Analysis can provide insights from all data in clicks. App owner can search through the data, leverage Dashboards to find the root cause in minutes [10/9/12 5:51:38:295 GMT+05:30] 0000006a servlet E com.ibm.ws.webcontainer.servlet.ServletWrapper service SRVE0068E: Uncaught exception created in one of the service methods of the servlet TradeAppServlet in application DayTrader2-EE5. Exception created : javax.servlet.ServletException : TradeServletAction.doSell (...) Events logs Expert knowledge metrics Tickets IBM Operations Analytics Log Analysis Tx# date status 108978 23-Jul-2013 started 108978 23-Jul-2013 To IN Transaction details from App DB Application owner : I got a trouble ticket on my app. I want to quickly find the root cause, fix it and restore service ASAP

Out of the Box Insight Packs Out of the Box Insight Packs (IBM Provided) IBM Websphere Application Server IBM DB2Web Access Logs Windows EventsSysLogJava CoreIBM MQ Series IBM Integration Bus (Message Broker)Delimiter Separated Value (DSV) log filesPartner Provided – Microsoft Sharepoint, Microsoft Exchange, Microsoft SQL Server, Microsoft Active Directory Tivoli Storage Manager IBM Systems Disk Storage 8000 IBM AIX Errpt IBM HTTP Server HP LiveSite , HP TeamSite Oracle Database VM Ware ESXi Oracle Siebel https://developer.ibm.com/itoa/

IBM Netcool Operations Insight Modern Dashboards, Fully Mobile Visualize the performance and health of your entire operations environment. Out of the box Integration 98% Reduction in Critical events: ~22 critical & ~100 major events per week Improved focus and utilization of first- and second-line staff Analytics to increase event value v1.1 v1.2 v1.3 30% reduction in Events to Operations Almost 50% reduction in repeating events 90% reduction for known event classes Optimize

Report on event history identifies seasonal events sorted by confidence level and frequency Drill down shows time distributions of events …investigate peaks. Can better align thresholds to seasonal peaks reducing events Event Analytics – Seasonal Event Identification Improve efficiency by identifying and resolving recurring problems Large Bank 7% of Priority 1 Tickets were raised by events that were highly seasonal 30% of lower severity tickets

Seasonality Analysis of events MS SCOM Health Service Heartbeat failures happen often on Sunday 06.00am, probably due to regular maintenance A specific Oracle database is not accessible every day at 21.00pm, probably due to a daily restart or backup A node is giving file system alerts every day around 01.00am, probably due to a daily batch job 1 3 2

Related Events Grouping Relationships I know about Out of the box domain expertise for known event relationships Vendor and technology dependent Significant reduction of incidents presented to the operator Extendable by Business Partners and clients with no coding required Known Event Analysis Grouping and Correlation providing powerful situation management of active events

Event Analytics –Related Event Analytics Improve efficiency - Reduce actionable events by grouping events that always occur together Leverages machine learning to analyze historical event archive and identify groups of events that always occur together Presents identified relationship to the Administrator Presents proposed automated actions Watch, Deploy, Archive or Do nothing Groups events in the Event Viewer Automatic detection of event clusters Relationships I don’t know about “It is very beneficial to have a tool that can turn historical event data into an event group with a single root event. It helps us turn the data into logic” Increase operator efficiency by up to 90% with out-of-the-box alert reduction and advanced alert analytics

Data Correlation Integration Predictive Analytics Real-time Analytics and Visualization Visibility Control Automation Problem Isolation Optimization Insight & Care Outage avoidance Future of Service Management

Thank You

Consolidated Communications avoids network outages and improves customer service 28 28 Need Monitoring a customer base of 250k access lines, 125k Internet, and 30k video is a challenge Managing manual thresholds within this networking environment is a nightmare Benefits Using SmartCloud Analytics, b ehavioral learning techniques generate alerts automatically when something is not normal Enable earlier detection and insight into issues not detected by existing monitoring systems Easily obtain impact analysis into how the network copes with various failure conditions “IBM SmartCloud Analytics helped detect 100 percent of the major incidents that occurred, including silent failures, and helped us eliminate manual thresholds, which will result in a cost avoidance of $300K USD annually” - Chris Smith, Director Tools and Automation Consolidated Communications Holdings, Inc. Predict

Telefonica de Peru performs faster root cause analysis while avoiding many service disruptions 29 29 Need Customer satisfaction for this communications provider of mobile, fixed-line and television services was critical Company wanted to increase availability of services by reducing overall time to resolve problems. They also wished to minimize service-impacting events Benefits Using IBM Netcool Operations Insight and Operations Analytics, Telefonica de Peru was able to predict problems before they impacted service Diagnostics associated with value added services was also improved through detailed analysis of log sources Overall, the client increased service availability while reducing operating costs Predict, Search & Optimize

China Merchants Bank selects IBM Operations Analytics 30 30 The Need Suffering service outages on front-end transactions – each outage was costing the business significantly Needed a robust, scalable analytics platform to manage events from these transactions Solution IBM Operations Analytics - Log Analysis displaced Splunk to collect and analyze event and system resource data to reduce mean-time to repair IBM Operations Analytics - Predictive Insights for self learning behavior on numerous KPIs to prevent outages from occurring How we won Led with a solution approach for all operations analytics capabilities Deep understanding of the client’s goals to cater a specific solution Search & Predict

Claranet – a Managed Service Provider Implemented Netcool Operations Insight Centralized event & alert management Single support team across many solutions Greater visibility into all collected data: events, alerts, performance Improves customer efficiency with specific analytical views