/
MIS 5208 Ed Ferrara, MSIA, CISSP MIS 5208 Ed Ferrara, MSIA, CISSP

MIS 5208 Ed Ferrara, MSIA, CISSP - PowerPoint Presentation

phoebe-click
phoebe-click . @phoebe-click
Follow
358 views
Uploaded On 2019-12-02

MIS 5208 Ed Ferrara, MSIA, CISSP - PPT Presentation

MIS 5208 Ed Ferrara MSIA CISSP eferraratempleedu Week 9 Big Data amp Splunk Agenda Chapter 1 Introduction Splunk amp Big Data What is Big Data Alternate Data Processing Techniques Machine Data ID: 768980

alert data search splunk data alert splunk search time report machine type events amp trigger reports security fields log

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "MIS 5208 Ed Ferrara, MSIA, CISSP" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

MIS 5208 Ed Ferrara, MSIA, CISSPeferrara@temple.edu Week 9: Big Data & Splunk

Agenda Chapter 1 Introduction / Splunk & Big DataWhat is Big Data? Alternate Data Processing Techniques Machine Data What is Splunk? Chapter 2 Variety of Data Dealing with Data File & Directories

What is Big Data? The Three Vs Big Data are:High volume High velocity High variety Information assets that require new forms of processing to enable: Enhanced decision making Insight discoveryProcess optimization Volume – Data measured in petabytes Highway sensors Data processing logs Amazon purchase data Velocity – Speed of data generation and frequency of delivery Variety – Difference in the number of data types

BIG DATA Facebook had more than 1B users with more than 618M active on a daily basisLinkedIn had more than 200M members – with the service adding 2 new members every second Instagram members upload 40M photos per day Twitter has 500M users – with the service adding 150K per day Wordpress has more than 40M new posts per day Pandora music streaming service has more than 13,700 years of musicEtc.

Splunk and the Kill Chain There are four classes of data that security teams need to leverage for a complete view : log data binary data (flow and PCAP)threat intelligence data and contextual data. If any of these data types are missing, there’s a higher risk that an attack will go unnoticed. These data types are the building blocks for knowing what’s normal and what’s not in your environment. This single question lies at the intersection of both system availability ( IT operations and application) and security use cases.

Splunk and the Kill Chain Effective data-driven security decisions require: Tens of terabytes of data per day without normalization Access data anywhere in the environment, including: Traditional security data sources P ersonnel time management systemsHR databases I ndustrial control systemsHadoop data stores and custom enterprise applications that run the businessDelivers fast time-to-answer for forensic analysis and can be quickly operationalized for security operations teamsMakes data more available for analysis and helps staff view events in context. https:// www.splunk.com / web_assets /pdfs/secure/ Splunk_for_Security.pdf

Machine Data Machine data contains a definitive record of all the activity and behavior of your customers, users, transactions, applications, servers, networks and mobile devices . Machine data includes: configurations , API dataMessage queuesChange events, Diagnostic command output Call detail recordsSensor data from industrial systemsMachine data comes in an array of unpredictable formats and the traditional set of monitoring and analysis tools were not designed for the variety, velocity, volume or variability of this data. A new approach, one specifically architected for this unique class of data, is required to quickly diagnose service problems, detect sophisticated security threats, understand the health and performance of remote equipment and demonstrate compliance.

Splunk Data Sources

Machine Data Data Type Where What Application Logs Local log files, log4j, log4net, Weblogic , WebSphere, JBoss , .NET, PHP User activity, fraud detection, application performance Business Process Logs Business process management logs Customer activity across channels, purchases, account changes, trouble reports Call Detail Records Call detail records (CDRs), charging data records, event data records logged by telecoms and network switches Billing, revenue assurance, customer assurance, partner settlements, marketing intelligence Clickstream Data Web server, routers, proxy servers, ad servers Usability analysis, digital marketing and general research Data Type Where What Configuration Files System configuration files How an infrastructure has been set up, debugging failures, backdoor attacks, time bombs Database Audit Logs Database log files, audit tables How database data was modified over time and who made the changes Filesystem Audit Logs Sensitive data stored in shared filesystems Monitoring and auditing read access to sensitive data Management and Logging APIs Checkpoint firewalls log via the OPSEC Log Export API (OPSEC LEA) and other vendor specific APIs from VMware and Citrix Management data and log events

Machine Data Data Type Where What Message Queues JMS, RabbitMQ , and AquaLogic Debug problems in complex applications and as the backbone of logging architectures for applications Operating System Metrics, Status and Diagnostic Commands CPU and memory utilization and status information using command-line utilities like ps and iostat on Unix and Linux and performance monitor on Windows Troubleshooting, analyzing trends to discover latent issues and investigating security incidents Data Type Where What SCADA Data Supervisory Control and Data Acquisition (SCADA) Identify trends, patterns, anomalies in the SCADA infrastructure and used to drive customer value Packet/Flow Data tcpdump and tcpflow , which generate pcap or flow data and other useful packet-level and session-level information Performance degradation, timeouts, bottlenecks or suspicious activity that indicates that the network may be compromised or the object of a remote attack

Module Quiz Machine data is always structured?True False Machine data makes up more than ___% of the data accumulated by organizations. 10% 25% 50% 90% False 90%

Module Quiz Machine data can give you insights into:?Application performance Security Hardware monitoring Sales User behavior Machine data is only log files on web servers. True False All of the Above False

Splunk Components

Splunk Components

Index Data Collects data from any sourceData Enters Inspectors decide how to process the data into a consistent format When the indexer finds a match – Splunk tags the data type for future use Events are then stored in Splunk Index

Splunk Event Processing

Search & Investigate Enter a query into the Splunk search bar Run statistics using the Splunk search language Collects and indexes log and machine data from any source Powerful search, analysis and visualization capabilities

Add Knowledge Data classification: Event types and transactions Event types and transactions group together interesting sets of similar events. Event types group together sets of events discovered through searches, while transactions are collections of conceptually-related events that span time. Data interpretation: Fields and field extractions Fields and field extractions make up the first order of Splunk Enterprise knowledge. The fields that Splunk Enterprise automatically extracts from your IT data help bring meaning to your raw data, clarifying what can at first glance seem incomprehensible. The fields that you extract manually expand and improve upon this layer of meaning. Data models Data models are representations of one or more datasets, and they drive the Pivot tool, enabling quick generation of useful tables, complex visualizations, and reports without needing to interact with the Splunk Enterprise search language. Data models are designed by knowledge managers who fully understand the format and semantics of their indexed data. Knowledge Objects Event Types Transactions Tags Saved Searches Lookups

Add Knowledge Data normalization: Tags and aliases Tags and aliases are used to manage and normalize sets of field information. You can use tags and aliases to group sets of related field values together, and to give extracted fields tags that reflect different aspects of their identity. For example, you can group events from set of hosts in a particular location (such as a building or city) together--just give each host the same tag. Or maybe you have two different sources using different field names to refer to same data--you can normalize your data by using aliases (by aliasing  client ip  to  ip address, for example). Data enrichment: Lookups and workflow actions Lookups and workflow actions are categories of knowledge objects that extend the usefulness of your data in various ways. Field lookups enable you to add fields to your data from external data sources such as static tables (CSV files) or Python-based commands. Workflow actions enable interactions between fields in your data and other applications or web resources, such as a WHOIS lookup on a field containing an IP address.Data Models cont.A typical data model makes use of other knowledge object types discussed in this manual, including lookups, transactions, search-time field extractions, and calculated fields.

Monitor & Alert Type of alert Base search is a... Description Alert examples Alerts based on  real-time searches  that trigger  every time  the base search returns a result. Real-time search (runs over all time) Use this alert type if you need to know the moment a matching result comes in. This type is also useful if you need to design an alert for machine consumption (such as a workflow-oriented application). You can throttle these alerts to ensure that they don't trigger too frequently. Referred to as a "per-result alert."  Trigger an alert for every failed login attempt, but alert at most once an hour for any given username.  Trigger an alert when a "file system full" error occurs on any host, but only send notifications for any given host once per 30 minutes.  Trigger an alert when a CPU on a host sustains 100% utilization for an extended period of time, but only alert once every 5 minutes.

Monitor & Alert Type of alert Base search is a... Description Alert examples Alerts based on  historical searches  that run on a regular schedule . Historical search This alert type triggers whenever a scheduled run of a historical search returns results that meet a particular condition that you have configured in the alert definition. Best for cases where immediate reaction to an alert is not a priority. You can use throttling to reduce the frequency of redundant alerts. Referred to as a "scheduled alert."  Trigger an alert whenever the number of items sold in the previous day is less than 500.  Trigger an alert when the number of 404 errors in any 1 hour interval exceeds 100.

Monitor & Alert Type of alert Base search is a... Description Alert examples Alerts based on  real-time searches  that monitor events within a  rolling time "window" . Real-time search Use this alert type to monitor events in real time within a rolling time window of a width that you define, such as a minute, 10 minutes, or an hour. The alert triggers when its conditions are met by events as they pass through this window in real time. You can throttle these alerts to ensure that they don't trigger too frequently. Referred to as a "rolling-window alert."  Trigger an alert whenever there are three consecutive failed logins for a user between now and 10 minutes ago, but don't alert for any given user more than once an hour.  Trigger an alert when a host is unable to complete an hourly file transfer to another host within the last hour, but don't alert more than once an hour for any particular host.

Report & Analyze When you create a search or a pivot that you would like to run again or share with others, you can save it as a report. This means that you can create reports from both the Search and the Pivot sides of Splunk Enterprise . After you create a report you can: Run the report on an ad hoc basis to review the results it returns on the report viewing page. You can get to the viewing page for a report by clicking the report's name on the Reports listing page. Open the report and edit it so that it returns different data or displays its data in a different manner. Your report will open in either Pivot or Search, depending on how it was created . This topic explains how you can create and edit reports.In addition, if your permissions enable you to do so, you can: Change the report permissions to share it with other Splunk Enterprise users. Schedule the report so that it runs on a regular interval. Scheduled reports can be set up to perform actions each time they're run, such as sending the results of each report run to a set of stakeholders. Accelerate slow-completing reports built in Search. Add the report to a dashboard as a dashboard panel . For more information about scheduling reports, see "Schedule reports," in this manual . http:// docs.splunk.com /Documentation/Splunk/6.0.2/Report/Createandeditreports

Splunk User Roles

Module 2 Quiz Which of these is not a main component of Splunk?Collect and Index the data Search and Investigate Add knowledge Compress and Archive The index does not play a major role in Splunk True False Compress and Archive False

Module 2 Quiz Data is broken into single events by:Sourcetype Host Number of files The “-” character Time stamps are stored _____. In a consistent format Differently for each indexed item Differently for each year As Images files Sourcetype In a consistent format

Module 2 Quiz Which role defines what apps a user will see by default:Admin Power User Which two apps ship with Splunk Enteprise DB Connect Search & reporting Sideview Utils Home App Admin Search & Reporting Home App

Installing Splunk Demonstration

Thank you