/
MIS 5208 MIS 5208

MIS 5208 - PowerPoint Presentation

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
584 views
Uploaded On 2017-05-15

MIS 5208 - PPT Presentation

Ed Ferrara MSIA CISSP eferraratempleedu Week 9 Big Data amp Splunk Agenda Chapter 1 Introduction Splunk amp Big Data What is Big Data Alternate Data Processing Techniques Machine Data ID: 548654

splunk data time alert data splunk alert time search report machine events amp type trigger run reports log security

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "MIS 5208" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

MIS 5208

Ed Ferrara, MSIA, CISSPeferrara@temple.edu

Week 9: Big Data &

SplunkSlide2

Agenda

Chapter 1 Introduction / Splunk & Big DataWhat is Big Data?

Alternate Data Processing Techniques

Machine Data

What is Splunk?

Chapter 2

Variety of Data

Dealing with Data

File & DirectoriesSlide3

What is Big Data? The Three Vs

Big Data are:High volume

High velocity

High variety

Information assets that require new forms of processing to enable:

Enhanced decision making

Insight discoveryProcess optimization

Volume – Data measured in petabytes

Highway sensors

Data processing logs

Amazon purchase data

Velocity – Speed of data generation and frequency of delivery

Variety – Difference in the number of data typesSlide4

BIG DATA

Facebook had more than 1B users with more than 618M active on a daily basisLinkedIn had more than 200M members – with the service adding 2 new members every second

Instagram members upload 40M photos per day

Twitter has 500M users – with the service adding 150K per day

Wordpress

has more than 40M new posts per day

Pandora music streaming service has more than 13,700 years of musicEtc.Slide5

Splunk and the Kill Chain

There are four classes of data that security teams need to leverage for a complete view

:

log data

binary

data (flow

and PCAP)threat intelligence data

and contextual data.

If

any

of these data types are missing, there’s a higher risk that an attack will go unnoticed.

These data types are the building blocks

for knowing

what’s normal and what’s not in your environment.

This single

question lies at the intersection of both system

availability (

IT operations and application) and security use cases.Slide6

Splunk and the Kill Chain

Effective data-driven security decisions

require:

Tens

of terabytes of data per day without

normalization

Access data anywhere in the environment, including: Traditional

security data

sources

P

ersonnel

time management systemsHR databases

I

ndustrial

control systemsHadoop data stores and custom enterprise applications that run the businessDelivers fast time-to-answer for forensic analysis and can be quickly operationalized for security operations teamsMakes data more available for analysis and helps staff view events in context.

https://

www.splunk.com

/

web_assets

/pdfs/secure/

Splunk_for_Security.pdfSlide7

Machine Data

Machine data contains a definitive record of all the activity and behavior of your customers, users, transactions, applications, servers, networks and mobile devices

.

Machine data includes:

configurations

,

API dataMessage queuesChange

events,

Diagnostic command output

Call

detail

recordsSensor

data from

industrial

systemsMachine data comes in an array of unpredictable formats and the traditional set of monitoring and analysis tools were not designed for the variety, velocity, volume or variability of this data. A new approach, one specifically architected for this unique class of data, is required to quickly diagnose service problems, detect sophisticated security threats, understand the health and performance of remote equipment and demonstrate compliance.Slide8

Splunk Data SourcesSlide9

Machine Data

Data Type

Where

What

Application Logs

Local log files, log4j, log4net,

Weblogic

, WebSphere,

JBoss

, .NET, PHP

User activity, fraud detection, application performance

Business Process Logs

Business process management logs

Customer activity across channels, purchases, account changes, trouble reports

Call Detail Records

Call detail records (CDRs), charging data records, event data records logged by telecoms and network switches

Billing, revenue assurance, customer assurance, partner settlements, marketing intelligence

Clickstream Data

Web server, routers, proxy servers, ad servers

Usability analysis, digital marketing and general research

Data Type

Where

What

Configuration Files

System configuration files

How an infrastructure has been set up, debugging failures, backdoor attacks, time bombs

Database Audit Logs

Database log files, audit tables

How database data was modified over time and who made the changes

Filesystem Audit Logs

Sensitive data stored in shared filesystems

Monitoring and auditing read access to sensitive data

Management and Logging APIs

Checkpoint firewalls log via the OPSEC Log Export API (OPSEC LEA) and other vendor specific APIs from VMware and Citrix

Management data and log eventsSlide10

Machine Data

Data Type

Where

What

Message Queues

JMS,

RabbitMQ

, and

AquaLogic

Debug problems in complex applications and as the backbone of logging architectures for applications

Operating System Metrics, Status and Diagnostic Commands

CPU and memory utilization and status information using command-line utilities like

ps

and

iostat

on Unix and Linux and performance monitor on Windows

Troubleshooting, analyzing trends to discover latent issues and investigating security incidents

Data Type

Where

What

SCADA Data

Supervisory Control and Data Acquisition (SCADA)

Identify trends, patterns, anomalies in the SCADA infrastructure and used to drive customer value

Packet/Flow Data

tcpdump

and

tcpflow

, which generate

pcap

or flow data and other useful packet-level and session-level information

Performance degradation, timeouts, bottlenecks or suspicious activity that indicates that the network may be compromised or the object of a remote attackSlide11

Module Quiz

Machine data is always structured?True

False

Machine data makes up more than ___% of the data accumulated by organizations.

10%

25%

50%

90%

False

90%Slide12

Module Quiz

Machine data can give you insights into:?Application performance

Security

Hardware monitoring

Sales

User behavior

Machine data is only log files on web servers.

True

False

All of the Above

FalseSlide13

Splunk ComponentsSlide14

Splunk ComponentsSlide15

Index Data

Collects data from any sourceData Enters

Inspectors decide how to process the data into a consistent format

When the indexer finds a match – Splunk tags the data type for future use

Events are then stored in

Splunk IndexSlide16

Splunk Event Processing Slide17

Search & Investigate

Enter a query into the Splunk search bar

Run statistics using the Splunk search

language

Collects and indexes log and machine data from any

source

Powerful

search, analysis and visualization

capabilitiesSlide18

Add Knowledge

Data classification: Event types and transactions

Event types and transactions group together interesting sets of similar events.

Event types group together sets of events discovered through searches, while transactions are collections of conceptually-related events that span time.

Data interpretation: Fields and field extractions

Fields and field extractions make up the first order of Splunk Enterprise knowledge.

The fields that Splunk Enterprise automatically extracts from your IT data help bring meaning to your raw data, clarifying what can at first glance seem incomprehensible.

The fields that you extract manually expand and improve upon this layer of meaning.

Data models

Data models are representations of one or more datasets, and they drive the Pivot tool, enabling quick generation of useful tables, complex visualizations, and reports without needing to interact with the Splunk Enterprise search language.

Data models are designed by knowledge managers who fully understand the format and semantics of their indexed data.

Knowledge Objects

Event Types

Transactions

Tags

Saved Searches

LookupsSlide19

Add Knowledge

Data normalization: Tags and

aliases

Tags and aliases are used to manage and normalize sets of field information.

You

can use tags and aliases to group sets of related field values together, and to give extracted fields tags that reflect different aspects of their identity.

For

example, you can group events from set of hosts in a particular location (such as a building or city) together--just give each host the same tag.

Or

maybe you have two different sources using different field names to refer to same data--you can normalize your data by using aliases (by aliasing 

client

ip

 to 

ip

address, for example).

Data

enrichment: Lookups and workflow

actions

Lookups

and workflow actions are categories of knowledge objects that extend the usefulness of your data in various ways.

Field

lookups enable you to add fields to your data from external data sources such as static tables (CSV files) or Python-based commands.

Workflow

actions enable interactions between fields in your data and other applications or web resources, such as a WHOIS lookup on a field containing an IP address.Data Models cont.A typical data model makes use of other knowledge object types discussed in this manual, including lookups, transactions, search-time field extractions, and calculated fields.Slide20

Monitor & Alert

Type of alert

Base search is a...

Description

Alert examples

Alerts based on 

real-time searches

 that trigger 

every time

 the base search returns a result.

Real-time search (runs over all time)

Use this alert type if you need to know the moment a matching result comes in. This type is also useful if you need to design an alert for machine consumption (such as a workflow-oriented application). You can throttle these alerts to ensure that they don't trigger too frequently. Referred to as a "per-result alert."

 Trigger an alert for every failed login attempt, but alert at most once an hour for any given username.

 Trigger an alert when a "file system full" error occurs on any host, but only send notifications for any given host once per 30 minutes.

 Trigger an alert when a CPU on a host sustains 100% utilization for an extended period of time, but only alert once every 5 minutes.Slide21

Monitor & Alert

Type of alert

Base search is a...

Description

Alert examples

Alerts based on 

historical searches

 that run on a

regular schedule

.

Historical search

This alert type triggers whenever a scheduled run of a historical search returns results that meet a particular condition that you have configured in the alert definition. Best for cases where immediate reaction to an alert is not a priority. You can use throttling to reduce the frequency of redundant alerts. Referred to as a "scheduled alert."

 Trigger an alert whenever the number of items sold in the previous day is less than 500.

 Trigger an alert when the number of 404 errors in any 1 hour interval exceeds 100.Slide22

Monitor & Alert

Type of alert

Base search is a...

Description

Alert examples

Alerts based on 

real-time searches

 that monitor events within a 

rolling time "window"

.

Real-time search

Use this alert type to monitor events in real time within a rolling time window of a width that you define, such as a minute, 10 minutes, or an hour. The alert triggers when its conditions are met by events as they pass through this window in real time. You can throttle these alerts to ensure that they don't trigger too frequently. Referred to as a "rolling-window alert."

 Trigger an alert whenever there are three consecutive failed logins for a user between now and 10 minutes ago, but don't alert for any given user more than once an hour.

 Trigger an alert when a host is unable to complete an hourly file transfer to another host within the last hour, but don't alert more than once an hour for any particular host.Slide23

Report & Analyze

When you create a search or a pivot that you would like to run again or share with others, you can save it as a report. This means that you can create reports from both the Search and the Pivot sides of

Splunk

Enterprise

. After

you create a report you can:

Run the report on an ad hoc basis to review the results it returns on the report viewing page. You can get to the viewing page for a report by clicking the report's name on the Reports listing page.

Open

the report and edit it so that it returns different data or displays its data in a different manner. Your report will open in either Pivot or Search, depending on how it was created

.

This

topic explains how you can create and edit

reports.In

addition, if your permissions enable you to do so, you can:

Change the report permissions to share it with other Splunk Enterprise users. Schedule the report so that it runs on a regular interval. Scheduled reports can be set up to perform actions each time they're run, such as sending the results of each report run to a set of stakeholders.

Accelerate slow-completing reports built in Search.

Add

the report to a dashboard as a dashboard panel

. For

more information about scheduling reports, see "Schedule reports," in this manual

.

http://

docs.splunk.com

/Documentation/Splunk/6.0.2/Report/CreateandeditreportsSlide24

Splunk User RolesSlide25

Module 2 Quiz

Which of these is not a main component of Splunk?Collect and Index the data

Search and Investigate

Add knowledge

Compress and Archive

The index does not play a major role in Splunk

True

False

Compress and Archive

FalseSlide26

Module 2 Quiz

Data is broken into single events by:Sourcetype

Host

Number of files

The “-” character

Time stamps are stored _____.

In a consistent format

Differently for each indexed item

Differently for each year

As Images files

Sourcetype

In

a consistent

formatSlide27

Module 2 Quiz

Which role defines what apps a user will see by default:Admin

Power

User

Which two apps ship with Splunk

Enteprise

DB Connect

Search & reporting

Sideview

Utils

Home App

Admin

Search & Reporting

Home AppSlide28

Installing Splunk

DemonstrationSlide29

Thank you