/
Splunk quick start Splunk quick start

Splunk quick start - PowerPoint Presentation

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
481 views
Uploaded On 2017-10-14

Splunk quick start - PPT Presentation

Mark Runals Sr Security Engineer About Me Have been using Splunk for 2 years ArcSight admin for 3 years medium size deployment Motto Solve for 80 and move on Presentation Focus Caveats ID: 595796

data splunk serverclass index splunk data index serverclass conf user search server servers app action time transforms local props deployment win alert

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Splunk quick start" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Splunk quick start

Mark Runals

Sr

Security EngineerSlide2

About MeHave been using Splunk for ~2 yearsArcSight admin for 3 years medium size deploymentMotto – Solve for 80% and move onSlide3

Presentation Focus / CaveatsFocus:High level tips on architecture and methodologies that have worked for OSU (potentially best practices)Get fundingGettingStartedSpecificUse Cases

ROICaveats:I don’t work for SplunkEveryone’s environment is different

This brief won’t be sufficient to answer all questions =)Slide4

AgendaMisc stuffHow many FTEs are needed?General server architecturePremade contentCommonly used config filesKeeping configuration files updatedIndex creation strategyMisc stuffSlide5

The value of visualizationExternal Threats!!!1!!1!1Top 5 CountriesChinaUnited StatesIndiaBrazil181826223844

Blocked IPs: Action taken on 3,225 external IPs attacking us in the last <timeperiod> Bro Snort1472741691Alerts in the last <timeperiod>

Are you doing this sort of reporting?Slide6

The value of visualizationBlocked IPs: Action taken on 3,225 addresses in the last <timeperiod> Slide7

What is Splunk? / Why use Splunk? Do we need to cover this?Slide8

Internet2 Splunk Deal3 year term license1 TB Maxhttp://www.internet2.edu/products-services/cloud-services-applications/splunk/#service-overviewMore informationSlide9

How many FTEs?Little dataLots of dataComplexityData diversityLog volumeEnvironmental ComplexityWho creates content

User diversityWhat’s your end game?Algorithms I’ve heard1 FTE per 7 servers1 FTE per TB daily volume(not 1:1)Slide10

FTE RequirementsCentrally Managed Service - Large EnvironmentService Work ListNew client interactionOnboard new dataData ManagementKnowledge ManagementDeploying appsTrainingContent CreationTestingTuning SplunkCustomer interactionDeployment Management

PoliticsData requestsGeneral Program ManagementPlanningServices SupportFixing stuffGeneral & random BS

Program &

Service Management

Content Creation

Care and Feeding

1 FTE

2

FTE

3 FTESlide11

Server ArchitectureGraphic from .conf2013 Best Practices: Deploying Splunk on Physical, Virtual and Cloud Infrastructure Slide12

Server ArchitectureFunctional Overview

Search heads

User interacts with Splunk, searches, alerts,

etc

Indexers

I

ngests and stores data, responds to queries

Forwarders

Collects and send data to indexers

Note: a single server can perform all three functions depending on data volumeSlide13

Server ArchitectureGeneral GuidanceCPUs / Cores3 Ghz 12 – 20 total coresGeneral rule of thumb for indexers1 indexer per 100GB of logs (daily throughput)Physical or VirtualVirtual: 20 – 30% in indexing performance reductionStorage: Local

vs SAN vs NAS vs other>> IOPS is a big performance constraint <<Production – if IOPS < 800 you need a different solutionRAID 1+0 arraysWindows or LinuxWindows: 10 – 20% in indexing performance reductionSlide14

Server ArchitectureGrowth Factors1:1 Search to core ratioAdd indexers before search headsMore servers > fewer beefy serversHow much incoming data?How many concurrent active users?Lots of real-time searches?What types of searches?

(similar to FTE questions)Slide15

Content DevelopmentSplunkBaseSplunkBase: great place to get startedApp can fulfill three types of functionsData management (i.e. getting data in)Knowledge management (i.e. define fields)Data visualizationSuggested appsSplunk on Splunk (SoS)Fire BrigadeWindows Security Operations Center

Windows / Nix Apps - at least the TAsDeployment Monitor – (if using Deployment Server & on 5x)Slide16

Splunk ConfigsLots and lots – beyond the scope of this preso Mostly use:inputs.conf – what is ingested: file paths, TCP/UDP ports, scripts. Typically live on forwardersprops/transforms.conf – data management instructions (next slide)

Live on indexers/search headsSlide17

Splunk Configsinputs.confCommon Attributes sourcetype host_segment index disabled ignoreOlderThan crcSaltTells Splunk what data to collectmonitor – directories or specific directories

TCP/UDP – ports listeningbatch – read and then delete datascript – run a local scriptGeneral useexplicit sourcetypingespecially useful on syslog servers (path split by host)where should ‘this’ monitored data gosome troubleshooting usesgood for limiting system loadread Splunk’s doc; especially useful for small filesSlide18

Splunk ConfigsTwo main data management configsProps.confTransforms.confCapabilities (not complete list)Timestamp recognitionLinebreakingHost overrideSourcetype overrideSimple Field Extractions

Complex Field CreationSlide19

Splunk ConfigsProps/Transforms RecommendationsTechnology xprops.conftransforms.conf…/deployment-apps/<group>_<technology>_TA

Place both

config

files in same folder

(why? note DS slides)

Use a common naming convention

Keep in mind alpha sorting

Way to ID the type of

configs

Splunk uses ‘TA’ = Technology

Addon

osu_shibboleth_props

osu_netflow_propsSlide20

Splunk ConfigsField Definitions – props.confRelatively simple search time field extractions via regex[my_sourcetype]EXTRACT-name_field = (?<name>\S+)EXTRACT-device = device_id=(?<device>\S+)Both call transforms.conf

Report = search time fieldsTransforms = index time fields[my_sourcetype]REPORT-<class> = <transforms_stanza_name>TRANSFORMS-<class> = <transforms_stanza_name>Three OptionsNote: defining fields isn’t required to search logsSlide21

Splunk ConfigsField ExtractionDefine fields inline[my_sourcetype]EXTRACT-data_fields = user (?<user>\S+) logged in from (?<device>\S+)[sourcetype_stanza_1]REGEX = user (?<user>\S+) logged in from (?<device>\S+)OR[sourcetype_stanza_1]REGEX = user (\S+) logged in from (\S+)

FORMAT = user::$1 device::$2

props

transforms

transforms

Pro tip: Fields

for new data source

Create search with

rex

Email to SME for validation

Plug into

configs

ProfitSlide22

Splunk ConfigsUse EXTRACT or REPORT?Delimiter based field definitionConcatenate fieldsReuse field extractions across multiple data sources/types Perform additional extraction within a particular fieldSetup configs for multi-value fields (requires use of fields.conf as well)

Generally speaking Extract and Report do the same thing. However there are times to use report to call

transforms.conf

or use

transforms.conf

in generalSlide23

Update ConfigsDo you have anything in-house? Chef, Puppet, Other ?Our ChallengesEach College IT shop is autonomousNothing is standardNo centralized asset management Splunk

Deployment ServerAt what point should you use an automated update mechanism?Forwarders on servers out of your direct controlMore than one indexer or search headMore than a handful of forwardersSlide24

Update ConfigsWhat to manage with Deployment Server?Smaller environmentMore focused on forwarder inputsMedium to Larger environmenteg: multiple indexer or search head servers

Forwarder inputsKeep server configs in synceg: single server indexer/search headSlide25

Update ConfigsSetting up Deployment Server

Can be installed on any Splunk server (ideally not an indexer)Put some content in SPLUNK_HOME/etc/deployment-appsCreate a serverclass.conf file in SPLUNK_HOME/etc/system/localCreate a deploymentclient.conf file on local agent in SPLUNK_HOME/etc/local

Typical

serverclass.conf

*

entry

[

serverClass:some_servers

]

whitelist.0 =

server_name

restartSplunkd

= true

[

serverClass:some_servers:app:some_content

]

Typical

deploymentclient.conf

[

target-broker:deploymentServer

]

targetUri

= splunk_ds.mycompany.com:8089

* $SPLUNK_HOME/

etc

/system/local/

serverclass.confSlide26

Update ConfigsWhitelisting Servers (serverclass.conf)

Options:Hostname

Considerations:

Can use wildcards / regex

Hostname collision (DC1)

Requires upfront list of servers

Did they use a (rational) naming convention?

[

serverClass:psychobotany_servers_win

]

whitelist.0 = psychobotany_dc01

whitelist.n

=

random_server_name

[

serverClass:psychobotany_servers_win:app:win_inputs

]Slide27

Update Config:Whitelisting Servers (serverclass.conf)

Options:HostnameIP address

Considerations:

Can use wildcards / regex

Doesn’t support CIDR

Multiple private IP space?

[

serverClass:psychobotany_servers_win

]

whitelist.0 =

10.10.10.*

[

serverClass:psychobotany_servers_win:app:win_inputs

]Slide28

Update ConfigsWhitelisting Servers (serverclass.conf)

Options:HostnameIP addressclientName string

Considerations:

Can use wildcards / regex

Key to rollout success at OSU

Local

Deploymentclient.conf

[deployment-client]

clientName

= psychobotany_win_dc01

[

serverClass:psychobotany_servers_win

]

whitelist.0 =

psychobotany_win

_*

[

serverClass:psychobotany_servers_win:app:win_inputs

]Slide29

Update ConfigsRandom Deployment Server TipsOne DS can manage ~3k check-ins per minute (Linux)500 check-ins per minute (Windows)Change default phonehome interval via Deployment Server packageGreat for troubleshootingDefault is every 30 secondsCan use DS to manage index.conf file on idx/

shPut technology X props/transforms in same package; deploy to both idx/shSlide30

Update Configs:Splunk Deployment ServerWhy bundle props/transforms together?Both files have settings that might be applied at index or search time Easier to just send updates out once

Set

restartSplunkd

to false to avoid inopportune service restarts

If

initial point of entry is heavy

forwarder and you need to change

index time

fields send the props/transforms file to it –

eg

syslog server

[

serverClass:all_search_heads

]

whitelist.0 = search_head_0*

restartSplunkd

= false

[

serverClass:all_search_heads:app:company_sso_props

]

[

serverClass:all_search_heads:app:company_firewall_props

]

[

serverClass:all_indexers

]

whitelist.0 = indexer_0*

restartSplunkd

= false

[

serverClass:all_indexers:app:company_sso_props

]

[

serverClass:all_indexers:app:company_firewall_props

]Slide31

Index Creation

splunk >

index = ??Slide32

Index CreationGeneralDon’t send data to ‘main’Default out-of-the-box location for dataCreate an alert to let you know when data IS in the main indexGive some consideration to log volumeNo need to be overly granular but can help search performancee.g. finding rare eventsCreate indices with logical / role based boundariesGroups or units, technologies (e.g. database, web, etc)Easiest way to grant permissions to dataUse to set retentionAge out data based on storage or dateSlide33

Index CreationGeneralDon’t send data to ‘main’Default out-of-the-box location for dataCreate an alert to let you know when data IS in the main indexGive some consideration to log volumeNo need to be overly granular but can help search performancee.g. finding rare eventsCreate indices with logical / role based boundariesGroups or units, technologies (e.g. database, web, etc)Easiest way to grant permissions to dataUse to set retentionAge out data based on storage or dateSlide34

Index CreationOSU’s General StrategyColleges1 – 5 admins for entire technology stackPrimary focus – audit complianceLarge variety of log sourcesEasy RBAC!ServersServersIIS

Firewall xFirewall yApacheIDSPsychobotanyXenopsychology

Office of the CIO

Service organization

Dedicated teams at various tiers

RBAC about to become a PITA

DC Firewalls

Server Management

Middleware

Basketweaving

SyslogSlide35

MiscellaneousRandom ThoughtsField creationCan create fields using eval statement in props.confi.e. calculations, case statements, etcShared resource for users?Consider removing user’s schedule search and real-time search abilitySomething to consider based on size/complexity of environmentCreate an app for each groupAbility for each group to create and share content ‘internally’Gives group a sense of ownershipLots of syslog data?Don’t send it directly to the indexers

Receive it on a server and ingest with a local universal or heavy forwarderUniversal forwarder – more efficient with high loadsHeavy forwarder – can adjust index time fields w/o restarting your indexers (ie host field)Slide36

MiscellaneousSplunk Config Order of Precedence

On bootSPLUNK_HOME/etc/default/… SPLUNK_HOME/etc/apps/default/0-9… SPLUNK_HOME/etc/apps/default/a-z….

SPLUNK_HOME/

etc

/apps/local/0-9…

SPLUNK_HOME/

etc

/apps/local/a-z….

SPLUNK_HOME/

etc

/local/…

Quick

Takeways

Upgrades overwrite ../default/.. files

Make all modifications in ../local/..

might mean making a file

Last attribute read in ‘wins’ if exists in multiple

config

filesSlide37

MiscellaneousRandom Admin Queries

Check for agents phoning home (lots of troubleshooting opportunities) index=_internal source=*splunkd_access.log POST phonehomeWatch for packages being installed/uninstalledindex=_internal sourcetype=splunkd

deployedapplication

(removing OR installing OR uninstalling) NOT "removing app at location" |

rex

"

DeployedApplication

- (?<Action>\S+)\

sapp

(\=|\S+\s)(?<App>\S+)" |

eval

Action = case(Action="Removing" , "Removing" , Action="Uninstalling" , "Removing" , Action="Installing" , "Installing" , 1=1,"Fix me") |

rex

"(

Removing|Installing

) app=(?<Version>\S+)" |

eval

Version = if(

isnull

(Version),"5x","-= 6x =-") |

dedup

_time host Action App Version | table _time host Action App Version | sort -_time

Busy agent processing a lot of files

index=_internal "File descriptor cache is full" |

rex

"is full \((?<

fd_limit

>\d+)" | stats count by host,

fd_limit

| sort -

fd_limit

, -countSlide38

MiscellaneousRandom Admin Queries

Check for agents pushing a lot of contentindex=_internal "current data throughput" | rex "Current data throughput \((?<kb>\S+)" | eval rate=case(kb < 500, "256", kb > 499 AND kb < 520, "512", kb > 520 AND kb < 770 ,"768", kb>771 AND kb<1210, "1024", 1=1, "Other") | stats count sparkline by host, rate | where count > 4 | sort -rate,-countCheck for file/folder monitoring permission errorsindex=_internal "permission denied" | stats count by host | sort –count

Alert on missing apps relative to

serverclass.conf

(i.e. spelling issues)

index=_internal source=*

splunkd.log

(component=application OR component=

serverclass

) warn OR errorSlide39

MiscellaneousRandom Admin Queries

Events of Interest (accounts created, deleted, delete command used, etc.)(index=_internal "No space left on device") OR (index=_audit "| delete" NOT "index=*_audit") OR (index=_audit action="login attempt" info=failed sourcetype="audittrail") OR (index=_internal source=*splunkd.log component=serverclass warn NOT "machineTypes in app * is deprecated") OR (index=_audit action=edit_user (operation=create OR operation=remove)) | eval Alert = case(action="

edit_user

" AND operation="create", "User account created", action="

edit_user

" AND operation="remove", "User account deleted", match(_raw, "Unable to load application"), "

Serverclass.conf

issue", match(_raw, "delete"), "Delete used", action="login attempt" AND info="failed", "Failed local login", match(_

raw,"No

space left on device"), "No space on device", 1=1, "fix me" ) |

eval

Message = case(Alert="User account deleted", "User: " .user. " Deleted: " .object, Alert="User account created", "User: " .user. " Created: " .object, Alert="Failed local login", "User: " .user, Alert="Delete used", "User: " .user. " Search: " .search, Alert="

Serverclass.conf

issue", message. " (Probably a spelling issue)", Alert="No space on device", "

Diskspace

or

inodes

issues", 1=1, "fix me") |

eval

a_time

=

strftime

(_

time,"%m

/%d/%y %k %p") | stats count by

a_time

host Alert MessageSlide40

?Slide41

Resourcesrunals.3@osu.edurunals.blogspot.comSplunkBase: apps.splunk.comSplunk Forum: answers.splunk.comSplunk Installation Manual (reference architecture, supported OS, etc)http://docs.splunk.com/Documentation/Splunk/latest/Installation/Whatsinthismanual