[go: up one dir, main page]

0% found this document useful (0 votes)
311 views11 pages

Splunk PDF

Splunk and ArcSight are two popular security information and event management (SIEM) tools. While Splunk offers simple dashboards, reports, and searches, ArcSight provides more powerful correlation and filtering capabilities. ArcSight also has stronger log management features, supports parsing over 400 log sources out of the box, and may have more cost-effective licensing compared to Splunk. However, Splunk indexing is faster for some data types and its interface is more intuitive for simple tasks. Overall, ArcSight appears to be the preferred option for more advanced security analytics and compliance needs.

Uploaded by

C G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
311 views11 pages

Splunk PDF

Splunk and ArcSight are two popular security information and event management (SIEM) tools. While Splunk offers simple dashboards, reports, and searches, ArcSight provides more powerful correlation and filtering capabilities. ArcSight also has stronger log management features, supports parsing over 400 log sources out of the box, and may have more cost-effective licensing compared to Splunk. However, Splunk indexing is faster for some data types and its interface is more intuitive for simple tasks. Overall, ArcSight appears to be the preferred option for more advanced security analytics and compliance needs.

Uploaded by

C G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Splunk VS ArcSight

I too have used both quite extensively, and for a time I too would have gone with Splunk as my
preferred option; however, this is not the case anymore, and is thoroughly explained here. I have
used both Splunk & ArcSight from an analyst, admin, content creation and engineering role:
Content Creation:
Splunk’s unstructured approach to data ingestion is marvelous for truly unique data sources (home-
brew applications, or really lazy system admins). Yet this presents challenges as it relates to
correlation, which is explained further down. The ability to create simple rules, simple reports, and
simple interactive dashboards is better in Splunk than ArcSight; which is a great confidence booster,
however it’s the ability to correlate or rather the simplicity in which ESM is able to create an
advanced correlated rule (well advanced for Splunk, not so much with ESM) is where ESM truly
shines.
While ESM does employ the java-based console, and while it may look like it’s straight out of 1997,
the same task that in Splunk would require almost 2 paragraphs worth of a query can be
accomplished with ESM in about 7-to-8 clicks of a mouse. Don’t even get me started on active lists
and session lists, along with the ability to have rules dynamically update them. This is why the usage
of filters in ESM is exponentially more powerful than Splunk. I could create a search marco in Splunk
(akin to an ESM filter), but having to remember the name of it and viewing a totally separate web
page, when I could more simply select “Filters” from my Navigator pane in ESM only adds to the
time wasted in creating content with Splunk. Plus, these filters in ESM are able to be used in
interactive dashboards, reports, trends, etc…
Analyst:
The ability to use free-text search “Google-like search capabilities” actually exists in both Splunk and
ESM, albeit ESM’s web interface (aka Command Center) and Logger (which is more similar to
Splunk than ESM). Splunk does pull ahead here, but not IMHO far beyond ArcSight. Also, in ESM if
an item is of interest I can simply right click and investigate or leverage an integration command to
conduct a whole host of activities from a single pane of glass. Since Logger is actually more similar
to Splunk, I would be offering you all a great disservice by not mentioning the speed of super-
indexed fields within Logger. Now I use to not be such a big Logger fan, but if I was stuck in the
trenches of a cyberwar, I would go with Logger hands down simply for the super-indexed search.
Case management is built in with ESM, so the tracking of progress in an investigation is paramount.
Splunk gets ZERO points here. The right-click investigate capability really allows an analyst to follow
the breadcrumbs during either the initial triage stages or even hunting. The integration commands
open up worlds of possibilities for PCAP retrieval, blocking of IPs, kicking off Cuckoo, hell you can
even have an integration command to brew coffee!
Admin:
Both Splunk and ArcSight offer great administration capabilities, and would be evenly split if it
weren’t for one huge fact….you can delete logs in Splunk. Granted it takes an intent to reconfigure,
but this ability completely makes Splunk unworthy of a log management solution. The most
important aspect of any log management solution, is the admissibility in a court of law if necessary.
With this, Splunk fails.
Engineering:
Finally, the title bout. While Splunk does pull ahead in the simplicity (but this is because of its
unstructured data approach), this means there are less moving parts and thus less things to go
wrong. However, with ArcSight Smart Connectors you are able to successfully parse over 400
sources natively. Even if a smart connector hasn’t been designed to incorporate your log source, you
still have flex connectors that are able to ingest, parse, and normalize XMLs, CSVs, JSONs, Syslog,
regex, databases; once that information is ingested and structured, the searching for such data
(especially on super-indexed fields in Logger) is at least 3-5 times faster than Splunk. However,
ArcSight has a product known as ArcMC which is used to maintain, upgrade, and administer not just
all of these smart/flex connectors, but Logger as well. Want to send logs to a new destination in
structured CEF format, just add it. In fact, this is the ONLY way that we were able to get Splunk to
mirror a fraction of ESM’s correlation capabilities. Now, if I need ArcSight to make Splunk work
effectively, why they hell am I going to pay for both?!?!?
Which brings me to cost. After looking at the licensing structure for both, ArcSight’s pulls ahead; this
is due to the duplicitous nature of Splunk’s licensing. I think we can agree that licensing should be
based on what you use as an average, right? Why should I pay for something during my peak
times? What happens if there is a DDoS and my peak data ingest rate skyrockets?!?!? What if my
log collection grows to a metric fμck ton, then I am going to be paying out the wazoo for Splunk. With
ArcSight, at least I can filter and aggregate on my smartconnectors BEFORE it’s ingested. Now I can
streamline my data feeds, get rid of what I don’t want, and aggregate items to use less space and
create higher fidelity alerts. Oh, and I can manage all of that from ArcMC. Just sayin’….

What is Splunk?
Splunk is ‘Google’ for our machine-generated data. It’s a software/engine that can be
used for searching, visualizing, monitoring, reporting, etc. of our enterprise data. Splunk
takes valuable machine data and turns it into powerful operational intelligence by
providing real-time insights into our data through charts, alerts, reports, etc.

What are the common port numbers used by Splunk?


Below are the common port numbers used by Splunk. However, we can change them if
required.
Service Port Number Used
Splunk Web port 8000
Splunk Management port 8089
Splunk Indexing port 9997
Splunk Index Replication 8080
port
Splunk Network port 514 (Used to get data from the Network port, i.e., UDP
data)
KV Store 8191
What are the components of Splunk? Explain Splunk architecture.
Below are the components of Splunk:

• Search Head: Provides the GUI for searching


• Indexer: Indexes the machine data
• Forwarder: Forwards logs to the Indexer
• Deployment Server: Manges Splunk components in a distributed environment

Which is the latest Splunk version in use?


Splunk 8.0.4.1

What is Splunk Indexer? What are the stages of Splunk Indexing?


Splunk Indexer is the Splunk Enterprise component that creates and manages indexes.
The primary functions of an indexer are:

• Indexing incoming data


• Searching the indexed data
• Picture

What is a Splunk Forwarder? What are the types of Splunk Forwarders?


There are two types of Splunk Forwarders as below:

• Universal Forwarder (UF): The Splunk agent installed on a non-Splunk system to


gather data locally; it can’t parse or index data.
• Heavyweight Forwarder (HWF): A full instance of Splunk with advanced
functionalities.

It generally works as a remote collector, intermediate forwarder, and possible data filter,
and since it parses data, it is not recommended for production systems.

Enroll in our Splunk Course in London to get a clear understanding of Splunk!

Can you name a few most important configuration files in Splunk?


• props.conf
• indexes.conf
• inputs.conf
• transforms.conf
• server.conf

What are the types of Splunk Licenses?

• Enterprise license
• Free license
• Forwarder license
• Beta license
• Licenses for search heads (for distributed search)
• Licenses for cluster members (for index replication)

What is Splunk App?


Splunk app is a container/directory of configurations, searches, dashboards, etc. in
Splunk.

Where is Splunk Default Configuration stored?


$splunkhome/etc/system/default

What are the features not available in Splunk Free?


Splunk Free does not include below features:

• Authentication and scheduled searches/alerting


• Distributed search
• Forwarding in TCP/HTTP (to non-Splunk)
• Deployment management

What happens if the License Master is unreachable?


If the license master is not available, the license slave will start a 24-hour timer, after
which the search will be blocked on the license slave (though indexing continues).
However, users will not be able to search for data in that slave until it can reach the license
master again.

What is Summary Index in Splunk?


A summary index is the default Splunk index (the index that Splunk Enterprise uses if we
do not indicate another one).

If we plan to run a variety of summary index reports, we may need to create additional
summary indexes.

Learn more about Splunk from this Splunk Training in New York to get ahead in
your career!

What is Splunk DB Connect?


Splunk DB Connect is a generic SQL database plugin for Splunk that allows us to easily
integrate database information with Splunk queries and reports.

Can you write down a general regular expression for extracting the IP
address from logs?
There are multiple ways in which we can extract the IP address from logs. Below are a
few examples:

By using a regular expression:


rex field=_raw "(?<ip_address>\d+\.\d+\.\d+\.\d+)"

OR
rex field=_raw "(?<ip_address>([0-9]{1,3}[\.]){3}[0-9]{1,3})"

Explain Stats vs Transaction commands.


The transaction command is the most useful in two specific cases:

• When the unique ID (from one or more fields) alone is not sufficient to discriminate
between two transactions. This is the case when the identifier is reused, for
example, web sessions identified by a cookie/client IP. In this case, the time span
or pauses are also used to segment the data into transactions.
• When an identifier is reused, say in DHCP logs, a particular message identifies the
beginning or end of a transaction.
• When it is desirable to see the raw text of events combined rather than an analysis
of the constituent fields of the events.

In other cases, it’s usually better to use stats.

• As the performance of the stats command is higher, it can be used especially in a


distributed search environment
• If there is a unique ID, the stats command can be used

How to troubleshoot Splunk performance issues?


The answer to this question would be very wide, but mostly an interviewer would be
looking for the following keywords:

• Check splunkd.log for errors


• Check server performance issues, i.e., CPU, memory usage, disk I/O, etc.
• Install the SOS (Splunk on Splunk) app and check for warnings and errors in its
dashboard
• Check the number of saved searches currently running and their consumption of
system resources
• Install and enable Firebug, a Firefox extension. Log into Splunk (using Firefox) and
open Firebug’s panels. Then, switch to the ‘Net’ panel (we will have to enable it).
The Net panel will show us the HTTP requests and responses, along with the time
spent in each. This will give us a lot of information quickly such as which requests
are hanging Splunk, which requests are blameless, etc.

What are Buckets? Explain Splunk Bucket Lifecycle.


Splunk places indexed data in directories, called ‘buckets.’ It is physically a directory
containing events of a certain period.
A bucket moves through several stages as it ages. Below are the various stages it goes
through:

• Hot: A hot bucket contains newly indexed data. It is open for writing. There can be
one or more hot buckets for each index.
• Warm: A warm bucket consists of data rolled out from a hot bucket. There are
many warm buckets.
• Cold: A cold bucket has data that is rolled out from a warm bucket. There are many
cold buckets.
• Frozen: A frozen bucket is comprised of data rolled out from a cold bucket. The
indexer deletes frozen data by default, but we can archive it. Archived data can
later be thawed (data in a frozen bucket is not searchable).

By default, the buckets are located in:


$SPLUNK_HOME/var/lib/splunk/defaultdb/db

We should see the hot-db there, and any warm buckets we have. By default, Splunk sets
the bucket size to 10 GB for 64-bit systems and 750 MB on 32-bit systems.

Interested in learning Splunk? Enroll in Intellipaat’s Splunk Training today!

What is the difference between stats and eventstats commands?

• The stats command generates summary statistics of all the existing fields in the
search results and saves them as values in new fields.
• Eventstats is similar to the stats command, except that the aggregation results are
added inline to each event and only if the aggregation is pertinent to that event. The
eventstats command computes requested statistics, like stats does, but aggregates
them to the original raw data.

Who are the top direct competitors to Splunk?


Logstash, Loggly, LogLogic, Sumo Logic, etc. are some of the top direct competitors to
Splunk.

What is the command for restarting Splunk web server?


We can restart Splunk web server by using the following command:
splunk start splunkweb

What is the command for restarting Splunk Daemon?


Splunk Deamon can be restarted with the below command:
splunk start splunkd

What is the command used to check the running Splunk processes on


Unix/Linux?
If we want to check the running Splunk Enterprise processes on Unix/Linux, we can make
use of the following command:
ps aux | grep splunk

What is the command used for enabling Splunk to boot start?


To boot start Splunk, we have to use the following command:
$SPLUNK_HOME/bin/splunk enable boot-start

How to disable Splunk boot-start?


In order to disable Splunk boot-start, we can use the following:
$SPLUNK_HOME/bin/splunk disable boot-start

Learn the complete concepts of Splunk from Intellipaat’s Splunk Training at


Hyderabad in just 26 hours!

What is Source Type in Splunk?


Source type is Splunk way of identifying data.

How to reset Splunk Admin password?


Resetting Splunk Admin password depends on the version of Splunk. If we are using
Splunk 7.1 and above, then we have to follow the below steps:

• First, we have to stop our Splunk Enterprise


• Now, we need to find the ‘passwd’ file and rename it to ‘passwd.bk’
• Then, we have to create a file named ‘user-seed.conf’ in the below directory:

$SPLUNK_HOME/etc/system/local/

• In the file, we will have to use the following command (here, in the place of
‘NEW_PASSWORD’, we will add our own new password):

[user_info]

PASSWORD = NEW_PASSWORD

• After that, we can just restart the Splunk Enterprise and use the new password to
log in

Now, if we are using the versions prior to 7.1, we will follow the below steps:

• First, stop the Splunk Enterprise


• Find the passwd file and rename it to ‘passw.bk’
• Start Splunk Enterprise and log in using the default credentials of admin/changeme
• Here, when asked to enter a new password for our admin account, we will follow
the instructions

Note: In case we have created other users earlier and know their login details, copy and
paste their credentials from the passwd.bk file into the passwd file and restart Splunk.

How to disable Splunk Launch Message?


Set value OFFENSIVE=Less in splunk_launch.conf

Learn more from Intellipaat’s insightful Splunk Tutorial!

How to clear Splunk Search History?


We can clear Splunk search history by deleting the following file from Splunk server:
$splunk_home/var/log/splunk/searches.log

What is Btool?/How will you troubleshoot Splunk configuration files?


Splunk Btool is a command-line tool that helps us troubleshoot configuration file issues
or just see what values are being used by our Splunk Enterprise installation in the existing
environment.

What is the difference between Splunk App and Splunk Add-on?


In fact, both contain preconfigured configuration, reports, etc., but Splunk add-on do not
have a visual app. On the other hand, a Splunk app has a preconfigured visual app.

What is .conf files precedence in Splunk?


File precedence is as follows:

System local directory — highest priority

App local directories

App default directories

System default directory — lowest priority

What is Fishbucket? What is Fishbucket Index?


Fishbucket is a directory or index at the default location:
/opt/splunk/var/lib/splunk

It contains seek pointers and CRCs for the files we are indexing, so ‘splunkd’ can tell us
if it has read them already. We can access it through the GUI by searching for:
index=_thefishbucket

Are you interested in learning Splunk from experts? Intellipaat’s Splunk Course in
Bangalore is the right choice!

How do I exclude some events from being indexed by Splunk?


This can be done by defining a regex to match the necessary event(s) and send
everything else to NullQueue. Here is a basic example that will drop everything except
events that contain the string login:

• In props.conf:
<code>[source::/var/log/foo]

# Transforms must be applied in this order

# to make sure events are dropped on the

# floor prior to making their way to the

# index processor

TRANSFORMS-set= setnull,setparsing

</code>

• In transforms.conf:

[setnull] REGEX = . DEST_KEY = queue FORMAT = nullQueue

[setparsing]

REGEX = login

DEST_KEY = queue

FORMAT = indexQueue

You might also like