One source, many use cases: How to deliver value right away by addressing different IT challenges with Splunk – Part 2

June 7, 2016, 12:54 am

≪ Previous: One source, many use cases: How to deliver value right away by addressing different IT challenges with Splunk

Do you remember this piece of raw data:

I hope so, it was on the blog only last week … 😉

Today, let’s focus on the value we can extract and how we’ll be able to address some of the IT challenges related to the company strategy.

IT Ops

What kind of information would be relevant for the application manager?

I am sure he would be interested by:

Number of transactions during the last X minutes and the trend
Number of transactions in errors during the last X minutes and if this number is growing compared to the last Y minutes
How long a transaction takes to complete for each customer
A geographic distribution of the transactions

“What? You said geographic distribution? But I don’t have any details about the transaction location into the logs …”

True. But don’t forget you can enrich your raw data with the lookup feature! You’ve probably seen that there is an IP at the beginning of each transaction. We have a csv file (could be a database), referencing all our customer point of sales IPs with additional metadata like location, shop name, etc.

So here we are, a few queries later:

With this kind of dashboard, the application manager has a clear vision of its application performance with a set of critical KPIs : error number and transaction duration. As the company agrees on SLAs with its customers, the application manager needs to be able to meet them. Controlling the response time for customers helps him:

confirm there is an issue if the customer calls (btw, customer names are random names)
first and foremost anticipate the call and solve the issue

Business Analytics

What about a business manager, what information would he like to see?

Hmmm, probably:

For each transaction, the company gets a percentage of the transaction amount. So, he would probably be interested by the total amount in real time
The cities or customers that generate the biggest revenues
The revenue evolution during the day, as well as a prediction of where we’ll stand at the end of the day and compared with our target

“The value of the transaction was not used for the Application Performance use case, does it mean we need to reindex our data to extract this field?!”

Of course not … Do you remember “Schema on the fly”? This is exactly its purpose, selecting the field from the piece of raw data you want to extract at search time. You don’t need to know all the different use cases you will address based on this data before indexing it.

If not done, just configure the amount (“montant” in french) value extraction and build your dashboard (remember? just a few clicks)!

By the way, you heard right, I said “prediction“! Splunk Core has a standard command that leverages predictive algorithms. This is the predict command. Since last September, we also have this specific application called “ML Toolkit” that can be used to explore data, fit and apply advanced statistical models. This application is free and can be downloaded here.

business_dashboard

The Business Manager now has these KPIs in real time and can react quickly. Depending on your business, reacting in real time can be a game changer, like for Domino’s Pizza …

If your prediction tells you that you won’t reach your daily targets, there’s still time to see why and troubleshoot as soon as possible. Linking this business under-performance to a potential IT root cause can be done in a few clicks within Splunk.

Security

Finally, since these data are quite sensitive, a security analyst might want to monitor the transactions to see if there is an abnormal use of the service. What would be interesting for him?

Probably a single user doing different payments within the same day at point of sales located in different regions in France. It would potentially mean that their card has been corrupted / duplicated.

Ok, let’s build this dashboard! Just present all the cards (I translated it to custId into the piece of raw data) used in different locations within the same day and let the security officer drilldown/interact to investigate and confirm whether it’s an abnormal behavior or not.

“Interactions in your dashboard?”

Yes, that’s really powerful, you can introduce really simply interactions into your Splunk dashboards. Drilldown by clicking, selecting or even filling a free text area, here it is!

security_dashboard

The security analyst can now list all the potential fraudulent transactions, deep dive and analyze them to confirm the potential fraud before calling the card holder. That’s great but it means the analyst would have to stay in front of his computer all the time … He probably has more value to deliver by doing other tasks! That’s Splunk next level of Operational Intelligence: Being proactive! Splunk keeps watch of specific patterns, trends and thresholds in your machine data so you don’t have to! Then Splunk sends you alerts on your favorite alerting channel (email, ServiceNow ticket, RSS feed, SNMP trap, etc.).

That’s it, one piece of raw data, three different use cases ! That’s Splunk.

If you have ideas, questions or feedback, tweet me @1rom1

↧

Smart AnSwerS #66

June 8, 2016, 11:23 am

≫ Next: DevOps Metrics: Measuring Team Productivity – Yes or No?

≪ Previous: One source, many use cases: How to deliver value right away by addressing different IT challenges with Splunk – Part 2

Hey there community and welcome to the 66^th installment of Smart AnSwerS.

Splunk HQ now has an open room with giant Lego like blocks for Splunkers to take a creative escape from the daily grind. Some folks have already constructed some pretty epic stuff. In the first week, someone built a “conference room” with a fully functional table and bench seating that could be used for gaming, eating lunch, and quick meetings…possibly. When I checked out the space again last week, there was a 30+ foot long bridge and some sort of igloo maze fort of sorts. Who knows what architectural feats people will come up with next!

Check out this week’s featured Splunk Answers posts:

Splunk Add-on for Blue Coat ProxySG: Why am I getting error “Page not found!” trying to launch the app?

gdavid was trying to figure out why he was getting this error trying to display the UI for the Splunk Add-on for Blue Coat ProxySG. Senior technical writer rpille pointed out that this is an add-on, not an app, clarifying that the purpose of the add-on is to help parse and index data. There are no UI or setup configuration screens; however, she provided documentation on how to access prebuilt panels that are included for some add-ons that can be added to existing dashboards.
https://answers.splunk.com/answers/410859/splunk-add-on-for-blue-coat-proxysg-why-am-i-getti.html

Why is Splunk not parsing JSON data correctly with my current sourcetype configuration?

rusty009 wanted to index JSON objects in Splunk and configured a sourcetype for this, but the indexed data was not parsed as expected. After taking a look at rusty009’s configuration, Yorokobi identifies a couple issues and recommended several best practices on sourcetype naming, stanza syntax, and relevant props.conf to be placed on the universal forwarder, search head, and indexers.
https://answers.splunk.com/answers/387602/why-is-splunk-not-parsing-json-data-correctly-with.html

Why is date_hour inconsistent with %H?

yuanliu read in documentation that the date_* fields are extracted from an event’s timestamp, but found that the automatically extracted date_hour field was inconsistent with the %H value extracted using the eval function, strftime. sideview and lguinn teamed up to explain the instances where these inconsistencies can occur and promote the best practice of not relying on date_* fields for accuracy in time based reporting. Instead, they recommend extracting specific time fields using strftime.
https://answers.splunk.com/answers/387130/why-is-date-hour-inconsistent-with-h.html

Thanks for reading!

Missed out on the first sixty-five Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

DevOps Metrics: Measuring Team Productivity – Yes or No?

June 17, 2016, 9:00 am

≫ Next: Smart AnSwerS #67

≪ Previous: Smart AnSwerS #66

In my last blog post, I talked about the importance of measuring the business impact of DevOps-driven application delivery. At the DevOpsDays Seattle Open Space discussion on metrics, we also explored measuring DevOps teams’ performance and people productivity. I was glad to see that Nancy Gohring from 451 Research joined our session (check out her insights). Below are some of the key highlights from that Open Space discussion.

For DevOps leaders, knowing if DevOps teams are making progress toward meeting their organizational goals are important. Often these teams seem to have conflicting objectives. And, since DevOps practice involves a cultural shift, in our discussion it was concluded that it is crucial for Dev and Ops teams to collaborate and define what data should be logged so that both sides can understand if they are on track. This collaboration is even more relevant since Ops is often measured by service availability, while Dev teams are marching toward generating new features and driving new revenue.

One participant in our discussion reported that for one DevOps team, once they started tracking the number of wake-up calls due to IT incidents, they managed to reduce the overall number of open tickets. With data-driven insights, they were able to tie the wake-up call to the piece of code and to the developer that committed that code.

And based on another IT practitioners experience, once DevOps teams adopted measuring QA tests success/fail rates and tied them to code coverage, they were able to meet both internal and external customer SLAs. They were also able to understand if those SLAs were correlated with the size and behavior of their technical debt over time.

We also discussed tracking the number of closed tickets per person, open ticket durations, as well as the number of story points for understanding the app delivery velocity. These two metrics should be analyzed with more detail and standardized across the organization. For instance, if one measures IT teams’ productivity, the complexity of tickets and story points should be included in an analysis.

Some folks reported that on occasion when people realized their productivity was measured, there were examples of “gaming the system”. Those included creating long readme pages instead of contributing to the actual code, or when measuring the number of bugs per developer, people started to be defensive and even question the nature of a particular bug. Is this a bug or perhaps a feature? Sound familiar?

When measuring productivity and performance, DevOps leads need to analyze how insights gleaned from data will be used. In real DevOps spirit, measuring people performance should be used for continuous improvements, allocating appropriate resources and fostering increases in productivity or other goals, rather than “punishing” or “shaming” people.

So the real question is not whether you should measure people or a teams’ productivity. Rather, it’s about how you are using the data that was obtained by analysis. Without data, you have no insight and no way to track progress. So to end on a philosophical note, it’s not about the tool, but it’s how we humans are using it.
What DevOps metrics are you measuring? I am looking forward to hearing from you!
You can reach me on Twitter: Follow @stela_udo

↧

Smart AnSwerS #67

June 17, 2016, 1:28 pm

≫ Next: Spotting the Adversary… with Splunk

≪ Previous: DevOps Metrics: Measuring Team Productivity – Yes or No?

Hey there community and welcome to the 67^th installment of Smart AnSwerS.

For folks who will be in the San Francisco Bay Area the first full week of July, you’re all welcome to join us at the SFBA User Group meeting on Wednesday, July 6^th @ 6:30PM PDT. chuckers has graciously offered to host at Comcast in Sunnyvale, CA where we’ll be hearing some interesting talks by watkinst from Mastercard and Splunk Senior Director of Product Management, Gaurav Agarwal. If you can make it, be sure to visit the SFBA User Group page to RSVP!

Check out this week’s featured Splunk Answers posts:

What happens to my multisite indexer cluster when connectivity between sites dies?

davidpaper shares this question and answer to educate the community on what exactly happens with replication when connection between sites is lost in a multisite indexer cluster. He explains the difference between inter-indexer and forwarder acknowledgement and how it relates to a disaster recovery scenario, making for a very informative read.
https://answers.splunk.com/answers/390978/what-happens-to-my-multisite-indexer-cluster-when.html

What are best practices for handling data in a Splunk staging environment that needs to go to production?

jtacy had end users from different teams that wanted to search non-production data and wanted to get community input on different approaches for getting this data to production. Lucas K recommends making use of distributed search groups which would allow users to choose between different data sources from a single set of search heads. He shows a simple example configuration for distsearch.conf to show how this setup works.
https://answers.splunk.com/answers/389806/what-are-best-practices-for-handling-data-in-a-spl.html

How can I get the latitude and longitude range when I click on map markers and use those values for a drilldown to a panel in the same dashboard?

Javip was using the Cluster Map visualization on a dashboard and had working XML to create tokens for latitude and longitude values when clicking in the map, but needed a range of values instead for filtering table results. ziegfried gives an excellent solution with sample XML to meet this requirement, introducing Javip to a different set of tokens to use that denote the bounds of the cluster.
https://answers.splunk.com/answers/391998/how-can-i-get-the-latitude-and-longitude-range-whe.html

Thanks for reading!

Missed out on the first sixty-six Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Spotting the Adversary… with Splunk

June 21, 2016, 1:09 pm

≫ Next: Supporting a cycling world record attempt using analytics

≪ Previous: Smart AnSwerS #67

Howdy Ya’ll. Eventually there is a Rubicon to cross in every Security professional’s life. With a satisfied sigh he’ll take a step back from the keyboard, wipe Dorito dust covered hands on khakis, take a long slug of Mountain Dew, and gaze proudly at his Splunk instance and utter the words “I’ve added all the data sources I can. The network is being ‘monitored’”. Then the smile will falter as his cyber demons claw their way up to the surface. He’ll hear them scream out “but WHAT am I supposed to look for??” He (and you) are not alone. Ever since time immemorial (or at least when I first began “practicing” the dark arts of cyber security) I would hear the question of “but what should I look for?” [1]. Collecting data is great but if you aren’t using it to find baddies, it’s just expensively corralled bits and bytes. In this brief missive, I’m going to tell you about a great document from the NSA that can set you on the road to “Adversary Detection” nirvana with Microsoft Logs and give then a sample panel of ways to look at that data to get on your way.

With that in mind, I thought I would take a moment and possibly give out some “Splunkspiration™” that I have shared several times over the last year but never committed to paper. For those of you who are not clued in by the vague title, the NSA IA team released a great white paper several years ago which was recently updated called “Spotting the Adversary with Windows event Log Monitoring” [2]. I highly recommend reading the entire white paper, but my favorite section is on page 8 where these beautiful IA nerds at the NSA provide a table of all the event codes that they find interesting for detecting baddies in your windows network.

So for those of you who are wondering what the hell to look for with those millions of Windows Logs… these Event ID’s are a great place to start!!! Since I’ve already done some of the work, I decided to share a little Splunk panel below that groups all of these Event IDs into selections based on the descriptions above. This does have the assumption that you have your GPO’s set up right to get these logs (see aforementioned NSA document), that you have the Windows Infrastructure app installed, and your CIM setup correctly. I’m not gonna lie to you. This panel is not going to find APT1337 right off the bat but it should give that bit of “Splunkspiration™” to go out there, make some correlation rules, add some saved searches, or go crazy on the SPL à la Splunker David Veuve, and get into the APT squashing business.

I also HIGHLY recommend taking a peek at a talk (https://conf.splunk.com/session/2015/conf2015_MGough_MalwareArchaelogy_SecurityCompliance_FindingAdvnacedAttacksAnd.pdf) from .conf2015 given by Michael Gough that goes over very similar material! [3]

I hope this helps and as always… Happy Hunting :-)

    <panel>
      <input type="dropdown" token="nsasearch" searchWhenChanged="true">
source=WinEventlog:application<label>Interesting Event IDs</label>
        <choice value=" source=WinEventLog:application">Executables</choice>
        <choice value=" source=WinEventLog:Security EventCode=4624 OR EventCode=4625 OR EventCode=4648 OR EventCode=4728 OR EventCode=4732 OR EventCode=4634 OR EventCode=4735 OR EventCode=4740 OR EventCode=4756">Account and Group Activities</choice>
        <choice value=" source=WinEventLog:Application EventCode=1000 OR EventCode=1002">Application Crashes and Hangs</choice>
        <choice value=" source=WinEventLog:Application EventCode=1001">Windows Error Reporting and BSOD</choice>
        <choice value=" source=WinEventLog:Application EventCode = 1005 OR EventCode = 1006 OR EventCode = 1008 OR EventCode = 1010 OR EventCode = 2001 OR EventCode = 2003 OR EventCode = 2004 OR EventCode = 3002 OR EventCode = 5008">Windows Defender Errors</choice>
        <choice value=" source=WinEventLog:Application EventCode = 3001 OR EventCode = 3002 OR EventCode = 3003 OR EventCode = 3004 OR EventCode = 3010 and 3023">Windows Integrity Errors</choice>
        <choice value=" source=WinEventLog:Application EventCode = 1 OR EventCode = 2">EMET Crash Logs</choice>
        <choice value=" source=WinEventLog:Security EventCode = 2004 OR EventCode = 2005 OR EventCode = 2006 OR EventCode = 2009 OR EventCode = 2033">Windows Firewall Logs</choice>
        <choice value=" source=WinEventLog:Application EventCode = 2 OR EventCode = 19">MSI Packages Installed</choice>
        <choice value=" source=WinEventLog:System EventCode = 7022 OR EventCode = 7023 OR EventCode = 7024 OR EventCode = 7026 OR EventCode = 7031 OR EventCode = 7032 OR EventCode = 7034">Windows Service Manager Errors</choice>
        <choice value=" source=WinEventLog:System EventCode = 1125 OR EventCode = 1127 OR EventCode = 1129">Group Policy
Errors</choice>
        <choice value=" source=WinEventLog:Application EventCode = 865 OR EventCode = 866 OR EventCode = 867 OR EventCode = 868 OR EventCode = 882 OR EventCode = 8003 OR EventCode = 8004 OR EventCode = 8006 OR EventCode = 8007">AppLocker and SRP Logs</choice>
        <choice value=" source=WinEventLog:System EventCode = 20 OR EventCode = 24 OR EventCode = 25 OR EventCode = 31 OR EventCode = 34 OR EventCode = 35">Windows Update Errors</choice>
        <choice value=" source=WinEventLog:System EventCode = 1009">Hotpatching Error</choice>
        <choice value=" source=WinEventLog:Security EventCode = 5038 OR EventCode = 6281 OR EventCode = 219">Kernel Driver
and Kernel Driver Signing Errors</choice>
        <choice value=" source=WinEventLog:System EventCode = 104 and 1102">Log Clearing</choice>
        <choice value=" source=WinEventLog:System EventCode = 7045">Windows Service Installed</choice>
        <choice value=" source=WinEventLog:Application EventCode = 800 OR EventCode = 903 OR EventCode = 904 OR EventCode = 905 OR EventCode = 906 OR EventCode = 907 OR EventCode = 908">Program
Inventory</choice>
        <choice value=" source=WinEventLog:Security EventCode = 8000 OR EventCode = 8001 OR EventCode = 8002 OR EventCode = 8003 OR EventCode = 8011 OR EventCode = 10000 OR EventCode = 10001 OR EventCode = 11000 OR EventCode = 11001 OR EventCode = 11002 OR EventCode = 11004 OR EventCode = 11005 OR EventCode = 11006 OR EventCode = 11010 OR EventCode = 12011 OR EventCode = 12012 OR EventCode = 12013">Wireless
Activities</choice>
        <choice value=" EventCode = 43 OR EventCode = 400 OR EventCode = 410">USB Activities</choice>
        <choice value=" source=WinEventLog:System EventCode = 307">Printing Activities</choice>
        <default> source=WinEventLog:application</default>
      </input>
      <html>
 These "interesting" Events are selected from the NSA's guide
to <a href="https://www.iad.gov/iad/library/reports/spotting-the-adversary-with-windows-event-log-monitoring.cfm">spotting
the adversary</a> 
      </html>
<table>
        <title>Event ID's of Interest</title>
        <search>
          <query>$nsasearch$ | table _time  EventCode Message ComputerName</query>
          <earliest>-7d@h</earliest>
          <latest>now</latest>
        </search>
<option name="list.drilldown">full</option>
<option name="list.wrap">1</option>
<option name="maxLines">5</option>
<option name="raw.drilldown">full</option>
<option name="rowNumbers">false</option>
<option name="table.drilldown">all</option>
<option name="table.wrap">1</option>
<option name="type">list</option>
<option name="wrap">true</option>
<option name="dataOverlayMode">none</option>
<option name="drilldown">cell</option>
<option name="count">7</option>
      </table>
    </panel>

[1]
And to be fair, I asked the question more than my fair share back when I was less experienced, less neckbeardy, and significantly less horizontally fleshed

[2]
https://www.iad.gov/iad/library/reports/spotting-the-adversary-with-windows-event-log-monitoring.cfm
[3]
https://conf.splunk.com/session/2015/conf2015_MGough_MalwareArchaelogy_SecurityCompliance_FindingAdvnacedAttacksAnd.pdf

↧

Supporting a cycling world record attempt using analytics

June 22, 2016, 8:22 am

≫ Next: Smart AnSwerS #68

≪ Previous: Spotting the Adversary… with Splunk

It’s not every day that you get to be involved in a record attempt, but Splunk is currently supporting a team of four cyclists in their quest to break the world record for a team cycling from the West coast to the East coast of the US.

The Race Across America (RAAM) www.raceacrossamerica.org is an annual cycle race involving teams of four riders. The total race distance is 3,070 miles and it involves 55 stages between a series of waypoints – fixed coordinates on a route that starts at Oceanside, California and ends at Annapolis, Maryland.

The stages themselves can vary dramatically in length, terrain and altitude change. The weather conditions and wind speed will have a significant impact on the time it takes to complete the race and therefore the likelihood of any team breaking the record. Fair weather and favourable wind conditions will mean that the record attempt could be on!

The team can select any rider for every stage, so it is constantly evaluating which rider is best placed to ride the next stage. To help them select the right rider for each stage, the Splunk Business Analytics and IoT team have built a model using our Machine Learning Toolkit that predicts how long it will take each rider to complete each stage, based on their training data.

(For the more geeky among you, our model used linear regression to predict the cumulative distance travelled for any rider in the team in a fixed time, based on the terrain that they will encounter during the stage.)

Below is example output from the model. It shows how far riders 1 and 2 will travel at the start of the race, assuming that they each ride for 30 mins.

During the race, each of the four riders in the team have also been completing questionnaires after every stage to provide feedback on their health (heart rate, temperature, blood pressure), their recovery between stages (nutrition, thirst, sleep quality), as well as any injuries (soreness, fatigue, any treatment given).

The race team have been using the Splunk model in combination with data from the rider questionnaires to refine their strategy during the race. This has ensured that the team has the best possible chance of breaking the record.

So, how did they get on? Well, we’ll be able to let you know whether they broke the world record or not in a couple of days. In the meantime you, can follow the team’s progress here (team name RAAMIN4CHARITY):

http://www.uk.logicalis.com/solutions-and-services/information-insights/race-across-america/daily-dashboard/

↧

Smart AnSwerS #68

June 24, 2016, 11:10 am

≫ Next: Eureka! Extracting key-value pairs from JSON fields

≪ Previous: Supporting a cycling world record attempt using analytics

Hey there community and welcome to the 68^th installment of Smart AnSwerS.

It’s the week of LGBT Pride in San Francisco, so SplunQers and fellow allies came together yesterday afternoon for our second party ever in the new building at HQ. The courtyard was set up with rainbow themed decorations, treats, and libations (of course) to celebrate the many identities that make up the diversity of our company. The turnout was amazing as we filled the courtyard with lively energy and blaring music in true Splunk fashion. Big thanks to the SplunQers, Fun Council, and Facilities for organizing and promoting an open culture.

Check out this week’s featured Splunk Answers posts:

How to speed up LDAP / Active Directory searches, specifically Asset or Identity lookups?

SplunkTrust member rich7177 is always finding nifty solutions to make things more efficient in his environment, and when he finds one that could benefit the greater Splunk community, he generously shares his knowledge. Find out how he improved the speed of an Active Directory search for asset and identity lookups from 400 seconds to 40 seconds.
https://answers.splunk.com/answers/400373/how-to-speed-up-ldap-active-directory-searches-spe.html

Why are the counts inconsistent for metadata under Data Summary after using Delete?

jakewalter used the delete command to remove some data from being searchable, but didn’t understand why the metadata under Data Summary in the Search & Reporting app showed a count that sometimes included the deleted events, but other times not. Just over 1.5 years later, somesoni2 highlighted a snippet from documentation that explains this inconsistency, clearing up a common misunderstanding brought up on Answers that more folks should be aware of – an event’s metadata is still searchable until it has gone past its retention period.
https://answers.splunk.com/answers/187032/why-are-the-counts-inconsistent-for-metadata-under.html

Punct… good god ya’ll – what is it good for?

jplumsdaine22 had ANNOTATE_PUNCT disabled in his Splunk deployment to save disk space for several years, but was thinking of turning it on since more resources had become available. He was curious to know benefits of the punct field, and how to estimate disk space and performance issues if he re-enabled the setting. jkat54 gives his experience using punct to find anomalies in data, and suggested using a Splunk search to calculate the number of bytes used based on the number of characters the field adds per event.
https://answers.splunk.com/answers/399814/punct-good-god-yall-what-is-it-good-for.html

Thanks for reading!

Missed out on the first sixty-seven Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Eureka! Extracting key-value pairs from JSON fields

June 28, 2016, 8:48 pm

≫ Next: Splunking a Microsoft Word document for metadata and content analysis

≪ Previous: Smart AnSwerS #68

With the rise of HEC (and with our new Splunk logging driver), we’re seeing more and more of you, our beloved Splunk customers, pushing JSON over the wire to your Splunk instances. One common question we’re hearing you ask, how can key-value pairs be extracted from fields within the JSON? For example imagine you send an event like this:

{"event":{"name":"test", "payload":"foo=bar\r\nbar=\"bar bar\"\tboo.baz=boo.baz.baz"}}

This event has two fields, name and payload. Looking at the payload field however you can see that it has additional fields that are within as key-value pairs. Splunk will automatically extract name and payload, but it will not further look at payload to extract fields that are within. That is, not unless we tell it to.

Field Extractions to the rescue

Splunk allows you to specify additional field extractions at index or search time which can extract fields from the raw payload of an event (_raw). Thanks to its powerful support for regexes, we can use some regex FU (kudos to Dritan Btincka for the help here on an ultra compact regex!) to extract KVPs from the “payload” specified above.

Setup

To specify the extractions, we will define a new sourcetype httpevent_kvp in %SPLUNK_HOME%/etc/system/local/props.conf by adding the entries below. This regex uses negated character classes to specify the key and values to match on. If you are not a regex guru, that last statement might have made you pop a blood vessel :-)

[httpevent_kvp]
KV_MODE=json
EXTRACT-KVPS = (?:\\[rnt])?(?<_KEY_1>[^="\\]+)=(?:\\")?(?<_VAL_1>[^="\\]+)

Next configure your HEC token to use the sourcetype of httpevent_kvp, alternatively you can also set sourcetype in your JSON when you send you event.

Restart your Splunk instance, and you ready to test.

Testing it out

We’ll use curl to test if the new sourcetype is working.

curl -k https://localhost:8088/services/collector -H 'Authorization: Splunk 
16229CD8-BB6B-449E-BA84-86F9232AC3BC' -d '{"event":{"name":"test",
"payload":"foo=bar\r\nbar=\"bar bar\"\tboo.baz=boo.baz.baz"}}'

Heading to Splunk, we can see that the foo, bar and boo.baz fields were properly extracted as interesting fields.

Kvp interesting fields

Now heading to “All Fields” we can select each of the new fields.

Kvp select fields

And then see the values magically show up!

Kvp fields

Using this approach will allow you to easily extract KVPs residing within the values of your JSON fields. This is useful when using our Docker Log driver, and for general cases where you are sending JSON to Splunk.

↧

Splunking a Microsoft Word document for metadata and content analysis

June 30, 2016, 7:47 am

≫ Next: Smart AnSwerS #69

≪ Previous: Eureka! Extracting key-value pairs from JSON fields

The Big Data ecosystem is nowadays often abbreviated with ‘V’s. The 3Vs of Big Data, or the 4Vs of Big Data, even the 5Vs of Big Data! However many ‘V’s are used, two are always dedicated to Volume and Variety.

Recent news provides particularly rich examples with one being the Panama Papers. As explained by Wikipedia:

The Panama Papers are a leaked set of 11.5 million confidential documents that provide detailed information about more than 214,000 offshore companies listed by the Panamanian corporate service provider Mossack Fonseca. The documents […] totaled 2.6 terabytes of data.

This leak illustrates the following pretty well:

The need to process huge volume of data (2.6 TB of data in that particular case)
The need to process different kind of data (Emails, databases dumps, PDF documents, Word documents, etc).

So, let’s see what we could do to Splunk a Word document!

A Word document is a Zip file!

As illustrated by the results of the Linux file command, a Word document is a Zip archive.

# file document.docx document.docx: Zip archive data, at least v2.0 to extract #

Splunk being able to uncompress Zip files to read the logs it contains, let see what happen if we try to Splunk a Word document “as this”.

Pretty ugly. Unfortunately, Splunk 6.4 will only provide ineligible results as illustrated by the above screenshot because it cannot index a Word document without prior preprocessing.

Word document format

XML representation of Word documents was introduced by Microsoft with Word 2003, and it evolved to a multiple files representation since then (aggregated under the now familiar .docx extension). As a result of not losing any functionality of moving from a binary to a XML representation, the produced XML files could be intimidating as they contain a lot of information that is not related to the actual content of the file, but to the presentation of such content.

A Microsoft Word 2007 file format consists of a compressed ZIP file, called a package, which contains three major components:

Part items, the actual files
Content type items, the description of each part item (ex: file XYZ is an image/png)
Relationship items, which describes how everything fit together.

Readers expecting a complete and precise description of the format of a Word 2007 document are invited to go through the Walkthrough of Word 2007 XML Format from Microsoft.

Uncompress & Index

After using the regular unzip command to extract the files from the docx package into a directory named “document”, the listing of the files is as follow:

# find document/ -type f | sort document/[Content_Types].xml document/customXml/item1.xml document/customXml/itemProps1.xml document/customXml/_rels/item1.xml.rels document/docProps/app.xml document/docProps/core.xml document/docProps/thumbnail.jpeg document/_rels/.rels document/word/document.xml document/word/fontTable.xml document/word/media/image1.emf document/word/media/image2.emf document/word/media/image3.emf document/word/media/image4.png document/word/media/image5.png document/word/media/image6.png document/word/media/image7.png document/word/numbering.xml document/word/_rels/document.xml.rels document/word/settings.xml document/word/stylesWithEffects.xml document/word/styles.xml document/word/theme/theme1.xml document/word/webSettings.xml #

As we can see, many files are XML, so flat ASCII files that Splunk can ingest. To ingest that directory, a custom sourcetype has been created with the property TRUNCATE set to false (props.conf):

TRUNCATE = 0

The TRUNCATE option is required to make sure Splunk completely index all the files (except the binary ones like images; see option NO_BINARY_CHECK for that).

After ingesting the whole directory, here is how one event looks into Splunk

The resulting events are more user friendly, but not really operationally exploitable yet.

Content Types

At the root of our document directory, the file [Content_Types].xml contains the content types specifications. As this is a flat XML files, we can parse it with Splunk spath command to visualize what kind of content we have into our Word document as illustrated by the following screenshot. In that example, we have two kinds of data: XML files, and images.

The MSDN walkthrough details the construction of that file:

A typical content type begins with the word application and is followed by the vendor name.
The word vendor is abbreviated to vnd.
All content types that are specific to Word begin with application/vnd.ms-word.
If a content type is a XML file, then the URI ends with +xml. Other non-XML content types, such as images, do not have this addition.
etc…

So, using regular Splunk-fu, we can parse our content type file to have access to more useable fields:

The search is detailed hereafter:
source="*[Content_Types].xml" | spath input=_raw | rename Types.Override{@ContentType} AS ContentType Types.Override{@PartName} AS PartName | fields PartName ContentType | eval data = mvzip(ContentType, PartName) | mvexpand data | eval tmp = split(data, “,") | eval ContentType = mvindex(tmp, 0) | eval PartName = mvindex(tmp, 1) | eval tmp=split(ContentType, “/“) | eval family_type=mvindex(tmp,0) | eval part2=substr(ContentType,len(family_type)+2) | rex field=part2 “vnd\.(?<vendor>[^.$]+)" | eval part3=substr(part2, len(vendor)+6) | eval isXML = if(match(part3, "\+xml$"),"Yes", "No") | eval filetype = if(match(part3, "\+xml$"),substr(part3, 0, len(part3)-4), part3) | table PartName family_type vendor isXML filetype ContentType | sort PartName

Document Properties (Word metadata)

Two very interesting files exist within a Word 2007 package: core.xml and app.xml from the docProps directory. A simple parsing using Splunk command spath can give us insights into the author of the document, the creation time, the modified time, the number of pages composing the document, the system on which the document was created, the number of characters, etc.

core.xml

app.xml

Revision IDs (RSID)

To dive more on the actual content of such file, one key mechanism to understand about Word documents is revision identifiers (rsids). It’s very well explained here:

Every time a document is opened and edited a unique ID is generated, and any edits that are made get labeled with that ID. This doesn’t track who made the edits, or what date they were made, but it does allow you to see what was done in a unique session. The list of RSIDS is stored at the top of the document, and then every piece of text is labeled with the RSID from the session that text was entered.

Practically speaking, this leads to such thing:

It is to notate here that the sentence in the analyzed Word document was “When a notable event is raised, a security analyst needs […] or identities. This manual task […]”.

Clearly, a lot of noise surrounds the real content of the document (this “noise” is required on purpose, but that level of details in our case isn’t appropriate because we just want to have access to the words composing the document).

Accessing the content of the Word document

As the content is actually XML, it could be parsed the same way as the previous files with the Splunk spath command.

The problem with this method is that firstly some words or sentences are cut in the middle, and we also need to know the exact path in the XML tree (here, <w:p><w:r><w:t> under the root <w:document><w:body>)

However, we know for sure that the actual content for the file will be within the boundaries <w:body>. The idea becomes then to extract the content within those boundaries, and remove the XML tags.

The Splunk search is presented hereafter. The result is one field containing the whole content of the file as illustrated above.
source="*/document.xml" | rex field=_raw "\<w:body\>(?<wbody>.+)\</w:body\>" | fields wbody | rex field=wbody mode=sed "s/\<[^>]+\>/ /g" | table wbody
That’s more practicable, but what about searching for a term within that document, which is basically contained into one single field?

One trick could be to split the field into multiple fields based on the punctuation. The output is similar on the first approach with the spath command, the big difference being that words are not cut in the middle!

source="*/document.xml" | rex field=_raw "\<w:body\>(?<wbody>.+)\</w:body\>" | fields wbody | rex field=wbody mode=sed "s/\<[^>]+\>/ /g" | rex field=wbody mode=sed "s/[[:punct:]]/#/g" | eval wb = split(wbody, "#") | mvexpand wb | table wb
From there, we can easily search for simple terms by appending the following to the above search:
| search wb = “*notable*”
In this example, the word “notable” will be searched across the entire document.

Conclusion

This article is only scratching the surface of the Microsoft Word 2007 format (now a worldwide standard under the references ECMA-376 and ISO/IEC29500), and does not cover core components like relationship items for example. While it is technically possible to Splunk Word documents, that’s not an easy task and operationally limited as illustrated above.

Now, one question remains, what are your use cases around such feature? (-:

↧

Smart AnSwerS #69

June 30, 2016, 3:01 pm

≫ Next: Gaining clarity: adding a visual line break between events

≪ Previous: Splunking a Microsoft Word document for metadata and content analysis

Hey there community and welcome to the 69^th installment of Smart AnSwerS.

Time has been flying by with Splunkers working incredibly hard and adapting to new changes in our office space. It’s hard to believe that we’re halfway through 2016 already, but that’s what happens when you’re constantly focused and pushing through the daily grind. Luckily, HQ and other Splunkers in the US are getting a nice 5 day Summer break starting tomorrow for the 4^th of July weekend. This is our chance in the middle of the year to refresh and recharge before finishing off strong with the next couple quarters ahead. Cheers!

Check out this week’s featured Splunk Answers posts:

How to add upper and lower boundaries to a sparkline?

Dohrendorf_Consist was using a sparkline to display a bar graph based on percentage values, and needed to know how to set fixed upper and lower boundaries in Simple XML to fix an issue with the visualization. A constant value of 100 was not displayed as expected as there were no changes or parameters set to determine the range. After a day’s research, Dohrendorf returned to answer the question with chartRangeMin and chartRangeMax, two undocumented options that rendered the sparklines properly. Senior Technical Writer frobinson caught wind of this Q&A and added the two parameters to the sparkline options in the Simple XML Reference documentation.
https://answers.splunk.com/answers/319548/how-to-add-upper-and-lower-boundaries-to-a-sparkli.html

When adding an indexer to a distributed environment, is there a configuration that makes indexers exchange events to auto load balance them?

adamguzek wanted to know if indexers could exchange events to balance the load automatically when a new indexer is added to an existing distributed search environment instead of making configuration changes to all syslog sources and forwarders. SplunkTrust member dwaddle gives a very comprehensive and concise answer, explaining that non-clustered indexers are not aware of one another, and although indexers in a cluster are given knowledge of each other, it is only for replication, not migration. In both cases, changes would still need to be done on forwarders, however, the indexer discovery feature introduced in Splunk 6.3 allows the cluster master to be a single point of communication with forwarders to know which indexers to connect to.
https://answers.splunk.com/answers/423939/when-adding-an-indexer-to-a-distributed-environmen.html

Why are search results cut off at 10,000 in Splunk Web and 10,000 or 20,000 results via REST API?

sjodle was getting a limited number of events when searching a large data set in Splunk Web and through the REST API, but didn’t know what was causing this or how to return all results. SplunkTrustee woodcock pinpoints the sort command as the culprit since he has dealt with search command result limits in his past experience.
https://answers.splunk.com/answers/423870/why-are-search-results-cut-off-at-10000-in-splunk.html

Thanks for reading!

Missed out on the first sixty-eight Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Gaining clarity: adding a visual line break between events

July 7, 2016, 3:11 pm

≫ Next: How to Pick a Threat Intelligence Provider (kind of…)

≪ Previous: Smart AnSwerS #69

(Hi all–welcome to the latest installment in the series of technical blog posts from members of the SplunkTrust, our Community MVP program. We’re very proud to have such a fantastic group of community MVPs, and are excited to see what you’ll do with what you learn from them over the coming months and years.
–rachel perkins, Sr. Director, Splunk Community)

Hi, I’m Mark Runals, Lead Security Engineer at The Ohio State University, and member of the SplunkTrust.

If your experience is anything like mine, there have been times when you’ve put together a query that has found events of interest to you–only to have to spend extra time scanning back and forth within the results to make sure you’re looking at the same related events. For example, let’s say you have identified a series of suspect accounts that have logged in from a series of IP address and you’ve sorted the results based on a combination of time and IP. As you are scanning down the time column as it relates back to the user list, it can be easy to miss if the IP has changed or vice versa. It would be nice to add a visual break to quickly inform the reader where a block of events stops or starts. While this example won’t use a query as dramatic as the above scenario, hopefully the methodology is presented clearly enough that you can adapt it as needed.

Let’s start with a base set of data. In this case, I’m going to get the top 3 longest running searches from my Splunk instance. That query might look something like this:

index=_audit action=search info=completed | sort -total_run_time | dedup 3 user | table _time user total_run_time scan_count event_count | sort user -_time

The screenshot is trimmed to only the first three users and there are only three results to review. Hopefully though you can see how (because the user names are similar and the results include several columns that just contain numbers) as your eye roves around the data you might not immediately pick up on key result set elements like time and user shift if dedup or other limiting commands had not been used.

One key to introducing a break is to decide how you will end up sorting the data. I will continue with sorting by user and then time with the most recent events, by user, first. The command we’re going to use here is appendpipe. Under ‘normal’ use this command can be used for things like providing a subtotal in a running list. It is similar in usage to eventstats but unlike eventstats will only provide 1 result vs having the result attached to each applicable event.

In our case what we want to do is place the line break between users, so I will be taking the oldest event and subtracting some additional time from it so that when sorted it will be the last entry per user. Remember that data in the _time field is really stored in epoch, so manipulating it can be done with simple mathematical equations. In our case, my query now looks like this:

index=_audit action=search info=completed | sort -total_run_time | dedup 3 user | table _time user total_run_time scan_count event_count | appendpipe [ stats min(_time) as _time by user | eval _time = _time - 10 ] | sort user -_time

Note: I’ve moved the sorting command after the appendpipe and the ‘blank’ event line is 10 seconds after the last real event per user.

You could stop there but I’d like to add something like a short string with all dashes to really break one result set group from the other. To make things easier I’m going to use the foreach command. This will allow me to loop over all of my columns at once instead of an eval command for each:

index=_audit action=search info=completed | sort -total_run_time | dedup 3 user | table _time user total_run_time scan_count event_count | appendpipe [ stats min(_time) as _time by user | eval _time = _time - 10 ] | sort user -_time | foreach * [ eval <<FIELD>> = if(len(<<FIELD>>) >= 1, <<FIELD>>, "----") ]

To really clean up the results you’d likely want to take care of the _time and user fields. You could do this with one foreach command and add the dash line string or maybe get a little fancy and in the _time field say something like “New User”. A lot of this depends on personal taste and the data itself. Remember, part of this effort is to let the eye quickly pass over the event break line and more quickly identify the blocks of related events. You don’t want to add too much visual clutter. Either way the end result might look something like this:

index=_audit action=search info=completed | sort -total_run_time | dedup 3 user | table _time user total_run_time scan_count event_count | appendpipe [ stats min(_time) as _time by user | eval _time = _time - 10 ] | sort user -_time | foreach * [ eval <<FIELD>> = if(len(<<FIELD>>) >= 1, <<FIELD>>, "----") ] | eval user = if(total_run_time = "----", "----", user ) | eval _time = if(total_run_time = "----", "---- New User ----", _time )

Hopefully you can see the advantage to leveraging this capability and with the use of the foreach command you really aren’t adding a whole lot of length to your base query. Have you come up with other ways to introduce breaks in your data? Let us know!

↧

How to Pick a Threat Intelligence Provider (kind of…)

July 11, 2016, 9:04 am

≫ Next: Best Practices in Protecting Splunk Enterprise

≪ Previous: Gaining clarity: adding a visual line break between events

Over my last two years-ish at Spunk I’ve been asked the question “Which threat intelligence feed should I purchase?” and “whats the deal with the viking helmet?” and “whats up with the Star Wars theme at Threatconnect” (ಠ_ಠ at you @wadebaker) on a more than regular occurrence. And like anyone who is trying to get out of a binary question I would respond with “it depends…” and then I’d mumble something about “threat data”. Finally I’d sigh and say, “All joking aside… it depends”. I just didn’t have a great answer. Don’t get me wrong, I have personal preferences based on my experiences, but I tend to know threat intelligence providers who focus on nation-state adversaries. If you work for an organization that is worried about crime-ware, my $TI_VENDOR_OF_CHOICE may not be appropriate. For months I was stymied and worried about what advice to give. Thankfully those days of uncertainty are past. This post will describe ways to find the right threat intelligence provider for your organization and which ones work best with Splunk at the time of publication.

Last week I saw a talk by Rebekah Brown (follow her on twitter @PDXbek … definitely worth your time) at the SANS DFIR conference in Austin. Although the talk had a much broader focus on threat intelligence, creating your own (possibly much more valuable) threat intelligence internally, and intelligence models… there was one slide that really resonated with me on the subject of threat intelligence vendors. In the slide, Rebekah proposed 4 questions that organizations could use to evaluate external threat intelligence providers.

Where does the info come from?

What types of threat groups does it cover?

What types of information does it include?

Primary source or enrichment?

I think these are great questions for an organization to ask threat intelligence vendors when they are shopping. But what do they mean:

Where does the info come from?
- Is the threat intelligence derived from hundreds of sensors around the world or only from five sensors in southern California? Are they doing reverse malware engineering or outsourcing that work to a third party? Have you ever heard of any of their researchers? Are they going into the “dark web”? These are important questions because where the data that generates threat intelligence comes from will often determine the quality of the report and the accuracy of the threat data for IOCs.
What types of threat groups does it cover?
- If a threat intelligence vendor’s researchers are focused on nation-state APT’s then their threat intelligence will be great for a company that builds fighter jets. However it may not be the best fit for a retail organization that is being targeted by crime-ware. Make sure that the focus of the TI vendor’s research matches with the vertical that your organization operates in.
What types of information does it include?
- Is the vendor providing MD5s or are they providing TTPs (Tactics, Techniques, and Procedure)? Is it just Threat Data (lists of IOCs) or is the provider creating government style intelligence reporting? If a org just wants a block list, there may be little to no value in paying for a thirty-page document that shows Weibo photos of the adversary in High School. The customer should determine what level of info that they truly need and only pay for access to data they are going to use.
Primary source or enrichment?
- Is this just regurgitated/duplicated information from someone else’s threat feeds? Is it primary research by a Reverse Engineer with a team with linguists? Or is the vendor repackaging other people’s original content several weeks later? Original content costs more but maybe of more temporal value to a company than IOCs of infrastructure that is months out of date.

Finally, since I am a Computer Network Defense (CND) orientated type of guy, I think there is one more question that should be added to Rebekah’s list:

How will your threat intelligence integrate with my
$SIEM/$ANALYTIC tool [1]

If you can’t action this information or quickly search for IOCs, will it be of value to you? How difficult will it be to incorporate their data into your SIEM/Toolsets? Are they only producing reports or is there a data feed? Is it in STIX?

With that final question in mind I had a quick conversation with the ever-so-smart Splunker Kyle Champlin [2] and we created this table of known threat intelligence providers that have prebuilt integration with Splunk and Splunk ES Threat Intel Framework:

Vendor	App	ES Compatible
Webroot Brightcloud	https://splunkbase.splunk.com/app/1929/	Not at this time
Anomali	https://splunkbase.splunk.com/app/1723/	Not at this time
Kaspersky Threat Feed App	https://splunkbase.splunk.com/app/3176/	Not at this time
Symantec Deepsight	https://splunkbase.splunk.com/app/1734/	Update coming soon
Recorded Future	https://splunkbase.splunk.com/app/3127/	Yes
Looking Glass	https://splunkbase.splunk.com/app/2820/	Not at this time
Phishme	https://splunkbase.splunk.com/app/3071/	Yes
iSight/FireEye	https://splunkbase.splunk.com/app/2764/	Possible

Also special mention to these threat intelligence-sharing vendors listed below. They may not specifically generate threat intelligence but they do allow you to manage and/or share threat intelligence in trusted (or heck, even untrusted if you want) communities:

Vendor	App	ES Compatible
ThreatConnect	https://splunkbase.splunk.com/app/1929/	Yes
Facebook Threat Exchange	https://splunkbase.splunk.com/app/1723/	Yes
FS-ISAC	Out of the box with Enterprise Security	Yes

So in conclusion:

Where does the info come from?
What types of threat groups does it cover?
What types of information does it include?
Primary source or enrichment?
And…
How will your threat intelligence integrate with my $SIEM/Analytic tool?

~~~

While it’s easy to be swayed by what is trending on Twitter at the moment or by who is speaking at DEF CON this year, take the time instead to thoughtfully consider these questions. You may be surprised by the answers. Thanks again and Happy Hunting ☺

[1]
Question inspired by the brilliantly Columbo-esque (in personality not physical stature) James Brodsky (http://blogs.splunk.com/author/jbrodsky/).

[2]
This whole table was co-created by the cheeky Kyle Champlin (http://blogs.splunk.com/author/kchamplin/)

↧

Best Practices in Protecting Splunk Enterprise

July 10, 2016, 4:28 pm

≫ Next: Docker? Amazon ECS? Splunk? How they now all seamlessly work together

≪ Previous: How to Pick a Threat Intelligence Provider (kind of…)

Splunk Enterprise helps companies collect, analyze, and act upon the data generated by their technology infrastructure, security systems and business applications. Customers use Splunk software to achieve operational visibility into critical information technology assets and drive operational performance and business results.

Splunk Apps enhance and extend the Splunk platform and deliver a user experience tailored to typical tasks and roles. Most customers make use of one or more of the 1000+ Apps available in Splunkbase.

While end-users are the main consumers of Apps, App installation requires full administrator access. We strongly discourage customers from granting this access to any user other than designated administrators.

Beyond restricting admin privileges, we recommend adopting the standard deployment and operation practices described briefly below and detailed in the Splunk Enterprise documentation and Securing Splunk section.

Protect your Splunk Instance

Treat your Splunk administrator accounts like any other Administrator or root account in your network.
Make sure to change the default user name and password on all Splunk software components and set a minimum password length.
Use accounts other than root to run Splunk Enterprise and Universal Forwarders.
Use non-administrator accounts for normal daily tasks such as searching and reporting.
Configure individual host firewalls to the minimum necessary network exposure.
Configure SSL/TLS to protect critical network communication paths.
Inventory local accounts and role capabilities on all Splunk components and remove unnecessary users.

List Users: $SPLUNK_HOME/bin/splunk list user

List Roles: $SPLUNK_HOME/bin/splunk btool authorize list

Review the specific capabilities assigned to use role.

Backup Splunk Enterprise configurations on a regular basis.

Monitor your Splunk Instance

Review administrative access using audit logs.

Review consolidated _audit and _internal logs from your forwarders.

Consider using third party file monitoring solutions such as Tripwire to regularly review changes in Splunk configuration.

Splunk also has an Application Certification Program as part of Splunkbase. Customers can choose to use only apps that have been reviewed for technical settings including security.

If you find or suspect a vulnerability in Splunk Enterprise, we’ll be glad to investigate! Let us know via the Splunk Security Portal or submission form.

↧

Docker? Amazon ECS? Splunk? How they now all seamlessly work together

July 13, 2016, 12:00 pm

≫ Next: Smart AnSwerS #70

≪ Previous: Best Practices in Protecting Splunk Enterprise

Today the Amazon EC2 Container Service (ECS) team announced they have added the Splunk native logging driver to the newest version of the ECS agent. This means it’s now easier to implement a comprehensive monitoring solution for running your containers at scale. At Splunk, we’re incredibly excited about this integration because customers running containers in ECS can now receive all the benefits of the logging driver, like better data classification & searching, support for flexible RBAC, and easy and scalable data collection built on top of the Splunk HTTP Event Collector (HEC).

The following is a guest blog post by David Potes, AWS Solutions Architect:

Monitoring containers has been somewhat of a challenge in the past, but the ECS team has been hard at work making it easy to integrate your container logs and metrics into key monitoring ecosystems. Recently, they have added native logging to Splunk in the latest version of the ECS agent. In this article, we’ll look at how to get this up and running and present a few examples of how to get greater insight into your Docker containers on ECS.

If you don’t already have Splunk, that’s OK! You can download a 60-day trial of Splunk, or sign up for a Splunk Cloud trial.

How It Works

Using EC2 Container Services (ECS)? The Splunk logging driver is now a supported option. You can set the Splunk logging driver in your Task Definition per container under the “Log configuration” section. All log messages will be sent to Splunk providing additional access control, using a more secure method, and providing additional data classification options for logs collected from your docker ecosystem.

Not Using ECS? No problem!

You can configure Splunk logging as the default logging driver by passing the correct options to the Docker daemon, or you can set it at runtime for a specific container.

The receiver will be the HTTP Event Collector (HEC), a highly scalable and secure engine built into Splunk 6.3.0 or later. Our traffic will be secured by both a security token and SSL encryption. One of the great things about HEC is that it’s simple to use with either Splunk Enterprise or Splunk Cloud. There’s no need to deploy a forwarder to gather data, since the logging driver handles all of this for you.

Setting Up the HTTP Event Collector

The first step is to set up the HEC and create a security token. In Splunk, select Settings > Data Inputs, and click on the “HTTP Event Collector” link where the configurations can be applied. For the full instructions please refer to our online docs.

Configuring your Docker Containers

First, make sure your ECS agent is up to date. Run the following to check your agent version:

curl -s 127.0.0.1:51678/v1/metadata | python -mjson.tool. Please refer to the following site for other options on how to check your ECS Container Agent version.

From an Amazon linux image, get the latest ECS version is simple. To update your ECS Container Agent, you can follow the instructions available here.

Configuring Splunk logging driver in EC2 Container Services (ECS)

You can setup your “Log configuration” options in your AWS Console for you EC2 Container Service. Under your Task Definition, specify a new “Log configuration” under your existing “Container Definition” under the “STORAGE AND LOGGING” section.

Set the “Log driver” option to splunk
Specify the following mandatory log options, for more details please reference the documentation
1. splunk-url
2. splunk-token
3. splunk-insecureskipverify – set to “true” – required if you don’t specify the certificate options (splunk-capath, splunk-caname)
Specify any other optional parameters (e.g., tag, labels, splunk-source, etc.)
Click on the “Update” button to update your configurations

Figure 2: Sample configuration of Log configuration

Here’s a sample JSON Log configuration for a Task Definition:

“logConfiguration”: {

“logDriver”: “splunk”,

“options”: {

“splunk-url”: “https://splunkhost:8088”,

“splunk-token”: “< your token>”,

“tag”: “{{.Name}}”,

“splunk-insecureskipverify”: “true”,

}
Configuring Splunk logging driver by overriding the docker daemon logging option

Now we will set our logging options in Docker daemon. We can set Splunk logging on a per-container basis or define it as a default in the Docker daemon settings. We will specify some additional details at runtime to be passed along with our JSON payloads to help identify the source data.

docker run –log-driver=splunk \

–log-opt splunk-token=<your token>\

–log-opt splunk-url=https://splunkhost:8088 \

–log-opt splunk-capath=/path/to/cert/cacert.pem \

–log-opt splunk-caname=SplunkServerDefaultCert

–log-opt tag=”{{.Name}}/{{.ID}}”

–log-opt source=mytestsystem

–log-opt index=test

–log-opt sourcetype=apache

–log-opt labels=location

–log-opt env=TEST

–env “TEST=false”

–label location=us-west-2

your/application

The splunk-token is the security token we generated when setting up the HEC.

The splunk-url is target address of your Splunk Cloud or Splunk Enterprise system.

The next two lines define the name of and the local path to the Splunk CA cert. If you would rather not deploy the CA to your systems, you set the splunk-insecureskipverify to true is required if you don’t specify the certificate options (splunk-capath, splunk-caname), though it does reduce the security of your configuration.

The tag will add the name of the container and the full ID of the container. Using the ID option would only add the first 12 characters of the container.

We can send labels and env values, if these are specified in the container. If there is a collision between a label and env value, the env value will take precedence.

Optionally, but recommended, you can set the sourcetype, source and the target index for your Splunk implementation.

Now that we have started up the container with Splunk logging options, we should be able to see events populate shortly after the container is running into our Splunk searches. Using the default sourcetype, and if you set the options as in the example above, you can use the following search to see your data: sourcetype=httpevent

Here’s a sample of a container log message logged by the splunk logging driver:

And there you have it. Container monitoring can bring additional complexity to your infrastructure, but it doesn’t have to bring complexity to your job. It’s that easy to configure Splunk to monitor your docker containers on ECS and in your AWS infrastructure.

Thanks,
David Potes
AWS Solutions Architect

Follow @davidpotes

Follow @awscloud

Follow @splunk

↧

Smart AnSwerS #70

July 14, 2016, 2:52 pm

≫ Next: If your plants could speak to you, what would they say?

≪ Previous: Docker? Amazon ECS? Splunk? How they now all seamlessly work together

Hey there community and welcome to the 70^th installment of Smart AnSwerS.

Since expanding Splunk HQ with the addition of the new building next door, things have been eerily quiet as you walk through each floor since everyone has been spread out, leaving many Splunkers feeling distant and empty. People have been missing the energy and lively vibe when everyone was all together under one roof. It was finally decided that everyone in the old building would be consolidated into the new building. So in true Splunk fashion, we’ll be celebrating with a party tomorrow for one last hurrah in our 250 Brannan courtyard before the move to bid our farewell to the old building until it undergoes its new makeover!

Check out this week’s featured Splunk Answers posts:

How to create a search that shows a trending value based on the selected time range picker value?

Iranes needed to create a dashboard with a single value visualization and trending value that changed based on the time range picker, not the default timechart span values. SplunkTrust member MuS answers with a run anywhere dashboard example of Simple XML to get the solution started. With some back and forth discussion on syntax for the search, Iranes was able to find a working solution with MuS’ guidance.
https://answers.splunk.com/answers/390574/how-to-create-a-search-that-shows-a-trending-value.html

How to convert an IP address to binary?

Applegreengrape wanted to know if it was possible to convert an IP address to binary in a Splunk search. SPL can be very powerful, especially if you have a strong grasp on how you can manipulate your data with the right commands. Javiergn comes up with just the right search for this requirement using a combination of eval and stats to get the expected output.
https://answers.splunk.com/answers/396201/how-to-convert-an-ip-address-to-binary.html

How does Splunk assign thread_id for scheduled searches and alerts in scheduler.log?

AntonyPriwin noticed there were saved searches and alerts with the same scheduled_time and dispatch_time that had incremented thread_id values, but there were others that all had the same thread_id. He was interested in understanding the reason for this behavior, and jrodman gave a great explanation of how this value is assigned and what it’s used for.
https://answers.splunk.com/answers/372872/how-does-splunk-assign-thread-id-for-scheduled-sea.html

Thanks for reading!

Missed out on the first sixty-nine Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

If your plants could speak to you, what would they say?

July 19, 2016, 11:11 am

≫ Next: Smart AnSwerS #71

≪ Previous: Smart AnSwerS #70

I’m pretty sure mine would say “Hey Bozo, thanks for drowning me to death” or “Must… have… water… What is this, the Sahara?” Oh, and also “I hate it here, what’s it take to get some morning sun?”

I decided it was time to apply my inner nerd to reduce my plants suffering. That and happier plants mean a happier fiancé. Enter Splunk! The goal was:

Keep track of moisture level in the soil.
Determine best location for light intake.
Combine current weather data, future forecasts and 1 and 2 above to create some machine learning models that predict when is best to water. (I’m still working on this part)

I shall call it… Operational Plantelligence! When first said aloud, I received a slow, sad head shake from the aforementioned fiancé. But alas, it did not deter me from the project.

I bought this kit to cultivate some technology-supported Redwood bonsais. The end result?

Let’s break this project apart and step through how I created this dashboard.

In this case, there are multiple IoT devices used to collect all of this machine generated data.

A PlantLink sensor to collect soil moisture levels
A Thingsee device to collect ambient light, indoor temperature, pressure and relative humidity
A weather feed from Weather Underground to get current, historical and future forecasts. (Note there is an embedded iframe on the dashboard from io to display current weather, but that’s not the actual indexed weather data.)

So, why Splunk? Sure, each data source offers its own app to view and analyze their particular data. But, Splunk let’s me combine and join these disparate data sources (and more!) over time to gain additional insights. With a composite view, I can do deeper discovery on what’s happening to my precious seedlings – moment-by-moment or over any period of time.

I see this pretty commonly in the IoT world today — whether it’s energy, manufacturing, agriculture, robotics, or healthcare. Sensors are created and Apps built to analyze those sensors for a specific purpose or use case. Call it fate, fortune, luck or a combo — Splunk was built to fill this gap before we even knew it would exist and before the IoT acronym was “so hot right now”.

Our customers are starting to find their greatest value in combining data across their enterprise to create Operational Intelligence – and we are stoked to innovate with you

Ok, enough of my absolute (yet totally reasonable) infatuation with Splunk….. let’s get to the fun stuff.

Starting with the PlantLink sensor: You can purchase these little guys from Amazon or maybe even make your own and save a buck. If you get a PlantLink, you’ll be able to obtain and view moisture sensor level readings on a continual basis and get reminders for when to water your plant with the included App. This is great, but would be better with more data!

I used the Add-On Builder from Splunkbase to create a simple Technology Add-on (TA) to pull data from PlantLink’s API at MY preferred rate.

Here is what that data looks like in Splunk:

Hooray for nice clean JSON. How about the Thingsee data?

Well it’s in JSON, but a bit cryptic. Perfect time for lookup tables to make machine data more human readable. Even better, the lookup tables were already created by the community in an App! The same app assists you with getting data in from your Thingsee device. Gotta love it when there is so much out-of-the-box content available. Thank you Splunkbase!

As you can see, the lookup table takes the machine data and adds additional context to it, in real-time, making it much more usable. The “senses” are translated into fields such as “air_pressure”, “ambient_light”, “battery_level” and so on. Note that these lookup tables can be .csv files or a direct connection to a database.

Last but not least, we have our weather data from another app TA-wunderground.

Again, must be my lucky day as it’s in JSON and the fields are automatically extracted by Splunk. (See fields on the left)

Now for my favorite part – building out a dashboard. I wanted a way to visualize all of these data sources in a ‘single pane of glass’ (there I said it..) so that I could measure the individual sources AND start looking for meaningful correlations in and across the data.

Additionally, I could monitor the health of the sensor itself (such as Battery level, and wifi signal strength) to make sure it is working properly. If I had a bunch of these sensors, I could use Splunk to look for deviations between them and possibly predict failures before they happen. Multiple use cases using the same data!

And VOILA!

I also wanted to be able to view current and historical values from the PlantLink and Thingsee over any time range and granularity (from dropdowns at the top). To visualize correlations, I created a time-chart with the ability to choose an overlay field from any source. Lastly, I used the Lookup Editor App so I could enter my own comments in a table (top right) to keep track of progress. Know that I used no code or customizations to build this dashboard. So easy even a cave man, (err.. middle school student?) could do it!

Question everything!

Now, I can start asking ANY question of my data (more so than what each individual app can offer). What is the current moisture level and how fast is it dropping? Is the plant receiving enough or too much ambient light (using the lux sensor from the thingsee)? Does pressure, temperature or humidity (inside and outside??) have any effects on moisture movement? Who knows just yet, but as more data is collected, I can look for correlations and possible predictors that are meaningful in creating models with the free Machine Learning App for Splunk. Then, I can start configuring Splunk alerts to warn me of changes and predictions I care about. I sure do love having all of these capabilities in just one platform!

I’m just getting started with this project and some of the next steps will be:

Add more plant moisture sensors.
Build a model to start predicting when to water given more conditions than what the commercial PlantLink App uses.
Get my lawn connected.

Would love your ideas too – add comments!

Hope you enjoyed the post and as always Happy Splunking!

↧

Smart AnSwerS #71

July 21, 2016, 12:10 pm

≫ Next: Sending binary data to Splunk and preprocessing it

≪ Previous: If your plants could speak to you, what would they say?

Hey there community and welcome to the 71^st installment of Smart AnSwerS.

There’s a lot of hustle and bustle going on as Splunkers, partners, and customers are preparing and reviewing presentations for .conf2016 just two months away. As we all wait in anticipation for the annual worldwide user conference, come join the community in a sneak peek of one of the sessions at next week’s July SplunkTrust Virtual .conf session. On Friday, July 29^th @ 3:00PM Pacific time, SplunkTrustee Mason Morales will be giving a preview of his .conf2016 talk: Architecting Splunk for Epic Performance. Visit the meetup page to RSVP and access the WebEx link for the event.

Check out this week’s featured Splunk Answers posts:

What causes “Too many search jobs found in the dispatch directory” and should Splunk be handling this on its own?

a212830 was seeing this message appear frequently on a search head, and could not find much material on why this happens. There have been several questions asked on related topics, but these have focused more on how to clean up the dispatch directory. sowings and yannK both contributed answers that addressed the underlying causes of this behavior. They educate the community on what the dispatch directory is, its purpose, the types of search artifacts that get stored there, and why the TTL (time to live) varies for each one.
https://answers.splunk.com/answers/213571/what-causes-too-many-search-jobs-found-in-the-disp.html

How does creating a data model affect storage and memory?

packet_hunter was concerned about predicting how much disk space would be consumed by creating and testing different data models, especially with little extra storage or license to work with. shaskell explains how this depends on the type of data model, acceleration, and the period of acceleration. He shares a lot of great resources from Splunk documentation on inspecting acceleration, precautions, differences between ad hoc versus persistent acceleration, and how to limit the amount of disk space used for data model summaries.
https://answers.splunk.com/answers/425565/how-does-creating-a-data-model-affect-storage-and.html

Why am I getting less fields returned from a search with the stats command compared to transaction?

Urias was told to use the stats command instead of transaction, but noticed there were fewer fields returned from the search. Stats was recommended for performance reasons, but Urias wasn’t sure if this was still the right way to go if it meant getting limited results. craigv covers the differences between the two commands, how they operate, and whether or not you can get the same functionality using one or the other based on your use case.
https://answers.splunk.com/answers/424769/why-am-i-getting-less-fields-returned-from-a-searc.html

Thanks for reading!

Missed out on the first seventy Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Sending binary data to Splunk and preprocessing it

July 28, 2016, 4:44 pm

≫ Next: Smart AnSwerS #72

≪ Previous: Smart AnSwerS #71

A while ago I released an App on Splunkbase called Protocol Data Inputs (PDI) that allows you to send text or binary data to Splunk via many different protocols and dynamically apply pre processors to act on this data prior to indexing in Splunk. You can read more about it here.

I thought I’d just share this interesting use case that I was fiddling around with today. What if I wanted to send compressed data (which is a binary payload) to Splunk and index it ? Well , this is very trivial to accomplish with PDI.

Choose your protocol and binary data payload

PDI supports many different protocols , but for the purposes of this example I just rolled a dice and chose HTTP POST. I could have chosen raw TCP, SockJS or WebSockets and the steps in this blog for handling the binary data are the same.

Likewise for the binary payload. I just chose compressed Gzip data(could have chosen another compression algorithm) because more people can likely relate for the purposes of an example blog rather than using an example of an industry proprietary binary protocol like ISO8583 (financial services) or MATIP(aviation) or binary data encodings such as Avro or ProtoBuf.

Note , Splunk’s HTTP Event collector can also support a Gzip payload.

Setup a PDI stanza to listen for HTTP POST requests.

PDI has many options , but for this simple example you only need to choose the protocol and a port number.

Declare the custom handler to apply to the received compressed data (a binary payload).

You can see this above in the Custom Data Handler section. I’ve bundled this custom handler in with the PDI v1.2 release for convenience.Here is the source if you are interested. Handlers can be written in numerous JVM languages and then applied by simply declaring them in your PDI stanza as above and putting the code in the protocol_ta/bin/datahandlers directory, there are more template examples here.

The GZipHandler will intercept the compressed binary payload and decompress it into text for indexing in Splunk.

Send some test data to Splunk.

I just wrote a simple Python script to HTTP POST a compressed payload to Splunk.

Search for the data in Splunk.

Voila !

I hope this simple example can get you thinking about unleashing all that valuable binary data you have and sending it to Splunk.

↧

Smart AnSwerS #72

July 29, 2016, 11:21 am

≫ Next: Send data to Splunk via an authenticated TCP Input

≪ Previous: Sending binary data to Splunk and preprocessing it

Hey there community and welcome to the 72^nd installment of Smart AnSwerS.

The “Where Will Your Karma Take You” contest has been underway for two weeks now on Splunk Answers, and there is just a little over 2 weeks left to go! From July 15 to August 15, the top 3 users that earn the most karma points within this period will each earn a free pass to .conf2016. Best of luck to everyone and finish off strong!

Also, next Wednesday, August 3^rd @ 6:30PM, the San Francisco Bay Area user group will be meeting at Splunk HQ. If you happen to be in the area, come join us! Visit the SFBA user group page to see what’s in store for the agenda and RSVP.

Check out this week’s featured Splunk Answers posts:

Are data model summaries linked to the original events? Can tstats access them?

gabriel_vasseur couldn’t access original events from accelerated data models, and even running a tstats search in verbose mode only returned limited results. He found that data model summaries were stored in the same place as indexes, and wanted to know why tsidx files weren’t just pointing to the original events in the index. SplunkTrust member dshpritz gives a clear answer defining accelerated data models, what they contain, and what exactly happens when drilling down from accelerated data to actual events. He also includes helpful links to supporting documentation for additional reading.
https://answers.splunk.com/answers/431982/are-data-model-summaries-linked-to-the-original-ev.html

How do you manage the content for users’ Splunk apps in a Search Head Cluster?

twinspop was previously running search head pooling, but recently moved over to a new install of a search head cluster and didn’t understand how to manage knowledge objects of users’ apps. SplunkTrustee somesoni2 explains that user created objects need to be stored locally on search heads, and default configurations need to be pushed from the SHC deployer. In addition to these best practices, he shares the folder paths for migration in case other users in the community need guidance on moving from search head pooling to a search head clustering environment.
https://answers.splunk.com/answers/426842/how-do-you-manage-the-content-of-users-splunk-apps.html

How to filter out weekdays or weekends in one search while using timewrap?

penguin1725 was trying to use the timewrap command to compare current data to the last 7 days, but needed to figure out how to compare a weekday to only weekdays, and a weekend day to only Saturday and Sunday. somesoni2 strikes again with a search using eval to define both weekdays and weekend days to use as a filter in one search.
https://answers.splunk.com/answers/426673/how-to-filter-out-weekdays-or-weekends-in-one-sear.html

Thanks for reading!

Missed out on the first seventy-one Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Send data to Splunk via an authenticated TCP Input

July 29, 2016, 9:07 pm

≫ Next: Smart AnSwerS #73

≪ Previous: Smart AnSwerS #72

Wow , my second blog in 24 hrs about Protocol Data Inputs(PDI) , but sometimes you just infected with ideas and have to roll with it.

So my latest headbump is about sending text or binary data to Splunk over raw TCP and authenticating access to that TCP input.Simple to accomplish with PDI.

Setup a PDI stanza to listen for TCP requests

PDI has many options , but for this simple example you only need to choose the protocol(TCP) and a port number.

Declare a custom handler to authenticate the received data

You can see this above in the Custom Data Handler section.I have declared the handler and the authentication token that the handler should use via a JSON properties string that gets passed to the handler when everything instantiates.This JSON properties string can be any format that you want because your custom data handler that you code will have the logic for processing it.

The approach I used for the authentication is :

1) received data is expected to be in the format : token=yourtoken,body=somedata
2) data is received and token is checked. If token matches , data from the body field is indexed , otherwise the data is dropped and an error is logged.

Here is the source if you are interested.

Handlers can be written in numerous JVM languages and then applied by simply declaring them in your PDI stanza as above and putting the code in the protocol_ta/bin/datahandlers directory, there are more template examples here.

Send some test data to Splunk

I just wrote a simple Python script to send some data to Splunk over raw TCP.

Search for the data in Splunk

If the token authentication fails , the data is dropped and a error is logged in Splunk.

And that’s it. Pretty simple to roll your own token auth handler and make your TCP inputs that much more secure.

Note : TCP was used for this example , but this exact same handler will work with any of the PDI protocol options , just choose another protocol.

↧

Splunk Add-on for Blue Coat ProxySG: Why am I getting error “Page not found!” trying to launch the app?

Why is Splunk not parsing JSON data correctly with my current sourcetype configuration?

Why is date_hour inconsistent with %H?

What happens to my multisite indexer cluster when connectivity between sites dies?

What are best practices for handling data in a Splunk staging environment that needs to go to production?

How can I get the latitude and longitude range when I click on map markers and use those values for a drilldown to a panel in the same dashboard?

How to speed up LDAP / Active Directory searches, specifically Asset or Identity lookups?

Why are the counts inconsistent for metadata under Data Summary after using Delete?

Punct… good god ya’ll – what is it good for?

Field Extractions to the rescue

Setup

Testing it out

How to add upper and lower boundaries to a sparkline?

When adding an indexer to a distributed environment, is there a configuration that makes indexers exchange events to auto load balance them?

Why are search results cut off at 10,000 in Splunk Web and 10,000 or 20,000 results via REST API?

Where does the info come from?

What types of threat groups does it cover?

What types of information does it include?

Primary source or enrichment?

How will your threat intelligence integrate with my $SIEM/$ANALYTIC tool [1]

Protect your Splunk Instance

Monitor your Splunk Instance

How to create a search that shows a trending value based on the selected time range picker value?

How to convert an IP address to binary?

How does Splunk assign thread_id for scheduled searches and alerts in scheduler.log?

What causes “Too many search jobs found in the dispatch directory” and should Splunk be handling this on its own?

How does creating a data model affect storage and memory?

Why am I getting less fields returned from a search with the stats command compared to transaction?

Choose your protocol and binary data payload

Setup a PDI stanza to listen for HTTP POST requests.

Declare the custom handler to apply to the received compressed data (a binary payload).

Send some test data to Splunk.

Search for the data in Splunk.

Voila !

Are data model summaries linked to the original events? Can tstats access them?

How do you manage the content for users’ Splunk apps in a Search Head Cluster?

How to filter out weekdays or weekends in one search while using timewrap?

Setup a PDI stanza to listen for TCP requests

Declare a custom handler to authenticate the received data

Send some test data to Splunk

Search for the data in Splunk

How will your threat intelligence integrate with my
$SIEM/$ANALYTIC tool [1]