Splunking the World Cup 2014: Real Time Match Analysis

June 19, 2014, 10:00 am

≫ Next: Quantified Splunk: Tracking My Vital Signs

≪ Previous: Calling Mobile App Builders: Bugsense is for you

splunk-blog-world-cup-stadium-chart

As an Englishman I’ve been waiting months – with very high expectations – for the World Cup to come around. Reading fellow Splunker, Matt Davies’ blog post titled, “Splunking World Cup 2014. The winner will be…“, only heightened my excitement.

The tournament is now going into the second week and I’ve been starting to look at the teams, players, and tournament more closely. Which stadium holds the most people? Who’s the top scorer? Which referee hands out the most cards?

With these questions fresh in my mind I opened up Splunk and began to have a look at the huge amounts of information being streamed from the tournament. For this post I’m going to explore real-time match updates; including teams, scores, and match locations.

Prerequisites

A basic understanding of Splunk
Splunk installed and running (free download here)

Step 1: Choose the Data Sources You Want to Splunk

World Cup Match JSON Feed

There are lots of potential sources to grab World Cup data – from match reports to fan Twitter feeds. Software For Good have created a bunch of endpoints offering both match and team information.

For this project we’ll use their live match endpoint.

Step 2: Install the REST API Modular Input in Splunk

splunk-blog-dallimore-rest-ta

To get this feed into Splunk we’ll use Damien Dallimore’s REST API Modular input for Splunk. You can download the app here with full instructions on how to install it.

Step 3: Configure your RESTful Input

splunk-blog-rest-input-config

In Splunk navigate to “Settings > Data Inputs > REST”, and select “Add new”.

Configuration options:

REST API Input Name: WorldCupMatchData (optional) Endpoint URL: http://worldcup.sfg.io/matches Response Type: JSON Set Sourcetype: Manual Sourcetype: _json Host: SFG (optional)

You will see we set the “Response Type” to JSON as the feed being returned is in JSON format. It is also important to explicitly set the “Sourcetype” to “_json” too. This ensures Splunk parses the JSON events correctly at search. If your search returns grouped events, you’ve probably forgot to set this.

Note, I have only included the fields that are essential to configure (unless stated). Everything else can be left blank or as default (unless you need to enter in a proxy to get out to the internet, etc).

Step 4: Lets Play

splunk-blog-search-world-cup-data

Note, this data source also contains future match data. If you’re not interested in this information just specify “NOT status=”future” in your search string.

Where have most matches been played so far? (Maracanã – Estádio Jornalista Mário Filho – 7 / Estadio Nacional – 7)

host="SFGFeed" NOT status="future" | top location

How many goals have been scored? (49)

host="SFGFeed" NOT status="future" | stats sum(away_team.goals) AS TotalAwayGoals sum(home_team.goals) AS TotalHomeGoals | eval TotalGoals = TotalAwayGoals + TotalHomeGoals | fields TotalGoals

Average goals per game? (~3)

host="SFGFeed" NOT status="future" | stats sum(away_team.goals) AS TotalAwayGoals sum(home_team.goals) AS TotalHomeGoals dc(match_number) AS TotalMatches | eval TotalGoals = TotalAwayGoals + TotalHomeGoals | eval GoalsPerGame = TotalGoals / TotalMatches

What stadium were the most goals been scored in during the first matches? (Arena Fonte Nova)

host="SFGFeed" NOT status="future" match_number>=1 match_number<=16 | stats sum(away_team.goals) AS TotalAwayGoals sum(home_team.goals) AS TotalHomeGoals dc(match_number) AS TotalMatches by location | eval TotalGoals = TotalAwayGoals + TotalHomeGoals | sort - TotalGoals

Which teams won their opening games? (USA, Switzerland, Netherlands, Mexico,, Ivory Coast, Italy, Germany, France, Costa Rica, Colombia)

host="SFGFeed" NOT status="future" | where winner!="Draw" | top winner | fields - percent

(Note, the numbers will be out of date by the time you read this! Maybe England have won!)

Step 5: Extra Time

splunk-blog-world-cup-winners

I’ve only started to scratch the surface here. Remember this data source is streaming information in real-time into Splunk as matches are being played. Why not get Splunk up on a second big screen whilst your watching the game to analyse the stats (too much)?

Correlating the data from the Software for Good endpoints with other sources may also prove interesting. Does the number of goals scored during the game have any correlation to the heat? Or distance travelled by teams before the match; how does this impact the final score?

Now I do believe there’s a ~~soccer~~ football match on…

↧

Quantified Splunk: Tracking My Vital Signs

June 23, 2014, 11:40 am

≫ Next: Quick Tip: Upload Logs to Splunk from Windows PowerShell

≪ Previous: Splunking the World Cup 2014: Real Time Match Analysis

splunk-blog-blood-pressure-overview

Last year Splunker, Ed Hunsinger, wrote a great post titled, “Go Splunk Yourself“, in which he shows how he’s using Splunk to track data from devices including a Fitbit, a Nike Fuelband, a Basis Band, and a Garmin GPS watch to name just a few!

Like Ed, I use a number of tracking devices and I use Splunk to analyse the data they produce. Recently – as my friends and colleagues will tell you – I’ve taken this concept of self-tracking to the next level. This has included purchasing both a blood sugar and a blood pressure monitor.

After a few weeks collecting the data I’ve uncovered some interesting trends. If you’re interested what I’ve found or how you can do this at home, read on.

Prerequisites

A basic understanding of Splunk
Splunk installed and running (free download here)
Microsoft Excel installed
Access to blood sugar / pressure devices

Step 1: Purchasing the Devices

The great thing about buying tracking devices is that you can spend a lot of money, very quickly. For this experiment I wanted to keep costs down and opted for entry level kit. Here’s the equipment I bought:

Step 2: Recording the Data

bloodpressure

With simplicity in mind I used a very manual data entry process. I created two CSV files to input blood pressure and glucose respectively. In column A I included a timestamp with a minute granularity. I did this as it was simple to log, but highly precise for the level of my analysis.

You can download my CSV templates here:

Step 3: Indexing the CSVs

splunk-blog-qs-add-csv

In Splunk I navigated to: “Settings” > “Data Inputs” > “Files & Directories” > “New”

Then I found and selected the CSV and clicked “Data Preview”. The timestamps looked good and I hit “Save”.

On the next screen, I set the source as: “Continuously index data from a file or directory this Splunk instance can access”. This means every time I save a new measurement Splunk automatically reads and indexes the change.

Altering the other fields on this page from the default, like “source”, may be useful but completely optional.

Step 4: Splunking the data

splunk-blog-blood-pressure-raw

If everything has worked properly you should be able to see the fields extracted from the data correctly when running a search:

source="/bloodpressure"

Now for the fun to begin…

Step 5: Start searching yourself

splunk-blog-blood-pulse

Looking at my pulse over time:

source="/bloodpressure.csv" | timechart avg(Pulse)

Great! But what does this mean? To add some context to the data I began by looking at what was considered a “normal” resting pulse. For this I used the reference ranges on the Mayo Clinic site - a max pulse = 100 and a min = 60. Of course, these should only be used as guides. To visualise this information I ran the search:

source="/bloodpressure.csv" | eval pulseMax = 100 | eval pulseMin = 60 | timechart avg(Pulse) avg(pulseMax) avg(pulseMin)

Looking at the chart above you can clearly see my pulse is regularly below the “normal” values. I attribute this to my fitness level but it will certainly be something to discuss with my doctor next time I visit for a checkup.

What about changes in blood pressure over time?

splunk-blog-blood-pressure-overview

source="/bloodpressure.csv" | timechart avg(Systolic) avg(Diastolic)

Perhaps as expected, Systolic and Diastolic measurements correlate well. There are some spikes in Systolic readings, whereas my Diastolic measurements are fairly smooth. If you’re interested in learning more about these measurements for analysis, this site provides a great introduction.

Step 6: Life-logging

splunk-blog-foursquare-checkins-map

I plan on adding further context to this data by including other life logging devices.

Like my Fitbit data, for example. Does more steps during the day have a noticeable impact on blood pressure readings taken each evening?

Other things like travel. Working for Splunk I’m fortunate to travel to different parts of the world but travelling through airports is stressful and staying in hotels can make it hard to eat well. If I compare my Foursquare checkins to blood data I can test wether distance travelled directly impacts upon my blood pressure and blood glucose readings.

I’ll keep you posted.

↧

Quick Tip: Upload Logs to Splunk from Windows PowerShell

June 25, 2014, 8:17 am

≫ Next: Splunk Alerts: Using Gmail, Twitter, iOS, and Much More

≪ Previous: Quantified Splunk: Tracking My Vital Signs

I had a folder full of log files I wanted to index real quick in my local instance of Splunk. They won’t persist, so the right thing to do is to use the “oneshot” command (documented here). This can be done in the web UI, but I like doing stuff at the command line. I opened up PowerShell (elevated, as my Splunk instance runs as system) and tried this:

splunk add oneshot *.log

And this was the output:

In handler 'oneshotinput': unable to open file: path='C:\Users\Hal\temp\*.log' error='The filename, directory name, or volume label syntax is incorrect.'

It didn’t work! Ok, so my assumption was that Splunk would parse the wildcard and have at it. But no big deal, this is quick to solve with a PowerShell one-liner:

ls | % { splunk add oneshot $_ }

Or, properly expanded out to not use the built-in aliases:

Get-ChildItem | ForEach-Object { splunk add oneshot $_ }

Hope this helps!

↧

Splunk Alerts: Using Gmail, Twitter, iOS, and Much More

June 27, 2014, 2:00 am

≫ Next: Quick PowerShell Script to Start Splunk

≪ Previous: Quick Tip: Upload Logs to Splunk from Windows PowerShell

splunk-blog-alerts-twitter

With no programming required!

One of the great features about Splunk is its built in alerting functionality. You can configure Splunk alerts to do just about anything, from sending an SMS to integrating them with another app, like ServiceNow for example.

Most Splunk users will probably want to configure alerts via email at some point. If you don’t have your own mail server you can use web based mail services like Gmail to do this. In this post we’ll explore how you can set this up and some neat ways in which you can extend upon native Splunk alerts.

Prerequisites

A basic understanding of Splunk
Splunk installed and running (free download here)
Some data being indexed into Splunk (to trigger alerts)
Read access to your email server information

Step 1: Configure Email Server Settings in Splunk

splunk-blog-alerts-email-server-config

Configuring Splunk to connect into the Gmail (and other web based email) servers is very simple.

In Splunk, navigate to: “Settings > System Settings > Email Alert Settings”.

In this example we’ll use Gmail, but you can also grab mail server information from web based email services, like Yahoo or Outlook to name but two. You’ll need to fill out 4 fields for your mail server to work with Splunk. For Gmail this will be as follows:

Mail host = smtp.gmail.com:587 Email security = TLS Username = <YOUR_GMAIL_ADDRESS> Password = <YOUR_GMAIL_PASSWORD>

You’re then given the option in Splunk to make your email alerts look pretty using the formatting options. For now we’ll keep it quick and use the defaults by hitting “Save”.

Step 2: Configure Alerts

splunk-blog-alerts-config

Now all you need to do is create an alert, or edit an existing one, to set off your email trigger. To create an alert, first create a search with the criteria you want to be alerted on, then click: “Save As > Alert”.

Once you’ve named the alert, select the email recipient(s) by selecting “Send email”.

Step 3: Profit

splunk-blog-alerts-email-alert-triggered

Voilà, alerts delivered to your inbox.

As you can see the alert information is pretty basic in its default format. The important thing is that the alert has a link to jump straight into Splunk for a deeper look. You probably want to style your emails better than I have using Splunk’s native email formatting settings (Step 1).

Step 4: Taking Alerts to the Next Level

splunk-blog-alerts-ifttt-gmail

After integrating Splunk with Gmail you can start to connect your Splunk alerts to other services. By using apps like IFTTT (If This Then That) this can be done very quickly, and very simply.

For example, connect your Gmail and Twitter accounts using IFTTT so that when a Splunk email alert is received a Tweet is posted.

Another neat recipe I’ve played around with is triggering an iOS notification through IFTTT when a Splunk email alert is received. If you’re really keen you can also connect your phone number to place calls as alerts come in!

Or what about…

↧

Quick PowerShell Script to Start Splunk

June 27, 2014, 8:13 am

≫ Next: Test-drive our new Splunk App for NetApp Bundle!

≪ Previous: Splunk Alerts: Using Gmail, Twitter, iOS, and Much More

Got another quick PowerShell post for you. I have a copy of Splunk running locally on my Windows 8.1 workstation. I don’t always leave it running, for obvious resource reasons, therefor I end up starting it and stopping it as needed. On Windows, there’s two ways to control the Splunk services:

CLI splunk.exe start|stop|restart commands
Windows native service control methods (and there’s a half-dozen ways to do that)

So, in PowerShell, you can just do this:

Get-Service splunk* | Start-Service

The only minor problem is that I keep forgetting to elevate my PowerShell shell, so I’ll get an error message, and then I have to open a new window, and then repeat the process. That’s no way to automate, I said to myself, so I made this quick Start-Splunk function:

Function Start-Splunk {
 try {
 Get-Service splunk* | Start-Service -ErrorAction Stop
 }
 catch [Microsoft.PowerShell.Commands.ServiceCommandException] {
 Write-Verbose "Command must be run in an elevated session, invoking new session."
 Start-Process -Verb Runas -FilePath powershell.exe { Get-Service splunk* | Start-Service -Verbose -ErrorAction Stop; Start-Sleep 5 }
 }
}

All I’m doing is catching the exception which is thrown when the call fails, and using the Start-Process cmdlet with a very useful trick to invoke PowerShell with “Run As”. That will do the right thing, and prompt you for elevation. Answer in the affirmative, and Splunk is started!

This function is also posted on gist.github.com for the cool stuff which that site provides.

Side note: you actually will have the same elevation issue if you try to start Splunk with its CLI commands. Technically, I could change the service to run as a non-system user, but that has other impact and this is just a dev environment, so there’s no point.

↧

Test-drive our new Splunk App for NetApp Bundle!

July 1, 2014, 2:38 pm

≫ Next: Splunking Social Media: Tracking Tweets

≪ Previous: Quick PowerShell Script to Start Splunk

Do you like solving user and applications problems and helping your customers, but lack adequate resources? We have made it super easy for you to accelerate your journey deep into storage space! Take our new Splunk App for NetApp Bundle for a spin and we will get you there. Download it for free here.

So what is it and where will it take you?

You are getting our free version of Splunk Enterprise packaged together with our free Splunk App for NetApp Data ONTAP. With this powerful combo you get an at-a-glance view of your entire NetApp Data ONTAP storage space. Quickly explore logs, storage performance and the system configuration of your NetApp environment. You also get both Cluster-Mode and 7-Mode in one central location with visibility into any storage entity, including controllers (filers), aggregates, disks, volumes and q-trees. And if you don’t have access to Lt. Commander Data to speed up your installation, just watch our bundle video installation tutorial or read the install guide

Want to go beyond the storage universe and venture into application problems?

Storage exists for storing and retrieving application and user data and really, you need to be able to diagnose when storage problems are causing application response problems or failures. Use the powerful engine of Splunk Enterprise to understand how storage latency impacts the performance and response times of your critical applications.

Don’t stop there. Splunk Enterprise can get you deeper into virtualization space, illuminate dark unknown matter of your networks and enable you to cross-correlate your storage data with any other technology tier be it OS, security, etc. Splunk Enterprise enables your journey to the final IT frontier.

Live long and prosper!

↧

Splunking Social Media: Tracking Tweets

July 3, 2014, 5:00 am

≫ Next: Big data just got its Tricorder

≪ Previous: Test-drive our new Splunk App for NetApp Bundle!

splunk-blog-twitter-dashboard

So you use Twitter and have heard Splunk can do “Big Data”. By tapping into Twitter’s API you can use Splunk to investigate the stream of tweets being generated across the globe.

The great thing about using Splunk to do this is that you have complete control of the data meaning it’s incredibly flexible as to what you can build. A few basic ideas I’ve had include tracking hashtags, following specific influencers, or tracking tweets by location in real-time.

What’s more, it takes a matter of minutes before you can start analysing the wealth of data being generated. This post will show you how.

Prerequisites

A basic understanding of Splunk
Splunk installed and running (free download here)
A basic Twitter account

Step 1: Create a Twitter App

splunk-blog-twitter-create-app

Go to: “dev.twitter.com” > “Sign in / up” > select “Create App”.

It doesn’t really matter what name you enter when creating the app (especially if it’s not going to be public) although I’d recommend using something you can remember. Same goes for description and website.

The callback field can be left blank. I won’t go into why or when this should be used in this post.

Step 2: Generate API Keys

splunk-blog-twitter-create-api-keys

Once your app has been created click the “API Keys” tab. You should see “Your access token” with a button “Create my access token”. Press this button.

You should now see your API keys. Don’t worry about noting them down, we can come back to this page at anytime. You will want to keep them secret though (the app above will be deleted by the time you read this!).

Step 3: Install the REST API Modular Input in Splunk

splunk-blog-dallimore-rest-ta

To get this feed into Splunk we’ll use Damien Dallimore’s REST API Modular input for Splunk. You can download the app here with full instructions on how to install it.

Step 4: Configure Your RESTful Twitter Input

splunk-blog-twitter-rest-input

We’re on the home straight now! Now we just need to give Splunk the credentials to tap into the Twitter API.

In Splunk navigate to “Settings > Data Inputs > REST”, and select “Add new”.

4.1 OAuth settings:

REST API Input Name = TwitterFeed (optional) Endpoint URL = https://stream.twitter.com/1.1/statuses/filter.json HTTP Method: GET Authentication Type = oauth1 OAUTH1 Client (Consumer) Key = <YOUR_CLIENT_KEY> OAUTH1 Client (Consumer) Secret = <YOUR_CLIENT_SECRET> OAUTH1 Access Token = <YOUR_ACCESS_KEY> OAUTH1 Access Token Secret = <YOUR_ACCESS_SECRET>

If you need to retrieve your OAUTH keys created in Step 2 go to: “dev.twitter.com” > “My apps” > “[Your App]” > “API Keys” > “Test OAuth”.

Note, we will be using version 1.1 of Twitter’s API which imposes rate limitations on its endpoints. If you’re only collecting a small number of tweets every 15 minutes this shouldn’t be a problem. If you’re planning on polling thousands you should probably read this first.

4.2 Argument settings:

At this point you should read the Twitter API docs if you are unfamiliar with the arguments that can be passed.

Example 1

URL Arguments: track=#worldcup^stall_warnings=true

Here I am using the ‘track’ streaming API parameter. In this case, I am polling tweets that contain the hashtag #worldcup. Note, that if you want to track multiple keywords, these are separated by a comma. However, the REST API configuration screen expects a comma delimeter between key=value pairs. Notice that I have used a “^” delimiter instead, as I need to use commas for my track values.

Example 2

URL Arguments: follow=21756213^stall_warnings=true

Now I am collecting Tweets using the “follow” streaming API parameter for the account @himynamesdave (that’s me). Note, that when using the follow parameter you must use the users ID, not username. If you’re unsure how to find a user ID, this site will help you.

4.3 Response settings:

Response Type = json Streaming Request = True Request Timeout = 86400 (optional) Delimeter: ^ (or whatever delimiter you used in the URL arguments field) Set Sourcetype: Manual Sourcetype: tweets (optional)

Note, for steps 4.1 – 4.3, I have only included the fields that are essential to configure (unless stated). Everything else can be left blank or as default (unless you need to enter in a proxy to get out to the internet, etc).

4.4 inputs.conf:

For reference, your new REST input configuration can also be found in: “<SPLUNK_HOME/etc/apps/launcher/local/inputs.conf”.

Step 5: Check Your Input is Working

splunk-blog-twitter-search

Using a Splunk search will allow you to check your data is being received and indexed:

sourcetype="tweets"

Note, you will only start to see Tweets after your Input polls a new Twitter event (we will not be able to pull Tweets historically).

See the latest tweet:

sourcetype="tweets" | fields text | head 1

Look at Tweet volume over time:

sourcetype="tweets" | timechart count(_raw)

Or count the number of retweets:

sourcetype="tweets" | stats count(retweet_count)

… you get the idea.

Start polling other accounts or searches to build up a bigger picture of what’s happening by repeating the steps above.

Step 6: Enrich Your Tweets

splunk-blog-twitter-sentiment

Why not start by analysing the sentiment of your Tweets? Splunker, David Carasso, has built a Sentiment App for Splunk that will help you to do this.

Alternatively use the REST API Modular Input to bring other social media sources into Splunk. Foursquare, Facebook and LinkedIn are just a few others that spring to mind.

Let me know what mashups you dream up (and build!).

↧

Big data just got its Tricorder

July 9, 2014, 5:19 am

≫ Next: Splunking web-pages

≪ Previous: Splunking Social Media: Tracking Tweets

In Star Trek a Tricorder is described as:

“A Tricorder is a multifunction hand-held device useful for data sensing, analysis, and recording data, with many specialized abilities which make it an asset to crews aboard starships and space stations as well as on away missions”.

I’m happy to announce the launch of the Splunk Mobile App, which unofficially I’m calling the “Big Data Tricorder”. You can download it from here (iTunes).

The Splunk Mobile App allows you to take the Splunk (Starship) Enterprise platform and allows you to explore strange new insights, to seek out new data and new visualizations, to boldly go where no machine data has gone before.

You can find more in the official press release here but I’ve been playing with the app for the last few weeks and its great. Splunk has always been about making machine data accessible, usable and valuable.

The Splunk Mobile App changes and extends that – we’re now making machine data accessible, usable and valuable, anywhere.

A lot of companies are now taking a “mobile first” strategy – engaging users with big data is key to its success and the visualization and analysis of your data needs to go mobile too. The Splunk Mobile App is a native app to give the best possible experience for end users to help them engage with the data and the insights it gives them.

Look around the next conference you go to – how many people are using a web browser on a laptop and how many people are using a tablet or phone? With the rise of BYOD – employees will need an easy to use, flexible way to engage with operational intelligence. I was at the Gartner CIO conference recently where I heard the term “business moments” and a discussion on how informed decisions need to be made in real-time and hence they are increasingly mobile. That’s why the Splunk Mobile App was built – putting your data and operational intelligence in the most convenient place for people so they can take the right action in real time.

With this first release, the mobile app is focused on two audiences:

The subject matter experts such IT analysts and line of business analytics users. These kinds of users will be able to securely view, interact and share machine data insights on their mobile device.

Splunk users and administrators who will be able to make available Splunk Enterprise dashboards, data visualizations and Operational Intelligence that users can view, interact with and share in a native mobile format on their device.

I’ve been using the app for a week or so now and here’s a quick tour through some of the things I’ve been trying out:

First up – you open up the app, log in and you can see your applications, favorites and dashboards (in this case on an iPhone 5):

The integration with the device is pretty tight with alerts being pushed through the notification mechanism – for example in this case you can see an alert around “Server Load Capacity” and the ease of being able to click on it and drill down into the relevant data and/or Splunk dashboard.

You can also see all your alerts (in this case on an iPad):

Once you’ve opened up a dashboard, you can open an individual visualization or chart and chose to export it, make it a favorite or even edit and share it:

I liked the collaborative elements and the ability to annotate it. You can highlight key points and then share it as you can see below (on an iPhone and then an iPad):

You can still search for information (in this case the ISDN NUMBER search box) and you can see this in the iPad dashboard below:

The dashboards render quickly and I’d say the Splunk dashboards look even nicer on an iPad than a Mac or PC.

You can download the app now from here. I hope it is useful and helps engage your users with Splunk and Operational Intelligence. There are going to be ongoing enhancements to the mobile app so keep an eye out for new features and functionality over the rest of this year. Any feedback is most welcome in the comments below.

There aren’t any plans to add a teleport function to Splunk in the near future but if you’re using Splunk (or considering it) and want to give anyone the ability to alert, access and analyze their big data – check out the Splunk Mobile App.

Set big data to stun…

↧

Splunking web-pages

July 9, 2014, 1:47 pm

≫ Next: Monitoring Local Administrators on Windows Hosts

≪ Previous: Big data just got its Tricorder

Have you ever had a situation where you found information on a webpage that you wanted to get into Splunk? I recently did and I wrote a free Splunk app called Website Input that makes it easy for everyone to extract information from web-pages and get it into a Splunk instance.

The Problem

There are many cases where web-pages include data that would be useful in Splunk but there is no API to get it. In my case, I needed to diagnose some networking problems that I suspected was related to my DSL connection. My modem has lots of details about the state of the connection but only within the web interface. It supports a syslog feed but it doesn’t include most of these syslog messages. Thus, to get this information, I need to get it directly from the web interface.

Some other use cases might be:

Integrity analysis of a website (so that you could alert if something goes wrong or if the site is defaced)
Identify errors on pages (like PHP warnings)
Retrieve contextual information that would help you understand the relevance of events in Splunk (like correlating failures with weather conditions)

The Solution

I wrote an app that includes a modular input for getting data from web-pages. Basically, you tell the app what page you want to monitor and what data to get out of the page. It will retrieve the requested data so that it can be searched and reported in Splunk. You identify the data you want to obtain using a CSS selector. The app will then get all of the text from under the nodes matching the selector.

Getting the Data into Splunk

Getting the web-page data into Splunk is fairly easy once you know the URL and the CSS selector that you want to use. You can get the data into Splunk in four steps.

Step 1: identify the URL

You’ll need to identify the URL of the page containing the data. In my case, I wanted to get data from my DSL modem and the URL containing the data was at http://192.168.1.1/statsadsl.html:

Step 2: identify the data

After identifying the URL, you’ll next need to make a selector that matches the data you want to obtain. If you don’t know how to use CSS selectors, Google “jQuery selector” or “CSS selector”. Here are a couple of good places to start:

The selector indicates what parts of the page the app should import into Splunk. For each element the selector matches, the app will get the text from the matching node and the child-nodes. Consider the following example. Assume we are attempting to get information from a page containing the following HTML table:

<table>
	<tr>
		<td></td>
		<td>Downstream</td>
		<td>Upstream</td>
	</tr>
	<tr>
		<td>Rate:</td>
		<td>3008</td>
		<td>512</td>
	</tr>
	<tr>
		<td>Attainable Rate:</td>
		<td>5600</td>
		<td>1224</td>
	</tr>
</table>

The table would look something like this:

	Downstream	Upstream
Rate:	3008	512
Attainable Rate:	5600	1224

If I enter a selector of “table”, then the app will match once on the entire table and produce a single value for the match field like this:

1	Downstream Upstream Rate: 3008 512 Attainable Rate: 5600 1224

This could easily by parsed in Splunk but it would be easier to parse if the results were broken up a bit more. You can do this by changing the selector to make multiple matches. If I use a selector of “td”, then I will get one value per td node (per each cell):

1	Downstream
2	Upstream
3	Rate:
4	3008
5	512
6	Attainable Rate:
7	5600
8	1224

Note that the app will make a single field (called “match”) with values for each match. Empty strings will be ignored.

Matching “td” works ok, but I think I would like the field values near the description. Thus, I would prefer to use a “tr” selector which will make a value for each row. That would yield:

1	Downstream Upstream
2	Rate: 3008 512
3	Attainable Rate: 5600 1224

This will be very easy to parse in Splunk. Once you get the selector and URL, you will be ready to make the input.

Step 3: make the input

Make sure you have the Website Input app installed. Once you do, you can make a new input by going in the Splunk manager page for Data Inputs and selecting “Web-pages”:

inputs

Click “Add new” to make a new instance:

new_input

The configuration is straightforward once you know what page you are looking and what selector you want to use. In my case, I needed to authenticate to my DSL modem so I needed to provide credentials as well. Also, you will likely want to set the sourcetype manually, especially if you want to apply props and transforms to the data. Otherwise, the data will default to the sourcetype “web_input”. Below is my completed input which grabs the data every minute and assigns it the sourcetype of adsl_modem:

completed_input

Once the input is made, you should see the data in Splunk by running a search. In my case, I searched for “sourcetype=adsl_modem”:

data

The data is present in Splunk and is searchable, but it isn’t parsed. That leads to the last step.

Step 4: parsing

Finally, you will likely want to create props and transforms to extract the relevant data into fields that you could include on dashboards. I want to get the value for “Super frame errors” since I have determined it indicates when my DSL connection is having problems.

I can use rex in a search to parse out the information. The following extracts the fields “super_frame_errors_downstream” and “super_frame_errors_upstream”:

sourcetype=adsl_modem | head 5| rex field=_raw "Super Frame Errors: (?<super_frame_errors_downstream>\d*) (?<super_frame_errors_upstream>\d*)"

This gets me the information that I wanted in the appropriate fields:

results_rex_parsed

You may want to have the extractions done in props/transforms so that you don’t have to add rex to every search that needs the data parsed. In my case, I did this by adding the following to props.conf:

[adsl_modem]
EXTRACT-super-frame-errors = "Super Frame Errors: (?<super_frame_errors_downstream>\d*) (?<super_frame_errors_upstream>\d*)"

With the data extracted, I could make a chart to illustrate the errors over time:

Getting the app

If you want to use the app, go the apps.Splunk.com and download it (its free). If you need help, ask a questions on Answers.splunk.com.

Limitations

The app currently only supports HTTP authentication which means you cannot use it to capture data from web-pages that require you to authenticate via a web-form (might be supported in a later version). Also, you need to be careful pulling data from others’ websites without approval. Some websites have terms of use that disallow web-scraping.

↧

Monitoring Local Administrators on Windows Hosts

July 10, 2014, 7:29 am

≫ Next: Protected: Deploying Splunk Securely with Ansible Config Management – Part 1

≪ Previous: Splunking web-pages

It is always gratifying when one of my readers comes to me with a problem. I love challenges. This one had to do with one of my old posts surrounding Local Administrators remotely. Of course, the way to do this is via WMI. However, it doesn’t quite work the same way locally. This is because the WMI call to Win32_Group.GetRelated() returns other stuff as well. So the question posed was “how do I get the list of Local Administrators locally.” More specifically, I want to monitor the local Administrators group.

I look at this two ways. Firstly, I want to get a regular list of names in the Administrators group and secondly, I want to monitor for changes to the Administrators group during the day. Let’s start with the first one. My favorite tool for this is, of course, PowerShell. After a bit of back and forth, I came up with a script that works on both Windows Server 2012 and Windows Server 2008R2 – there is a little bit of syntax change in the Where-Object between PowerShell 2 (used on Windows Server 2008R2) and PowerShell 3 (used on Windows Server 2012 and above).

(Get-WMIObject Win32_Group | Where-Object { $_.Name –eq ‘Administrators’ }).GetRelated() | Where-Object { $_.__CLASS –eq “Win32_UserAccount” –or $_.__CLASS –eq “Win32_Group” } | Select-Object __CLASS,Caption,SID

The script-block form of Where-Object has always been available and hence is suitable here. My original script has just Where-Object Name -eq ‘Administrators’ which doesn’t work in PowerShell 2. After we have the GetRelated() information, we can search for the information we are really after – the users (which has a __CLASS of Win32_UserAccount) and the groups (which has a __CLASS of Win32_Group). Finally, we select the information we are interested in. As you are probably aware if you are a constant reader of the blog, we use SA-ModularInput-PowerShell to run our PowerShell for us. This module requires us to Select-Object the fields prior to finishing. We can now put this script into our inputs.conf file:

[powershell://LocalAdmins]
script = (Get-WMIObject Win32_Group | Where-Object { $_.Name –eq ‘Administrators’ }).GetRelated() | Where-Object { $_.__CLASS –eq “Win32_UserAccount” –or $_.__CLASS –eq “Win32_Group” } | Select-Object __CLASS,Caption,SID
schedule = 0 30 2 ? * *
sourcetype = PowerShell:LocalAdmins
source = PowerShell
disabled = false

[powershell2://LocalAdmins]
script = (Get-WMIObject Win32_Group | Where-Object { $_.Name –eq ‘Administrators’ }).GetRelated() | Where-Object { $_.__CLASS –eq “Win32_UserAccount” –or $_.__CLASS –eq “Win32_Group” } | Select-Object __CLASS,Caption,SID
schedule = 0 30 2 ? * *
sourcetype = PowerShell:LocalAdmins
source = PowerShell
disabled = false

Notice that I have two versions – one is PowerShell 2 and the other is PowerShell 3. On Windows Server 2012 and above, only PowerShell 3 is available – you have to install PowerShell 2 separately. Thus the second one doesn’t get run. On Windows Server 2008 R2 the reverse is true – you don’t get PowerShell 3 – only PowerShell 2 is installed by default. If you have standardized on PowerShell 3 already (and you really should have done so by now), only enter the first stanza. There isn’t a Common Information Model for group membership inventory, but I tend to assume there will be in the future. In keeping with the CIM, let’s define our field extractions properly and ensure there are tags. This would fall under the Inventory Data Model and be a “Group” type. Our membership can be either a Group or a UserAccount and will have a domain and a username and a security ID. Start with props.conf:

[PowerShell:LocalAdmins]
REPORT-cim = localadmins-type, localadmins-userdom, localadmins-sid

Then define the extractions in transforms.conf:

[localadmins-type]
REGEX=(?ms)__CLASS=Win32_(?<member_type>.*?)\n

[localadmins-userdmon]
REGEX=(?ms)Caption=(?<member_domain>[^\\]+)\\(?<member_name>.*?)\n

[localadmins-sid]
REGEX=(?ms)SID=(?<member_sid>.*?)\n

Finally, our tags.conf looks like this:

[sourcetype=PowerShell:LocalAdmins]
inventory = enabled
group = enabled

Now that we have our list of local administrators, how do we get the changes? The changes are recorded in the Security Windows Event Log. This is kept in the Splunk_TA_windows and I highly recommend that this is distributed with common inputs enabled – one of which is the WinEventLog://Security input. Don’t want everything? You can limit the event codes you need with a whitelist and blacklist. The relevant event codes are:

NT5 (Windows Server 2003 and before): 635-639
NT6 (Windows Server 2008 and beyond): 4731-4735

The Splunk App for Windows Infrastructure contains specific field extractions for these events, allowing you to easily incorporate them into your own dashboards. Now you can truly monitor the local Administrators group on all your servers.

↧

Protected: Deploying Splunk Securely with Ansible Config Management – Part 1

July 12, 2014, 8:36 pm

≫ Next: Calling Mobile App Builders: Bugsense is for you

≪ Previous: Monitoring Local Administrators on Windows Hosts

↧

Calling Mobile App Builders: Bugsense is for you

June 19, 2014, 3:48 pm

≫ Next: Deploying Splunk Securely with Ansible Config Management – Part 1

≪ Previous: Protected: Deploying Splunk Securely with Ansible Config Management – Part 1

A few months ago, Splunk acquired a tiny, fast growing company, Bugsense and its talented team including the founders Panos and Jon. Over the last few months, this team has been acclimatizing to the San Francisco weather, our crazy obsession with ponies, ninjas and such..

So What Is Bugsense?

If you don’t know what Bugsense does – here’s a quick primer. You have a mobile app or many mobile apps. You want to know when your users are experiencing crashes. You want to know what’s causing those crashes. You want to know about handled and unhandled exceptions. You want to know this by app version, device version, OS version so you can see if your fixes worked.

Why? Because without this visibility, you are floating in a sea of chaos – you get hard- to- diagnose reviews like “your app keeps crashing”, you get poor ratings on app stores, and the worst of all…no one uses your mobile app anymore.

To fix all this, you go to the Bugsense website – you pick the right SDK for your platform ( Android, IOS, HTML5, Windows, Xamarin are supported), you drop in the single line of code necessary for data collection and then you are off to the races!

The Bugsense service gives you all types of analytics: not just how many users by which version of your app are having problems, but also –which specific OS or device type are experiencing more problems and what effect is this having on customer retention.

The really cool features of course have to do with more fuzzy measurements. Bugsense has a quality score for your apps – called MOBDEX. MOBDEX takes into account sessions, errors, users, crashes, retention and assigns a qualitative score to how good your app is. This is particularly useful when you need a high level glance at “Did my new version improve things for my users or not?”

Another cool thing you can do is perform custom events tracking with BugSense. This is particularly useful in scenarios when you wish to establish how often users perform an action(like say “click on a button”). Sometimes in scenarios where there are many calls to action, or many possible routes for users to interact with the content, it is extremely useful to log which thing the user picked the most often. Knowing what they didn’t choose to do at all allows you to eliminate clutter and make for a cleaner, easier to use app! To have the event logged within BugSense, all you have to do is create a custom event hook.

What’s New

We are recently previewing a new feature in Bugsense called “Advanced Events” – for paid users only.

What does this feature do? Here’s an example:

You want to track “User clicked ok”. You add the event in your AppVersion 1.1, you go to your dashboard and you see cool real-time analytics and graphs on the Bugsense “Insights ” page. But then you release a new version of your app (1.2) and you have optimized the flow or made some changes. And now you don’t know which events came from version 1.1 and which came from 1.2.

One way is to create a new event “User clicked ok but this is another version”. With the new “Advanced events” , you can skip this step!

Now, you can filter events by AppVersion, see the trend of a specific app and compare how this is event is during across all your releases in a slick, real-time and responsive UI.

We also provide you a bar with top weekly events, most trending events and other neat information in order to always be on top of your game.

If you see a change in top events from one app version to another, you know you’re doing getting instant visibility into what customers like or where they are getting tripped up!

So what does this have to do with Splunk?

Well, imagine what you could do with all this data about your mobile apps in Splunk!

You could perform all types of analytics – not just which events caused crashes but maybe a path of events. The most frequent, least frequent flows through your apps leading to successful outcomes for the customer. You could also compare mobile app response times to different levels of infrastructure capacity. Or track how different response times from third party APIs or your own back-end application components impact users.

Last, but not the least, imagine the insights you could get with comparing how mobile users use your apps vs how they behave on your website or while physically interacting with your business. Sound cool? If it does and you would like to be in the loop on follow-on plans.Follow us on twitter @bugsense and @splunk.

↧

Deploying Splunk Securely with Ansible Config Management – Part 1

July 12, 2014, 8:36 pm

≫ Next: Splunk + Cloudera for Hadoop–Better Together

≪ Previous: Calling Mobile App Builders: Bugsense is for you

Automation Johnny

Intro

More times than not I have seen corporations struggle with config management and it is key for concise mitigation and remediation plan. Interfacing with a variety of Splunk customers the corporations whom do implement a config management system usually have a different tactic on how to manage Splunk while doing it in a secure fashion. In this series of blog posts which will hopefully walk you through a simple deployment of Ansible all the way to the most complex use-cases I have seen. I will first be covering how Ansible can be leverage to manage a simple Splunk deployment on your own hosts. Part 2 we will cover how this can be done in a larger scale with EC2 utilizing dynamically changing inventory of hosts for deployments whom need to scale in a cloud environment. Finally part 3 will cover how to manage a giant deployment of Splunk with multi-tenecy requirements where there are various “customers” or business units with different Splunk config needs. The idea is to embark the necessary knowledge to not only deploy Splunk but anything else using Ansible as your config management system.

Why Ansible?

There are a few config management system. The most common ones I have seen deployed are Chef and Puppet. In my previous job we evaluated a few of them, including Puppet and Chef and we ended up choosing Ansible. Here are the reasons why I have seen organizations make that choice:

No Agent Required – this is awesome! Personally, I think the less agents a system has the easier it is to manage and the smaller your exploit landscape needs to be. Ansible also has the ability to deploy an agent specially on a scenario where your host are pulling configs instead of a master server pushing them.
Using SSH as Transport – No need to deal with custom communication protocols, everything is encrypted natively. Keys are used instead of passwords. Also SSH is pretty ubiquitous across the varied linux distributions. Most times than not you are already talking to your servers via SSH.
Ease to pickup – I found that the learning curve in Ansible is tremendously smaller any otherconfig management system. This is primarily due to the fact that play books read easily. They are YAML base and the project has great documentation
Low overhead and scales to huge deployments – There is no need to run a dedicated Ansible master server; the application has very low resource requirements. Ansible is also shown to scale, I have seen deployments as large as 100,000+ endpoints(link here).
Python Base – do not like something, want to integrate with something else? Ansible is python based and very easily extendable. If you can code in python this is the CMDB for you, they speak your language!

Ansible is not for organizations whom their large server base is windows. Although they are working on Windows clients.

Installing Ansible

I suggest you grab Ansible from their stable repo in github instead of your distributions repository. The stable version in github has welcomed updates and additions like ansible-vault which we will cover later in further detail.
cd /opt/ git clone https://github.com/ansible/ansible
Now lets install the playbooks and configuration files we will use for deployment.
cd /etc/ git clone https://github.com/divious1/ansible-splunk-simple mv ansible-splunk-simple /etc/ansible
Now you have a copy of the ansible application under /opt/ansible and a collection of configuration files under /etc/ansible. Below is a logical diagram which represents a high level Ansible application structure.

Ansible Structure

Explanation of what the different pieces are:

ansible-playbook: Ansible executable which runs the playbooks etc..
hosts: INI file which contains the role/group and host mapping
playbooks: Ties in Roles, host groups and task together to create orchestrated actions on target hosts
roles: contains the actions each group will complete (this is where the deployment logic lives).

Roles in details

Lets walk through the structure of a role in details. I will start with the common role. The common role should be ran no matter what kind of role the host has as it performs common functions that you would want on every host. If we look at the main.yml under tasks for this role we can see all the tasks it performs.
ansible$cat roles/common/tasks/main.yml --- # This playbook contains common tasks in this role - include: apt.yml - include: users.yml - include: files.yml - include: cron.yml - include: time.yml
The role contains 5 play books. Lets figure out what these are what what they do. Lets look at apt.yml in detail:
ansible$cat roles/common/tasks/apt.yml --- # This playbook install the apps required in a server - name: install security controls tags: - configuration - security apt: name={{ item }} state=present with_items: - chkrootkit - rkhunter - clamav - fail2ban - name: install basic utilities tags: - configuration apt: name={{ item }} state=present with_items: - vim - screen - iotop - htop - ioping - ntp

The description of this is at the top as a comment. Using the “apt” Ansible module (follow the link for information on modules) we install a variety of software on the server. The first batch of software is tagged as “configuration” and “security” and the last are just configurations. The first bath install chkrootkit, rkhunter, clamav, fail2ban.

Running Ansible for the First Time

Requirements

Ansible Installed
splunk-admin user updated with your keys under /playbooks/splunk_creds/splunk-admin.pub
Root password of hosts to run Ansible in
Make sure you have ssh keys generated for root
Hosts inventory updated

Before running Ansible make sure that your environment is set correctly. Run . /opt/ansible/hacking/env-setup in order to have the necessary paths and environment activated for Ansible to run. Verify it worked by runningwhich ansible-playbook /opt/ansible/bin/ansible-playbook This can also be added to bash profile as well.

To build a splunk server from scratch just run:
/etc/ansible#ansible-playbook /etc/ansible/playbook/search_heads.yml

Make sure that you have hosts defined under hosts
If the host you are running is virgin (nothing has been done to it) you will want to run Ansible with -k so it prompts for the password during the first run. You will only need this once as Ansible has a task under roles common to copy the ssh key of your user to that host.

You should see a similar output to this:

/etc/ansible#ansible-playbook playbooks/search_heads.yml PLAY [apply common configuration to all nodes] ******************************** GATHERING FACTS *************************************************************** ok: [162.243.231.42] TASK: [common | install security controls] ************************************ ok: [162.243.231.42] => (item=chkrootkit,rkhunter,clamav,fail2ban) TASK: [common | install basic utilities] ************************************** ok: [162.243.231.42] => (item=vim,screen,iotop,htop,ioping,ntp) TASK: [common | create splunk-admin] ****************************************** ok: [162.243.231.42] TASK: [common | copy splunk-admin bash_profile] ******************************* ok: [162.243.231.42]

Closing Thoughts

This should have armed you with the basic tools to do config management and orchestration on Splunk hosts as well as the rest of your infrastructure.

I encourage you to walk through each roles task to verify what it is doing. Also the github read me will have more information as to how Splunk is configured using the shipped playbooks. Feel free to drop me a line with any questions or if something is not working at jose at splunk.

Thank you Mike Regan, Mike, Baldwin and the rest of the Splunk Cloud team for your contributions and working with me on the Playbooks.

↧

Splunk + Cloudera for Hadoop–Better Together

July 21, 2014, 5:17 pm

≫ Next: Updating the iplocation db

≪ Previous: Deploying Splunk Securely with Ansible Config Management – Part 1

This is a guest post contributed by Amr Awadallah, Ph.D., Co-Founder and Chief Technology Officer, Cloudera

On July 23, my friend Todd Papaioannou and I are co-hosting a webinar on a subject that’s very important to me. As co-founder and CTO of Cloudera and a long-time Hadoop user dating back to my days at Yahoo, I recognize that big data, for all its promise, also comes with its share of challenges. A central one being how to make data exploration and analysis on petabyte-scale datasets across distributed systems accessible to people without advanced data science backgrounds.

That’s one of the things I really like about Hunk, Splunk’s analytics and visualization solution for Hadoop. It’s a powerful platform that allows you to search and explore endless volumes of unstructured data, with an easy to learn and use user interface (UI). Another reason Hunk is a great platform for Hadoop users is that the data can remain right where it is, in Hadoop. Hadoop brings the compute to the data, so there’s no need to move your data into an in-memory analytics store to run queries against it.

If we’re going to make big data exploration available to a larger user base, it’s important that we close the skills gap. That’s the focus of the webinar I mentioned above. Todd and I will talk about big data, what it is, where it’s going, and how our companies are collaborating to make it easier for all knowledge workers to use Hadoop.

The webinar is this Weds. July 23 at 9:00 am PT register now (or watch the recording from this link). I hope to see you there.

↧

Updating the iplocation db

July 22, 2014, 4:57 pm

≫ Next: Tracking calls and SMS with Splunk

≪ Previous: Splunk + Cloudera for Hadoop–Better Together

When Splunk added the new version of the iplocation command in v6.0, it added the ability to add location info without the need for internet concenttivity. We did this by shipping a custom version of the MaxMind DB in the 6.0.x release. However, because we used a Splunk specific version of the DB, you still had to wait for a new version of Splunk to get a new copy of the DB.

In 6.1 we added support for using the native MaxMind DB (.mmdb), allowing you to update the DB yourself at anytime! It looks like some of you have already figured this out (Go George go!), but I figured I would add some additional info about this new feature.

As George states, you can replace the GeoLite2-City.mmdb file under $SPLUNK_HOME/share/ with a copy of the paid version or with a monthly update of the free version, but there is another way! You can change the path to the MMDB file under the limits.conf file, so it becomes Splunk upgrade safe. From the limits.conf.spec file:

[iplocation] db_path = * Location of GeoIP database in MMDB format * If not set, defaults to database included with splunk

I download the the July 2014 update to test it out:

kbains:local kbains$ cat limits.conf [inputproc] file_tracking_db_threshold_mb = 500 [iplocation] db_path = /Applications/splunk612/share/GeoLite2-City-201407.mmdb
kbains:share kbains$ ls -l GeoLite2-City-201407.mmdb -rw-r--r-- 1 kbains SPLUNK\Domain Users 30300878 Jul 22 15:06 GeoLite2-City-201407.mmdb

And of course it worked as expected =)

Happy Splunking!

↧

Tracking calls and SMS with Splunk

July 26, 2014, 2:27 am

≫ Next: Splunk Command> Cluster

≪ Previous: Updating the iplocation db

splunk-app-for-twilio

The first thing I think of when someone mentions a call centre: “Those guys that call me at 2300 trying to sell things I didn’t even know existed”.

That’s a little unfair. Call centres and telecommunication systems are vital to all of us around the world, though rarely do we look deeply into the vast amounts of valuable data being generated. I want to change this.

In this post we’ll examine data generated by Twilio, a service that allows you to bake voice and SMS capabilities into your apps.

But remember, Splunk is a machine data platform. If you’re not using Twilio, this data could be taken from any other voice or SMS management system.

splunk-app-for-twilio-machine-data copy

Plugging Twilio data into Splunk is super simple. The company offer a flexible API from which Splunk can poll data in real-time, or manually via CSV exports from their web interface.

The Twilio app for Splunk comes with prewritten Twilio API calls to collect the data if you want a little help getting started (disclosure: I developed it).

splunk-app-for-twilio-1

Now that the messy, unstructured data has been indexed lets start making sense of it using Splunk searches.

Top 10 outbound call destinations:

Sid=”CA*” Direction=”outbound*” | top limit=10 To

Outbound call success:

Sid=”CA*” Direction=”outbound*” | stats count by Status

Average length of outbound calls:

Sid=”CA*” Direction=”outbound*” | eval DurationMins = round(Duration/60,2) | stats sum(DurationMins) AS TotalDuration count(Duration) as NumberOfCalls | eval AveCallLength = round(TotalDuration/NumberOfCalls,2) | fields AveCallLength

splunk-twilio-app-sms

Last SMS received:

Sid=”SM*” Direction=”outbound*” Body!=”New Voicemail*” | head 1 | fields Body

Total number of SMS received:

Sid=”SM*” Direction=”inbound*” | stats count

splunk-app-for-twilio-billing

And for the finance managers out there; how much are these guys spending?

Sid=”CA*” | eval Cost = Price*-1 | timechart sum(Cost) as Cost by PriceUnit

splunk-app-for-twilio-customer-care

It becomes even more exciting as you plug more and more data into Splunk. Enriching call data with customer activity across your infrastructure – things like web or access logs – is a great example. When a customer calls a rep because they are having a problem, Splunk could immediately show the errors that customer encountered to the rep, as well as things like purchases, tweets, pages visited, and any other data that was linked to the customer.

Don’t have access to this type of data but want to explore? Most network providers give customers a monthly breakdown of their usage – data that is ripe for Splunking.

↧

Splunk Command> Cluster

July 28, 2014, 1:55 pm

≫ Next: Updated Keyword App

≪ Previous: Tracking calls and SMS with Splunk

Being a Splunk sales engineer is incredible. I get to talk to customers about their use cases, ‘Splunk’ their data, and together discover the insight Splunk provides them. Initial demos typically start with the search bar, looking for keywords in their data. Usually doesn’t take long before the “Ah Hah!” moment comes – either by using Splunk’s intuitive GUI to interact with extracted fields of interest or employing a very small subset of the 130+ search commands with in the search bar to gain operation intelligence not readily seen before. At a recent customer visit I employed the Splunk on Splunk (S.o.S.) App, explored some of the underlying searches and noticed the cluster command, which I never used before. So I dug a little deeper into some Nagios data with the cluster command.

Nagios is an industrial monitoring tool that monitors your entire IT infrastructure. Capturing these events with Splunk allows you to perform historical diagnostics of problems that occurred within your environment. The cluster command is used to find common and/or rare events within your data. A quick search, organizing in a table with a descending sort by time shows 9189 events for a given day.

index=mmo_ops | table _time, _raw | sort - _time

Adding the cluster command groups events together based on how similar they are to each other. Two new fields, cluster_count and cluster_label, get appended to each new event created by cluster. [An option of showcount=t must be added for cluster_count to be shown in results.] The cluster_count represents the number of original _raw events grouped together, cluster_label is a unique label given to each new event. So after cluster I create a table then sort by cluster_count.

index=mmo_ops| cluster showcount=t | table _time, cluster_count, cluster_label, _raw | sort cluster_count

Immediately you see cluster isolates four rare, notable events – one in which a critical system shutdown occurred. Pretty cool, right? No heavy lifting. No prior knowledge of what we might want too look for. The cluster command simply did it for us!

The main option for this command, t, sets a threshold that controls the “sensitivity” of the clustering, values ranging from 0 to 1.0. The closer t is to 1.0 the more similar events are that get clustered together (increased resolution). By default t=0.8 (used for previous search), therefore decreasing t should “zoom out” and decrease the number of clustered events. With my data value of t=0.7 produced fewer clustered events, three unique values with a cluster_count of one, and still showing the system shutdown.

index=mmo_ops| cluster showcount=t t=0.7 | table _time, cluster_count, cluster_label, _raw | sort cluster_count

Still want to see the original events but know which clustered event they belong to? No problem! Use the labelonly=t with cluster.

index=mmo_ops| cluster showcount=t t=0.7 labelonly=t | table _time, cluster_count, cluster_label, _raw | sort cluster_count

Extending this search string one more time with dedup (de-duplication) command allows me to see the most recent grouped events within a clustered event. The search below I limit it to the lasted five events within each clustered event.

index=mmo_ops| cluster showcount=t t=0.7 labelonly=t | table _time, cluster_count, cluster_label, _raw | dedup 5 cluster_label | sort cluster_count, cluster_label, - _time

Awesome stuff, huh? Can’t wait to “plunge” into other commands – way too cool! Till then the Force be with you, fellow padawan’s.

*I highly encourage you to download Splunk Enterprise for FREE and try it out! Nagios is free but you can “Splunk” any data.

↧

Updated Keyword App

July 29, 2014, 1:53 pm

≫ Next: Indexing data from Saas solutions running on relational databases

≪ Previous: Splunk Command> Cluster

Last year I created a simple app called Keyword that consists of a series of form search dashboards that perform Splunk searches in the background without having to know the Splunk search language. You can read about the original app here and see how it easy it is to use. This year, I added some dashboards for the Rare Command, but I didn’t think it was newsworthy to blog about it.

Then, Joe Welsh wrote a blog entry about using the cluster command in Splunk, which allows you to find anomalies using a log reduction approach. Joe’s example using Nagios is easy to follow and gives the novice a useful approach to get rare events. So, using this approach, I decided to update the Keyword app to add a Cluster dashboard where the user simply puts in a search filter (something to search for), a threshold on matching like events, and a time range to to get results. This should work on any data and allow you to quickly see grouped anomalous events without having to know the search language. As I wrote about it before, a picture is worth more than a description. Here’s an example using SSH logs:

Cluster Dashboard

It follows the same pattern as Joe’s blog entry. For completeness, I’ll include a picture of the Rare dashboard that shows you counts of rare sources, hosts, and sourcetypes for a keyword search:

Rare Sources, Sourcetypes, and Hosts

Finally, you can also split each rare result by the punctuation of the result and either its source, sourcetype, or host. As Splunk automatically captures the punctuation of each event, as usual, all you have to do is search by a keyword or set of keywords separated by OR or implicit AND.

Rare Punctuation

This could really help in your IT and Security use cases. Enjoy the the update.

↧

Indexing data from Saas solutions running on relational databases

July 30, 2014, 1:16 pm

≫ Next: What’s new in TA-windows 4.7.0?

≪ Previous: Updated Keyword App

As we began work on building the Salesforce.com app, I was again face to face with a familiar challenge…a challenge that you would encounter anytime you want to ingest structured data coming from any Saas based application that is running on a back-end relational database. In such a Saas based environment, the data is usually exposed via a REST, Webservices API or similar. As you know, in a typical relational database, all data is stored in multiple tables and records are linked across tables using ID’s. For instance the Incident table in ServiceNow does not have the Username that created that ticket but has a User Identifier (long cryptic string) referencing another record in the “Users” table that includes the Username, First Name, Last Name and other information.

The problem that you would face in this scenario would be: how do you ingest that data in order to make it easier to search? How do you build these lookups in Splunk knowing that the data is distributed across multiple tables?

The first solution that would come to mind would be to have the same python script that polls the data from SFDC or ServiceNow (as a scripted input) save the data in a CSV format directly and store it under the lookup directory of the app. The app would then use that CSV file as an automatic lookup on the searches that apply. However, this approach poses some limitations:

In a distributed environment, the lookups CSV files have to be stored on the search head. This means that either you have to come up with a mechanism to periodically copy the CSV files to the search head from the forwarder or use the search head to pull the data in directly. Both are not good options.
The python code has to handle updates to the lookup tables – in other words, update existing records in the CSV file with newly polled data that matches. This is doable but can be a bit tricky and would require some python coding skills.
What if I want to track changes to that lookup tables from an auditing perspective? Take the example of the Users table in ServiceNow, if someone changes the First Name of a given user in ServiceNow, the python script will poll the changes and update the existing record in the CSV lookup file erasing the previous information.

After many discussions, an alternative solution presented itself. This solution relies on indexing the lookup data as opposed to storing it as a CSV. Then the app runs periodic searches to build that lookup table in Splunk and keeps it up to date.

A good example of that search would rely on the “inputlookup” command to get the existing data in and “outputlookup” command to append the new result in the CSV file.

↧

What’s new in TA-windows 4.7.0?

July 30, 2014, 1:52 pm

≫ Next: RDP to Windows Server from a Splunk Dashboard

≪ Previous: Indexing data from Saas solutions running on relational databases

If you are a Windows admin and use Splunk then you’ve likely deployed Splunk_TA_windows on your endpoints. It’s a central method for handling Windows data and has all the extractions you need to handle Windows event logs. We’ve just released version 4.7.0. So what’s new and should you upgrade?

The first thing we did was we organized the data. The well considered best practice is to not put data in the default index. Yet here we were putting data in the default index. That has now changed. By default, we create three indices for you:

perfmon is used for performance data
wineventlog is used for event logs
windows is used for everything else

This change will not affect you if you’ve been using a local inputs.conf file (as you should). However, new installations beware – this change requires that the indices be available on your indexing tier or you will get messages about indices missing.

The second thing we did was we turned off all the inputs. That’s right – Splunk_TA_windows no longer gathers data by default. There are many reasons for this, but the primary reason is that we license by data volume so we didn’t want you to be blowing out your license because we turned something on by default. Make the conscious decision on what to gather based on what you want to show off.

The third thing we did was make the data more CIM compliant. CIM compliance has always been a part of the Splunk_TA_windows for use with Enterprise Security. If you’ve looked at the Common Information Model recently you will have noticed that the CIM now encompasses more than just security – it adds many data models for IT/Operations usage. We’ve embraced this and as a result, the Windows data now will automagically appear in Enterprise Security and the data models that are implemented in the Common Information Model app. You can now use the Windows data in standardized data models like this:

Finally, we incorporated a lot of customer feedback. We can’t test on every Windows variant out there (although we do try to get as many common variants as we can), so sometimes we miss something and you tell us about it.

There is one thing to take note of going forward. Microsoft is marking Windows Server 2003, Vista and before as end of life on July 14th, 2015. As a result, we will also be de-supporting the field extractions for 3-digit windows security event logs, and we are deprecating them with this release. What does this mean? We will be separating the 3-digit field extractions into another app – we’ll call it TA-legacywindows or something like that. If you install the TA-legacywindows then everything happens as before – the extractions are just in a different app. We will remove the field extractions affected from Splunk_TA_windows some time after the end-of-life notice and after TA-legacywindows has been released. We will also bump that second digit again so you know something major has happened.

Needless to say, future versions of the Windows Infrastructure app, Exchange app and other apps for Microsoft technologies will rely on the updated TA.

Got requests, comments, or bugs with the Splunk_TA_windows – drop us an email at microsoft@splunk.com.

↧