Smart AnSwerS #56

March 4, 2016, 8:39 am

≪ Previous: Life, Liberty and the Pursuit of Log-i-ness

Hey there community and welcome to the 56^th installment of Smart AnSwerS.

We just hosted the March SF Bay Area User Group meeting last night at Splunk HQ and had a great conversation about various real and hypothetical security scenarios in spirit of RSA. It was awesome to hear a mix of experiences and lessons from Splunkers, partners, and customers. If you want to learn about all the juicy details from the meeting, visit the #sfba channel in our Splunk User Group Slack Chat where smoir (thank you!) “liveslacked” all the key topics discussed. It will only be available to view for a limited time, so act fast! Otherwise, feel free to hang out in that channel during our next meeting on Tuesday, April 19^th @ 6:00PM at Yahoo! HQ in Sunnyvale, CA, hosted by Becky Burwell.

Smart AnSwerS will be taking a break for the next 3 weeks as I’ll be on PTO in a land far far away, but will jump back into action at the end of March. Until I return, enjoy this week’s featured Splunk Answers posts:

As part of a Splunk alert, is it possible to include 100 lines from the log prior to the event that triggered the alert?

cybrainx wanted to set up and trigger an alert when an ERROR string was found, but also include 100 lines from the log prior to the trigger event in the results. SplunkTrust member rich7177, with an assist from fellow member MuS, came up with a search to capture all necessary raw data before using a combination of eval, streamstats with window=100, and transaction to make this alert requirement possible.
https://answers.splunk.com/answers/310019/as-part-of-a-splunk-alert-is-it-possible-to-includ.html

What is the recommended compatibility sequence of upgrading instances in my environment from Splunk 6.2.7 to 6.3.2?

rcreddy06 had an environment with a search head cluster, indexer cluster, deployment server, heavy forwarders, and universal forwarders running Splunk 6.2.7, but wanted to upgrade everything to 6.3.2. To tackle this properly, recreddy06 needed to know in what order and how to upgrade each instance or group of instances for a smooth transition. esix breaks down the upgrade process in phases with things to look out for and references the relevant documentation.
https://answers.splunk.com/answers/340953/what-is-the-recommended-compatibility-sequence-of-1.html

How to make a world map dashboard using logs from an email server with no IP addresses?

emixam3 was looking for a way to use logs from an email server to plot dots on different countries in a world map based on the domain of receiver email addresses, but had trouble figuring out how to do this without associated IP data. yannK points out that map tools rely on longitude and latitude coordinates, and geoip tools rely on IPs to convert them to coordinates, but gives emixam3 another approach. He suggests creating a lookup with domain, country, lat, and long fields to use for searches in combination with the geostats command to create map visualizations.
https://answers.splunk.com/answers/343068/how-to-make-a-world-map-dashboard-using-logs-from.html

Thanks for reading!

Missed out on the first fifty-five Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Using Syslog-ng with Splunk

March 11, 2016, 4:59 pm

≫ Next: Another Update to Keyword App

≪ Previous: Smart AnSwerS #56

Overview

A Splunk instance can listen on any port for incoming syslog messages. While this is easy to configure, it’s not considered best practice for getting syslog messages into Splunk. If the splunkd process stops, all syslog messages sent during the downtime would be lost. Additionally, all syslog traffic would stream to a single Splunk instance, which is not always wanted if it can be configured to spread syslog data amongst all indexers.

What is the best practice for getting syslog data into Splunk? The answer is a dedicated syslog server.

Below we discuss the installation, configuration and utilization of syslog-ng as the syslog server for Splunk.

Syslog-ng:

syslog-ng is an open source implementation of the syslog protocol for Unix and Unix-like systems. It extends the original syslogd model with content-based filtering, rich filtering capabilities, flexible configuration options and adds important features to syslog, like using TCP for transport. As of today syslog-ng is developed by Balabit IT Security Ltd. It has two editions with a common codebase. The first is called syslog-ng Open Source Edition (OSE) with the license LGPL. The second is called Premium Edition (PE) and has additional plugins (modules) under proprietary license.

Installation:

Syslog-ng is pre-packaged with some versions of Linux. It can also be downloaded and installed using wget as shown below.

# wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

# yum install –enablerepo=epel syslog-ng

Yum will resolve any dependencies required, downloaded and install syslog-ng 3.2.5-3.el6
The syslog-ng service will start but may give a warning message about a missing module as shown below.

Plugin module not found in ‘module-path’; module-path=’/lib64/syslog-ng’, module=’afsql’
Starting syslog-ng: Plugin module not found in ‘module-path’; module-path=’/lib64/syslog-ng’, module=’afsql’

Although syslog-ng works without syslog-ng-libdbi module, it should be installed to prevent the warning message from appearing each time syslog-ng is started.

# rpm -i libdbi-0.8.3-4.el6.x86_64.rpm
# yum install syslog-ng-libdbi

Disabling rsyslog

Turn off rsyslog and disable the rsyslog service from starting at boot time

# service rsyslog stop
# chkconfig rsyslog off

Enabling syslog-ng

Enable syslog-ng to start at boot and start syslog-ng service

# service syslog-ng start
# chkconfig syslog-ng on

Modifying IPTables to allow UDP traffic

Check iptables to determine which ports are open. (-L option lists by service, -n by port number)

# iptables –L –n

We need port 514 (which is the default syslog port for root) to be added to iptables.
To add UDP port 514 to /etc/sysconfig/iptables, use the following command below.

# iptables -A INPUT -p udp -m udp –dport 514 -j ACCEPT

Modifying syslog-ng.conf
Copy the existing syslog-ng.conf file to syslog-ng.conf.sav before editing it. The syslog-ng.conf example file below was used with Splunk 6. Each unique data source type had a directory created under /home/syslog/logs. This was done using destination options with the create_dirs attribute set to yes.

@version:3.2

# syslog-ng configuration file.
#
#
options {
chain_hostnames(no);
create_dirs (yes);
dir_perm(0755);
dns_cache(yes);
keep_hostname(yes);
log_fifo_size(2048);
log_msg_size(8192);
perm(0644);
time_reopen (10);
use_dns(yes);
use_fqdn(yes);
};

source s_network {
udp(port(514));
};

#Destinations
destination d_cisco_asa { file(“/home/syslog/logs/cisco/asa/$HOST/$YEAR-$MONTH-$DAY-cisco-asa.log” create_dirs(yes)); };
destination d_palo_alto { file(“/home/syslog/logs/paloalto/$HOST/$YEAR-$MONTH-$DAY-palo.log” create_dirs(yes)); };
destination d_all { file(“/home/syslog/logs/catch_all/$HOST/$YEAR-$MONTH-$DAY-catch_all.log” create_dirs(yes)); };

# Filters
filter f_cisco_asa { match(“%ASA” value(“PROGRAM”)) or match(“%ASA” value(“MESSAGE”)); };
filter f_palo_alto { match(“009401000570” value(“PROGRAM”)) or match(“009401000570” value(“MESSAGE”)); };
filter f_all { not (
filter(f_cisco_asa) or
filter(f_palo_alto)
);
};
# Log
log { source(s_network); filter(f_cisco_asa); destination(d_cisco_asa); };
log { source(s_network); filter(f_palo_alto); destination(d_palo_alto); };
log { source(s_network); filter(f_all); destination(d_all); };

Restarting syslog-ng
Syslog-ng can be restarted by executing the script in /etc/init.d or by issuing the service syslog-ng stop | start | restart commands shown below

# service syslog-ng stop
Stopping syslog-ng: [ OK ]

# service syslog-ng start
Starting syslog-ng: [ OK ]

# /etc/init.d/syslog-ng restart
Stopping syslog-ng: [ OK ]
Starting syslog-ng: [ OK ]

Configuring SELinux
In some cases, syslog-ng may not be writing any files out to the destination directories. The SELinux (Security-Enhanced Linux)module can block the syslog daemon from writing. If this happens, blocking statements can be found in /var/log/audit/audit.log. Run the getenforce command to check SELinux status / mode.

# /usr/sbin/getenforce
Enforcing

Note: Although SELinux can be disabled or set to” Permissive”, check with the sysadm.
The sysadmin may want to add exceptions to the SELinux policy instead.

Edit the selinux config file (vi /etc/selinux/config) and change the mode to Permissive.
Run sestatus to ensure config file edit was successful (should read “permissive” as shown below).

# sestatus

SELinux status: enabled

SELinuxfs mount: /selinux

Current mode: enforcing

Mode from config file: permissive

Policy version: 24

Policy from config file: targeted

Change the current mode from enforcing to permissive using the setenforce command as shown below. After this command, syslog-ng was able to write to /home/syslog/logs.

# setenforce 0
# sestatus

SELinux status: enabled

SELinuxfs mount: /selinux

Current mode: permissive

Mode from config file: permissive
Removing old log files from syslog-ng server
To ensure syslog-ng doesn’t fill the filesystem up with log files, create a cron job which runs daily at 5AM to remove old syslog-ng log files after “x” days. The example below runs every morning at 5am and removes files older than 7 days.

# crontab –e
0 5 * * * /bin/find /home/syslog/logs/ -type f -name \*.log -mtime +7 -exec rm {} \;

Use the crontab –l command to see what other cron jobs may exist or to check to ensure the cron job scheduled is correct.
UF collection on syslog-ng server

Install a Universal Forwarder on the machine where the syslog-ng server is installed.
The UF on the syslog-ng server can collect events from log files written from Cisco ASA and Palo Alto firewall devices. The monitor stanza below will monitor everything below the filesystem listed

Notice the attribute host_segment is used to identify the position of the hostname relative to the full path from the left.

# Cisco ASA
[monitor:///home/syslog/logs/cisco/asa/*/*.log]
sourcetype = cisco:asa
index = cisco
disabled = false
host_segment = 6

Splunk walks the filesystem path to the sixth field and sets the hostname for the events to the value found.

/home/syslog/logs/cisco/asa/<hostname>/2014-09-10-cisco-asa.log

↧

Another Update to Keyword App

March 14, 2016, 1:09 pm

≫ Next: Using Splunk to Monitor Changes to PowerShell Scripts

≪ Previous: Using Syslog-ng with Splunk

It’s been three years since I first released the relatively simple Keyword app on Splunkbase and wrote an initial blog entry for it describing it followed by an updated entry. In summary, the Keyword app is a series of form search dashboards designed for Splunk 6.x and later that allow a relatively new user to type in keywords (e.g., error, success, fail*) and get quick analytical results such as baselines, prediction, outliers, etc. Splunk administrators can give this app to their users as is, use the app as a template to write their own keyword dashboards, or take the searches in the app to create new views.

For this update, I’ve used, fellow Splunker, Hutch’s icons to update the display. I also removed the quotes around the token in the search so that users can now type things like

index=_internal err*

or anything that you want that is used before the pipe symbol in a search. Finally, I added a new dashboard using the abstract command. The abstract command in Splunk is a way for viewing a summary of multi-line events using a scoring mechanism that saves you from having to view the whole event. This is useful for viewing things like stack traces without having to view the whole stack trace as an event. Rather than continue to describe it, I’ll end with a screenshot of the form search dashboard.

Abstract Form Search

↧

Using Splunk to Monitor Changes to PowerShell Scripts

March 16, 2016, 8:53 am

≫ Next: The Value of Hybrid Highlighted as Splunk is Honored by SC Magazine Awards

≪ Previous: Another Update to Keyword App

I had a question this morning from a customer who was looking for ways to monitor changes made to PowerShell scripts in their environment. They wanted to know who made the changes, but also what changes were made. Well, I thought to myself–that’s a great excuse for a blog post!

Let’s break this down into two separate requirements:

I want to know when a PowerShell script has been modified
I want to know the changes between two versions of a file that has been modified

Who changed a file and when?

Requirement #1 is not hard to do using Splunk in combination with some native Windows file auditing features. In fact, it’s such a common use case that we’ve documented all the steps in the core docs. In short, this technique involves: in Windows, enabling NTFS filesystem auditing for the path in question; and then in Splunk, enabling security event log monitoring.

The end result will be audit events in Splunk that will look similar to the below. (Sorry, I could only find a group policy change quickly; a filesystem audit event will have a few differences.)

2016-03-16_11-02-34

The lower body of this event even shows who made the change, for example:

Subject:

  Security ID: SEATTLE\Administrator

  Account Name: Administrator

  Account Domain: SEATTLE

  Logon ID: 0x55e65c847

Of course, you can create an alert or make a dashboard or report to then surface these events up to those who may need to take action.

But what changed?

Now we come to the reason that I wanted to break this into two separate requirements. I think that a best practice for authoring scripts used in production scenarios is to check them into a source code repository like git. That may still be a new concept to many sysadmins, but I hope that this changes, because there are many excellent reasons for admins to use some tools which used to be only for developers. (And hey, that’s what DevOps is all about–one of my favorite topics.) Some reasons for using source control for config files and scripts include:

track changes in files over time
rich “diff” tools to visualize these changes
package an approved set of changes as a “release”, maybe even tag the release with a change control ID if you’re into that
source control systems produce data that can be Splunked!

Speaking of visualizing diffs, here’s a quick screenshot of viewing a change in my favorite editor, Visual Studio Code (which is great for editing PowerShell files, by the way). Here is the list of changes in my project:

2016-03-16_11-34-35

And here’s the change in detail, highlighting the addition of a paragraph:

2016-03-16_10-38-12

You might be thinking by now, “ok, looks cool, but show me how to do this in a Windows environment with PowerShell scripts”. To answer that, I’m going to send you to a fellow PowerShell MVP named Stephane van Gulick. Stephane has written a comprehensive post on this very topic that I liked: Embrace version control using powershell git and github.

What’s next?

There are some related topics that I wanted to mention before letting you go. If you go back to the original question, one assumption that came to my mind was that this was being done not just for auditing purposes, but also for security and stability. In that case, what you really want to do is prevent all modified scripts from being executed, thus preventing any damage which might occur. PowerShell has a feature for this called the Execution Policy. Be sure to research how you can sign PowerShell scripts, and going one step further, prevent scripts that aren’t signed, or that have an invalid signature, from being executed. Or maybe you just want to verify the signatures, and for auditing purposes you might Splunk the output of that check.

And lastly, I should mention that you ought to follow such changes as I’ve described above with policy and training. Once everyone is on the source control train, it will be much easier to have quality discussion around a change control process, and how it can contribute to a tighter IT shop all around.

↧

The Value of Hybrid Highlighted as Splunk is Honored by SC Magazine Awards

March 18, 2016, 8:13 am

≫ Next: Creating the optimal customer journey using analytics

≪ Previous: Using Splunk to Monitor Changes to PowerShell Scripts

Every organization has a cloud strategy. It’s a journey, but the destination is clear. And, it’s my experience that regardless of what mile marker organizations have just passed on the road to cloud, they’re likely operating a hybrid environment. This means they are running solutions both on-premises and in the cloud. To support this strategy, we offer hybrid delivery options – supporting both on-premises and cloud solutions – one of the differentiated values we provide our Splunk customers.

And today, I’m excited to share that our differentiated value just got a big boost as SCMagazineAwards2 Splunk Enterprise won a Trust award for “Best Fraud Prevention Solution” and Splunk Enterprise Security won a Trust award for “Best SIEM Solution” from SC Magazine. Not only does it feel good to be recognized for our innovation, but it’s great that our customers with a hybrid strategy can now feel even more confident with their Splunk solutions.

Customers who have integrated both Splunk Enterprise on-premises and Splunk Cloud software-as-a-service have the unique ability to search data across both platforms – with two of these award-winning solutions at the core of the platform. Some customers are using Splunk Enterprise on-premises and in the cloud as their Fraud Prevention Solution and others are integrating Splunk Enterprise Security into their Splunk Cloud platform to create a SIEM-in-the-Cloud.

What customers gain from this combination of deployment options is visibility. Splunk enables its customers to ingest, search, visualize and analyze machine data of any type from any environment. And, with these two award-winning solutions in a hybrid delivery model, users gain instant, real-time security, operational, and customer insights across their entire environment.

So, in the end, it’s a balance. Something we like to call hybrid harmony – that place where on-premises and cloud environments can operate in sync as long as there is the right level of visibility, monitoring and insight.

Cheers!

Marc Olesen
SVP & GM, Cloud Solutions
Splunk Inc.

Follow @marc_olesen

Follow @splunk

↧

Creating the optimal customer journey using analytics

March 21, 2016, 7:46 am

≫ Next: Your Splunk Sandbox

≪ Previous: The Value of Hybrid Highlighted as Splunk is Honored by SC Magazine Awards

Understanding the customer journey is currently a hot topic. This is because being able to deliver the right information and messaging to all consumers at every touch point is now a critical element of brand success.

Tracking customer journeys is an omni-channel challenge, but in many journeys there is some form of online interaction – which makes this channel a pivotal element in the process.

For me, the most interesting thing about online customer journeys is the fact that there is typically a divergence between how brands design and perceive the customer journey and the actual route taken by consumers. This is why being able to accurately track each interaction during the customer journey is critical.

The work that Splunk is undertaking for Kurt Geiger is typical of our work in online customer journey mapping. Kurt Geiger is looking to monitor customer activity with the aim of optimising the experience for everyone who visits its website; the ultimate objective being to improve the customer experience and drive sales.

Read more details of how Kurt Geiger is using Splunk.

So how can brands optimise the customer journey?

It is possible to understand the details of any journey on any website by analysing the metadata contained in weblogs. Using this rich, machine-generated data we can construct a detailed map of every journey, based on an agreed definition of what constitutes a ‘session’.

By aggregating qualifying sessions to understand the route that a sub-group of consumers takes, we can establish the information and messaging they will have experienced during their journey. It will also tell us how many consumers ultimately completed a purchase or experienced some sort of application error.

Arguably the greatest value from this type of work comes from the ability to understand differences in the journey for those in key consumer segments, which look to take account of differences in, for example: the route taken to arrive at a website; the activity undertaken while on the website; and (via relational data correlation) wider purchasing behaviour or value. The benefit of overlaying consumer segments is that this enables communications to be tailored for each segment, based on their needs and their typical journey.

An increasingly important consideration for brands conducting customer journey mapping is the ability to align outputs with any Business Information / KPIs reported internally. By ensuring that the data used to generate journey maps are complementary with internal reporting, it is possible to provide insights that align with existing perceptions of business performance. This alignment is particularly important when measuring the impact of any changes to a website layout that are designed to improve the customer experience – as it provides a ready-made evaluation framework.

In conclusion, by using information contained in weblogs, we can build a detailed picture of online customer journeys that can be used as a basis for improving the customer experience, as well as for monitoring the impact of these improvements on business performance.

Further insights relating to how analytics is being used to improve the customer experience are contained in the following articles:

Operational Intelligence a Necessity for John Lewis

Splunk for Customer Experience Analytics

Masters of Machines

↧

Your Splunk Sandbox

March 22, 2016, 1:46 pm

≫ Next: Splunking 1 million URLs

≪ Previous: Creating the optimal customer journey using analytics

When I was an admin, sometimes I wanted to Splunk things, but not in my production environment. Maybe I wanted to add data and define the corresponding sourcetype. Maybe I wanted to mess with some backend conf files. Maybe I wanted to muck around with a new version of a search or dashboard. Whatever the reason, I learned a few approaches that may be obvious for the Splunk Ninjas out there, but not so much for our adorable n00bs. Either way, if you find yourself hesitating to try something Splunky, then this post is for you.

Build a Splunk Sandbox

Ideally, you’re installing Splunk on your local workstation (desktop/laptop), but if your company hasn’t given you access rights to install Splunk, then see if there’s a cloud or virtual machine you can get your hands on. Something internal to your organization is ideal so you don’t have to worry about your data leaving the company firewalls. Other approaches include VirtualBox or Docker (for which you’ll find many blog posts -> http://blogs.splunk.com/?s=docker). They key is, you want something that is going to be easy to access, and easy to rebuild when you break it (yes, “when”…if you’re not breaking it, you’re not taking enough risks). Of course, to install just follow your handy Installation Manual which will even walk you through the download.

Once Splunk Enterprise is installed you’ll want to convert it to the free license so you can use it perpetually without any hassle. Don’t worry about pointing it to any indexers or anything like that. You can certainly configure that later using the Distributed Search Manual, but remember that the more you configure your environment, the more work it is to rebuild when you break it, and therefore the less likely you are to take risks.

Play in your Splunk Sandbox

Looking for sandbox data? Check out the free example data available in the Search Tutorial. In fact, if you haven’t taken the tutorial yet, I can’t stress enough how great it is, so stop reading this and take the free Search Tutorial. I’ll wait…..back? Oh, welcome back? Your hair looks nice, did you get it done? I noticed ;).

Ok, back on topic. What if you want to play with data you see in your production instance of Splunk? That’s easy too! Simply export the results of a search you’ve run. There are some gotchas to keep in mind.

If the search returned a huge amount of events, then it might take a long time to export and later add to your sandbox. Therefore try reducing the Number of Results to make things more manageable.
If you want raw events, then select that option when exporting. Raw events are good if you’re trying to define new elements of the sourcetype in your sandbox before putting it in production. Conversely, if you export with any of the other structures that export formats to, then you can play with your data without having to worry about the sourcetype’s field extractions. Using a text editor, take a peek at the results of the different options to see what I mean.

Once you’ve exported the data, add it to your local instance! Now you’re cooking…err, I mean Splunking!

When to Avoid the Sandbox

There are situations in which you won’t need a sandbox but you can still play around without much impact to others.

For example, if you’re creating a cool new search, it might be burdensome to build in your sandbox. In that case, embrace your time selector and the head command. Whenever I work on a new search, I change my time selection to something very small so I know my searches won’t bog down the system. Depending on your data, Last 15 minutes is a good choice but check out the Search Manual’s section on this topic to gain more control over your search window. On the other hand, sometimes the data is not evenly distributed and so the last 15 minutes may have sparse results while on the other hand a few hours ago was too voluminous. Since a wider search window could slam the Splunk system, check out the head command. The head command is awesome because you can do a search over a long time period, but trust that only a specified number of results will be returned. In other words, by using head, I can search over something as extreme as All Time but feel safe that as soon as, say, ten events are returned, my search will terminate and no longer impact the system.

Another scenario where a Sandbox may not be a good fit is if you want to tweak a knowledge object such as a saved search or a dashboard. In those circumstances, fear not! There’s a great way for you to mess with a copy of the object without impacting the original. It’s a feature called Cloning. Once you make a clone of an object, your clone will appear private to your account. That means no one else knows it even exists! Using your cloned copy, you can immediately edit and play with the object knowing that you’re not impacting the system and no one will mistake your copy for their own. If you want to share your clone with your peers, check out the Manual on Managing Knowledge Object Permissions OR, if you have permission from the author/owner of the original object, you can replace the original with your cloned copy and then delete your cloned copy.

Apply Config without Restart

Sometimes even in our sandbox, we get tired of restarting Splunk. Fortunately, there’s a couple of interesting tricks you can do to get your configuration reloaded without said restart. Be aware that some of these aren’t necessarily formal or supported means to reload Splunk and therefore if something looks fishy your best bet is to revert back to the trusty ol’ splunk restart.

If a sourcetype’s props or transforms have been edited, try refreshing with the extract command’s reload parameter.
If you’re playing with conf files, often you can reload the configuration using the /debug/refresh URI. Check out the Admin Manual for more details. Be careful, this trick might force SplunkWeb users to have to re-login.
If its static files on the backend that you’re playing with, then see Advanced Dev Manual for how the _bump URI can save you some restarts.

Hesitate No More!

As you can see, there’s a number of different ways for you to get yo’ Splunk knowledge on. The last tip I can give is that if you find yourself avoiding something in Splunk, take a step back and think about what’s intimidating you. You may find a solution is as simple as a sandbox.

I hope this helped. Good luck, have fun, and happy Splunking!

↧

Splunking 1 million URLs

March 22, 2016, 3:32 pm

≫ Next: Announcing Splunk Add-On for Google Cloud Platform (GCP) at GCPNEXT16!

≪ Previous: Your Splunk Sandbox

Do you love URLs? I do! This is a great way to have insight about behaviors, catch malware, and help to classify what is going on in a network.

I also have a secret: I collect them. The more I have the happiest I am! So what’s best than Splunk to analyze them?

This is the first post of a bunch on what one can do with URLs and Splunk. Please share in comments war stories, or anything you are doing with Splunk and URLs so I can enrich the upcoming posts.

First, you need to grab the Alexa list, which contains top 1 million URLs in a CSV you can download.

We add the new data source to Splunk:

Splunk automagically discovers the CSV type, and we can start searching for our URLs right away.

Now we need an App to parse our URLs properly, fortunately Splunkbase has many:

If we start looking at our data, we can run a search such as

source=”top-1m.csv.zip:./top-1m.csv” | rex field=_raw “\d+,(?<url>\S+)”

We create a field url using our regex, and then use the lookup to parse those URLs and extract useful new fields:

We can now look at the top count for domains without the attached TLD:

Which shows in this case Google, with a count of 145. That means Google appears in the top 1 million most visited URLs multiple times under various TLDs, such as:

google.com, google.om, google.li, google.co.ls, google.so, google.co.uk, etc.

If we now look at the top TLDs, it is easy to see com as a top TLD:

Amongst elements extracted, we have one field “url_url_type”, which can give various data, such as ipv6, ipv4, no_tld, unknown_tld, mozilla_tld.

The Mozilla TLD is only to show presence into the Public Suffix List. So whenever an entry appears in both “unknown_tld” and is in the top 1 million urls by Alexa, it starts to get interesting:

This is actually a TLD Romanized as rf, according to what Wikipedia can say about this one, which actually appears in the Mozilla Prefix list as following:

// xn--p1ai (“rf”, Russian-Cyrillic) : RU

// http://www.cctld.ru/en/docs/rulesrf.php

рф

But does not have the same encoding, hence showing some improvements that could be made in the lookup. Adding a Unicode to Punycode conversion?

Splunk offers a variety of apps, amongst which that can help analysts to understand more about great insight given by URLs. Happy Splunking!

↧

Announcing Splunk Add-On for Google Cloud Platform (GCP) at GCPNEXT16!

March 23, 2016, 7:48 am

≫ Next: Smart AnSwerS #57

≪ Previous: Splunking 1 million URLs

This week Splunk is thrilled to be speaking and exhibiting at GCPNEXT16 to announce the availability of a Splunk Add-On for Google Cloud Platform. This free add-on available on Splunkbase, provides IT Ops Teams with secure access to GCP Pub/Sub events that you can collect, search, analyze and monitor in Splunk to maintain the security and reliability of mission critical services. This includes any logs from GCP Services such as App Engine, Compute Engine, Container Engine, BigQuery, etc. that have been exported to Pub/Sub through Stackdriver Logging. Splunk’s Add-On also includes secure access to GCP’s Stackdriver Monitoring API which allow you to collect time series performance metrics from App Engine, Compute Engine, Cloud SQL, etc. in Splunk.

Using the prebuilt panels, IT Ops team members can build dashboards / reports and allow you to quickly drill into GCP events:

Who CHANGED the firewall rule? Why NOW? You’re a click away to find out what changed and who changed it, when the change was made and what project may be impacted!

Here’s our sample prebuilt panel used to monitor CPU Utilization from the Cloud Monitor API metric subscription:

Will you be at GCPNEXT16? If yes, I’d like to invite you to attend my talk on Wednesday, March 23rd from 5:30pm – 6:15pm as part of the Infrastructure & Operations track, Enterprise Ops: Managing your business on GCP and beyond (https://cloudplatformonline.com/NEXT2016-schedule.html). To learn more, ask questions, and demo Splunk’s Add-On for GCP, please schedule some time to stop by the Splunk booth. We are #39! We look forward to meeting you!

↧

Smart AnSwerS #57

March 31, 2016, 5:10 pm

≫ Next: Hunting that evil typosquatter

≪ Previous: Announcing Splunk Add-On for Google Cloud Platform (GCP) at GCPNEXT16!

Hey there community and welcome to the 57^th installment of Smart AnSwerS.

Feels good to be back in action after a 3 week break, minus coming down with the flu, but that hasn’t completely stopped me from shifting my brain back into Splunk mode. Even though I’ve had to spend recovery time working from home, I was still able to join in on the SplunkTrust Virtual .conf March Session on “Grouping with stats: practical concerns and best practices” presented by Nick Mealy, aka sideview. You can visit the Meetup page to find the link to the recording in case you missed out and stay tuned for the next session.

Check out this week’s featured Splunk Answers posts:

How to write a search to only show the latest contents of a lookup file on a dashboard?

kuga_mbsd had an external program creating a lookup table every night, but needed an easy way to search and display the latest contents of the file on a dashboard rather than manually checking it every time. Lucas K gives a nifty solution defining a macro to simplify and automate the process in combination with an inputlookup scheduled search to pull the latest data.
https://answers.splunk.com/answers/330443/how-to-write-a-search-to-only-show-the-latest-cont.html

Is it possible to have your sourcetype be determined at index-time based on host?

cmeyers wanted the sourcetype for his data to be the type of device and wanted this to be based on host as data is indexed. lguinn provides an answer that cautions against using sourcetype for another purpose other than grouping data based on the actual data format and fields. She instead suggests creating a CSV file of host names with other necessary information such as devicetype, and upload it as an automatic lookup to use the devicetype field in searches. This method is easier and more flexible as the CSV file can be updated and reloaded as needed.
https://answers.splunk.com/answers/334331/is-it-possible-to-have-your-sourcetype-be-determin-1.html

How to work out the age of a user based on date of birth?

A similar question was featured before, but this run anywhere search example by SplunkTrust member somesoni2 is a great learning opportunity for other users. Amohlmann had a search to calculate a person’s age based on a dateofbirth field, but was having trouble figuring out how to make it work for birth dates before 1970. Level up your SPL fu with somesoni2’s answer using rex and eval.
https://answers.splunk.com/answers/338613/how-to-work-out-the-age-of-a-user-based-on-date-of.html

Thanks for reading!

Missed out on the first fifty-six Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Hunting that evil typosquatter

April 1, 2016, 9:41 am

≫ Next: Building add-ons has never been easier

≪ Previous: Smart AnSwerS #57

We are continuing our URL investigations, S1 episode 2. If you missed the first episode, you can go and read the blog post Splunking 1 million URLs first.

One of the well known security problems is typo squatting. What would happen if someone registers www.yahoo.om? Knowing this is one of the most popular website, there is a high chance a small percentage would type this instead of the legitimate www.yahoo.com

.om is the ccTLD for the country of Oman

I encourage you to read a thorough analysis on this problem from Endgame “What does Oman, the House of Cards, and Typosquatting Have in Common? The .om Domain and the Dangers of Typosquatting”. They even published a list on pastebin, which as of today looks like this:

ctrip.om
dangdang.om
directv.om
douban.om
drugstore.om
dubizzle.om
eastmoney.om
enterprise.om
etao.om
fiverr.om
htc.om
huffingtonpost.om
nbc.om
one.om
qqc.om
qvc.om
si.om
sogou.om
tuniu.om
usaa.om
weatherc.om
weiboc.om
y8.om
yatra.om

As a Splunk user, it is great to know the impact of those domains in your network.

While some bash users like to say “Go away or I will replace you with a very small shell script”, at Splunk we like to say “I will find you with a very small Splunk search!”.

First of all, let’s create a CSV from this list. It will look like this:

typodomain, typotype
ctrip.om, typo squatting
...

Then, in Splunk, go to Settings/Lookups to add a Lookup table file. Tune Permissions properly afterwards and lets get started!

We make sure the CSV works properly first:

Now we can apply the lookup to our data. We happen to have bluecoat proxy data, to which we apply the URL schema on the fly as seen in the first post:

sourcetype="bluecoat*" | lookup webfaup url

To which we run our subsearch, so matching events will appear:

| inputlookup typosquatting.csv | rename typodomain as url_domain | fields + url_domain

Now we can combine both:

sourcetype="bluecoat*" | lookup webfaup url | search [| inputlookup typosquatting.csv | rename typodomain as url_domain | fields + url_domain]

And we can check if one of the domains did match that request:

Voilà, this is how one can catch with no pain, a small Splunk search, from external knowledge intelligence gathered on a pastebin of something you would rather be aware of if this happens in your organization.

↧

Building add-ons has never been easier

April 4, 2016, 3:18 pm

≫ Next: Smart AnSwerS #58

≪ Previous: Hunting that evil typosquatter

Speaking from personal experience, building add-ons had never been the easiest task for me. There are numerous steps required, and each step may come with its owns challenges. Worse, I might spend time on a solutions just to hear it wasn’t best practice.

Wouldn’t it be great if there was a way to make this process easier by equipping developers, consultants, and Splunk Admins with the right tool to build their own add-ons? To take it a step further, wouldn’t it be even better if this tool actually helps you build the add-on by following tried and true best practices?

Allow me to introduce you to the Splunk Add-on Builder that helps to address the challenges highlighted above. Splunk Add-on Builder V1 was released on April 1st, 2016. In this release the Add-on Builder assists with building the basic components of add-ons. Namely:

UI based creation of the add-on and its folder structure:

Intuitive add-on setup page creation: No need to write xml files, just select the fields you want your add-on setup to expose. Multiple accounts and custom fields are easy to support now:

Building data collection: in this release, Add-on Builder helps you build your modular input supporting various mechanisms such as REST API, shell commands, or using your own python code to pull data from third party systems. If you have a REST API, let us generate the code and modular input for you. Just input the API URL and parameters and hit save:

If you need a modular input that requires you to write you own Python code or run a system command, you can use the Add-on Builder to interactively validate the output:

Interactive fields extraction: Add-on Builder uses a machine learning clustering algorithm to classify data ingested by add-on into groups that share the same format structure. That means it can automatically generate the field extractions for each group, letting you skip the grunt work and go straight through to recognizing event types.

Mapping to CIM made easy:

Last but not least, the Add-on Builder offers validation for best practices so you can see if you’re going to run into trouble before you post your Add-on on Splunkbase:

Please give Splunk Add-on Builder a try and provide us with your feedback. We’re very excited to hear how the first version works for you, and we are looking forward to your help to take this to the next level.

↧

Smart AnSwerS #58

April 7, 2016, 11:17 am

≫ Next: Developing Correlation Searches Using Guided Search

≪ Previous: Building add-ons has never been easier

Hey there community and welcome to the 58^th installment of Smart AnSwerS.

It’s officially baseball season and today is opening day for the San Francisco Giants. With Splunk HQ just two blocks away from the ballpark, traffic and parking mayhem has commenced! Our next SF Bay Area Splunk User Group meeting also falls on a game day, but luckily one of our awesome members Becky Burwell offered to host the meeting at her office away from the madness If you happen to be in the area, come join us at 6:00 PM on Tuesday, April 19^th,2016 at Yahoo! HQ in Sunnyvale, CA. Visit the SFBA Splunk User Group page to view the agenda and RSVP.

Check out this week’s featured Splunk Answers posts:

Can a dashboard map’s center location change based on a drop-down token?

Phil219 had a drop-down form on a dashboard for users to display US states on a map and needed a way to change the center of the map based on the selected token. talla_ranjith had implemented something similar in the past and shared his XML code, explaining exactly how it works with screenshots of what the final dashboard should look like.
https://answers.splunk.com/answers/349998/can-a-dashboard-maps-center-location-change-based.html

Is it possible to show a custom tooltip whenever a user hovers over a slice of a pie chart or column in a bar chart using Simple XML with Splunk JS?

lyndac wanted to show a custom tooltip whenever a user hovers over a slice of a pie chart to display certain fields, counts, and percentages. jeffland came around with some JavaScript he had used for this exact case in an older version of Splunk to get lyndac started. Through some back and forth discussion, they hashed out the kinks together and got the final working code.
https://answers.splunk.com/answers/337329/is-it-possible-to-show-a-custom-tooltip-whenever-a.html

After copying the folder for an app I created in Splunk 6.1.3 to 6.2.1, why can’t I see any of the dashboards?

shrirangphadke created an app in Splunk 6.1.8, but couldn’t see any of the dashboards after copying the folder to 6.2.1. Splunktrust member alacercogitatus gave some possible troubleshooting tips to try to deduce the problem. After some additional information from shrirangphadke about how the dashboards were originally created, fellow Splunktrustee MuS hit the nail on the head, explaining that dashboards created by users are first placed under the user directory and have to be shared in the app to move them to the app directory.
https://answers.splunk.com/answers/250152/after-copying-the-folder-for-an-app-i-created-in-s.html

Thanks for reading!

Missed out on the first fifty-seven Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

Developing Correlation Searches Using Guided Search

April 11, 2016, 11:08 am

≫ Next: Show Me Your Viz!

≪ Previous: Smart AnSwerS #58

Guided Search was released in Splunk Enterprise Security 3.1, nearly two years ago, but is often an overlooked feature. In reality, it is an excellent tool for streamlining the development of correlation searches. The goal of this blog is to provide a better understanding of how this capability can be used to create correlation searches above and beyond what Enterprise Security has to meet your unique security requirements.

So what is Guided Search?

It’s a “wizard”-like process to gather the key attributes that make up a correlation search. Essentially, there are five elements to Guided Search:

Identify the data set to search
Apply a time boundary
Filter the data set (optional)
Apply statistics (optional)
Establish thresholds (optional)

Along the way, Guided Search provides search syntax to validate the filters and thresholds to ensure the output meets your needs.

How does this work?

Let’s take the following example:

We have recently deployed an intrusion detection system (IDS) and we would like to tune the signatures to ensure that the IDS is not overly “chatty” as it pertain to a set of systems within a specific part of your network. Our analysts need to be notified when more than 20 IDS events are triggered by the same signature against any host in that network range within an hour.

Splunk uses data models to create search time mappings of datasets to specific security domains. A data model does not group events by vendor or by network but by the type of event. Examples of data models provided in the Splunk Common Information Model (CIM) include authentication, malware, intrusion detection, and network traffic.

Because we want to identify IDS events, we’ll choose the Intrusion_Detection data model and its IDS_Attacks object. Click Next.

GuidedSearchCreation

We will need to define a time range for the correlation search to run against. This range will bound the events. Because we are focusing on the previous hour, we select Last 60 minutes.

GSC_60minutes

Now that we have a data set and a time frame for our data, we can use Guided Search to filter our search results. If we don’t need all of the IDS data, then why search all of it? Because we are focused on a specific network range, we could use the LIKE operator and the % wildcard to look for any destination IP addresses that start with 192.168.1.x like this: dest_ip LIKE “192.168.1.%”. Note that the filter will be implemented using Splunk’s where command so the asterisk (*) wildcard character that you may be familiar with won’t work here. Alternatively, we could use CIDR notation in this way: cidrmatch(“192.168.1.0/24”, dest_ip).

GSC_Filter

At this point, our search is created and the syntax is displayed. We could click Run search to test our search to ensure that it is collecting the data we are expecting. Once we are satisfied with our data set, we can proceed to applying statistics to the data. Click Next.

While we can create multiple aggregates for our search, we only need a single aggregate and that is a count. Click Add a new aggregate.

GSC_stats

The Function drop-down has a number of arithmetic functions including count, distinct count, average, standard deviation and more. These functions can be applied to any of the fields that are available in the Attribute drop down. Because we want a count of events, we are going to count the values in the _raw field and then assign an alias of count to the statistic.

GSC_stats2

Click Next. If we needed to add additional aggregates, we do that now. Click Next.

GSCStats_3

In most cases, we want to use Split-By to select fields to group our data. If we had asked for a count as our aggregate, but did not apply at least one field in the Split-By, we would have ended up with a number that would have represented the count of IDS alerts during the past 60 minutes, making the search fairly useless to an analyst. In our case, we want to know the count of alerts by dest (destination) and by signature, so our Split-By will be both of those fields. Click Next.

GSC_signature

We can create aliases for these fields if we want. If we do not create an alias, it will default to the field name, in this case dest and signature.

GSC_field

The last step we need to take before we are finished is to determine a threshold for the search. If you recall, our requirement was to trigger this correlation search when a host saw the same signature 20 times in one hour so our analysts could review and tune the signature. Based on that, we will set our count Attribute threshold by selecting Greater than Operation and Value of 20. Click Next.

GSC_operationvalue

At this point our search syntax is available for review. We can click Run search to review the output and to ensure the results are what we would expect.

GSC_ready

GSC_newsearch

Once we are happy with the results, we can click Save to write the generated search processing language (SPL) into the Correlation Search. From there, we can configure the rest of our correlation search, but we will save that for another post!

John Stoner
Federal Security Strategist
Splunk Inc.

Follow @splunkgov

Follow @splunk

↧

Show Me Your Viz!

April 11, 2016, 3:09 pm

≫ Next: Smart AnSwerS #59

≪ Previous: Developing Correlation Searches Using Guided Search

Have you just download Splunk 6.4 and asked yourself what’s new and awesome? Have you ever built a dashboard with a custom visualization and wanted to share that with someone or easily replicate it somewhere else? Have Splunk’s core visualizations dulled your senses?

Reader, please meet Splunk 6.4 Custom Visualizations. Are you besties yet? If not, you two will be making sweet love by the end of this article.

I’m going to walk you through a Custom Visualization app I recently wrote and lay it all out there. I’m going to talk about why building these visualizations in Simple XML and HTML are a pain in your ass and how the API’s make your life easier. I’m going to show you techniques I learned the hard way so you can accelerate the creation of your app. Does all that sound good? Great, let’s get started.

First, a little backstory…

I recently presented a Developer session at Splunk Live San Francisco this past March. After the session I had a customer come up to me and ask a really simple question. How do you plot single values on a Splunk map? The simple answer is you don’t. But that’s a really shitty answer especially since there is no way to natively do it. Our maps were built to leverage aggregate statistics clustered into geographic bins from the geostats command. The output of that are bubbles on a map that optionally show multiple series of data (like a pie chart) if you use a split by clause in your search. If you’ve ever tried to plot massive amounts of categories using the argument globallimit=0, your system will come to a grinding halt as it spirals into javascript induced browser death. I thought about the problem for a while and figured there had to be a better way.

Less than a month later, in the span of a week, I built an app that solves this problem. The best part is it’s distributable on Splunkbase and can be used by anyone running a Splunk 6.4 search head. If you’re a Javascript pro then I’m confident you can build a visualization app in a couple days. Here’s how I did it and what I learned.

Step 0 – Follow The Docs

The docs are quite good for a new feature. Use them as a definitive reference on how to build a Custom Visualization app. Use this article to supplement the docs for some practical tips and tricks.

Step 1 – Put Splunk Into Development Mode

This will prevent splunk from caching javascript and won’t minify your JS or CSS. This allows you to just refresh the browser and re-load any changes you made without having to restart Splunk.

Create $SPLUNK_HOME/etc/system/local/web.conf with the following settings and restart Splunk.

[settings]
minify_js = False
minify_css = False
js_no_cache = True
cacheEntriesLimit = 0
cacheBytesLimit = 0
enableWebDebug = True

Step 2 – Create The App Structure

Follow this section from the docs. There’s a link to a template that has the requisite app structure and files you’ll need to be successful. The only thing it’s missing is the static directory to house your app icons if you decide to post the app on Splunkbase and want it to look pretty.

Step 3 – Create The Visualization Logic

Step 3a – Include Your Dependences

Here’s where we’ll spend the bulk of our time. This step is where you install any dependencies that your visualization relies on and write your code to display your viz.

You may be asking yourself when I’m going to get to some of the relevant points I mentioned above; specifically how using this API makes your life easier. Here’s where there rubber meets the road. We’re managing dependencies a little differently this time around. If you’ve built a custom visualization Splunk 6.0-6.3 you’ve done it using one of two methods. The first is converting your Simple XML dashboard into HTML. This works well but isn’t guaranteed to be upgrade friendly. The second method is loading javascript code from Simple XML. If you’ve used either method you’ll likely have run into RequireJS. We use RequireJS under the covers in Splunk to manage the loading of Javascript libraries. It works but it’s a major pain in the ass and it’s a nightmare when you have non-AMD compliant modules or conflicting version dependencies for modules that Splunk provides. I come from a Python world where importing dependencies (modules) is easy. Call me simplistic or call me naive, but why shouldn’t Javascript be so simple?

The Custom Visualization framework makes dealing with dependencies a lot easier by leveraging npm and webpack. This makes maintaining and building your application a lot easier than trying to do things in RequireJS. Use npm to download dependencies with a package.json (or manually install with npm install) and webpack will build your code and all the dependencies into a single visualization.js file that the custom viz leverages. This code will integrate smoothly with any dashboard and you won’t run into conflicts like you may have in the past with RequireJS.

The visualization I built requires a couple libraries; Leaflet and a plugin called Leaflet.markercluster.

Here’s what it looked like to load these libraries using RequireJS in an HTML dashboard within an app called ‘leaflet_maps’. Luckily, Leaflet doesn’t require any newer versions of Jquery or Underscore than are provided by Splunk. I’ve had to shelve an app I want to build because of RequireJS and the need for newer versions of Jquery and Lodash (modified Underscore). If you’re a RequireJS pro you may be screaming at me to use Multiversion support in RequireJS. I’ve tried it unsuccessfully. If you can figure it out, please let me know what you did to get it working.

require.config({
    baseUrl: "{{SPLUNKWEB_URL_PREFIX}}/static/js",
    paths: {
        'leaflet': '/static/app/leaflet_maps/js/leaflet-src',
        'markercluster': '/static/app/leaflet_maps/js/leaflet.markercluster-src',
        'async': '/static/app/leaflet_maps/js/async',
    },
    shim: {
        leaflet: {
            exports: 'L'
        },
        markercluster: {
            deps: ['leaflet']
        }
    }
});

This piece of code literally took me half a day to figure out. Things are easy in RequireJS if your module is AMD compliant. If it isn’t, like Leaflet.markercluster, you have to shim it. The bottom line is it’s a pain in the ass and difficult to get working. It took a lot of Google searching and digging through docs.

Here’s what it looks like using npm and webpack.

npm config – package.json

{
  "name": "leaflet_maps_app",
  "version": "1.0.0",
  "description": "Leaflet maps app with Markercluster plugin functionality.",
  "main": "visualization.js",
  "scripts": {
    "build": "node ./node_modules/webpack/bin/webpack.js",
    "devbuild": "node ./node_modules/webpack/bin/webpack.js --progress",
    "watch": "node ./node_modules/webpack/bin/webpack.js -d --watch --progress"
  },
  "author": "Scott Haskell",
  "license": "End User License Agreement for Third-Party Content",
  "devDependencies": {
    "imports-loader": "^0.6.5",
    "webpack": "^1.12.6"
  },
  "dependencies": {
    "jquery": "^2.2.0",
    "underscore": "^1.8.3",
    "leaflet": "~1.0.0-beta.2"
  }
}

This is the same package.json provided in the sample app template. The only things I modified were the name, author, license, devDependencies and dependencies. The important dependencies are imports-loader and leaflet. Leaflet.markercluster is available via npm but it’s an older version that was missing some features I needed so I couldn’t include it here. Now all I need to do is have nodejs and npm installed and run ‘npm install’ in the same directory as the package.json ($SPLUNK_HOME/etc/apps/leaflet_maps_app/appserver/static/visualizations/leaflet_maps). This creates a node_modules directory with your dependencies code.

webpack config – webpack.config.js

var webpack = require('webpack');
var path = require('path');

module.exports = {
    entry: 'leaflet_maps',
    resolve: {
        root: [
            path.join(__dirname, 'src'),
        ]
    },
    output: {
        filename: 'visualization.js',
        libraryTarget: 'amd'
    },
    module: {
        loaders: [
            {
                test: /leaflet\.markercluster-src\.js$/,
                loader: 'imports-loader?L=leaflet'
            }
        ]
    },
    externals: [
        'vizapi/SplunkVisualizationBase',
        'vizapi/SplunkVisualizationUtils'
    ]
};

Again, this is the same file provided in the template app. The difference here is the ‘loaders’ section of the ‘module’ definition. I’m using the webpack imports-loader to shim the Leaflet.markercluster module since it’s not AMD compliant. This is analogous to the RequireJS shim code I provided above. The difference here is that it’s much more intuitive (once you figure out you need imports-loader) to shim in webpack. The test key is a regex that matches the Leaflet.markercluster source file. The loader key defines the dependency on the function ‘L’ which is exported in the leaflet library.

Lastly, here’s the one small portion of RequireJS that you have to touch in your source.

define([
            'jquery',
            'underscore',
            'leaflet',
            'vizapi/SplunkVisualizationBase',
            'vizapi/SplunkVisualizationUtils',
            '../contrib/leaflet.markercluster-src'
        ],
        function(
            $,
            _,
            L,
            SplunkVisualizationBase,
            SplunkVisualizationUtils
        ) {

I’ve created a contrib directory and added some supporting Javascript and CSS files. I’ve defined my leaflet module and it’s L function as well as the leaflet.markercluster source location in contrib. Notice that since this leaflet.markercluster is not AMD compliant I don’t need to define a function.

Now all you have to do is build the code using npm.

bash-3.2$ cd $SPLUNK_HOME/etc/apps/leaflet_maps_app/appserver/static/visualizations/leaflet_maps
bash-3.2$ npm run build

> leaflet_maps_app@1.0.0 build /opt/splunk/etc/apps/leaflet_maps_app/appserver/static/visualizations/leaflet_maps
> node ./node_modules/webpack/bin/webpack.js

Hash: 9ea37b6ef76197f0a3b7
Version: webpack 1.12.14
Time: 511ms
           Asset    Size  Chunks             Chunk Names
visualization.js  649 kB       0  [emitted]  main
   [0] ./src/leaflet_maps.js 7.39 kB {0} [built]
    + 6 hidden modules

Any time you make subsequent changes to your source just re-run the build and re-fresh Splunk.

I built this app on CentOS 7 in a docker image. I have npm and node installed in the docker image but it’s also possible to leverage node that gets shipped with Splunk. You’d just tweak this section of your package.json.

"scripts": {
    "build": "$SPLUNK_HOME/bin/splunk cmd node ./node_modules/webpack/bin/webpack.js",
    "devbuild": "$SPLUNK_HOME/bin/splunk cmd node ./node_modules/webpack/bin/webpack.js --progress",
    "watch": "$SPLUNK_HOME/bin/splunk cmd node ./node_modules/webpack/bin/webpack.js -d --watch --progress"
  },

Then export the path to your Splunk 6.4 install as SPLUNK_HOME.

bash-3.2$ export SPLUNK_HOME=/opt/splunk

Your visualization code and all dependencies are now built into a single file called visualization.js.

Step 3b – Write The Code

If you’re using the app template and following the docs then you’ll be modifying the file /appserver/static/visualizations//src/visualization_source.js to place your code. There are a bunch of methods in the API that will be relevant.

The first method is updateView

The updateView method is where you stick your custom code to create your visualization. There’s nothing super fancy going on and the docs do a great job explaining what needs to be done here. One important thing to cover is how to handle searches that return > 50,000 results. If you’ve worked with the REST API or written dashboards using the SplunkJS stack you’ll know that you can only get 50,000 results at a time. Things are no different here. It just wasn’t obvious how to do it. I had to dig through the API to figure it out. Here’s how I did it so you don’t have to waste your time.

    updateView: function(data, config) {
        // get data
        var dataRows = data.rows;

        // check for data
        if (!dataRows || dataRows.length === 0 || dataRows[0].length === 0) {
            return this;
        }

        if (!this.isInitializedDom) {
	    // more initialization code here
	    this.chunk = 50000;
	    this.offset = 0;
	}

	// the rest of your code logic here

	// Chunk through data 50k results at a time
	if(dataRows.length === this.chunk) {
	    this.offset += this.chunk;
	    this.updateDataParams({count: this.chunk, offset: this.offset});
	}
    }

I initialize a couple variables; offset and chunk. I then check to see if I get a full 50k events back. If so, I increment my offset by the chunk size and update my data params. This will continue to page through my results set, calling updateView each time and synchronously running back through the code, until I get < 50k events. It's straightforward but not documented anywhere.

This leads us to the second method getInitialDataParams

This is where you set the output format of your data and how many results the search is limited to.

        // Search data params
        getInitialDataParams: function() {
            return ({
                outputMode: SplunkVisualizationBase.ROW_MAJOR_OUTPUT_MODE,
                count: 0
            });
        },

I set the count to 0 which is an unlimited amount of results. This can be dangerous and could potentially overwhelm your visualization so be sure that it can handle it before you go down this route. Here are the available options and output modes.

/**
         * Override to define initial data parameters that the framework should use to
         * fetch data for the visualization.
         *
         * Allowed data parameters:
         *
         * outputMode (required) the data format that the visualization expects, one of
         * - SplunkVisualizationBase.COLUMN_MAJOR_OUTPUT_MODE
         *     {
         *         fields: [
         *             { name: 'x' },
         *             { name: 'y' },
         *             { name: 'z' }
         *         ],
         *         columns: [
         *             ['a', 'b', 'c'],
         *             [4, 5, 6],
         *             [70, 80, 90]
         *         ]
         *     }
         * - SplunkVisualizationBase.ROW_MAJOR_OUTPUT_MODE
         *     {
         *         fields: [
         *             { name: 'x' },
         *             { name: 'y' },
         *             { name: 'z' }
         *         ],
         *         rows: [
         *             ['a', 4, 70],
         *             ['b', 5, 80],
         *             ['c', 6, 90]
         *         ]
         *     }
         * - SplunkVisualizationBase.RAW_OUTPUT_MODE
         *     {
         *         fields: [
         *             { name: 'x' },
         *             { name: 'y' },
         *             { name: 'z' }
         *         ],
         *         results: [
         *             { x: 'a', y: 4, z: 70 },
         *             { x: 'b', y: 5, z: 80 },
         *             { x: 'c', y: 6, z: 90 }
         *         ]
         *     }
         *
         * count (optional) how many rows of results to request, default is 1000
         *
         * offset (optional) the index of the first requested result row, default is 0
         *
         * sortKey (optional) the field name to sort the results by
         *
         * sortDirection (optional) the direction of the sort, one of:
         * - SplunkVisualizationBase.SORT_ASCENDING
         * - SplunkVisualizationBase.SORT_DESCENDING (default)
         *
         * search (optional) a post-processing search to apply to generate the results
         *
         * @param {Object} config The initial config attributes
         * @returns {Object}
         *
         */

Some other methods you may want to look into are initialize, setupView, formatData and drilldown. If you want to look at all the methods take a look at $SPLUNK_HOME/share/splunk/search_mrsparkle/exposed/js/vizapi/SplunkVisualizationBase.js

Steps 4-7 – Add CSS, Add Configuration Settings, Try Out The Visualization, Handle Data Format Errors

Refer to the docs.

Step 8 – Add User Configurable Properties

You’ll most likely want to give your user an interface to tweak the parameters of the visualization. You HAVE to define these properties in default/savedsearches.conf and README/savedsearches.conf.spec. Follow the docs here and don’t skip this step!

Step 9 – Implement a format Menu

Refer to the docs. If you want to add a default value here’s an example of how you’d do it. Quick warning, I had to strip out the HTML so WordPress wouldn’t mangle it. Have a look at the code in my app if you want a full example.

splunk-radio-input name="display.visualizations.custom.leaflet_maps_app.leaflet_maps.cluster" value="true"

That’s all there is to it! I’m not a Javascript developer and this was pretty damn simple to figure out. I hope you find the experience as enjoyable as I did. If you build something cool, please contribute it back to the community and post the app on Splunkbase.com.

↧

Smart AnSwerS #59

April 14, 2016, 12:45 pm

≫ Next: HTTP Event Collector and sending from the browser

≪ Previous: Show Me Your Viz!

Hey there community and welcome to the 59^th installment of Smart AnSwerS.

There’s a tradition at Splunk where “something” happens to or around your desk if you take PTO for at least 2-3 weeks. When piebob left for the UK late last year, she returned to Splunk HQ with a completely homemade replica of the cruise ship she took on her trip abroad which spanned the entire length of her desk. This week, support engineer DerekB just came back from paternity leave to find a hybrid Audi baby stroller made entirely out of cardboard with fully functional wheels. To top it off, it’s parked right behind me and Derek’s (pouty) face was printed out and tacked on to a dilapidated baby doll that stares into my very soul. *shudder*

Check out this week’s featured Splunk Answers posts:

Diagram of Splunk Common Network Ports

This post is over 2 years old, but still very useful and pretty to look at! rob_jordan asked and answered this question to share a diagram he created to help the Splunk community understand what network ports commonly used in Splunk Enterprise environments need to be open to allow traffic through a firewall. He also shares a link in a second answer for anyone interested in downloading the source Visio diagrams.
https://answers.splunk.com/answers/118859/diagram-of-splunk-common-network-ports.html

Is it possible to create a batch data input via the REST API?

Sometimes SplunkTrust members need to ask questions on Answers too…and they figure out the solution and answer their own questions to educate other Splunk users sideview had an app with a data input wizard which used the REST API to list and create monitor data inputs, but wanted it to also do the same for batch inputs. With some input from jkat54 and help from Splunk Support, sideview was able to figure the correct REST API endpoint for the job.
https://answers.splunk.com/answers/335081/is-it-possible-to-create-a-batch-data-input-via-th.html

How to troubleshoot why my universal forwarder is not phoning home?

w0lverineNOP was pinging a Splunk Enterprise server from a universal forwarder, but was not getting a response and needed to figure out how to successfully set up forwarding. aljohnson laid out a clear and concise process on how to make sure everything was set up on the forwarder and indexer side before defining inputs. After w0lverineNOP went down the sanity checklist, the issue ended up being resolved in the very last step which was actually defining inputs.conf on the forwarder.
https://answers.splunk.com/answers/351117/how-to-troubleshoot-why-my-universal-forwarder-is-1.html

Thanks for reading!

Missed out on the first fifty-eight Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

HTTP Event Collector and sending from the browser

April 14, 2016, 2:43 pm

≫ Next: Downtime Got You Down? Webinar: Getting Started With Splunk for Application Management

≪ Previous: Smart AnSwerS #59

Recently we’ve been seeing a bunch of questions coming in related to errors when folks try to send events to HEC (HTTP Event Collector) from the browser and the requests are denied. One reason you might want to send from the browser is to capture errors or logs within your client-side applications. Another is to capture telemetry / how the application is being used. It is a great match for HEC however…

Making calls from a browser to Splunk get you into the world of cross-domain requests and CORS. In this post I’ll describe quickly what CORS (Cross Origin Resource Sharing) is and how you can enable your browsers to take advantage of HEC.

Problem

Browser clients are trying to send events to HEC from Javascript and the requests are denied. The issue is related to CORS . Most browsers by default (Chrome, Safari) are not going to allow cross-domain requests (which includes HEC) unless it is authorized. A cross-domain call is when a page served from one domain (website) tries to make a request from a script to another domain. The browser will first go and make a pre-flight request asking the target server who is allowed to access it and what methods are supported. The server may respond with an Access-Control-Allow-Origin header which includes either a wildcard (*), or a list of domains that are acceptable. Assuming the browser gets a response that indicates it’s origin is permitted then it will allow the request to go through.

Note: For Splunk Cloud customers, you will need to work with support to get this enabled.

Solution

Splunk supports CORS and it can be enabled within conf. Depending on the version of Splunk, where you enable it differs. In Splunk 6.4, this will be enabled in the [http] stanza of inputs.conf. Which is specific for HEC. You’ll see the crossOriginSharingPolicy setting here.

If you are using Splunk 6.3, then the setting is in server.conf under [httpserver] and applies generally to the REST API as well. Once the policy is properly configured, browsers will be able to make cross domain requests.

Note: If you are in Splunk Cloud trial or Single Instance this may be an issue as the cert is self-signed.

Caveats on SSL and CORS

There is one big caveat though, the SSL cert on the Splunk side MUST be a valid cert. This is not a Splunk constraint, this is a constrain for browsers like Chrome, Firefox etc. Without a valid SSL cert the request will complete and you will get an error. The only way to work around this is to not use SSL (which I am guessing you don’t want to do).

Now depending again on which version of Splunk you are using determines where to configure the valid SSL cert. If you are in Splunk 6.4, this is also in inputs.conf. For Splunk 6.3 it is in server.conf under [sslconfig]

Enjoy having fun with HEC and the browser!

↧

Downtime Got You Down? Webinar: Getting Started With Splunk for Application Management

April 15, 2016, 9:00 am

≫ Next: When entropy meets Shannon

≪ Previous: HTTP Event Collector and sending from the browser

Your applications are often the most important part of your business, and poor performing apps can be extremely detrimental to your bottom line as well as your company reputation. At Splunk, we help you provide the best end user experience to your customers. Whether it’s ensuring the availability of critical services, improving response time, or reducing MTTR, Splunk can help you monitor and measure the key inputs that affect customer experience (C/X).

For instance, think about response time. At 1/10^th of a second response time in application performance is nearly seamless to the end user. As response time creeps up to 1 second, or longer, that’s enough of a mental break for an end user to realize the lag time with your website or application. If that response time expands to 5 or 10 seconds, you’ve lost that user and potentially inspired them to write a negative review on social media. To that end, customers are starting to measure response time in the degree of milliseconds to see how it affects conversion and customer satisfaction. Hence, improving uptime and performance is a critical piece to providing a stellar customer experience.

The tenets to ensuring good C/X are by no means new. So then why don’t we live in a world free of outages and downtime? Well, today’s application environments are extremely complex. Applications and business services typically span multiple components and with the advent of virtualization, containerization, and the cloud, traditional tools are not sufficient to manage such a distributed and constantly changing environment. Fortunately, Splunk software can help remove the blind spots and eliminate the silos.

Join the Getting Started with Splunk for Application Management Webinar at 9am PT on April 27^th to learn how to:

Use Splunk software to monitor and measure the key areas that impact Customer Experience
Index and analyze data across your application stack for better insight on performance and availability
Improve application uptime and deliver a better customer experience.

We look forward to seeing you there!

↧

When entropy meets Shannon

April 21, 2016, 12:12 pm

≫ Next: Smart AnSwerS #60

≪ Previous: Downtime Got You Down? Webinar: Getting Started With Splunk for Application Management

This is the third post on URL analysis, please have a look at the two other posts for more context about what can be done with Splunk to analyze URLs:

You will find in this article information on how one can detect DNS tunnels. While you can find lots of very useful apps on Splunkbase to help you analyze DNS data, it is always good for curious individuals to discover some techniques being used underneath.

A lot of captive portals are bypassed everyday by anyone able to run a DNS request, if someone can run on their machine the following command:

$ host splunk.com
splunk.com has address 54.69.58.243
...

Without being authenticated on the captive portal, then they can use any service on the internet using a DNS tunnel. There are a lot of tools out there to create those tunnels. And for a great paper on the topic, I encourage you to read the Detecting DNS Tunneling from SANS Institute.

Claude Shannon to the rescue!

Claude Shannon

Long time ago, the venerable Claude E. Shannon wrote the paper “A Mathematical Theory of Communication“, which I strongly encourage to read for its clarity and amazing source of information.

He invented a great algorithm known as the Shannon Entropy which is useful to discover the statistical structure of a word or message.

If you consider a word, being a discrete source of the finite number of characters type which can be considered, for each possible character there will be a set of probabilities which would produce various outputs. There will be an entropy for each character. This entropy on the chosen word is defined as the average of the output weighted on the probability of occurrence of the characters.

The previous paragraph can easily be translated into the following Python code (taken from the excellent URL Toolbox on Splunkbase:

def shannon(word):
    entropy = 0.0
    length = len(word)

    occ = {}
    for c in word :
        if not c in occ:
            occ[ c ] = 0
        occ += 1

    for (k,v) in occ.iteritems():
        p = float( v ) / float(length)
        entropy -= p * math.log(p, 2) # Log base 2

    return entropy

Which can be run directly from any word you can have in Splunk:

As you can see, the score is pretty high, which makes sense since there is a high variety of frequency over those data. If we click on the ut_shannon field to sort in reverse order, this is what you could get:

As one can see, words of low characters distribution get a low score.

Catching DNS tunnels from subdomains in URLs

If we run the following query, interesting results are shown:
sourcetype="isc:bind:query" | eval list="mozilla" | `ut_parse(query, list)` | `ut_shannon(ut_subdomain)` | table ut_shannon, query | sort ut_shannon desc

As you can see in the results here, the high score come from tunnels made to the domain ip-dns.info as well as something which is unknown but could also be a tunnel: traffic towards greencompute.org

I hope this post helps you to see tools and methodologies one can use to find out unusual activity strictly based on the DNS traffic. More to come…

↧

Smart AnSwerS #60

April 21, 2016, 1:41 pm

≫ Next: Creating a Splunk Javascript View

≪ Previous: When entropy meets Shannon

Hey there community and welcome to the 60^th installment of Smart AnSwerS.

Hot off the press! The next SplunkTrust Virtual .conf Session has been scheduled for next Thursday, April 28^th, 2016 @ 9:00AM PST. Duane Waddle and George Starcher will be giving their popular talk “Avoid the SSLippery Slope of Default SSL”, which has been used and referenced far and wide among the Splunk community in the past couple years. See what the hype is all about by visiting the Meetup page to RSVP and find the WebEx link to join us next week!

Check out this week’s featured Splunk Answers posts:

How to put an expiration date on a set of saved searches or alerts so after a specified date, they will no longer run?

daniel333 was looking for a way to set an expiration on a set of saved searches so they will no longer run after a certain date, and also provide a warning when a job’s expiration date is approaching. SplunkTrust member somesoni2 created the exact search with all requirements daniel333 was looking for. sk314 thought this was just plain sorcery, but somesoni2 left a follow up comment to break down exactly what was happening in each part of the search for an excellent lesson in SPL.
https://answers.splunk.com/answers/366096/how-to-put-an-expiration-date-on-a-set-of-saved-se.html

Where should I run my report that populates a summary index?

markwymer had a scheduled search populating a summary index, but wasn’t sure if it would be better to run it from two load balanced indexers, or if it had to be run and stored on a search head. Jeremiah gave a great explanation of what happens when scheduling and running a summary search and addressed the various scenarios brought up by markwymer. The best approach, however, was the best practice of running it from a search head configured to forward its data to the load balanced indexers. This way, the summary data is evenly distributed to the tier of indexers instead of being indexed locally on the search head to avoid unnecessary storage and scaling issues.
https://answers.splunk.com/answers/363868/where-should-i-run-my-report-that-populates-a-summ.html

How to calculate the factorial of a number in a Splunk search?

shrirangphadke wanted to know if it was possible to calculate the factorial of a number in a Splunk search using eval. javiergn took a stab at this by constructing 2 possible searches based on the natural logarithm, using a combination of Splunk search commands to generate the same logic. This worked for shrirangphadke who also took the suggestion by stanwin to create a custom command with Python.
https://answers.splunk.com/answers/369054/how-to-calculate-the-factorial-of-a-number-in-a-sp.html

Thanks for reading!

Missed out on the first fifty-nine Smart AnSwerS blog posts? Check ‘em out here!
http://blogs.splunk.com/author/ppablo

↧

As part of a Splunk alert, is it possible to include 100 lines from the log prior to the event that triggered the alert?

What is the recommended compatibility sequence of upgrading instances in my environment from Splunk 6.2.7 to 6.3.2?

How to make a world map dashboard using logs from an email server with no IP addresses?

Who changed a file and when?

But *what* changed?

What’s next?

Build a Splunk Sandbox

Play in your Splunk Sandbox

When to Avoid the Sandbox

Apply Config without Restart

Hesitate No More!

How to write a search to only show the latest contents of a lookup file on a dashboard?

Is it possible to have your sourcetype be determined at index-time based on host?

How to work out the age of a user based on date of birth?

Can a dashboard map’s center location change based on a drop-down token?

Is it possible to show a custom tooltip whenever a user hovers over a slice of a pie chart or column in a bar chart using Simple XML with Splunk JS?

After copying the folder for an app I created in Splunk 6.1.3 to 6.2.1, why can’t I see any of the dashboards?

Step 0 – Follow The Docs

Step 1 – Put Splunk Into Development Mode

Step 2 – Create The App Structure

Step 3 – Create The Visualization Logic

Step 3a – Include Your Dependences

Step 3b – Write The Code

Steps 4-7 – Add CSS, Add Configuration Settings, Try Out The Visualization, Handle Data Format Errors

Step 8 – Add User Configurable Properties

Step 9 – Implement a format Menu

Diagram of Splunk Common Network Ports

Is it possible to create a batch data input via the REST API?

How to troubleshoot why my universal forwarder is not phoning home?

Problem

Solution

Caveats on SSL and CORS

Claude Shannon to the rescue!

Catching DNS tunnels from subdomains in URLs

How to put an expiration date on a set of saved searches or alerts so after a specified date, they will no longer run?

Where should I run my report that populates a summary index?

How to calculate the factorial of a number in a Splunk search?

But what changed?