Set up Splunk for AWS monitoring and analysis with a lambda
Splunk is a tool used for system monitoring and data analysis. You can send all of your logs to splunk and then search and sort through those logs. My main experience with splunk has been with dashboards and alerts. Dashboards can be set up as a one stop overview of the health and important information about your application. Alerts can be set up to notify someone if something went wrong through teams, pager duty, email, and a number of other programs. In this post I will discuss how send logs from cloudwatch to splunk using a lambda function, how to create a dashboard, and how to create an alert.
The Lambda
A lambda function is how the logs are actually sent from cloudwatch to splunk. The lambda is very easy to set up because there is a blueprint for it, no modification to the body of the lambda is necessary. However you must configure the environment variables. They are SPLUNK_HEC_TOKEN and SPLUNK_HEC_URL. These will be used to send logs to the splunk http event collector. The trigger for the lambda will be cloudwatch events in the log stream.
Cloudwatch
Here you must decide which logs you would like to send to splunk. On the log group(s) that you would like to send to splunk you set up a subscription filter. When you set up the subscription filter simply select your new lambda function and your log format. Be sure to look up the costs before starting to send your logs out as well as the cost of keeping them in cloudwatch. There is a cost for data transfer out as well as a cost for storage in cloudwatch. Be selective about what you send out. You can also save by setting ttl or expiration on your logs so that they will be deleted after a certain age and not cost anymore in storage.
Search
Now that you logs are being sent to splunk it is time do do something with them. A convenient tip for searching in splunk is to specify the name of your lambda as source. This will cut down on your search time if you have multiple lambdas sending data to splunk. For example dev, qa/test, and production. Ex. source=”lambda:my-splunk-lambda-dev”
Dashboard
Each search can be turned into a dashboard panel. These panels can display data in different formats like a table, a graph, or a single number. The dashboard is a great place to put simple visual representations of your application. Some panels that I have created are a healthcheck panel that uses incoming logs from a healthcheck to display on a line graph. This is useful because it shows the current health as well as if there has been any recent downtime. Another panel is used to display the last time a particular process was executed successfully. And yet another panel displays all of the database connection errors from the last 24 hours.
Alerts
To set up an alert you simply create a search for whatever you want to be your alert condition and then click save as. One of the options will be save as alert, click this. From there you will be prompted for a title, a description, a schedule and a few other details. This is how you configure your alert. The last thing to be configured is the trigger actions. Using a variety of webhooks and extension you can send an alert just about anywhere Teams, PagerDuty, or email.
I hope that this has been helpful. Thanks for reading!