Splunk and Communication Kakfa topic to Splunk indexes

Vaibhav Mishra
2 min readAug 26, 2024

--

Splunk

Splunk is a tool which analyze the machine data and can create graph and dashboard by using that.

Splunk can leverage below things:

  • System Perf
  • Failure
  • Visualization
  • Search and Investigation
  • Store the data

Components of Splunk:

Forwarders, Indexers and Search headers

  • Forwarders: Collect the data and forward to other splunk instances
  • Indexer: Stores the data
  • Search Headers: Analyze, Visualize and report the data.

Kafka to splunk communication

Data Flow:

  • Step 1: Data is produced to Kafka topics by various producers (applications, services).
  • Step 2: Kafka Connect pulls data from specific Kafka topics using source connectors.
  • Step 3: Kafka Connect uses the Splunk Sink Connector to push the data to Splunk.
  • Step 4: Splunk indexes the incoming data, making it available for search, analysis, and visualization in dashboards and reports.

Installation for all services in one docker-compose

version: '3'

services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- "2181:2181"

kafka:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
ports:
- "9092:9092"

splunk:
image: splunk/splunk:latest
environment:
SPLUNK_START_ARGS: "--accept-license"
SPLUNK_PASSWORD: "admin@123"
ports:
- "8000:8000"
- "8088:8088" # HEC Port
- "8089:8089"

kafka-connect:
image: custom-kafka-connect-splunk:latest
depends_on:
- kafka
- splunk
environment:
CONNECT_BOOTSTRAP_SERVERS: kafka:9092
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: "connect-cluster"
CONNECT_CONFIG_STORAGE_TOPIC: "connect-configs"
CONNECT_OFFSET_STORAGE_TOPIC: "connect-offsets"
CONNECT_STATUS_STORAGE_TOPIC: "connect-statuses"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.storage.StringConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.storage.StringConverter"
CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
CONNECT_HEAP_OPTS: "-Xms512M -Xmx1G"
ports:
- "8083:8083"

volumes:
splunk-data:

Step by Step Configuration

  • Create a splunk index
  • Create a HEC(HTTP Event Collector) inside Data Inputs and copy the token for further configuration.
  • Add the kafka connector command by calling below Post API

curl — location — request POST ‘http://localhost:8083/connectors' \
— header ‘Content-Type: application/json’ \
— data-raw ‘{
“name”: “splunk-sink-connector-splunk-name”,
“config”: {
“connector.class”: “com.splunk.kafka.connect.SplunkSinkConnector”,
“tasks.max”: “1”,
“topics”: “<kafka-topic>”,
“splunk.hec.uri”: “http://splunk:8088”,
“splunk.hec.token": “<hec-token>”,
“splunk.indexes”: “<index>”,
“splunk.hec.ack.enabled”: “true”,
“splunk.hec.raw”: “false”,
“splunk.hec.event.timeout”: “60000”,
“splunk.hec.retry.interval.ms”: “10000”,
“splunk.hec.total.channels”: “2”,
“splunk.hec.pool.size”: “10”
}
}’

At this time kafka is synced with splunk and the packet will forward to splunk index from kafka-topics. Use Search and Reporting to see the desired data in splunk.

Note: Prerequisite of this page, reader should know some basics of splunk and kafka.

--

--

Vaibhav Mishra
Vaibhav Mishra

Written by Vaibhav Mishra

I am passionate software developer working in different technologies like Python, ML, Django and JavaScript.

No responses yet