A technical troubleshooting blog about Oracle with other Databases & Cloud Technologies.

Components and Buckets in SPLUNK

2 min read

Components:

The primary components in the Splunk architecture are Forwarders, the Indexers and the Search Head.

Forwarders:
The forwarder is an agent you deploy on IT systems, which collects logs and sends them to indexers. Splunk has two types of forwarders:
* Universal Forwarders: forwards the data without any prior treatment. This faster and requires less resource on the host but results in large quantity of data sent to indexer.
* Heavy Forwarders: performs parsing the data and indexing at the source on the host machine and sends only the parsed events to the indexer.
Indexers:
The indexer transforms the data into events, stores it to a disk and adds it to an index. Enabling search ability. It performs generic events processing on log data such as applying timestamp, source and also can execute user-defined transformation actions to extract specific information’s.
Search Head:
The search head provides the UI users can interact with Splunk. It allows user to search and query data in Splunk, and interacts with indexers to gain access to the specific data they request.

Buckets:

Buckets are usually a unit of directory structure in the file system which is created by Splunk itself at the time of indexing. When new data comes to Splunk, it stores in the form of buckets. Basically, there are 4 different types of buckets:

HOT: 
While indexing the data, it is stored in Hot state. It is writable and readable at the same time. The data currently being indexed by indexer are in Hot buckets and as well as user can fetch this data through search head.

Criteria for Hot Buckets
-------------------------
* HOT bucket is full (max size of 10GB for 64bit & 750mb for 32bit is reached)
* After a max span of time is reached (90 days)
* When number of maximum HOT bucket count is reached (3 HOT buckets/indexer)
* When hot bucket has not received data from long time
* When Splunk gets restarted.
WARM: 
Once data reaches the WARM bucket, it becomes read only. Active incoming data will not get written to the WARM buckets.

Criteria for Warm Buckets
--------------------------
* By default the no. of warm buckets is 0.
* When data rolls from Hot to Warm buckets, Warm buckets gets created.
* Maximum of 300 warm buckets can be created.
COLD: 
Once data reaches the COLD bucket, it becomes read only just like WARM buckets. Since this is less accessible data, it can be stored in a low cost disk storage.

Criteria for Warm Buckets
--------------------------
* Maximum size off Cold Bucket is 5000000MB.
* After a max span of time is reached (6 years)
FROZEN: 
After rolling from Cold, it reaches to Frozen buckets where data gets deleted by default by the indexers. If we want we can choose to archive data which can be brought back to THAWED bucket and rebuild it for re-indexing which doesn’t effect the license.