Cyber Chasse- Restore Archived Data

Restore Archived Data in Splunk Enterprise

In our previous blog titled How data ages in Splunk, we discussed the different stages that data goes through. We explained how data progresses from Hot to Warm to Cold to Frozen and Thawed stages. Data in the frozen stage can either be archived or deleted permanently. We covered how to archive indexed data in Splunk here. Data that has been archived can be returned to the index by thawing it. This blog will cover details on how to restore archived data in Splunk Enterprise.  

To restore archived data, you move it to the thawed directory. The default location for thawed data is:

$SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/*

Thawed data doesn’t go through the aging process (hot, warm, cold, frozen) and is never deleted by default. If you no longer need the thawed data, you can delete it manually or just move it out of thawed state.

It’s important to remember that the process to restore archived data is different depending on what version of Splunk Enterprise the data was originally indexed and we discussed this in our blog titled How to archive indexed data in Splunk Enterprise.  

Broadly speaking, you can restore an archive to any instance of the indexer. However, there are some restrictions and factors that govern this.

In the Splunk Enterprise version:

· Since the bucket data format changed from 4.1 to 4.2, you cannot
restore a bucket indexed in Splunk Enterprise 4.2 or later to a pre-4.2
version.

· You can restore 4.2+ buckets to any 4.2+ instance.

· Aside from a few OS-related issues described next, you can restore a
pre-4.2 bucket to indexer’s in pre-4.2 or post-4.2 versions.

In the OS version:

· You can restore a 4.2+ bucket to an indexer with any operating system.

· You can also restore a pre-4.2 bucket to an indexer with any operating
system, however, there is a restriction. For instance, data generated on
64-bit systems is not likely to work well on 32-bit systems, and data
cannot be moved from Sparc systems or PowerPC to x86 or x86-64
systems and the other way around.

So how do you identify whether the archive bucket contains pre or post 4.2 data?

Before you thaw data, you will need to identify the version in which the data was archived. If you archived the data through your own customized script, the resulting bucket could contain just about anything.
Assuming you archived the buckets using coldToFrozenDir or the example script provided, a 4.2+ bucket directory will contain only the rawdata directory which contains journal.gz. On the other hand, a pre-4.2 bucket directory will contain gzipped versions of .tsidx and .data files and a rawdata directory containing file names <int>.gz. In this case, you could use the following procedures to thaw data.

THAWING DATA ARCHIVED IN 4.2+

Example for .nix users

First, copy your archived bucket into the thawed directory.
Example:  
cp -r db_1181756456_1162600547_1005 $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb
Note: the bucket id ‘1005’ we used as an example should be unique.

Next, rebuild the indexes and associated files by executing the splunk rebuild command on the archive bucket.
Example:  
splunk rebuild  $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/db_1181756456_1162600547_1005

Finally, restart the indexer.
Example:  splunk restart

Example for windows users

First, copy your archived bucket into the thawed directory.
Example:  
xcopy D:\MyArchive\db_1181756456_1162600547_1005 %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\db_1181756456_1162600547_1005 /s /e/ v
Note: the bucket id ‘1005’ we used as an example should be unique.

Next, rebuild the indexes and associated files by executing the splunk rebuild command on the archive bucket.
Example:
splunk rebuild  %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\db_1181756456_1162600547_1005

Finally, restart the indexer.
Example:  splunk restart

THAWING DATA ARCHIVED IN PRE-4.2

Example for .nix users

First, copy your archived bucket into a temporary location in the thawed directory.
Example:  
# cp -r db_1181756456_1162600547_0  $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/temp_db_1181756456_1162600547_0

Next, if the bucket archived originally was compressed, uncompress the contents in the thawed directory.

Following which, rename the temporary bucket to something that can be identified by the indexer.
Example:  
# cd    $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb/
# mv   temp_db_1181756456_1162600547_0  db_1181756456_1162600547_1005
Note: the bucket id ‘1005’ we used as an example should be unique.

Finally, refresh the manifests.
Example:  
# cd    $SPLUNK_HOME/bin
# ./splunk login
# ./splunk _internal call /data/indexes/main/rebuild-metadata-and-manifests

Your thawed bucket should be searchable after a few moments.

Example for windows users

First, copy your archived bucket into the thawed directory.
Example:  
>  xcopy D:\MyArchive\db_1181756456_1162600547_0 %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb\temp_db_1181756456_1162600547_0 /s /e/ v

Next, if the bucket archived originally was compressed, uncompress the contents in the thawed directory.

Following which, rename the temporary bucket to something that can be identified by the indexer.
Example:  
>  cd  %SPLUNK_HOME%\var\lib\splunk\defaultdb\thaweddb
>  move temp_db_1181756456_1162600547_0  db_1181756456_1162600547_1005
Note: the bucket id ‘1005’ we used as an example should be unique.

Finally, refresh the manifests.
Example:  
>   cd    %SPLUNK_HOME%\bin
>   splunk login
>   splunk _internal call /data/indexes/main/rebuild-metadata-and-manifests

Your thawed bucket should be searchable after a few moments.

Thawing Clustered data  

This was a simple run down of how to restore archived data in Splunk Enterprise. A final word of caution related to thawing clustered data……similar to when archiving clustered data, thawing clustered data will result in multiple copies of thawed data on your cluster. Not only is it difficult to archive just a single copy of clustered data, it’s also time-consuming. Partnering with a professional Services firm is highly recommended. A Splunk professional can guide you through these complexities. Cyber Chasse can provide personalized solutions to suit your requirements and make the process seamless. Call us today.