The Join command in Splunk helps to combine the sub search results with that of the main search. To do this, we need one or more common fields. You can also use the self-join command to combine a search result set to itself. The join command is also called a centralized streaming command when a pre-defined set of fields to join is available. Use this command when the subsearch results are small. It should be less than approximately 50,000 rows or less. It is necessary to mention the fields to use join for. If not mentioned, all of the common fields are used.
To reduce the impact of join command on execution and consumption of resources, Splunk imposes certain limitations on the subsearch. This limitation for the join command on the subsearch is mentioned in the file limits.conf.spec. These limitations contain the maximum subsearch search time, the peak subsearch to join against, and the maximum subsearch wait time to finish.
Types of joins:
- inner join
- left or outer join
Only the matched events are included in the result in the inner join.
Let’s consider an example to understand the inner join concept better.
The following picture depicts a lookup of cities and their pin codes.
Let’s match it with the indexed data with the help of the inner join.
In the picture below, only the matching results are included in the result. The city Mumbai is not included as it didn’t match with the indexed dataset though it was present in the lookup.
Left or Outer Join:
All the main search events are comprised along with the matched results in a left or outer join.
Let’s consider an example to understand the outer join concept better.
In the above picture, city Mumbai is included in the result as it was there in the lookup though not in the index data.
Hence, the only difference between inner and outer join is well elaborated with the above examples.
If you still have any doubts, feel free to post your queries in the Comment Box below and don’t forget to follow us on 👍 Social Networks, Happy Splunking >😉