Forums: SplunkAdministration: Splunk misreporting indexed events?

Previous Topic: Mass host tagging  |   Next Topic: Script not being updated


Posts 1–7 of 7  |  Post to this topic

I run a fairly small network, about 70 users, I have exceeded the 500MB limit 3 times. Once by about a megabyte, once by about 100MB and once by 500MB.

At first I figured it was due to my turning on auditing on our file server. It generates a lot of events which is great. But then I stuck in a couple fire plotter logs and noticed the indexing go through the roof. Here are the stats:

File 1: 589KB - Splunk reports 1.4 Million indexes, the file is only 20,109 lines long.
File 2: 24,872KB - Splunk reports 225,000 indexes and this file is 864,264 lines long.
File 3: 21,188KB - 192.402 indexes, 736,691 lines

See where I am going with this? These logs all have similar data in them but the indexes are all over the place. I understand that number of indexes is not necessarily proportial to filesize.But I would like to know what is causing the bloat in my logging.

I am up to 14 million indexes and have been barely running a month on a small LAN. Splunk seems to be struggling now on returning reports. It is a fairly beefy server 2.0GBz Xeon with 1.5GB ram, which I can beef up if necessary but my peak usage hasn't come close to the amount of RAM I have.

Can someone tell me what I am doing wrong?

I think by indexes you mean events? I think you need to check out the sections in the admin manual on line breaking, segmentation, and time stamp recognition. These basic tuning elements will help you get Splunk indexing your data properly.

oh I assumed since the numbers were so high it was counting the indexes themselves, counting events the number really seem off. The events seemed to show up properly, eg not being cutt off or anything in the results.
I will look into tuning. Thank you for that bit of info.
This still doesn't account for the massive 1000GB in a day of data that is being claimed to being indexed. I added 100MB and suddenly the server indexed a Gig? I am trying to run some query to show where the spikes are or anything that could be causing this. Is there a place I can drill down and look at where all the size is coming from?

Oh and if it helps I am only monitoring 6 hosts.

To investigate what is causing your overages, the following searches may be useful for you:

Indexing volume by host:
index=_internal group=per_host_thruput | timechart sum(kb) by series

by source:
index=_internal group=per_source_thruput | timechart sum(kb) by series

by sourcetype:
index=_internal group=per_sourcetype_thruput | timechart sum(kb) by series

You can tailor the time constraint to meet you needs.

ya it just says localhost.

ahhh now those were interesting, took 4 minutes a piece to run about but gave me a neat graph ;) I need to fine tune it a bit but it should be right about what i need.
The localhost was being reported in the admin section under view usage report.

Post to this topic

You must be logged in to post a reply.










close

Flash required to play this video.

Click here to download the free Flash Player.

Description:

Permalink: