
We find ourselves submerged in a sea of software applications practically all the time. Their primary job is to make life easier and help us accomplish certain tasks. However, these applications require a lot of data. What’s more, their development requires a systematic approach with proper management of that data — and its related activities.
But that’s not a straightforward and simple process.
What happens if these applications stop running? How do you figure out what caused the problem? How do you resolve incidents in an effective way?
(Looking for solutions from Splunk? Explore our product portfolio, covering enterprise needs for security, monitoring and observability and all things data.)
The answers to all these questions are important, especially for the IT professionals who are responsible for the smooth running of these applications and resolving any errors or failures that may occur. This is where logs come into the picture. They are vital to ensuring application security, which itself can have many negative effects when managed poorly.
But what, exactly, is a log? How do you manage them? In this roundup, I’ll break down log management for you so you understand what it means and how you can make the most out of it.
What is a log?
Before we dive into log management, let’s first understand logs.
A log is a type of machine data that is particularly significant for developers and IT professionals. In some cases, a log could be in the form of a text file created by various software applications and operating systems. It contains specific information about the activities that happen during the execution of an application or operating system.
Realistically, you can use logs to:
- Write or document all the activities performed by the application.
- Automate the documentation of errors, messages, file transfers, etc.
Both logs and log management are essential to the overarching practice of observability.
Different types of logs
Let’s look at the criteria for categorizing different types of logs.
A log is classified according to the format or data types it handles. It is also based on the processing and its protocols. Logs following the same protocols fall under one classification. Here are a few types of logs:
- Event log: This log only takes care of the traffic occurring in the network. This includes keeping track of various user credentials, how many times a user has logged in, etc.
- System log: A system log is responsible for updating all the operations and activities performed by the operating system.
- Server log: This is a type of text file that keeps a record of the activities performed by the server and also records activity time periods.
Now that we understand what a log can do, let's move onto log management.
What is log management?
Log management is a process that handles huge piles of logs. These logs are generated internally in a system or from software applications. Log management consists of four major phases:
- Collecting the logs from various sources.
- Storing the collected logs at a centrally located area or storage. The main motivation here is to make it easy for the IT professionals to access, encrypt and process them, depending on the application.
- Identifying each log through an index in the records, further enabling pros to find the required logs.
- Relating the logs to each other, in some way. A correlation is established within the logs that makes it easy for migration to other applications. For example, in the application of sending an email, the sender sends the email with specified logs that contain information about the sender, the process of application, and many more. The receiver receives these messages.
(Explore cloud log management.)
The benefits of log management
The significance of logs is quite clear, but what happens when we want to analyze and inspect them for any bugs or system failure? In this case, we can manually check multiple logs to identify errors during troubleshooting. However, this approach is tedious, toilsome — and definitely not scalable.
The best way to deal with this situation is to have a systematic way to manage these logs.
This is where log management can provide real-time insights on various areas and operations, such as the health of your application. All the required logs are collected and stored at one place. This solution is known as centralized log management. This solution makes it easy for the professionals during analysis and error detection.
Log management also has an enhanced security feature and effective troubleshooting capabilities.
Structured logging
Sometimes, the text in the log file is unstructured. Therefore, it becomes a tedious task to apply queries on them to extract required information.
The idea is to make the work easy and flexible by converting unstructured data to structured data. This is called structured logging. This happens by changing the unstructured data into an acceptable format (JSON, XML, etc.). This method also helps machines to read the required log file very quickly and easily.
Best practices for log management
Now let’s improve the process of log management by implementing some of the essential as well as best practices:
- Always implement structured logging before analysis. This saves time when dealing with large numbers of logs.
- When you create the log of any event, associate extra messages with it. This acts as an alert for the monitoring team.
- Determine what should be logged — and what shouldn’t. Never log sensitive data, as that information could leak to the outside world.
- Collect logs from various sources. This helps in understanding the subject matter more deeply for error correction and management.
Log management example
Now let’s see how log management works in practice.
Use case: key pairs
Let’s take an example of a log written in a string format. It contains necessary information about airlines.
WARNING:__main__:Lufthansa airlines 820 from Indira Gandhi International airport, New Delhi(DEL), India to Frankfurt International Airport, Frankfurt(FRA), is delayed by approximately 5 hours, 22 minutes
INFO:__main__:Air India flight 120 from Indira Gandhi International airport, New Delhi(DEL), India to Frankfurt International Airport, Frankfurt(FRA), Germany has departed at 12:20:18
The content is understandable and readable enough— so it shouldn’t be a problem for someone to extract important information. But if this task is assigned to a machine, how will it understand and identify the appropriate information? What if we have a collection of similar log data?
This situation requires that the logs be structured for the machines. How do we do this? Let’s begin.
Logs must be written in a different format, not in the string format that’s above. The above data will simply be stored in a dictionary (i.e., key pair values) that can be further serialized.
Let’s do this task in Python. We’ll use a Python package called a structlog package for structured logging.
from structlog import get_logger
log = get_logger("Structured Logger")
if status in ['departed', 'landed']:
log.info(status, **flight)
elif status == 'delayed':
log.warning(status, **flight)
else:
log.critical(status, **flight)
The result generated will be in the form of a dictionary. This will allow the machines to understand and extract the information and help manage the log file.
[warning ] delayed airline=Lufthansa airlines 820 delay_duration= 5 hour 22 mins destination={'airport': 'Frankfurt International Airport', 'iata': '', 'icao': '', 'city': 'Frankfurt', 'state': '', 'country': 'Germany'} flight_id=820 origin={'airport': 'Indira Gandhi International Airport', 'iata': '', 'icao': '', 'city': 'New Delhi', 'state': '', 'country': 'India'} stops=1
As you can see, key value pairs have been created to perform queries and extract information. This is what structured logging looks like. As discussed, there can be many formats used, such as XML, JSON, etc.
Logging vs. instrumentation
There are two terms that often create confusion when we talk about log management:
- Logging
- Instrumentation
Logging is the process of collecting various logs. It is the first step to implementing log management. But sometimes, when we encounter huge logs to inspect ourselves, it is a challenge. It consumes a lot of time and effort. A smarter choice can be to log from the important sources.
The solution to the above problem is Instrumentation. This is a technique where logging becomes smarter by adding an extra code to the logs. This adds various useful information, such as runtime content. This way, the IT professionals are able to understand the respective application more internally.
Now let’s turn to best practices for instrumentation.
Best practices for instrumentation
Since instrumentation is one step ahead of normal logging, let’s discuss some of the best practices adopted by instrumentation:
- Collect almost all information from the resources and pinpoint what part can create an issue. Here is where deep instrumentation will be useful.
- Inspect the logs that might not be considered problematic. Assuming they won’t be a problem doesn’t mean they won’t create a huge problem.
- If you need to check on the performance of a system or an application, always keep a log so you can deal with it if it goes out of range.
- Long and full context logs are always beneficial.
- OpenTelemetry tracing can perform instrumentation in the logs.
Log management is a crucial IT practice
Logs are an essential part of the job of any developer, IT person or system administrator. They are also the medium for our customers to enjoy various software applications. Log management plays an important role in the efficient working of these applications and solves many issues that can occur at the execution stage.
What is Splunk?
This article was written by Siddhant Varma. Siddhant is a full stack JavaScript developer with expertise in frontend engineering. He’s worked with scaling multiple startups and has experience building products in the EdTech and healthcare industries. Siddhant has a passion for teaching and a knack for writing.
This posting does not necessarily represent Splunk's position, strategies or opinion.