POSTS
Playing with Python Structured Logs
Recently I’ve been experimenting with the Python project structlog to add structured logging to our in-house applications. The eventual plan would be to emit the logs to an ELK stack with JSON parsing, instead of the much more complicated set of rules we have to custom define for each type of log file ingested today.
This post is my current understanding of structlog
, reordered from how the
docs are arranged to make a bit more sense to me. If you’re familiar with
the library and have something to add–let me know,
I’d love to hear from you.
Basic Usage
Structlog is installed with pip install structlog
, optionally adding
colorama
for color output in a terminal.
# Old code
import logging
log = logging.getLogger()
# New code
import structlog
log = structlog.getLogger() # or structlog.get_logger()
At this point the logging usage is identical, but you can add key-value pairs to the log calls.
log.info("Hello World", a=1)
# 2018-08-23 17:59.51 Hello World a=1
Already this is an improvement for tools like awk
or sed
.
Powerful Structured Logging
Configuring structlog
While the basic usage is great for getting started, it still has a lot of training wheels attached.
The power of structlog comes from the concept of “processors”, which are
a pipeline of functions that modify the event – the event_dict
in all
of the example code. Since event_dict
is a dictionary, keys can be
modified as needed in each function before the next function is called.
The last step in the processor function chain is the Renderer
function,
which takes in the event_dict
and “renders” it appropriately for the
output handler. By default, and with most of the examples, this is
the logging
library from Python’s stdlib. So effectively there’s an
arbitrary dict
that gets massaged into the kwargs for logging.getLogger().log()
.
import structlog
# This configures the module, needed only once per process.
# Any subsequent code only calls structlog.get_logger()
structlog.configure(processors=[
structlog.stdlib.add_log_level,
structlog.processors.KeyValueRenderer(sort_keys=True)
])
log = structlog.get_logger()
log.info("Hello World")
# This outputs
# event='Hello World' level='info'
However, the basic usage comes with a bunch of processors that have to
be explicitly defined when processors
is modified, since it’s a list
rather than a numerically ordered set of middlewares like Scrapy.
This is the documented suggested configuration
for use with logging
:
import structlog
structlog.configure(
processors=[
structlog.stdlib.filter_by_level,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.processors.UnicodeDecoder(),
structlog.stdlib.render_to_log_kwargs,
],
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
cache_logger_on_first_use=True,
)
At this point I came to a screeching halt and had to take a step back to understand what was going on.
Breaking Down The Processors
The processors in question do the normal pipeline of operations that logging
does, but I’ve never gone deep into logging
except to know that:
logging
will only output when the desired log level meets the logging level threshold.- I should use
%
formatted strings and pass the substitutions as args tolog.log()
These are optimizations in the logging library to minimize unnecessary processing. The first rule allows us to entirely skip DEBUG messages when we’re not logging DEBUG. The second rule allows us to bypass string interpolation on those calls.
So similarly, our processors do this work.
import structlog
structlog.configure(
processors=[
# This performs the initial filtering, so we don't
# evaluate e.g. DEBUG when unnecessary
structlog.stdlib.filter_by_level,
# Adds logger=module_name (e.g __main__)
structlog.stdlib.add_logger_name,
# Adds level=info, debug, etc.
structlog.stdlib.add_log_level,
# Performs the % string interpolation as expected
structlog.stdlib.PositionalArgumentsFormatter(),
# Include the stack when stack_info=True
structlog.processors.StackInfoRenderer(),
# Include the exception when exc_info=True
# e.g log.exception() or log.warning(exc_info=True)'s behavior
structlog.processors.format_exc_info,
# Decodes the unicode values in any kv pairs
structlog.processors.UnicodeDecoder(),
# Creates the necessary args, kwargs for log()
structlog.stdlib.render_to_log_kwargs,
],
# Our "event_dict" is explicitly a dict
# There's also structlog.threadlocal.wrap_dict(dict) in some examples
# which keeps global context as well as thread locals
context_class=dict,
# Provides the logging.Logger for the underlaying log call
logger_factory=structlog.stdlib.LoggerFactory(),
# Provides predefined methods - log.debug(), log.info(), etc.
wrapper_class=structlog.stdlib.BoundLogger,
# Caching of our logger
cache_logger_on_first_use=True,
)
Other batteries included
Other nifty processors that come out of the box include:
structlog.processors.ExceptionPrettyPrinter()
, which will pop the exception information off theevent_dict
and print the familiar “Traceback (most recent call last):” text.structlog.dev.ConsoleRenderer()
, which provides some nicer terminal layouts than the unconfigured structlog.structlog.processors.JSONRenderer()
, JSON. Call with thesort_keys=True
for human readabilitystructlog.processors.TimeStamper()
, adds a timestamp key, defaulting to UTC.
The structlog processor library is a mix of classes and functions. The
classes initially tripped me up with an error message about __init__
positional arguments before I realized not everything was a function.
Any classes have to be instantiated (generally with a configuration)
before being included in the processor pipeline. The processor then makes
a call to the object, treating it as a function. This works because all
of these included classes are defined with the magic method __call__
.
Custom Processors
This idea of processors opens up significant power to customize the logging pipeline globally. As mentioned, the pipeline is a list of Python functions.
Beyond tagging enforcement, like adding a timestamp and host info, or modifying the pipeline based on a deployment target, these custom functions allow for event sampling–very necessary to avoid data collection overload.
Any processor can prevent an event from continuing through the processing
chain by raising the exception structlog.DropEvent
.
Here’s an example with the ever popular FizzBuzz that will drop every “Fizz”, as well as every other “FizzBuzz”. While just a toy example, this could be extended to anything – log every delete but only 1 of 100 search queries.
class FizzDropper(object):
def __call__(self, logger, method_name, event_dict):
# An example of keying off log message
if event_dict.get('event') == "Fizz":
raise structlog.DropEvent
# An example of keying off additional data
count = event_dict.get('count')
if count and count % 3 == 0 and count % 5 != 0:
raise structlog.DropEvent
return event_dict
class FizzBuzzDropper(object):
counter = 0
def __call__(self, logger, method_name, event_dict):
if event_dict.get('event') == "Fizz Buzz":
self.counter += 1
if self.counter % 2 == 0:
raise structlog.DropEvent
return event_dict
structlog.configure(
processors=[
FizzDropper(),
FizzBuzzDropper(),
structlog.dev.ConsoleRenderer(pad_event=15)
],
)
log = get_logger()
for i in range(1, 100):
if i % 3 == 0 and i % 5 == 0:
log.error("Fizz Buzz", count=i)
elif i % 3 == 0:
# This should go missing!
log.info("Fizz", count=i)
elif i % 5 == 0:
log.info("Buzz", count=i)
else:
log.info(str(i), count=i)
Simple sampling like this is easy, effectively sampling is a hard problem. I recommend reading Baron Schwartz’s blog post on selecting representative samples if you’re interested more in sampling.
Migrating to structlog
If I could update all the code in my environment at once, switching to structlog
could be done in an afternoon. However, I have multiple projects with
their own update schedules. I also have Sentry to contend with, which is
tied into the logging
library.
Sentry
I found one Github issue relating to configuring structlog with Sentry, but it ended up confusing me more than helping. So here’s what did work for me in testing:
import logging
from raven import Client
from raven.handlers.logging import SentryHandler
sentry_handler = SentryHandler(Client(SENTRY_DSN))
sentry_handler.setLevel(logging.ERROR)
logging.getLogger().addHandler(sentry_handler)
log = structlog.get_logger()
try:
1 / 0
except ZeroDivisionError:
log.exception("Div by Zero")
Conceptually what’s happening is we configure a logging handler,
attach it to the root logger, and then let structlog fetch the root logger
via the default logger_factory
.
External libraries
Without changing everything to structlog
at once, I also have to contend
with loggers in imported libraries that have not yet been updated or are
outside my control–otherwise the logs will be half JSON, half traditional
Python messages. That’s a sure guarantee to break any parser reading
these logs.
The structlog docs suggest python-json-logger
.
Pieced together from a few code snippets in the examples, we get a setup that looks like this:
import logging
import sys
from pythonjsonlogger import jsonlogger
# Setup jsonlogger to print JSON
json_handler = logging.StreamHandler(sys.stdout)
json_handler.setFormatter(jsonlogger.JsonFormatter())
# Setup Sentry to only report on errors
sentry_handler = SentryHandler(Client(SENTRY_DSN))
sentry_handler.setLevel(logging.ERROR)
# Add both handlers to logging
logging.basicConfig(
format="%(message)s",
handlers=[json_handler, sentry_handler],
level=logging.INFO, # Or whatever the general level should be
)
# The processor list
processors = [
structlog.stdlib.filter_by_level, # First step, filter by level to
structlog.stdlib.add_logger_name, # module name
structlog.stdlib.add_log_level, # log level
structlog.stdlib.PositionalArgumentsFormatter(), # % formatting
structlog.processors.StackInfoRenderer(), # adds stack if stack_info=True
structlog.processors.format_exc_info, # Formats exc_info
structlog.processors.UnicodeDecoder(), # Decodes all bytes in dict to unicode
structlog.processors.TimeStamper(), # Because timestamps! UTC by default
structlog.stdlib.render_to_log_kwargs, # Preps for logging call
]
# Configure structlog
configure(
processors=processors,
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
cache_logger_on_first_use=True,
)
log = get_logger()
# App code
log.info("Starting app")
from module import foo
foo() # function with logging.getLogger().info("foo")
Now any modules I import in my application will log with JSON, albeit
without my pipeline additions. My text in the log.log()
becomes the
"message"
key of the JSON.
{"message": "Starting app", "logger": "__main__", "level": "info", "timestamp": 1535063721.076024}
{"message": "foo"}
If I used a renderer other than render_to_log_kwargs
, there’s a possibility
the entire JSON will instead be encoded as the value in the message key, like
{"message": "event='Starting' level='info' logger='__main__' timestamp=1535058769.622922"}
.
It’s important to double check the format of the output while testing and
that Sentry is correctly grouping the log messages sent.
Now I can gradually migrate over my internal libraries to structlog, but all of the application’s logs switch to JSON at one time.