Traditionally, to monitor your servers, your company would set up in-house monitoring with something like Nagios, and that was how things were done.
With the increase in Cloud services, there came an increase in hosted monitoring. If you opt to host on a platform like Heroku or AWS Lambda, you won’t even have a datacenter to self-host on!
Continue reading “Hosted vs Self-Hosted Monitoring”
Datadog is a great service I’ve used for monitoring. Since the agent is Python-based it’s very extensible through a collection of
pip installable libraries, but the documentation is limited on how to handle these libraries.
If you use the provided
datadog-agent package, Datadog comes with its own set of embedded applications to monitor your server, including
python for the agent,
supervisord to manage the Datadog processes, and
pip. Since this is all just Python, surely this can lead to something. Can’t we import our own custom libraries in our custom checks? Yes we can.
Continue reading “Using virtualenv and PYTHONPATH with Datadog”
Elasticsearch allows you to setup heterogeneous clusters, that is, nodes with different configurations within the same cluster. Elastic (the company) refers to this architecture as “Hot-Warm“, but it’s called tiered storage if you come from a storage background.
The canonical example is that you have a bunch of data you want to keep online and able to query, but it becomes less relevant over time. You want to cut costs, so you have your “Hot” data that is written and/or read frequently, most likely on SSD, and then “Warm” data accessed less frequently on less expensive nodes, most likely on spinning disks. But it doesn’t end there, this architecture can be used and extended in different ways
Continue reading “Tiered Storage With Elasticsearch”
tcpdump is a great utility for debugging network applications and profiling them. It’s near universal – some tcpdump variant exists on every platform. You can even run tcpdump on some higher end switches and routers. Packet dumps can be saved for later analysis or to be sent off to vendors for debugging.
Continue reading “A Primer on TCPDump”
I authored this as part of SysAdvent, which posts one system administration-related post each day in December, ending on the 25th. You can find the original posted here: http://sysadvent.blogspot.com/2016/12/day-15-take-that-vacation-eliminate.html
It’s mid afternoon and you just sat down for that holiday meal with your family and friends. Your phone goes off and you look at the number. Work, again.
Before you even read the text or answer the call with the robotic voice telling you about the latest problem, you’re wondering to yourself “how long it will take?” Your relatives are only in-town for another day or two, before you have to take them to the airport. What if it goes off again later? A holiday potentially ruined.
You read the text. Maybe it’s a false alarm. Maybe it’s not. Either way you’re out of the moment–worrying about work and if things are going to break over the holidays.
Don’t Be Your Own Grinch
It’s possible to engineer yourself and environment for success. Continue reading “Take That Vacation: Eliminate Alerts Dragging You Back to the Office”