Quick Note on GnuRadio on Pentoo

Not a big blog, but a quick problem I got solved on IRC that I thought might help others.

I have a Gateway LT4009u with an Atom N2600. It's my "hacker/workshop" laptop. The atom N processors are a bit gimpy so sometimes things don't run right. One thing is GNURadio on Pentoo. Pentoo runs hardened and this pisses off the atom n.

So if you get the following error.

LLVM ERROR: Allocation failed when allocating new memory in the JIT
Can't Allocate RWX Memory: Operation not permitted

Then you need to soft disable hardened with the following command

sudo toggle_hardened

I hope that helps anyone else on the internet.

Thanks to Zero_Chaos in #pentoo on irc.freenode.net for the fix (and pentoo)

Quick Update: This also happens when running in VirtualBox 5 on my 2015 MacBook i7, but the fix is the same


Monitoring Chef runs without Chef

I, like many sysadmins, really want to monitor all the things I actually care about. Monitoring is in general hard. Not because it’s hard to set up, but it’s hard to get right. It’s really easy to monitor ALL THE THINGS and then just end up with pager fatigue. It’s all about figuring out what you need to know and when you need to know it.

So in this case I really need to know that my machines are staying in compliance with chef.

There was a few ways you can do this. The first thought I had was adding a hook into all of my runs and having them report in on failure. This is mostly because I’m always looking for another way to hack on Chef and work on my ruby. The big problem with this is:

  • What if the node is offline?
  • What if the cron doesn’t fire?
  • What if chef/or ruby is so borked it can’t even fire the app
  • What if someone disabled chef

I need a better solution

Knife Status

Knife status is just awesome, it has some awesome flags and generally I run it far more than I should. The great part about this query the server approach is that it lets me know;

  1. The server is still happy and spitting out cookbooks to nodes
  2. The status of ALL of my runs from the “source of truth” for runs

Not making my chef test rely on chef

But I’m not going to shell knife status. I’m a damn code snob and something about having the chef test rely on the chef client status didn’t seem right.

Instead I wrote a nagios script that I am not going to share in it’s entirety here because $WORK_CODE1insert sad face but I will tell you exactly how I did it.

How to python your chef, or how I stopped worrying and learned to love that I can still use python to do anything.

I’m the most experienced in python and almost all of our internal nagios checks we have written in python. So this is in python.

Step one

Use pynagioscheck and pychef. Seriously. Don’t reinvent the wheel here.

Step two

Create a knife object. have it take all your settings on initialize, then you can create functions for all the different knife commands to recreate them with pychef.

You really only need status for this one. The meat of status is this here, coderanger dropped this on me in IRC

for row in chef.Search('node', '*:*'):
    nodes[row.object['machine name']] = datetime.fromtimestamp(row.object['ohai_time'])

Step three

Now from here I created a TimeChecker object. It takes the dictionary of { server: datetimeObj } on it’s init. For consistency sake I also init self.now = datetime.now(). Then I have a TimeChecker.runs_not_in_the_last() that just takes an int.

The magic of runs_not_in_the_last I will also share with you because I’m proud of this damn script and want to share it with the world

diff = timedelta(hours=hours)
return [k for k in self.runtimes.keys() if self.now - self.runtimes[k] > diff]


Step four

Now just extend NagiosCheck with KnifeStatusCheck, make all your options and other goods in your init and then make your check()

In the check you make knife, Make a Timechecker with the status return… then all you have to do is see if you have any runs_not_in_the_last for critical and then warning.

Gotchas and cleanup notes


seriously, this can and will make them so catch them properly and return errors. You will need to catch and handle AT LEAST - URLError - Status - UsageError - ChefError - At least two of your own exceptions

SSL errors

So there is no trusted_certs here. You need to either give your server a working cert, install the snake oil into the nagios server as acceptable or do the dirtiest of monkey patches.

# Dirty Monkeypatch
if sys.version_info >= (2, 7, 9):
    import ssl
    ssl._create_default_https_context = ssl._create_unverified_context

But before you do this think of the children!!!

Weird ass errors with join

I need to maybe open a ticket and patch pynagioscheck but I had the weirdest bug when raising a critical. It would die in the super’s check on “”.join(bt) or something of the ilk.

My work around was to not just pass msg to the Status exception but to make msg a list and put the main message in msg[0] and then put the comma joined list of servers out of compliance in msg[1]. This means the standard error comes up on normal returns but if you run the check with -v it will give you a list of servers out of compliance for troubleshooting or debugging. Not bad.

Handling the pem file

Eeeeehhhh This maybe my one cop out in the whole script. Basically I created a nagios user in chef with a insane never to be used again and promptly lost password and put the nagios.pem file alongside the check script. Then I let the script optionally take a pem name, and it just checks that the pemfile is alongside the check script. I was considering letting you specify a pem script somewhere on the server or in the Nagios’s users home directory but decided to bite that and take the simplest route there.

Don’t destroy your nagios server

Seriously. Did you see this code? Run a search on all nodes and then return an attribute for every node in your nagios server. This is not the worlds fastest check script.

Unless you dedicate some serious power to your solr service on your chef server you should make sure to only check this service once every ten minutes tops. I only check once an hour normally and then follow up with 10 minute checks on fail on my server since I only do converges every four hours so an “out of compliance” warning for me would be at the 12 hour mark and critical at 24 hours2.

  1. I don’t yet have any clearance to post or share anything I write for, while, at, or around work. The company owns all that, but we are currently working on getting to the point where we can share some stuff. Especially things not so related to our IP like infrastructure code, cookbook, checks, ect. 

  2. The reason I picked these numbers is I don’t want to know the FIRST time a converge fails. I use the omnibus_updater in my runs (Pinned version in attributes of course) so a failed run can be normal. Plus I am deploying something that important I am going to spot check runs and verify everything gets run with knife ssh. I just want to know mostly if a machine is out of the loop for more than a day because that’s a node that needs to get shot. 

Tagged ,

An Open Year

It's been about a year since my last post, mostly frustrated with Chef as a beginner. Now I spend most of my day writing cookbooks and recipes. In fact I am even helping the Lead Dev at work learn Chef and got back from Chef conference. There I met a lot of amazing people and even offered to help maintain BSD support in chef.

This post isn't about that so much. It's mostly about a behavior I noticed I picked up. When I worked for Stephens Media I spent a lot of my energy trying to contribute, in posts, open source, pull requests, ect. Then when I moved to Slickdeals.net my time was really sucked up. I drifted from working on Pelican and stopped doing as many pull requests. At some time I set up a personally hosted Stash instance. Then I locked that stash instance off behind a login. Then I started writing in my private confluence instead of here. Now all my projects these days are All Rights I noticed... hmph.

I don't know exactly what triggered this sharephobia but it needs to stop. I almost think it's some weird greed involving my personal time and effort but if I was greedy wouldn't I want people fixing up my code for me? Is there some revolutionary private research in all this that makes me more valuable? I think showing off my abilities and progress makes me more valuable.

I'm just currently working on pulling all my code out of my stash and putting it onto github, with a much better BSD license. I'm remembering what the subtitle of my blog really means.

I've spent a lot of time studying Ruby since I finished my DBA course. There is still a lot of areas where Chef could use improvements and I plan to do a lot about it. We are going to make BSD a first class citizen with Chef and hopefully many of it's tools and cookbooks too.1

Remember when I used to post monthly? Hahahaha. I don't want to use this as a journal, I already have one of those but I wanted to give a bigger picture life update since I am updating pages and testing my jenkins build trigger with github ;p

  1. I have always preferred UNIX to Linux. My first sysadmin job was a Solaris Admin, a job I did for a long time. With the advent of SystemD I've gone back to my love in the form of BSD.