Push, pull?

Push, or pull? That is the question of the day.

What is push and what is pull? These are simple principles, but can be difficult to explain so I am going to use monitoring to do that for me.

Lets start with Nagios. Nagios is a common (ok, it’s over-adopted) monitoring (and alerting) tool that allows you to query the state of things in your environment. It’s purpose? Detect the state of your environment, and notify your engineers when something breaks.

So how does Nagios work? Well, Nagios uses a bunch of config files and plugins to interact with your environment. The configuration files specify hostnames, ip address, commands, good values, bad values, etc. The plugins are the applications that take the values from the configuration, and then use those values to query your environment, making sure everything is ok.

Great, so all I have to do to monitor my environment is tell nagios to issue a RFI to my service, and based on that, tell me if its good or bad. Well, sure, this sounds too good to be true, and in some cases, it is.

What if a service is batch based? Batch processes tend to break pull based monitoring pretty quickly. That’s because batch processes come and go. We may not know when the process is going to run the next batch. So how do we fix this?

In the case of an application that comes and goes, why not do something like:

  • Create a database and table
  • Insert values into the table
    • Process Name (varchar 100?)
    • Process Start Time (timestamp)
    • Process Expected Completion (timestamp)
    • Process Status (bool)

Now, we can have our application, upon launch of a job, insert something like:

insert into batch_status values Development Test Query, 2013-12-19 14:29:00, 2013-12-19 14:39:00, 1;

Where:

  • Name: Development Test Query
  • Process Start Time
  • Process Expected End Time
  • Process Status (0 finished, 1 running)

As each job completes, the last step would be to update batch_status with the process_status 0 (finished)

Now, we can have nagios search the database for any process whose status is not 0, and the expected time is before now(), and where it finds those, it can issue an alert, because nagios is now able to pull for information.

Interesting, we just turned a push bashed process into one that could be consumed by pull. Nifty.

Technology groups are like monitoring and processes. The managers are like nagios, and your engineers can be either push or pull responsive.

What do you do when you are a manager and need to get information from an engineer, but they are constantly in flux, and busy adjusting things, wearing their many hats? Slow down. Stop. Remember, not everyone works the same way.

I am a push engineer. I do not do well with being interrupted on your schedule, to relay information to you. I find time in my day, where I sit down, and I dump that information. Now unfortunately, managers are not often trained on how to handle this.

I came up with an interesting idea, because of monitoring. What if you created a method in which a manager can still query the data as they want, and as often as they want? Wouldn’t that make for a happy manager? Now what if the engineer doesn’t have to worry about being requested information, and can respond to it at their leisure? How would you go about doing this? Email is an ok medium but gets sloppy over time.

If you have an engineer who works best when they can push information, try setting up a blog. A space where they can dump their status, and you can go and consume it. It gives you the best of both worlds. You are able to get the updates you so need, but you don’t create the friction with your engineer who is trying to keep your processes moving.

-Villain

Comments

villains-lab forums