Geo::PostalCode::NoDB 0.01

Geo::PostalCode is a great Perl module. It lets you find surrounding postal areas (zip codes) around a given an amount of miles (radius), calculate distance between them, among other nice features. Sadly, I couldn’t get it to work with updated data and because the file its Berkely DB installer was producing was not being recognized by its parser, which bases off on DB_File. Since I was able to find working data for the source of zip codes, I ended up hacking the module and producing a version with no Berkeley DB support.

So basically, and taken from the POD:

RATIONALE BEHIND NO BERKELEY DB
On a busy day at work, I couldn’t get Geo::PostalCode to work with newer data (the data source TJMATHER points to is no longer available), so the tests shipped with his module pass, but trying to use real data no longer seems to work. DB_File marked the Geo::PostalCode::InstallDB output file as invalid type or format. If you don’t run into that issue by not wanting to use this module, please drop me a note! I would love to learn how other people made it work.

So, in order to get my shit done, I decided to create this module. Loading the whole data into memory from the class constructor has been proven to be enough for massive usage (citation needed) on a Dancer application where this module is instantiated only once.

$ sudo cpanm Geo::PostalCode::NoDB now!


Feedbag released under MIT license

I was contacted by Pivotal Labs regarding licensing of Feedbag. I guess releasing open source software as GPL only makes sense if you continue to live under a rock. I’ve bumped the version to 0.9 and released it under MIT license.

Feedbag 1.0, which I plan to work on during the following days will bring in a brand new shiny backend powered by Nokogiri, instead of Hpricot (I mean, give me a break, I’m trying to catch up with the Ruby community, after all I’m primarily a Perl guy :D) and hopefully I will be able to recreate most of the feed auto-discovery test suite that Mark Pilgrim retired (410 Gone) when he committed infosuicide.

Have a good weekend!


Is GitHub really necessary for your startup engineering team?

For the last few months, I’ve been ranting a little too much with our engineering team about GitHub. These rantings, of course, didn’t come out of the blue. I haven’t been too satisfied with GitHub’s reliability lately since we use it for our daily development and constant deployment on our different staging environments and our own operation has been affected by their recent multiple technical difficulties. This has led me to analyze whether we really need them, why we started paying for their service in the first place and if it’s really necessary for our product development in the first place.

Being an startup that originated its main product on Ruby and Ruby on Rails, some of our initial engineers were more oriented towards Ruby and open source technologies. We started using GitHub because it was the natural and de facto choice. Creation of repositories and private contributors was easy and would put you up to speed with collaboration. As we grew in developers count, our repositories did too as well as different projects. We even brought a system administrator in place. Do we still need GitHub right now? I won’t be dropping Git as our main version control system, but what has GitHub provided my company and project in exchange for the $200 we are paying them every month? Is it only hype for a company to use it for their internal repositories? Or is it hype to use them for their open repositories?

The lack of site reliability on their side is what has led me thinking about whether we should still use them. What really concerns me is that they seem to have operational issues on a daily basis, they’re constantly announcing new hires, new pub meetings and new Octocat designs, and little effort has been shown about the technical operations. It sometimes amazes me how people is so much in love with GitHub to the point where I even find “liked” articles on their feed on Google Reader whenever they announce downtimes and br0kenage (now +1’s). It’s like people enjoy what’s wrong.

Are we using their issues functionality? The nice thread comments per commit, per line? Are we using pull requests? Are we using any of their actual offering compared to what a native Git server setup offers? No. The only thing is just pretty much the collaboration management, and even then, the “Groups” feature per organization is still lacking. The only main feature we use is the web browsing for the repos, which we can do with open source tools anyway. At the end, there’s other great GUIs for Git we also use on a daily basis, such as GitX or Tower (and for now, yes, GitHub’s GUI client).

The issue they were attacked with last week makes me wonder whether they are the right organization to give us version control system services or we’re just using them because we haven’t know any better or even looked at our own resources (people and machines). Someone got commit access to an open source repository. What if it would have been a malicious user getting access to private repositories where passwords and other sensitive are stored? How serious is this company? $200 is not much for a startup to spend on a neat service, but when that provider seems to be more concerned about stickers or beer than fixing their tech operations from the inside out is when it gets worrying. I like to see how other similar services we’re using compare to GitHub, how stable they feel and how notorious that reliability is shown. For example, New Relic which we also use, seems rock solid and they don’t show issues at all. What’s the difference here anyway?

Site reliability is tricky anyway: when it’s faulty, it shows and screams because it affects business operations and reaches the entire company; when it’s working perfectly, the way it should, you can barely realize people are doing an stupendous job. GitHub, however, doesn’t seem to have this clear.


Lessons learned and real experience

I’ve been busy.

We‘ve been building a next generation video ad serving platform with all of lessons and experience we’ve gained over the last couple of years and specially over last one. In an industry where there’s too much client demand there’s only a few key supply players doing a good job and we’ve identified strengths, weaknesses, opportunities and threats. We’ve kept ourselves busy.

Over the course of last year, especially on the second half and as far as this first quarter of 2012 has been running, aims for a new better product have made me reconsider what I thought I knew and what I thought I had to learn on an engineering point of view, about how to build not only regular or good software, but enterprise software. Having a startup background where dynamics are key, we’ve shifted into a more concise, long-term, sustainable technology where pieces need not to change easily, but at the same time, provide flexibility, customizable solutions for clients’ needs and that would allow us to continue growing.

Question for me has been whether all of what I thought I knew about Web dynamics, engineering and technology is what I really thought it was. How do you get to the point where you start using some sort of product to enable you to build another. How is that decision made in your head. What business decisions are made that affect the technology stack on the technology your peers are building, on the product you are taking care of to produce something usable. How are those decisions considered from a technology point of view, are they just passed along and executed or do they really carry along considerations from all points of the company.

A modern life Web engineer does not only write software. That person should also understand how the entire business stack works, why are positions covered around him and how are those dots connecting to provide a good solution for some one who pays to use it. The engineer has to understand that very well.

It’s amazing the amount of strategies that I’ve personally met and thought of over the time. Real life experience is invaluable, but basic understanding of how things work is what enables that, and some people sometimes seem to forget that. Are you paying overpriced developers because you’re using hyped technology? Are you putting enough thought on what you are building and what could lead to in terms of those bricks.

It is not a myth that when you learn and increase your knowledge on something, something else seems to open up and leave your mind with more questions. Learning more makes you realize you’re ignorant. And it’s only the ability and willingness to continue learning that would lead you to a much better thought process that could even eventually make you think that the answer to the question was just right there from the very beginning, from the basics.


RVM + Rake tasks on Cron jobs

RVM hates my guts. And it doesn’t matter, because I hate RVM back even more. Since I was technologically raised by aging wolfs, I have strong reasons to believe that you just shouldn’t have mixed versions of software on your production systems, specially, if a lot of things are poorly tested, like most of Ruby libraries, aren’t backward compatible. I was raised on a world where everything worked greatly because the good folks at projects like Debian or Perl have some of the greatest QA processes of all time, hands down. So, when someone introduces a thing like RVM which not only promotes having hundreds of different versions of the same software, both on development, testing and production environments, but also encourages poor quality looking back and looking forward, there isn’t anything else but to lose faith in humanity.

But enough for ranting.

I had to deliver this side project that works with the Twitter API and the only requirement pretty much was that it had to be both run from the shell but also loaded as a class within a Ruby script. And so I did everything locally with my great local installation of Ruby 1.8.7. When it came the time to load on the testing/production server I found myself on a situation where pathetic RVM was installed. After spending hours trying to accommodate my changes to run properly with Ruby 1.9.2, I set up a cron job using crontab to run my shit every now and then. And the shit wasn’t even running properly. Basically, my crontab line looked something like this:

*/30 * * * * cd /home/david/myproject && rake awesome_task

And that was failing, crontab was returning some crazy shit like “Ruby and/or RubyGems cannot find the rake gem”. Seriously? Then I thought, well, maybe my environment needs to be loaded and whatever, so I made a bash script with something like this:

#!/bin/bash
cd /home/david/myproject
/full/path/to/rvm/bin/rake -f Rakefile awesome_task

And that was still failing with the same error. So after trying to find out how cron jobs and crontab load Bash source files, I took a look at how Debian starts its shell upon login. And while that didn’t tell me that much that I didn’t know, I went to look at the system-wide /etc/profile and found a gem, a wonderful directory /etc/profile.d/ where a single shitty file was sitting, smiling back at me, like waiting for me to find it out and swear on all problems in life: rvm.sh. /etc/profile is not being loaded when I just run /bin/bash by my crappy script, only when I log in, I should’ve known this. Doesn’t RVM solve the issue of having system-wide installations so the user doesn’t have to deal with, you know, anything outside of his own /home ?

So I had to go ahead and do:

#!/bin/bash
source /etc/profile
cd /home/david/myproject
/full/path/to/rvm/bin/rake -f Rakefile awesome_task

And hours later I was able to continue with work. Maybe this will help some poor bastard like myself on similar situation on the future.

Of course one can argue that I could’ve installed my own RVM and its Ruby versions, but why, oh why, if it was, apparently, already there. Why would I have to fiddle with the Ruby installations if all I want is get my shit done and head to City Bakery where I can spend that money I just earned on chocolate cookies? My work is pretty simple to run with pretty much any ancient version of Ruby, nothing fancy (unless you call MongoMapper fancy). RVM is a great project that doesn’t solve an issue, but just hides some really fucked up shit on the Ruby community.


DebConf 11

Last year, I had all the intention in the world to go to Bosnia and Herzegovina this summer to attend DebConf 11, but given my non-existent involvement on the project for the last couple of years, there’s no reason for me to do it. Even last year when the event was held in town, I was not entirely thrilled about it. Life has changed quite a bit in the last few years, allowing me to just find and allocate my priorities on their right places.

I hope everybody attending have a great time and snap lots of photos for me to see and regret not being there.


Disable Nginx logging

This is something that is specified clearly on the Nginx manual, but it’s nice to have it as a quick reference.

The access_log and error_log directives on Nginx are on separate modules (HTTP Log and Core modules respectively) and they don’t behave the same way when all you want is to disable all logging on your server (in our case, we serve a gazillion static files and perform a lot of reverse proxying and we’re not interested on tracking that). It’s a common misconception that you can set error_log to off, because that’s how you disable access_log (if you do that, the server will still log to the file $nginx_path/off). Instead, you have to set error_log to log to the always mighty black hole /dev/null using the highest level for logging (which triggers the fewest events), crit:

http {
  server {
    # ...
    access_log off;
    error_log /dev/null crit;
    # ...
  }
  #...
}

If you’re the possessor of the blingest of bling-bling, you can disable all logging (not only for a server block), by putting error_log on the root of the configuration and access_log within your http block and make sure you don’t override that on any of the inner blocks. And you’re good to go.


GitHub (and open source) on LinkedIn

I am a huge fan of LinkedIn, I love it. Handled properly, it can allow you to establish great opportunities that eventually lead up to either work or paid gigs and we all know how good money is, right? I have spent time for the last couple of weeks making sure my own LinkedIn profile is complete, fully up-to-date and that I’m taking full advantage of all the features that the site has to offer, and even though I’m not actively looking for a job it’s always nice to get contacted by other developers or directors (heck, even recruiters) inviting you to consider other job opportunities. In my own personal space, it makes me more confident that I can still be of great value to other enterprises. It’s like feeling you could date again, if you wanted to.

Last night, they announced that they had enabled GitHub as an application on users profiles. This is great for developers. I think that LinkedIn still needs to implement ways for users to track their not-so-formal activities and this is kind of a big step, to some extent.

For instance, I’d like to list some projects that I’ve worked with on which I wasn’t fully employed with but hired me as a contractor for a few hours. They need to implement some sort of portfolio for the side business you perform.

After all, a LinkedIn profile is based on the professional work you do, not entirely on the commits you do into a forked or your never-ending project. There’s still the question on how you can show off additional open source experience that is not in the form of commits on a GitHub repo, for instance, how to display the work you do on a project that doesn’t have a Git repository, and use something else; or maybe the translation work you’ve done for years; or the documentation you’ve contributed here and there. But then again, if you’re depending on LinkedIn to show off your open source expertise, you’re probably not using the right tool. While I applaud the GitHub integration on LinkedIn, I do think it still lacks more work to cover more open source contributions which can take up to significant amounts of time on a developers expertise skills set.

Go add the GitHub application to your LinkedIn profile here.