Daniel Doubrovkine (aka dB.) is one of the tallest engineers at Art.sy. He founded and exited a successful Swiss start-up in the 90s, worked for Microsoft Corp. in Redmond, specializing in security and authentication, dabbled in large scale social networking and ran a big team that developed an expensive Enterprise product in NYC. After turning open-source cheerleader a few years ago in the worlds of C++, Java and .NET, he converted himself to Ruby and has been slowly unlearning everything he learned in the last 15 years of software practice. Daniel has posted 46 posts at DZone. You can read more from them at their website. View Full User Profile

GitHub Hacked: How to Protect Your Code

03.05.2012
| 7575 views |
  • submit to reddit

You should take a serious look at your application and write some tests, first thing Monday.

I would write integration tests with real data that attempt to exploit the issues that were exposed by the Github hack. Even if you’re sure of your code, sit down and write a few tests, just to be double-sure. Don’t do a code review, write some code that will tell you, 100%, whether you have problems or you don’t.

I see two major attack vectors.

Mass Assignment

Read Homakov’s post. If it’s not clear, read it again until it’s clear.

Given models Parent and Child where children belong to parents - can I post a parent’s ID to a form that updates a child and therefore change which parent a child belongs to? If so, you have a problem. Go fix it first thing in the morning in a systematic way, by writing a test that reproduces the issue, then by protecting the attributes with an attr_accessible method. This will filter out everything that’s not in the list when you call update_attributes. Make sure you just use this on all models, all the time.

A variation of this problem is garbage in, garbage out. This affects systems backed by NoSQL document databases. Make sure you aren’t writing random attributes that come from a form into your model. In a relational database you get an exception because the field doesn’t match the schema. In a document store you have just stored junk. It may be harmless or harmful, but you’d rather not find out the hard way.

We use a home grown hash map to whitelist attributes for historical reasons, but attr_accessible does the job just fine.

Identity Confusion

Whitelisting attributes only works when you actually don’t need to assign relationships. Do you pass an identity for a Widget as a parameter, maybe in a URL? Do widgets belong to different users? If so, write a test that ensures that a user that doesn’t have access to this Widget cannot modify it.

My recommendation is to use something like CanCan and to check authorization in a single layer. You spell out who can create/retrieve/update/delete models and enforce this with a single has_authorization_to?. We do this in our API layer systematically. We also learned to key off current_user as much as possible. So if you’re modifying widgets that belong to current_user, you won’t find a widget with a rogue ID by doing current_user.widgets.find(someone_elses_widget_id).

Dear Github

I still love you. This happens to the best people out there. Shameless plug for my former Team SHATTER, if you want a list. Move on and learn from it.



Published at DZone with permission of its author, Daniel Doubrovkine. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)