Now a part of the BuzzData team

Today is my first official work day at BuzzData—a startup company devoted to humanization of data. That means, we facilitate the process of transforming raw data ore into all-powerful fuel of knowledge. We are the context. Although there are newspapers, independent journalists, government agencies, data miners among our clients, BuzzData is perfect for everyone with good will and an interesting dataset. Come and try, it's free.

There was a lot of factors contributing to my decision to join BuzzData, but here is the most important one: an opportunity to join a team of highly skilled professionals and just amazingly friendly people. Also, this is not coincidental that I revived my interest in statistics, optimization, machine learning and related things not so long ago.

The first day was spent on figuring out an algorithm for computing distances in a multidimensional non-isomorphic space. Abstract algebra and all that stuff... Looks like there is a lot of fun there.

I also want to thank Shane Caraveo and David Eaves for heading me up. It wouldn't happen without their support.

How much can be done in four hours

Today I had an awesome day at the first OpenDataBC hackathon which took place at Mozilla Labs Vancouver.

Tara Gibbs pitched this wonderful idea of consolidating shelter availability data and displaying it on a few window displays, so the homeless people living DTES would not waste their time going from one shelter to another just to find a free spot.

This doesn't solve all the problems of course, but it does solve a little yet very annoying one.

So... At 11:30 we had nothing but an idea. We discussed possible approaches for a while, then came David Eaves and suggested using Twitter as a message queue service.

At approximately 12:00 we still had nothing but a piece of paper covered with boxes and arrows, then we started coding. Tara did the frontend, I was busy hacking the backend and the Twitter stuff.

Four hours later we had a fully functional, production ready system - https://github.com/mikeivanov/vanshelter

How it is supposed to work:

  1. Shelters tweet their availability data (they all have internet access)
  2. VanShelter monitors -- each of them independently -- receive Twitter updates and
  3. Refresh their displays when something changes.

For displays we can use cheap LCD monitors, probably even donated. The software will run on those amazing Raspberry thingies - http://www.raspberrypi.org/, $25 each. This brings the full cost of installing 10 displays down to $250+.

Thank you Tara and David. Also, thank you Jeff and all the people who made this hackathon possible.

How I learned to like Ruby

The point of this rant is to annoy people share my experiences with Ruby.

As strange as it seems, I develop deep emotional relationships with programming languages. I love Python. I truly do. I totally irrationally hate Java (though I'm ok with JVM). 

Now, I like Ruby. 

Well, it has some little warts, but generally it is a very enjoyable language.

I didn't "get" Ruby for quite a long time because I tried to wrap my head around it from the wrong end. Ruby didn't like me, I didn't like Ruby -- it lasted until I realized a very simple thing: Ruby is not "like Perl".

The same happened many years ago with JavaScript. I disliked it so much so couldn't make myself write code in it. I hated it until it occurred to me that JavaScript is not "like Java". JavaScript is a Lisp in disguise. Once I realized that, I stopped worrying and quickly found myself in an intimate, romantic relationship with JavaScript. It is still one of my favourite languages.

Then came Ruby. I had to use it because it was a part of my job. At first Ruby felt like a Perl with broken legs. I tried to make it run and it crawled. I missed one important detail: it didn't have legs at all -- it got wings. Ruby flies.

Here comes my little revelation: Ruby is a Smalltalk. Ruby has much more in common with Smalltalk than with anything else. 

Actually it's a better Smalltalk. Since I realized that, I started to enjoy this beautiful language. Hey, peace and happiness -- welcome back.

Conclusion: it is often our own distorted perception that makes good things look ugly.

Rant mode off.

Simple things

It was a wonderful week. I didn't realize how much I needed to step away from the daily routine. This week was completely spent on (get jealous!):

  • sleep
  • beer
  • football (Lions vs Winnipeg; it was an exciting game, yet a little bit disappointing -- we lost 17:30)
  • slacking on the beach
  • some hacking (just a little bit)
  • meeting some new people (really nice ones)

The stuff I definitely didn't miss:

  • nosy office
  • commuting to work (2 hours every day)
  • stress
  • coffee

I'm happy now, at last.

 

Pure Python Paillier Homomorphic Cryptosystem Implementation

What

This is a very basic Paillier Homomorphic Cryptosystem implemented in pure Python.

The idea is, in short, to encrypt two numbers, perform an "add" operation on cyphertexts, decrypt the result and find it to be the sum of the original plaintext numbers.

How

The code is loosely based on the thep project and a few ActiveState recipes. The code is pure Python and all objects are serializable.

Where

Here: https://github.com/mikeivanov/paillier

Why

I was bored.

 

Different emails for different Git repositories

On my laptop I have two directories:

  • ~/activestate, where my work projects reside, and
  • ~/me -- for my personal stuff.

The problem is, when I commit changes I want them to be properly attributed. More specifically:

  1. everything that belongs to ActiveState should be checked in using my work email
  2. all other stuff checks in with my private address
  3. I don't want to `git config user.email <...>` each time I clone a new repository ('cause I do it a lot).

Here's what I've done:

  • in my home directory I created a file called ~/.gitemail containing just my private email. This address is going to be the default
  • to the ~/activestate directory I added another .gitemail with my work address.
  • finally, I added this snippet to ~/.bashrc:
    alias git='GIT_AUTHOR_EMAIL=$(
          p=$(pwd)
          while [[ $p != "$HOME" ]]; do
            [ -e $p/.gitemail ] && cat $p/.gitemail && break
            p=$(dirname $p)
          done) /usr/bin/git'

The alias scans all the directories up to the home dir looking for a file called .gitemail. When found, it sets the GIT_AUTHOR_EMAIL variable to the file's content. This effectively makes the actual git command use the subtree-specific email. Now, when I'm working e.g. in ~/activestate/stackato, it will automatically pick up my work email from ~/activestate/.gitemail.

No extra efforts, less things to remember, and no history rewriting anymore.

 

How to mount an NTFS-formatted USB drive in read-write mode on Mac OS X

Actually, it's very easy. No additional software is required. Just seven easy steps:

  1. Attach your USB drive
  2. Open the Terminal app (Command-Space, then type "Terminal", hit Enter)
  3. Type or copy/paste this command:
    sudo sh -c "mkdir -p /mnt $(mount | grep ntfs | head -n 1 \
       | awk '{ print "&& umount " $3 " && mount_ntfs -o nosuid,rw " $1 " /mnt" }')"
  4. Locate your drive in Finder
  5. Drag/drop files there
  6. Unmount the drive as usual
  7. DONE!

The command breakdown, if you're interested:

  1. mkdir -p /mnt creates a mount point -- a place in the file system where we will attach the drive
  2. the mount command without parameters gives you a list of the currently attached drives
  3. grep ntfs filters non-ntfs drives out the list
  4. head -n 1 grabs the first line (we're assuming only one ntfs drive can be attached at a time)
  5. the awk part produces two commands:
    • umount /Volumes/<name> -- unmounts the drive from its original place
    • mount_ntfs -o nosuid,rw /dev/<device> /mnt -- mounts the drive again, but this time in the read-write mode
  6. now, the sudo sh -c "..." thing allows code execution with superuser privileges.

That's it.