Nov 9, 2013

Installing SBCL 1.1+ on RHEL/CentOS systems

The version of SBCL available on RedHat Enterprise Linux 6.4 (and CentOS) is 1.0.38, which is quite old. If your project requires a newer SBCL, it has to be installed manually.

Although sbcl.org offers some Linux binaries, those are incompatible with RHEL/CentOS 6.4. Compiling from the sources, unfortunately, is the only option.

This tutorial assumes a 64-bit system (x86_64). Compiling SBCL on a 32-bit platform might or might not work — I never tried it.

  1. The first step is to make sure you can compile programs:

    sudo yum groupinstall "Development Tools"
    
  2. Then enable EPEL, this is necessary for the next step:

    wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
    sudo rpm -Uvh epel-release-6*.rpm 
    
  3. Now, let’s install the old SBCL. We need it because SBCL’s Lisp compiler is written in Lisp, so it requires a working Lisp compiler to compile itself. This older SBCL binary can be safely removed later.

    sudo yum install -y sbcl.x86_64
    
  4. Download SBCL source code. At the time of writing this post the latest version was 1.1.13:

    wget http://downloads.sourceforge.net/project/sbcl/sbcl/1.1.13/sbcl-1.1.13-source.tar.bz2
    tar xfj sbcl-1.1.13-source.tar.bz2
    cd sbcl-1.1.13
    
  5. Compile the sources. Expect to see a lot of diagnostic messages.

    ./make.sh
    
  6. Install the compiled binary. The warnings about missing doc directory can be safely ignored. By default, the binary is installed in /usr/local/bin:

    sudo sh install.sh
    
  7. Make sure it works. You should see “SBCL 1.1.13” in the response:

    sbcl --version
    
  8. Remove the old SBCL:

    sudo yum remove -y sbcl
    
  9. Optional: install Quicklisp. This is not strictly necessary, but having a CPAN-like Lisp package manager around will definitely make your life easier:

    wget http://beta.quicklisp.org/quicklisp.lisp
    sbcl --load quicklisp.lisp \
         --eval '(quicklisp-quickstart:install)' \
         --eval '(ql:add-to-init-file)' \
         --eval '(quit)' 
    

Enjoy your new SBCL.

Aug 29, 2013
"Catching Salmon" by Daniela Ivanova - https://www.facebook.com/photo.php?fbid=430329697086800

"Catching Salmon" by Daniela Ivanova - https://www.facebook.com/photo.php?fbid=430329697086800

Jun 27, 2013

MySQL Connector: inherited transactions

This is what I have noticed today: if a process opens an MySQL connection and then forks, the child process not just inherits the open connection, but also the transaction state. The current transaction becomes shared between the child and the parent. That is, if the child process rolls back, the parent also gets a roll back. Also, as it is the same transaction, a lock set by one process has no effect on another.

Here is a proof of concept:

"""
Create and populate a database before running this script:

create database mytest;
grant all on mytest.* to ''@'localhost';
flush privileges;
create table foo(a int);
insert into foo (a) values (0);
"""

import time
from multiprocessing import Process
import _mysql

reconnect = False  # change to true to make the child process block (it should)

conn = _mysql.connect("localhost", user="mike", db="mytest", passwd="")

def sub():
    if reconnect:
        sub_conn = _mysql.connect("localhost", user="mike", db="mytest", passwd="")
    else:
        sub_conn = conn
    print "SUB: start", sub_conn.thread_id()
    print "SUB: do this to get the number of connections -> sudo lsof | grep mysql.sock"
    sub_conn.query('begin')
    sub_conn.query('select * from foo for update')
    if not reconnect:
        print "SUB: NOT BLOCKED, sleeping for 30 sec to hold the conneciton open"
        time.sleep(30)
    print "SUB: result", sub_conn.use_result().fetch_row()
    sub_conn.query('rollback')
    print "SUB: end"

print "HOST: start", conn.thread_id()
conn.query('begin')
conn.query('select * from foo for update')
print "HOST: result", conn.use_result().fetch_row()

process = Process(target=sub)
print "HOST: start sub"
process.start()
process.join()
print "HOST: sub joined"

conn.query('rollback')
print "HOST: end"

When reconnect is set to False, the parent’s thread id will be the same as in the child. The reason why is that MySQL uses server-side thread ids as connection identifiers. Here’s the mysql_thread_id function (mysql-connector-c-6.1.0-src/libmysql/libmysql.c:1070):

ulong STDCALL mysql_thread_id(MYSQL *mysql)
{
  ......
  return (mysql)->thread_id;
}

And this is how it is set in CLI_MYSQL_REAL_CONNECT (mysql-connector-c-6.1.0-src/sql-common/client.c:3613):

......
server_version_end= end= strend((char*) net->read_pos+1);
mysql->thread_id=uint4korr(end+1);
end+=5;
......

The direct consequence is that children processes, created for example using the multiprocessing module, must close the inherited MySQL connections and then reopen them to avoid surprises.

When I discovered it, I immediately thought about Django management commands splitting workload between children.

Open questions:

  1. Are Celery tasks affected by this? — probably yes.
  2. What happens when two processes sharing a transaction update data at the same time?
Jun 6, 2013

Multimethods in Python

So, PEP-443 aka Single-dispatch generic functions has made it into Python. There is a nice writeup of the singledispatch package features by Ɓukasz Langa.

Although I’m glad that Python is evolving in the right direction, I can’t see how single dispatch alone could be enough. In essence, PEP-443 defines a way of dynamically extending existing types with externally defined generic functions. Which is nice, of course, but too limited.

What is really interesting is multiple dispatch. There are a few packages bringing multimethods to Python; all of them are overcomplicated to my taste.

Here’s my take on it. I will not talk much, better show you the code.

This is the complete implementation:

# multidispatch.py

import operator
from collections import OrderedDict

class DuplicateCondition(Exception): pass

class NoMatchingMethod(Exception): pass

class defmulti(object):
    def __init__(self, predicate):
        self.registry = OrderedDict()
        self.predicate = predicate

    def __call__(self, *args, **kw):
        method = self.dispatch(*args, **kw)
        return method(*args, **kw)

    def dispatch(self, *args, **kw):
        for condition, method in self.registry.items():
            if self.predicate(condition, *args, **kw):
                return method
        return self.notfound

    def notfound(self, *args, **kw):
        raise NoMatchingMethod()

    def when(self, condition):
        if condition in self.registry:
            raise DuplicateCondition()
        def deco(fn):
            self.registry[condition] = fn
            return fn
        return deco

    def default(self, fn):
        self.notfound = fn
        return fn

    @classmethod
    def typedispatch(cls):
        return cls(lambda type, first, *rest, **kw: isinstance(first, type))

And here’s how to use it:

import types
from multidispatch import defmulti, NoMatchingMethod

# Exhibit A: Dispatch on the type of the first parameter.
#            Equivalent to `singledispatch`.

cupcakes = defmulti.typedispatch()

@cupcakes.when(types.StringType)
def str_cupcakes(ingredient):
    return "Delicious {0} cupcakes".format(ingredient)

@cupcakes.when(types.IntType)
def int_cupcakes(number):
    return "Integer cupcakes, anyone? I've got {0} of them.".format(number)

@cupcakes.default
def any_cupcakes(thing):
    return ("You can make cupcakes out of ANYTHING! "
            "Even out of {0}!").format(thing)

print cupcakes("bacon")
print cupcakes(4)
print cupcakes(cupcakes)


# Exhibit B: dispatch on the number of args, no default

@defmulti
def jolly(num, *args):
    return len(args) == num

@jolly.when(1)
def single(a):
    return "For {0}'s a jolly old fellow!".format(a)

@jolly.when(2)
def couple(a, b):
    return "{0} and {1} are such a jolly couple!".format(a, b)

print jolly("Lukasz")
print jolly("Fish", "Chips")
try:
    jolly("Good", "Bad", "Ugly")
except NoMatchingMethod:
    print "Noo! Angel Eyes!"
Apr 17, 2013

I will never CNAME my root domain again.

I will never CNAME my root domain again.

I will never CNAME my root domain again.

NEVER. EVER.

Apr 13, 2013

Hello Tumblr

Good bye, Posterous. Rest in peace.

Jun 6, 2012

A simple callback chain macro for elisp

The Problem

As usual, it started with a tiny piece of ugly code:

(bd-create-stage datafile-id
                 (lambda (stage-id)
                   (bd-insert-rows stage-id 
                                   [[10 20 30] [40 50 60]]
                                   (lambda (stage-id)
                                     (bd-commit-stage stage-id 
                                                      #'ignore)))))

The snippet above is basically a callback chain. When bd-create-stage finishes its work, it calls the first lambda, which calls bd-insert-rows with the second lambda as its callback argument and so on, until it all stops at the ignore function.

I wanted to rewrite it as something like this:

(=> datafile-id
    (bd-create-stage it next)
    (bd-insert-rows  it [[1 2 3 4 5] [6 7 8 9 0]] next)
    (bd-commit-stage it next))

Where the it variable would represent the current callback’s parameter and next would refer to the next callback in the chain. As with the -> macro, I wanted explicit anaphoric variables.

The Idea

Each line in the snippet above could be wrapped in a lambda, lust like this:

(=> datafile-id
    (lambda (next it)
      (bd-create-stage it next))
    (lambda (next it)
      (bd-insert-rows it [[1 2 3 4 5] [6 7 8 9 0]] next)
    (lambda (next it)
      (bd-commit-stage it next))))

Then it should somehow call each function in the list with the consequent function as the first parameter and the result of execution of the previous function as the second parameter.

The Solution

This function chaining thing looks a lot like a binary function fold:

(defun chain2 (f1 f2)
  (apply-partially f1 f2))

(defun chain (&rest fns)
  (if fns
      (reduce #'chain2 fns :from-end t)
    #'identity))

Applying chain to a function list creates a new function taking one parameter and passing it through the whole function list, much like the -> macro does.

In fact, this is enough to start working on the macro.

(defmacro => (initial &rest forms)
  `(funcall ,(build-form-chain forms) ,initial))

The build-form-chain function wraps each form into a lambda and then chains them together:

(defun build-form-chain (forms)
  `(apply #'chain 
          (list ,@(mapcar #'build-form-link forms) #'ignore)))

At the end it adds ignore as a terminator. The terminator is necessary because the last callback’s result is almost always ignored.

The build-form-link's implementation is trivial:

(defun build-form-link (form)
  `(lambda (next it) ,form))

Done! Here’s the full source for your convenience:

(defun chain2 (f1 f2)
  (apply-partially f1 f2))

(defun chain (&rest fns)
  (if fns
      (reduce #'chain2 fns :from-end t)
    #'identity))

(defun build-form-link (form)
  `(lambda (next it) ,form))

(defun build-form-chain (forms)
  `(apply #'chain 
          (list ,@(mapcar #'build-form-link forms) #'ignore)))

(defmacro => (initial &rest forms)
  `(funcall ,(build-form-chain forms) ,initial))

Now let’s see how the macro expands:

ELISP> (macroexpand
     '(=> datafile-id
          (bd-create-stage it next)
          (bd-insert-rows  it [[1 2 3 4 5] [6 7 8 9 0]] next)
          (bd-commit-stage it next)))

(funcall (apply (function chain) 
                (list (lambda (next it) 
                        (bd-create-stage it next))
                      (lambda (next it)
                        (bd-insert-rows it [[1 2 3 4 5] [6 7 8 9 0]] next))
                      (lambda (next it)
                        (bd-commit-stage it next))
                      (function ignore)))
          datafile-id)

Exactly as intended.

This macro covers 95% of my callback chaining needs. For the rest 5% there is the all-powerful deferred.el library.

May 9, 2012

Thread operator in Elisp

TL;DR

ELISP> (-> 1
           (+ 2 it)
           (* 3 it))
9
ELISP> (macroexpand
           '(-> 1
                (+ 2 it)
                (* 3 it)))
(let* ((it 1) (it (+ 2 it)) (it (* 3 it))) it)

Implementation:

(defmacro -> (arg &rest forms)
  `(let* ((it ,arg) .
      ,(mapcar (lambda (form) `(it ,form))
           forms))
     it))

The Long Story

When I see code like this, I frown:

(defun bd-search (api-key query callback)
  (send-request "GET"
            (format "search?%s"
                (make-query-string `(("api_key" . ,api-key)
                             ("query" . ,query))))
        callback))

It’s a very simple case, yet the parameter list is already at the fourth level of indentation. When it gets really ugly I usually wrap the whole thing into a let statement and start moving inner parts into variables.

What I have noticed, however, is that almost always constructs like this are sequential by their nature, in other words the output of the innermost statement serves as input for the statement one level up, and so on and so forth. This is the very reason why Clojure had its thread operator macro since beginning.

Remembering that, I started literally morphing my bd-search function into something more prettier. I came up with this variant:

  (-> `(("api_key" . ,api-key)
    ("query" . ,query))
      (make-query-string it)
      (format "search?%s" it)
      (send-request "GET" it callback)))

Then I put together the -> macro and that was it.

I decided to make the macro anaphoric instead of implicitly injecting an extra parameter as in Clojure. This allowed me to put the threaded parameter at any place, not just at the beginning or at the end of the parameter list.

Aug 27, 2011

How much can be done in four hours

Today I had an awesome day at the first OpenDataBC hackathon which took place at Mozilla Labs Vancouver.

Tara Gibbs pitched this wonderful idea of consolidating shelter availability data and displaying it on a few window displays, so the homeless people living DTES would not waste their time going from one shelter to another just to find a free spot.

This doesn’t solve all the problems of course, but it does solve a little yet very annoying one.

So… At 11:30 we had nothing but an idea. We discussed possible approaches for a while, then came David Eaves and suggested using Twitter as a message queue service.

At approximately 12:00 we still had nothing but a piece of paper covered with boxes and arrows, then we started coding. Tara did the frontend, I was busy hacking the backend and the Twitter stuff.

Four hours later we had a fully functional, production ready system - https://github.com/mikeivanov/vanshelter

How it is supposed to work:

  1. Shelters tweet their availability data (they all have internet access)
  2. VanShelter monitors — each of them independently — receive Twitter updates and
  3. Refresh their displays when something changes.

For displays we can use cheap LCD monitors, probably even donated. The software will run on those amazing Raspberry thingies - http://www.raspberrypi.org/, $25 each. This brings the full cost of installing 10 displays down to $250+.

Thank you Tara and David. Also, thank you Jeff and all the people who made this hackathon possible.

Jun 28, 2011

Pure Python Paillier Homomorphic Cryptosystem Implementation

What

This is a very basic Paillier Homomorphic Cryptosystem implemented in pure Python.

The idea is, in short, to encrypt two numbers, perform an “add” operation on cyphertexts, decrypt the result and find it to be the sum of the original plaintext numbers.

How

The code is loosely based on the thep project and a few ActiveState recipes. The code is pure Python and all objects are serializable.

Where

Here: https://github.com/mikeivanov/paillier

Why

I was bored.

Navigate
« To the past Page 1 of 2
About