Wednesday, May 12, 2010

Python Crypto: State of the Art (Part 3)

The third part of my Python crypto saga is on the ActiveState blog.

Cryptlib: Peter Gutmann’s cryptlib library -- a powerful security toolkit that allows even inexperienced crypto programmers to easily add encryption and authentication services to their software. The high-level interface provides anyone with the ability to add strong security capabilities to an application in as little as half an hour, without needing to know any of the low-level details that make the encryption or authentication work...

Read the whole article here: http://blogs.activestate.com/2010/05/python-crypto-state-of-the-art-part-3/

Thursday, May 06, 2010

How to make cryptlib_py work on 64-bit platforms

Cryptlib is a totally fascinating cross-platform crypto library with Python bindings. Unfortunately, the current version (3.3.3) is unusable on 64-bit platforms. The good news though it is easily fixable.

Let's look what's going on there:
$ python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> from cryptlib_py import *
>>> cryptInit()
>>> sess = cryptCreateSession(CRYPT_UNUSED, CRYPT_SESSION_SSH)
>>> sess.CRYPT_SESSINFO_SERVER_NAME = "myserver.com"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 32, in __setattr__
cryptlib_py.CryptException: (-2, 'Bad argument, parameter 2')
>>> 
No good.

The problem is the way PyObject_AsWriteBuffer (and PyObject_AsCharBuffer) is called. The last parameter is declared as a pointer to Py_ssize_t type, which can be either 32 or 64 bit depending on the platform. Here is the actual declaration in the abstract.h file:
PyAPI_FUNC(int) PyObject_AsWriteBuffer(PyObject *obj,
                      void **buffer,
                      Py_ssize_t *buffer_len);
Now, here is how this function is used in cryptlib_py (python.c, line 21):
if (PyObject_AsWriteBuffer(objPtr, bytesPtrPtr, lengthPtr) == -1)
Where lengthPtr is declared as "int* lengthPtr". See the problem?

This is what happens: the function expects a pointer to a 8-byte blob but instead it is provided with a pointer to a 4-byte int. Being unaware of that, the function smashes the variable next to the one pointed by lengthPtr.

The solution is to patch cryptlibConverter.py, the script generating bindings/python.c. Although this problem is very likely to be fixed in the next cryptlib release, here is what you do if you can't wait:
mkdir cryptlib; cd cryptlib
curl -O ftp://ftp.franken.de/pub/crypt/cryptlib/cl333.zip
unzip cl333.zip
curl -O http://mikeivanov.com/pc/cryptlibConverter.py.patch
patch -p0 < cryptlibConverter.py.patch
python tools/cryptlibConverter.py cryptlib.h bindings python
make
cd bindings
python setup.py build
sudo python setup.py install
Fixed! The patched version is working just fine:
$ python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> from cryptlib_py import *
>>> cryptInit()
>>> sess = cryptCreateSession(CRYPT_UNUSED, CRYPT_SESSION_SSH)
>>> sess.CRYPT_SESSINFO_SERVER_NAME = "myserver.com"
>>> sess.CRYPT_SESSINFO_USERNAME = "mike"
>>> sess.CRYPT_SESSINFO_PASSWORD = raw_input("pwd=")
pwd=mypassword
>>> sess.CRYPT_SESSINFO_ACTIVE = 1
>>> data = array('c', '\0' * 1024)
>>> recv = cryptPopData(sess, data, 1024)
>>> print data.tostring()[:recv]
Linux myserver.com 2.6 XXXXXXXXX x86_64 GNU/Linux
Ubuntu 10.04 LTS
.......
>>> cryptPushData(sess, "uptime\n")
7
>>> cryptFlushData(sess)
>>> recv = cryptPopData(sess, data, 1024)
>>> print data.tostring()[:recv]
uptime
 18:17:15 up 6 days,  1:54, 13 users,  load average: 0.30, 0.27, 0.26
mike@myserver:~$ 
>>> 

Tuesday, April 06, 2010

Clouds and entropy

In a post titled "A Trusted Cloud Entropy Authority" Reuven Cohen writes:
"...maybe there an opportunity to create a trusted cloud authority to provide signed verified and certified entropy. Think of it like a certificate authority (CA) but for chaos. Actually, Amazon Web Service itself could act as this entropy authority via a simple encrypted web service call. I even have a name for it, Simple Entropy Service (SES)."
This is really a good idea. Amazon should have provided such a service long time ago.

When an SSL connection is being established, a browser and a server perform the Handshake protocol. This protocol involves exchanging random bits between the parties. The important thing is that security depends on how random those bits are. If they are not, the connection is effectively insecure.

In the case of AWS, there is no source of true randomness, therefore SSL on AWS is inherently insecure. Moreover, instances running on the same physical machine can affect each other's security by draining the shared random pool in the host system.

Further he writes:
"a website called http://random.org [is] a true random number service that generates randomness via atmospheric noise. Looks cool, maybe this may help solve the problem."
I think that random.org is not a good choice for several reasons.

One problem is a connection to such a service. It should be as secure as the most secure secret handled on your system. If the random bit connection is encrypted with 256 bit AES (and it actually is), this is the highest level of security your system can provide. Plus, there should be guarantee that no unencryped random bits are stored anywhere. The same is true for the proposed SAS service, too.

Another problem with random.org is... well, randomness is perceptive. What you see as "random" can be quite deterministic to the people who run the random.org service. Even though they might not store anything, their present is your future--just think about relativistic effects. A temptation to tamper with someone's future can be, you know, very strong.

The overall quality of the service is not known. There is no guarantee it is random at all. A quote from their FAQ: "Q1.2: Is the source code for the generator available? -- Not currently, no. Maybe I'll make it available as open source some day."

Even though the Whois database indicates the domain name's registrant is located in France, the SSL certificate owner is not specified. I have no reasons for not believing the guy running the service, but I would not entrust my customers' data into a total stranger's hands, even though he or she seems to be a nice person.

So the conclusion is: while there is no trusted entropy generator on the AWS side, we, the AWS customers, are on our own.

Here is a hint: entropy seeds can be generated in-house and smuggled into instances over a secure channel. Then those seeds could be fed to a cryptographically secure RNG like Isaac to produce actual "random" bits. I think there should be a way of injecting those into the instance's random pool.

Thursday, March 18, 2010

Concurrent processes

This is interesting and inspiring: more and more people start talking about threading not being a programming model.

Joe Gregorio from Google had a talk at PyCon 2010 on this topic. Watch the video at http://python.mirocommunity.org/video/1600/pycon-2010-threading-is-not-a-

Roberto Ierusalimschy said it in his Stanford University talk: "concurrency should follow the model of communicating processes [instead of threads which were designed as an OS-level construct]". The video is here: http://stanford-online.stanford.edu/courses/ee380/100310-ee380-300.asx

I think we are approaching some critical point after which the multithreading non-model will start disappearing from the application programming landscape. It's really the time for threads to go where they belong: the under-the-hood realm of implementation artefacts, along with garbage collectors and i/o caches.

Processes FTW!

Tuesday, March 02, 2010

Sunday, February 21, 2010

Python Crypto: State of the Art (Part 1)

There is a lot of interest in doing cryptography using Python these days which has resulted in quite an impressive number of cryptography related Python modules out there. PyPI package index alone has about 50 ‘cryptography’-tagged entries....

Read the rest of the article here:
http://blogs.activestate.com/2010/02/python-crypto-state-of-the-art-part-1/

Monday, August 24, 2009

Fun with backups

Backing a large database up can be tricky.

The default Postgres backup facility works very well on relatively small databases. However, as the complexity increases and the number of tables grows, it becomes slower and slower. On a 10,000-table database pg_dumpall could spend several hours just on gathering database structure information.

One particular reason for that is locking: pg_dump sets a lock on every single table it backs up, and those locks are expensive. A remedy to this problem could be disabling table locks if pg_dump supported such an option. Unfortunately, it does not, so there is no choice but brute force.

Grab the sources, find pg_dump.c, locate and comment out this fragment completely:
if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
{
resetPQExpBuffer(query);
appendPQExpBuffer(query,
"LOCK TABLE %s IN ACCESS SHARE MODE",
fmtQualifiedId(tblinfo[i].dobj.namespace->dobj.name,
tblinfo[i].dobj.name));
do_sql_command(g_conn, query->data);
}
Then rebuild the whole source tree without installing it (just make should be enough), locate pg_dump executable, rename it to something like pg_dump_nolock and place under /usr/local/bin or similar location.

The performance gain depends on the schema size, in my case it was more than 100%.

This approach is not for everybody, though. As no locks are applied to the tables, no backup consistency is guaranteed. It has to be ensured by some other means like time-split backup/upgrade procedures, filesystem-level locks, etc. This, however, is rarely an issue: production database schemas don't change often.