Sunday, August 23, 2009

Fun with backups

Backing a large database up can be tricky.

The default Postgres backup facility works very well on relatively small databases. However, as the complexity increases and the number of tables grows, it becomes slower and slower. On a 10,000-table database pg_dumpall could spend several hours just on gathering database structure information.

One particular reason for that is locking: pg_dump sets a lock on every single table it backs up, and those locks are expensive. A remedy to this problem could be disabling table locks if pg_dump supported such an option. Unfortunately, it does not, so there is no choice but brute force.

Grab the sources, find pg_dump.c, locate and comment out this fragment completely:
if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
         {
             resetPQExpBuffer(query);
             appendPQExpBuffer(query,
                               "LOCK TABLE %s IN ACCESS SHARE MODE",
                          fmtQualifiedId(tblinfo[i].dobj.namespace->dobj.name,
                                         tblinfo[i].dobj.name));
             do_sql_command(g_conn, query->data);
         }
Then rebuild the whole source tree without installing it (just make should be enough), locate pg_dump executable, rename it to something like pg_dump_nolock and place under /usr/local/bin or similar location.

The performance gain depends on the schema size, in my case it was more than 100%.

This approach is not for everybody, though. As no locks are applied to the tables, no backup consistency is guaranteed. It has to be ensured by some other means like time-split backup/upgrade procedures, filesystem-level locks, etc. This, however, is rarely an issue: production database schemas don't change often.

Thursday, July 16, 2009

Dynamic queries: the Postgres way

Have you ever been in a situation when you needed a query to be generated and executed dynamically as a result of another query?

If yes, read further: there is a simple and elegant way to achieve that.

First, create this function:
CREATE OR REPLACE FUNCTION dselect(varchar)
RETURNS SETOF record AS $$
DECLARE rec record;
BEGIN
FOR rec IN EXECUTE $1 LOOP
RETURN NEXT rec;
END LOOP;
END
$$ LANGUAGE 'plpgsql';

Second, umm.. well, that's it.

Anything you pass as a parameter will be interpreted and executed as an SQL statement.

The function is SELECTable. That is, you can use it in SELECTs:
SELECT * FROM dselect('SELECT id, name FROM users') AS t(id int, name varchar);

Please note the "AS t(id int, name varchar)" part. Postgres has no idea about what this function returns, so a column definition list should be provided. If not, Postgres will complain:
SELECT * FROM dselect('SELECT id, name FROM users');
ERROR: a column definition list is required for functions returning "record"

Of course, the column definition list depends on the query and should match the actual query results.

So why this function is needed at all?

Because in situations like this:
CREATE SCHEMA sc_foo;
CREATE TABLE sc_foo.activity (date date, descr text);
INSERT INTO sc_foo.activity (date, descr) VALUES ('2009-08-07', 'went there');
INSERT INTO sc_foo.activity (date, descr) VALUES ('2009-06-04', 'hanging around');

CREATE SCHEMA sc_bar;
CREATE TABLE sc_bar.activity (date date, descr text);
INSERT INTO sc_bar.activity (date, descr) VALUES ('2009-10-11', 'came here');

You can do this:
SELECT 
nspname
FROM
pg_namespace
WHERE
nspname LIKE 'sc_%' AND
(SELECT date FROM dselect('SELECT max(date) FROM ' || nspname || '.activity') AS t(date date)) > '2009-09-09';

nspname
---------
sc_bar
(1 row)

In one pass this query goes over all the schemas whose names start with 'sc_', grabs the latest date from the schema's activity tables and matches the result against the provided date.

Of course, this approach is quite ineffective. Each time the function is called, a query is parsed and executed using a separate plan. A simple UNION of two queries would do the same, but... what if there are ten schemas? How about a hundred? I'm actually working with a database containing thousands of them.

I use this function when I need to collect statistics or do some db administration tasks, it saves me a lot of time.

Tuesday, June 30, 2009

So you want it fast

Postgres can be slow sometimes. There could be a whole bunch of reasons why: a slow disk system, suboptimal SQL queries or even more subtle things like inefficient schema layout.

But before diving into the realm of fine SQL tuning it is always worth to make one simple check. It costs nothing yet it can save countless hours of chasing non-existing issues in non-relevant areas.

Where is my cache, dude?

Postgres is sensitive to the filesystem cache. To be completely precise, any RDBMS that stores data in the file system (as opposed to storing it in a separate partition) suffers from the file system cache issues.

Please note, this is in no way related to the Postgres shared buffers, which is a totally different thing. What I am talking about is the operating system file cache, an area in the main memory where the operating system stores often read and written files for quicker access. The size of that area is extremely important for the normal Postgres work.

So, here is the very first step towards blazing fast Postgres:

$ cat /proc/meminfo | grep -P '(^Mem|^Cached)'

You will see something like this:

MemTotal: 509884 kB
MemFree: 177016 kB
Cached: 254920 kB


This is an example of a healthy, albeit small system. In fact, it is one of my virtual machines. It has enough free memory for emergencies like vacuums; it has about 250 megabytes of the file system cache, everything is fine.

Now look at this:

$ cat /proc/meminfo | grep -P '(^Mem|^Cached)'
MemTotal: 1747764 kB
MemFree: 14376 kB
Cached: 7944 kB


See the problem? Nearly all the memory is taken, apparently by some processes. Because no memory is left for caching, there is nothing to speed up file access with. How can Postgres possibly be fast in this environment?

By the way, the same information (and a whole lot more) can be obtained with the top utility. It's an interactive program, but it can be run in the batch mode: top -b -n 1

Here is how the top's output looks like:
top - 18:24:50 up  4:56,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.1%sy, 0.1%ni, 99.6%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 509884k total, 333180k used, 176704k free, 11976k buffers
Swap: 548568k total, 0k used, 548568k free, 255152k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 10308 716 600 S 0.0 0.1 0:01.18 init
2 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root RT -5 0 0 0 S 0.0 0.0 0:00.00 migration/0
.....

The memory usage information is in the top part.

The line starting with "Mem:" contains some essential memory information. This line is self-explanatory:
Mem:    509884k total,   333180k used,   176704k free,    11976k buffers

The "buffers" field shows how much memory is allocated for... well... the system buffers. It is a specialized area in the memory being kept just in case. When something happens, like a program allocates a lot of memory, this area is used first. The system slices an appropriate chunk of memory off the buffers and gives it to the program. To remedy the loss it usually drops a part of the file system cache.

When there is no cache left, the system starts unloading inactive processes to the disk. This process is called swapping and is very hurtful for the system performance. By any means swapping should be avoided.

The line coming after the memory status is about Swap--the area on the disk where swapped processed are loaded out. It looks like this:
Swap:  548568k total,        0k used,  548568k free,   255152k cached

Zero swap usage, all the swap area is free. That's fine, this is how it supposed to be.

Some people suggest to turn swapping off completely by disabling the swap area. It's not a good idea. Swap is a safety measure designed to prevent system crashes in situations where no physical memory is left. It is pretty much like a safety belt: having it on will not prevent a system from accidents, but will let its data survive.

The last number in the line is for the file system cache. Oddly enough, it appears here among swap metrics, which confuses a lot of people, including myself.

Now a few words on how to make the file system cache bigger. Unfortunately there is no magical config parameter for that. The system will make the decision itself based on how much free memory is available. It's a wise idea not to allow processes take more than 50% of the physical memory. Say there is 1G of RAM. Very roughly the postmaster process takes about 30M, ten connections about 10M each, various Postgres services 50M altogether, then a 150M shared memory area for the Postgres shared buffers. Add 150-200M for various operating system services. All this gives about 500M of the process memory while the rest half of the gigabyte is available for caches and buffers. An ideal situation for a database under 500M.

If processes take more than 1/2 of the main memory, the situation can be improved by eliminating unnecessary ones (slocate is the candidate #1), decreasing the Postgres shared buffers and so on. It's a good idea not to have Postgres and Apache running on the same machine for this very reason.

Surprisingly often, this is all you need to make Postgres a happier, faster program.

Tuesday, May 12, 2009

Ongoing

Hooray! Another round of our apartment renovation epopea is over. Now as I have more free time I can start writing again. 

A lot of things happened to us lately, mostly good things. 

First I've got a new job. I'm with ActiveState now. It's a long story how this happened and what's going on there. I promise to tell more later. 

Soon after that we've been granted Canadian citizenship. A whole period of our lives which started ten years ago--yes, it took more than ten years from the beginning to the end--is over. That's a really good thing and a precious gift, long time awaited and very appreciated.

That's all for now.

Friday, March 13, 2009

The Adventure of Life

Somebody asked me recently: 'what really ticks you?'

A simple question that supposes a simple answer, isn't it? I talked about my family of course, and having created something useful, then couple of other things... Well it was just a friendly conversation but I really struggled finding the right word and I couldn't. I couldn't articulate a simple and precise answer.

Suddenly a word popped up in my mind, just one word describing everything that matters: adventure.

I really do like this word, semantically it is close to words like 'advance' and 'advantage'. And to 'advent', too.

Really, for me the whole life is a big, wild ever-going adventure.

Moving to Canada--oh yes, that was an adventure! My current job--for sure, it's quite an adventure, too. Software development is an adventure on itself, what a fun is to take a part in it! I still enjoy adventures of learning new things, getting to new places, discovering what is out there.

My favorite is the adventure of having a family which started many years ago as an adventure of marriage. A true adventure of everlasting love. A limited in time, but really fun adventure of growing the kids, followed by another adventure of seeing them going their own ways.

A joyful adventure of meeting other people, knowing them, learning from them. A sometimes sorrowful adventure of knowing problems and needs of others, feeling their pain and helping them out. A really difficult adventure of becoming a better human being.

A truly thriving adventure of meeting and understanding God, learning His paths. The most important adventure of journeying to His Kingdom.

The greatest adventure of what lies ahead.

That's the adventure of my life, and I must tell you I'm really having fun going down the road!

Friday, March 6, 2009

Code quality and SCC systems

Well, I've just got another crazy idea...

From days of my programming in Java I remember an IDE called Together. One of its features I really liked was the code analysis tool. The thing would scan you code, collect metrics like LOC/method, cyclomatic complexity, number of methods per class, test coverage and so on and give you an impression of how quality your code is.

If I recall properly there was something similar in the Idea IDE. Also, I believe MS Visual Studio does code analysis, too.

Anyways, I thought: well, why this should be an IDE feature at all, when it really belongs to source code control systems?

Just imagine GitHub or BitBucket telling you "you know what, this file you have just uploaded is a total crap." Neat, huh?

Seriously, I would really appreciate an ability to browse my source code repository and actually see quality metrics along with the files/directories. It could be as simple as just one more column with code quality stats.

What really amazes me is why nobody came up with such a thing yet.

A web DVD editor

No, I can't post a link to that, it's just a raw idea.

Imagine a service where you log in and somehow upload your movies or just provide links to Youtube or PhotoBucket stuff. Then you put together a DVD menu composed of previews generated from your clips and some text you entered in.

Then you press the big red "Burn" button and--voila--you have a DVD ISO ready to download.

Or even better you press another big green "Send" button, enter your friend's or mum's address and a couple days later they receive a shiny new DVD in a nice colorful envelope.

Apparently a lot of people would wishfully pay a few bucks for such a convenience.

So since I really have no time to implement it, the idea is yours. If you like it--go ahead, but don't forget about me when it starts paying off :-)