Postgresql streaming replication

If you read the postgresql streaming replication documents it sounds really really complicated, but it isn't. There is a clever little tool called repmgr that can set up multiple slaves for you quite easily.  repmgr can also take care of promoting a slave to a master automatically if needed but you still have to tell the database clients that the server has changed.

Getting started

You need two servers with postgresql installed. For this discussion we will use 9.3 but you can use 9.4 as well. If you have an RPM based system such as Centos 7 it goes like this:

 yum -y install
yum -y groupinstall "PostgreSQL Database Server 9.3 PGDG" 
yum -y install repmgr93 


You will need a pair of SSH keys preferably without a password because repmgr uses rsync to copy a snapshot of the master server onto the slave. The key should belong to the postgres user which means it should be in /var/lib/pgsql/.ssh/ folder (a folder that you will most probably have to create). The should be in the authorized_keys file on the master. Don't forget to check the permissions and ownerships of this files SSH is very particular about it (chmod 700 .ssh and chmod 600 .ssh/*

Setting up the master

Yum placed a file named repmgr.conf in /etc/repmgr/9.3/repmgr/ for me. I decided the hell with it and created a file like this:

conninfo='host= user=repmgr dbname=repmgr'

Note that in some places in the repmgr documentation the username is given as repmgr_usr in and other places it's repmr similarly the database name is given as repmgr_db or repmgr it doesn't matter what you choose for the username or db name but make sure that it's the same everywhere

Now register the master with repmgr repmgr -f /etc/repmgr/9.3/repmgr.conf --verbose master register a minor detail is that you will need to run this command as postgres user and repmgr will probably need to added to the path.

The slave

If you were to do a manual replication this is the tricky part. But thanks to repmgr it's a one liner (run as postgres).

repmgr -d repmgr -U repmgr --force --verbose standby clone

If you didn't do an initdb you don't need the --force option. Now here is the gotcha, the above command copies over the postgresql.conf and the pg_hba.conf files from the master. That means you will need to edit postgresq.conf and change the listen directive. Without it the server will try to listen on the wrong IP address.

Next steps are straightforward, create the repmgr.conf file

conninfo='host= user=repmgr dbname=repmgr'

Start the server (systemctl start postgresql-9.3-service in Centos 7 /etc/init.d/postgresql start in older versions) and finally register the slave (run this command as postgres).

repmgr -f /path/to/repmgr/node2/repmgr.conf --verbose standby register

Now you can go and put the kettle on or whatever else it is you like to wear. The rest of this post is about what lead to my writing about postgresql replication.

For nearly two years operated with just log shipping WAL archives. We took a pg_basebackup about once a month (alright alright whenever I remembered to).  All the log files were rsynced to another machine but not applied. A warm standby it certainly wasn't. The logs just sat there in a folder waiting to be applied if needed. But we did test it once in a while though and the last time process took hours and hours, so after the carpool was launched I decided that streaming replication is the way to go. 

On Monday was offline for nearly an hour. It was due to the primary database server crashing, no data was lost but the automatic failover hadn't kicked in automatically. To make the failover automatic, you need to change the /etc/hosts file on the server and send a HUP signle to gunicorn but the tiny bash script that I wrote for this had a bug. It's just my luck that the outage occurred the very moment I stepped out and I am the only one with the keys to the server.

May 8, 2015, 5:48 a.m. » Tagged: postgresql , streaming , replication , repmgr , roadlk

Taking on side on the vim vs emacs debate

it started with a casual comment on twitter. Then at Gaveen's suggestion I decided to finally take a side and had to decide which side! After using a plethora of IDEs all these years I had very little knowledge vim and what I knew of emacs could be written on the back of a stamp. The little bit I knew about vim was thanks to the all the times when I need to edit something on a remote server. Even then for heavy work I prefer to edit locally and rsync.

These days most of my code is python with a bit of Java on Android but surely VIM isn't suitable for the former? As for the latter python-mode seems to be highly recommended so I decided to install it using vundle. This is what the plugins section of my .vimrc looks like

Plugin 'gmarik/vundle'
Plugin 'tpope/vim-fugitive'
Plugin 'scrooloose/nerdtree'
Plugin 'klen/python-mode'
Plugin 'majutsushi/tagbar'
Plugin 'davidhalter/jedi-vim'
Plugin 'joonty/vdebug'

As you can see I have decided not to rely on rope but to use jedi-vim instead, even then I had to make a slight change because autocomplete would automatically select the first item before I had time to select anything. For example when I typed the name of a Django model followed by the '.' jedi would append __str__ at the blink of an eye. That behaviour can be changed with the following settings in .vimrc

let g:jedi#popup_select_first = 0
let g:jedi#popup_on_dot = 0

It's only been a few days but I've already found Nerdtree to be indispensable while tagbar I haven't really made much use of yet. The biggest issue that I need to sort out though is debugging. So far I have tried Vundle without much success. Well, i can in fact get to stop at breakpoints but at that stage the vim tab with the code stops responding.

May 4, 2015, 6:30 a.m. » Tagged: python , vim , emacs

Making inroads into city traffic

mugshot Making inroads into city traffic, that was the title of a Sunday Times article nearly two months ago. Why the hell am I writing about it now? I am a firm believer of the old adage better late than never.

I do have a very good excuse for the delay; I was kept really busy by the carpool project which is what the story is all about. Besides this is the first time in many years that my ugly mug appeared on the newspapers. The last time was when I was an undergrad and our chess team was featured on the same paper. Oh hang on a sec, my photo did appear a few times when my mother in law (a veteran singer) gave out a few family photos when journos asked her to share them. But that doesn't count right?

What you know for is not exactly what Raditha Dissanayake had in mind for the site. After a fair share of glowering in traffic, his intention was to reduce the number of vehicles on the streets.

In the article we talk about how became known as a crowd sourced traffic alert system though my original plan was to make it a sub project. Even 6 weeks after the launch of the carpool service it's still better known for that, but hopefully things will change.

As to questions if he takes after his father and sister known for their penmanship, his response has long been “yes, I write code, which is sort of like poetry.”

Well I actually said code is poetry and it's something I often say whenever I am introduced to one of my father's friends.

Raditha likes to think of himself as being a “serial entrepreneur” and in fact one of his first innovations was a programme making online transactions possible. Finding a method of online payment accepted in Sri Lanka, because “systems like Paypal are still not functional here” has been one of the reasons for the hold-up

This is actually a misquote in what is otherwise a very well written article. (Thank you Venusha). I spoke of an online bookshop that I started along with three friends in the late nineties. There we processed online transactions. Since then I've seen many Sri Lankan websites claim to have been the first to have done online credit card processing in Sri Lanka but I seriously doubt if any of them were doing anything like that in the 90s.

Rather curiously though getting online payment facilities for proved to be a very big challenge. None of the service providers that I worked with in the past supported the travel industry (and according to them carpooling fell into the rather broad travel category). Thankfully HNB stepped into fill the vacuum and here wer are!

April 24, 2015, 4:41 a.m. » Tagged: carpool , sri lanka , roadlk

Dell B1160 'Held'

TL;DR: If you find that your B1160 jobs are being marked as 'held' in the printer status applet instead of being sent to the printer, you need to add your printer to CUPS after you have installed the Dell unified printer driver.


I hardly print anything at all so when my ancient HP p1105 printer died (or the toner was used up not sure which), it was time to shop around for a new budget printer and the Dell B1160 was what I chose. This will probably last a few years too and by the time the cartridge is used up it will be cheaper to replace the whole printer rather than the toner, but first lets try to get at least the test-page printed cause what ever you do nothing get's sent to the printer.

print job held

Dell has very kindly created a driver for this device instead of passing the buck to the open source community. The only trouble with the driver is that it's lookings for SANE libraries during the installation. I can see why you would need it for a multifunction printer but the B1160 doesn't have a built in scanner.

SANE for a printer?

So it turns out that using the dell unified driver installer is not sufficient you need to add the printer manually to CUPS as well. It so happens that cups may tell you that the  printer has already been added but if you try to remove it you will be told that the printer does not exist! So use system-config-printer or the cups web interface ( http://localhost:631 ) and after that you will find that items are no longer being 'held' 

Aug. 13, 2014, 11:39 p.m. » Tagged: dell , linux , printer , hp , cups

R In 24 Hours

I had prevously made a couple of half hearted attempts to learn the R language without much progress. Then inspired by my own Python in 48 hours and Ruby in 24 hours efforts, I decided to go on a crash course in R today. My interest in R is mostly because we have a lot data at that I would like to look at through different eyes. I also have access to a lot of scrabble tournament data that need analysis. The latter can certainly be done most easily with python but to be a good it never does any harm to have a lot of different weapons in your Armoury. So it's 7:31 in the morning of August 7th. Lets start.


The first stop was the Coursera R lang course, but lecturers aren't my thing. So next stop was the Datacamp courses but they seem to be too trivial. Can't blame them cause their target audience probably isn't hard core programmers but data scientists. Next stop was the r-projects documentation, but that's where I got lost in the maze. There were too many to choose from. So as at 5:49 am in the 9th of August, I haven't gone anywhere. The fact that I couldn't devote my full attention to R didn't help either. So this is going to need another effort next week.

Aug. 7, 2014, 1:58 a.m. » Tagged: R , coursera

ISC on Linux

Internet Scrabble Club is the cool place to play scrabble online. Some of the world's top players hang around at the ISC. Play online doesn't mean you can play with the browser, rather you need to download and install one of their clients. Fortunately they have clients available for Linux, Mac and Windows. The only trouble is that the linux client freezes at the slightest excuse. You can never seem to play more than one more before it either disconnects or stops responding at all. The obvious solution then is to try running it under wine.


ISC on Linux WINE


The windows exe running under wine doesn't even get as far as that. It fails at the login screen. The dialog box to enter the username and password is permanently damaged and wants repainting but it never seems to happen. So it was time to look back at the java version of wordbuff and it wasn't long before I figured out that the problem was because Open JDK and wordbuff doesnt' see eye to eye. It runs easily with the Oracle JVM.


export JAVA_HOME=/usr/local/java/jdk1.7.0_65/
/usr/local/java/jdk1.7.0_65/bin/java -jar wordbiz.jar


Ok a few footnotes:

The client software is named wordbuff. Does that have anything to do with Derek McKenzie who runs the popular website?

Wordbuff seems to save the username and password in a clear text file named Config

ISC seems to save the password in clear text on their server (if you try the password reset, they send you your old password back in a plain text email)


July 20, 2014, 5:28 a.m. » Tagged: Scrabble , ISC , wordbuff , java

NDB Queries

This blog now runs on Google App Engine with the NDB serving as the storage backend. In the process of writing the (python) code I ran into some complications regarding how it uses indexes. First we will consider a query on one of the wordpress tables in a mysql database.

explain select * from wp_posts where post_type='post' and comment_count = 0;

explain select * from wp_posts where comment_count = 0 and post_type='post';

The comments_count column does not have an index so mysql query optimizer is smart enough to figure out that the 'type_status_date' index is the right one to use for both queries. 


| id | select_type | table    | type | possible_keys    | key              | key_len | ref   | rows | Extra                              |
|  1 | SIMPLE      | wp_posts | ref  | type_status_date | type_status_date | 62      | const |  974 | Using index condition; Using where |
1 row in set (0.38 sec)


Now what if you tried to do a similar query on NDB? You would need indexes on both columns to begin with and you would also find that NDB expect two different indexes for these queries. One would take the form Index(comment_count, post_type) while the other would be Index(post_type, comment_count). Welome to the world of noSQL where no one really cares too much about storage. But if you are a GAE user you should care because you are paying for it as well as the read and write operations to the data store. More indexes you have higher the write cost. Unless you plan your queries judiciously you will find that they soon baloon out of hand but you are limited to 200 indexes max!

June 1, 2014, 11:39 p.m. » Tagged: GAE , NDB ,

Flipped the switch.

Flipped the switch. Finally. Wordpress has been switched off, as I have threatened to do so many times before and in fact claimed to have done at least twice! But this time it's real. This blog post is being written with the aid of a home made blogging system running on Google App Engine (python). The data is stored in NDB and the editor is CKeditor, with  the comments being powered by Disqus.

So what about the previous claim for switching to jekyll.  Well I did do so for the photoblog and I did do all the hard work for this one as well, but then for no apparent reason the whole damn thing stopped working. 

I couldn't be bothered to find out what caused it, particularly since Ruby isn't in my repotoire. So I said the hell with it and stopped blogging for a few months and then suddenly decided to come up with this system. It's said that the path to Django mastery lies through building your own blogging platform. Well I can well and truely say that I have done that now.

The system hasn't been without a few teething problems. The feed link broke and resulted in a lot of being being sent a bunch of notifications about new posts even when there weren't any (very sorry about that). And there are still a few 404 errors scattered about here and there (these will be fixed soon).

May 29, 2014, 1:32 a.m. » Tagged: Jekyll , Ruby , GAE , Python , Django