Saturday, 23 June 2012

DIY Image Search: Part 1 - Introduction

I'm sure everyone has had an awesome picture lying around on their hard drive for what seems like forever, with a gibberish file name and no clues as to the source.  Most people would simply navigate to a site like TinEye and use their awesome service to identify the image and find a source.  Not you, you're interested in building some kind of software hacked together with ducktape and string, or else why would you be reading this?

I'm going to be flying by the seat of my pants for this series, and I'm giving very little forethought to future posts.  Things probably will be changing dramatically as I progress through this project, so I'm going to apologise in advance if everything isn't as contiguous as it should be.

The Tools

To build this monstrosity, I'm going to use a couple of interesting projects: pHash and py-pHash , django and django-orm, and PostgreSQL with the smlar extension.  I won't be going over django or PostgreSQL, so if they are foreign to you, take some time to learn about them.

What is pHash?


To understand pHash, you'll need a basic understanding of hashes.  If you have no idea what they are, go and check out the hash function page on wikipedia.  The Coles Notes version is that a hash function transforms an arbitrarily-sized input into a smaller fixed-size output.

There is a staggering variety of hashes, and an even more astounding number of applications for them.  The type of hash we're using, a perceptual hash, maps visually (or acoustically) similar input to similar output hashes.  Take a look at the following example:

Tree Picture by Calibas / Cervus Elaphus Picture by Luc Viatour

On the left we have our images, and on the right, the pHash for the image.  Each red digit indicates a nibble that differs from the pHash of the original colour picture.  Notice how the colour and greyscale images have very similar hashes, while the third picture has a dramatically different hash.

By now, you can probably see how a perceptual hash would be useful: to find images similar to A, we compute the hash for A and search for all images within a threshold distance.

py-pHash is a Python wrapper for the pHash library, which will make integrating it with django much easier later on.

What is django-orm?


Django-orm is a collection of 3rd-party extensions to django's already pretty awesome database system.  It adds support for a metric tonne of new database features, like negated F expressions and full text search, but the most interesting feature it adds to django is the PostgreSQL specific ArrayField.

The smlar extension uses arrays extensively, so having support for them in django is essential.

What is smlar?


Smlar is an extension for PostgreSQL built by Oleg Bartunov and Teodor Sigaev.  It allows you to make effective similarity searches in PostgreSQL databases on pretty much any kind of data, as long as you can put it in an array.  Alexey Vasiliev goes into a lot more depth on similarity searches on his blog, including some example code, but basically the relevant information is that smlar adds efficient similarity searches and indexes.

We'll be using smlar to find similar hashes, which will let us find similar images very quickly.

The Setup

I'm running Ubuntu 12.04, so my instructions are going to be very biased towards (read: only for) Linux.  Convert the instructions to use the package management system of your choice! I'm using the standard Python that ships with Ubuntu, which at the time of writing is 2.7.3.

PostgreSQL & smlar


# Leave a comment if I missed something,
# which is more than probable...

sudo apt-get install postgresql-9.1 \
                     postgresql-server-dev-9.1 \
                     git build-essential

# Clone smlar and build/install it
git clone git://sigaev.ru/smlar
cd smlar
make && sudo make install

# Enable the extension
sudo -u postgres psql -c 'CREATE EXTENSION smlar'

# Setup django user and database
sudo -u postgres psql -c 'CREATE DATABASE lostpic'
sudo -u postgres psql -c "CREATE USER lostpic WITH PASSWORD 'password1'"
sudo -u postgres psql -c 'GRANT ALL PRIVILEGES ON DATABASE lostpic TO lostpic'


pHash


sudo apt-get install libphash0 libphash0-dev

Virtualenv


I'll be using virtualenv to manage a separate Python environment to make deployment easier.  This step is optional, but recommended.
# Install virtualenv
sudo apt-get install python-virtualenv

cd /path/to/project

# Make the virtual environment
virtualenv pyenv

# Activate it (its 'deactivate' to exit)
. ./pyenv/bin/activate

Django & Dependencies


pip install psycopg2 django django-orm
pip install git+https://github.com/polachok/py-phash.git

django-admin.py startproject lostpic

cd lostpic/lostpic

Edit settings.py with your favourite editor and make the following changes:
  • Set the database backend to postgresql_psycopg2
  • The database user is lostpic, the password is password1
  • Uncomment the admin application in INSTALLED_APPS
Don't forget to edit urls.py to enable the admin.

Next Up

In the next instalment, we'll be building the models to store picture information and writing the functions to look up pictures by similarity.

Sunday, 10 June 2012

Robots!

If you've been around me in the past couple of days, you've probably heard me talking endlessly about robots, and possibly PyPy, and my trials and tribulations with RPython.  I've been working pretty hard on building something fun that isn't a website, and I think I've succeeded, at least at entertaining myself.

A little bit of back story would probably be useful here.  I first got interested in programming when I read about Java, back in the early 2000s.  I saw a magazine in a news stand proclaiming the awesomeness of Sun's baby, and had to buy it.  Thankfully as a ten year old, my money was practically non-existent and I was unable to acquire it.  I forgot about programming for months, until Christmas rolled around and my parents bought me Borland's C++ with a Sam's Teach Yourself C++ in 21 days.

After miserably failing to wrap my brain around pointers, I began looking at other ways to program, and I eventually discovered AT Robots, Corewars, and other awesome games.  These games, especially Robocom, are what inspired tonight's project.

I give you Robots!  This is my way of saying thank you to those who got me interested in programming (I hope no one ever finds this post, since its a fairly terrible thank you, but I digress.)

In Robots! you program a single robot whose goal is to eliminate the other Robot teams on the field.  Robots can move, scan, replicate and transfer instructions between each other.  The user interface is limited to a grid with coloured circles, or an even worse command line interface, but I think it gets the point across:


The game is played on a 100 by 100 grid where the edges wrap around, torus style.

Robots are programmed in a very simple assembly like language that is interpreted by an RPython program.  The game state is transferred to the user interface, which is written in full Python with wxPython and Cairo.

A sample Robot program:

:start
build $left

:program
set L1 :end
sub L1 1

:program-loop
if $lt L1 :new
jump (:start)       ' Exit the loop when the counter reaches :new

set L0 L1           ' Copy source location to L0
sub L0 :new         ' Make relative to :start

xfer $left L1 L0    ' Transfer the new instruction from L1 to L0
sub L1 1
jump (:program-loop)

:new
go $up
jump (:new)
:end

This simple program builds copies of itself and sends them on their merry way around the board.

Feel free to take a look at the code over on GitHub and play around: