James Little

Archive for the ‘MySQL’ Category

Detect visitor’s country with PHP & MySQL

View Comments

Of course you don’t have to use PHP, or MySQL for that matter. But it’s my method of choice for most web apps, and it’s also a pretty common one. The general gist is to do a lookup on a database of geographical locations for IP addresses, having taken your visitor’s IP address from the PHP superglobal array $_SERVER. Yes there are caveats: the database is not 100% complete/accurate, and some ISPs (like AOL!) use proxies across different countries so the user will appear to come from somewhere other than their true country of origin. Boo hoo, let’s do it anyway; according to MaxMind, their free(!) GeoLite Country database is 99.3% accurate, and their licensed version, 99.8%.

The database is released monthly in CSV format, so I’ll have to import it into MySQL using mysqlimport, or LOAD DATA INFILE. I prefer the first option. Those of you that are MySQL fans probably know that there is a CSV storage engine available, but that’s only in version 5.1 which is still in Release Candidate stage, so I’ll stick to mysqlimport.

Download the GeoLite database from Maxmind, extract the CSV file and rename it to something more convenient; mysqlimport uses the filename for the name of the MySQL table it imports into:
root@jim-desktop:/home/jim/data# wget http://www.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip
root@jim-desktop:/home/jim/data# unzip GeoIPCountryCSV.zip
root@jim-desktop:/home/jim/data# mv GeoIPCountryWhois.csv geo_csv.csv

Before we import the data into MySQL we need to create a table for it to go into. The following DDL accurately describes the structure of the data. Obviously create a new database if you want; here I have one called geo_ip:
CREATE TABLE  `geo_ip`.`geo_csv` (
`start_ip` char(15) NOT NULL,
`end_ip` char(15) NOT NULL,
`start` int(10) unsigned NOT NULL,
`end` int(10) unsigned NOT NULL,
`cc` char(2) NOT NULL,
`cn` varchar(50) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

If you look at the data in the CSV file you’ll see it’s delimited by commas and the text is qualified by double quotes. With that in mind, we use the following statement to import the data:
root@jim-desktop:/home/jim/data# mysqlimport --fields-terminated-by=","  --fields-optionally-enclosed-by="\"" --lines-terminated-by="\n" geo_ip /home/jim/data/geo_csv.csv
geo_ip.geo_csv: Records: 102957  Deleted: 0  Skipped: 0  Warnings: 0

If the mysqlimport binary is not in your environment path, use locate to find it. If you don’t have it at all then use LOAD DATA INFILE.

So we now have the raw data imported into MySQL, but how do we use it? First let’s take a look at the data:
mysql> select * from geo_csv order by rand() limit 10;
+---------------+----------------+------------+------------+----+----------------+
| start_ip      | end_ip         | start      | end        | cc | cn             |
+---------------+----------------+------------+------------+----+----------------+
| 207.209.7.0   | 207.209.7.255  | 3486582528 | 3486582783 | AU | Australia      |
| 79.99.200.0   | 79.99.207.255  | 1331939328 | 1331941375 | BE | Belgium        |
| 217.27.192.0  | 217.27.207.255 | 3642474496 | 3642478591 | DE | Germany        |
| 194.59.180.0  | 194.59.180.255 | 3258692608 | 3258692863 | FR | France         |
| 81.16.160.0   | 81.16.175.255  | 1360044032 | 1360048127 | SE | Sweden         |
| 62.23.198.192 | 62.23.198.207  | 1041745600 | 1041745615 | GB | United Kingdom |
| 64.49.231.240 | 64.49.232.15   | 1077012464 | 1077012495 | US | United States  |
| 83.217.68.32  | 83.217.68.95   | 1406747680 | 1406747743 | BE | Belgium        |
| 91.193.20.0   | 91.193.23.255  | 1539380224 | 1539381247 | CH | Switzerland    |
| 194.37.249.0  | 194.37.249.255 | 3257268480 | 3257268735 | SE | Sweden         |
+---------------+----------------+------------+------------+----+----------------+
10 rows in set (0.22 sec)

The table is essentially a big list (~103k records) of IP ranges, given in both dot-decimal and decimal form. The decimal form is the most efficient to search on as the datatype int requires less memory than char, and with integers we can reliably make use of operators such as greater than, less than, BETWEEN, etc. Exactly how you will use the data will depend on your scenario. I began looking into this when I was working on a German website that wanted to know when a visitor was from Switzerland, so it could display prices in CHF rather than Euros. So in fact, all I needed to know was whether the visitor was Swiss, and if they were from any other country, they would see Euros. So the only columns I’ll need from the table are start and end, and all the rows belonging to Switzerland, or ‘CH’. So to make searches more efficient I’ll grab only the data I need and put it in a new table called ch_ip:
mysql> create table ch_ip as select start,end from geo_csv where cc='CH';
Query OK, 2023 rows affected (0.05 sec)
Records: 2023  Duplicates: 0  Warnings: 0

Great, that’s cut the data from nearly 103 thousand records to just over 2 thousand, and we’ve also lopped off four columns. I’ll now be searching on a table that’s 18K in size, rather than the original 5.3MB. Maybe you need every row of data in your scenario, but in many cases you only need a fraction. And in any case, you really don’t need the start_ip and end_ip columns (as you will see shortly). You could also split off the country names (cn column) into another table so that cc becomes a foreign key. Or you could ditch the country names completely and create an array of CC => CN in your application; there are only 239 unique CCs after all:
mysql> select count(distinct cc) from geo_csv;
+--------------------+
| count(distinct cc) |
+--------------------+
|                239 |
+--------------------+
1 row in set (0.05 sec)

So I have my table of 2,023 Swiss IP ranges. Now I need to grab a visitor’s IP address and convert it into decimal notation. For this we can use PHP’s built-in function ip2long. We use sprintf to ensure the result is always unsigned:
<?php
$ip_num = sprintf("%u", ip2long($_SERVER['REMOTE_ADDR']));
?>

Once we have $ip_num we can create our MySQL query:
$qry = "SELECT '' FROM ch_ip WHERE $ip_num BETWEEN start AND end";

All we need to know is whether the query returns > 0 rows. If it does, then the visitor is Swiss and we’ll set their locale appropriately. Obviously we don’t want to be performing this query on every page, so once it has been performed once for the visitor we’ll set a session variable. So the final code looks like this:
<?php
session_start();
if (!session_is_registered("locale")) { //check if the session variable has already been set first
$con = mysql_connect('localhost', 'geo_user', 'geo_password');
if ($con) {
$ip_num = sprintf("%u", ip2long($_SERVER['REMOTE_ADDR']));
mysql_select_db("geo_ip", $con);
$result = mysql_query( "SELECT '' FROM ch_ip WHERE $ip_num BETWEEN start AND end" );
$num_rows = mysql_num_rows($result);

if ($num_rows > 0) {
$_SESSION['locale'] = "ch";
}
else { $_SESSION['locale'] = "de"; }
}
else { $_SESSION['locale'] = "de"; //If no db connection can be made then set their locale to German }
}
?>

Written by James

June 7th, 2008 at 6:17 pm

Posted in MySQL, PHP

Tagged with , ,

mysqlslap for MySQL 5.0

View Comments

mysqlslap is a very useful tool for emulating client load, something that would normally be very difficult in the real world (until you go live!). The binary is bundled into the MySQL 5.1 releases (still at Release Candidate stage) but not 5.0, so the only option is to compile from 5.1 source and then you can use it with your 5.0 server installation.

Fortunately when compiling you can save some time by configuring with the --without-server option, which will compile just the client tools (mysqldump, mysqlbinlog, mysql CLI, etc.). The following worked for me on an installation of CentOS5 64-bit, on the same machine that runs my 5.0 server.

1. Download the MySQL 5.1 source code in compressed tar format (.tar.gz). Go to the download page, or do a wget (in my case from the Mirror Service from the University of Kent, UK). Version 5.1.23-rc was current at the time of writing:
wget http://dev.mysql.com/get/Downloads/MySQL-5.1/mysql-5.1.23-rc.tar.gz/from/http://www.mirrorservice.org/sites/ftp.mysql.com/

2. Unpack the archive:
tar -xvvzf mysql-5.1.23-rc.tar.gz

3. Install required development packages and compile:
cd mysql-5.1.23-rc
yum install glibc gcc libtool ncurses-devel gcc-c++
./configure --without-server --disable-shared
make
make install

The ‘make’ stage will take some time. Why configure with --disable-shared? I wanted my binary to be portable so I could share amongst a group of similar-spec machines, and I can’t be sure that the shared libraries are all in the same location. See the MySQL Installation pages for more details on configuration options; you may want to make use of --libdir=... instead.

4. Copy/move the mysqlslap binary (it will be in /usr/local/bin by default) to wherever the rest of your v5.0 client binaries are. For example:
cp /usr/local/bin/mysqlslap /usr/local/mysql/bin/mysqlslap
5. Test it out! Run something like:
./mysqlslap --user=root --auto-generate-sql --concurrency=100 --iterations=5
as a quick test, and check out the MySQL Documentation for a more detailed use-guide.

Since compiling I have thrown the binary around various other Intel machines (and VMs) running CentOS, without any problems. Read the rest of this entry »

Written by James

April 2nd, 2008 at 3:18 pm

Posted in Linux, MySQL

Tagged with ,

How to Access MySQL with an SSH Tunnel

View Comments

This is a particularly useful method for gaining access to your remote MySQL databases, such as those held on a web hosting account where the MySQL port may not be open. You can use this method to gain access to other services too (SMTP, IMAP, FTP), but in this post I’ll explain how I use it in combination with MySQL Query Browser to administrate my DBs with a GUI. You need to have SSH access to your remote server (normally over port 22) for this to work. My instructions are for Ubuntu but it’s easily transferred to other distros, Mac OS X, and Windows (just download an SSH client).

Run sudo apt-get install ssh if it isn’t installed already, which will install several SSH connectivity tools (more info here). Query Browser is an excellent tool with which to run queries, updates, create views and stored procedures, and loads more besides. Run sudo apt-get install mysql-query-browser. Now to create the SSH tunnel by using port forwarding; here’s how I access a MySQL instance on my local network:
james@james-laptop:~$ ssh -L 3307:localhost:3306 root@192.168.1.211
root@192.168.1.211's password:

Essentially this forwards all traffic on port 3307 on the local machine (james-laptop) to port 3306 on 192.168.1.211. The general format is
ssh -L localport:host:hostport
. Note that in my example I used localhost, but this is resolved after the connection has been made to 192.168.1.211 and so it refers to that IP address.

Effectively we can now access port 3306 (the default MySQL port) on 192.168.1.211 via port 3307 on james-laptop even though port 22 (the SSH port) is the only port open on 192.168.1.211. Keep the connection open (i.e. don’t close the terminal) and open Query Browser. In the connection dialogue set the hostname to 127.0.0.1 and the port to 3307 and enter a username/password as required. Hit connect and you should see a graphical representation of your database(s). Note that in *nix OSs (including Mac OS X) you must use 127.0.0.1 rather than ‘localhost’ or the connection will be made via a named pipe rather than TCP.

Written by James

September 10th, 2007 at 10:18 pm

Posted in Linux, MySQL

Tagged with , ,