James Little

Detect visitor’s country with PHP & MySQL

14 comments

Last update: December 24th, 2011


{lang: 'en-GB'}

Of course you don’t have to use PHP, or MySQL for that matter. But it’s my method of choice for most web apps, and it’s also a pretty common one. The general gist is to do a lookup on a database of geographical locations for IP addresses, having taken your visitor’s IP address from the PHP superglobal array $_SERVER. Yes there are caveats: the database is not 100% complete/accurate, and some ISPs (like AOL!) use proxies across different countries so the user will appear to come from somewhere other than their true country of origin. Boo hoo, let’s do it anyway; according to MaxMind, their free(!) GeoLite Country database is 99.3% accurate, and their licensed version, 99.8%.

The database is released monthly in CSV format, so I’ll have to import it into MySQL using mysqlimport, or LOAD DATA INFILE. I prefer the first option. Those of you that are MySQL fans probably know that there is a CSV storage engine available, but that’s only in version 5.1 which is still in Release Candidate stage, so I’ll stick to mysqlimport.

Download the GeoLite database from Maxmind, extract the CSV file and rename it to something more convenient; mysqlimport uses the filename for the name of the MySQL table it imports into:

root@jim-desktop:/home/jim/data# wget http://www.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip
root@jim-desktop:/home/jim/data# unzip GeoIPCountryCSV.zip
root@jim-desktop:/home/jim/data# mv GeoIPCountryWhois.csv geo_csv.csv

Before we import the data into MySQL we need to create a table for it to go into. The following DDL accurately describes the structure of the data. Obviously create a new database if you want; here I have one called geo_ip:

CREATE TABLE  `geo_ip`.`geo_csv` (
 `start_ip` char(15) NOT NULL,
 `end_ip` char(15) NOT NULL,
 `start` int(10) unsigned NOT NULL,
 `end` int(10) unsigned NOT NULL,
 `cc` char(2) NOT NULL,
 `cn` varchar(50) NOT NULL
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1

If you look at the data in the CSV file you’ll see it’s delimited by commas and the text is qualified by double quotes. With that in mind, we use the following statement to import the data:

root@jim-desktop:/home/jim/data# mysqlimport --fields-terminated-by=","  --fields-optionally-enclosed-by="\"" --lines-terminated-by="\n" geo_ip /home/jim/data/geo_csv.csv
geo_ip.geo_csv: Records: 102957  Deleted: 0  Skipped: 0  Warnings: 0

If the mysqlimport binary is not in your environment path, use locate to find it. If you don’t have it at all then use LOAD DATA INFILE.

So we now have the raw data imported into MySQL, but how do we use it? First let’s take a look at the data:

mysql> select * from geo_csv order by rand() limit 10;
 +---------------+----------------+------------+------------+----+----------------+
 | start_ip      | end_ip         | start      | end        | cc | cn             |
 +---------------+----------------+------------+------------+----+----------------+
 | 207.209.7.0   | 207.209.7.255  | 3486582528 | 3486582783 | AU | Australia      |
 | 79.99.200.0   | 79.99.207.255  | 1331939328 | 1331941375 | BE | Belgium        |
 | 217.27.192.0  | 217.27.207.255 | 3642474496 | 3642478591 | DE | Germany        |
 | 194.59.180.0  | 194.59.180.255 | 3258692608 | 3258692863 | FR | France         |
 | 81.16.160.0   | 81.16.175.255  | 1360044032 | 1360048127 | SE | Sweden         |
 | 62.23.198.192 | 62.23.198.207  | 1041745600 | 1041745615 | GB | United Kingdom |
 | 64.49.231.240 | 64.49.232.15   | 1077012464 | 1077012495 | US | United States  |
 | 83.217.68.32  | 83.217.68.95   | 1406747680 | 1406747743 | BE | Belgium        |
 | 91.193.20.0   | 91.193.23.255  | 1539380224 | 1539381247 | CH | Switzerland    |
 | 194.37.249.0  | 194.37.249.255 | 3257268480 | 3257268735 | SE | Sweden         |
 +---------------+----------------+------------+------------+----+----------------+
 10 rows in set (0.22 sec)

The table is essentially a big list (~103k records) of IP ranges, given in both dot-decimal and decimal form. The decimal form is the most efficient to search on as the datatype int requires less memory than char, and with integers we can reliably make use of operators such as greater than, less than, BETWEEN, etc. Exactly how you will use the data will depend on your scenario. I began looking into this when I was working on a German website that wanted to know when a visitor was from Switzerland, so it could display prices in CHF rather than Euros. So in fact, all I needed to know was whether the visitor was Swiss, and if they were from any other country, they would see Euros. So the only columns I’ll need from the table are start and end, and all the rows belonging to Switzerland, or ‘CH’. So to make searches more efficient I’ll grab only the data I need and put it in a new table called ch_ip:

mysql> create table ch_ip as select start,end from geo_csv where cc='CH';
Query OK, 2023 rows affected (0.05 sec)
Records: 2023  Duplicates: 0  Warnings: 0

Great, that’s cut the data from nearly 103 thousand records to just over 2 thousand, and we’ve also lopped off four columns. I’ll now be searching on a table that’s 18K in size, rather than the original 5.3MB. Maybe you need every row of data in your scenario, but in many cases you only need a fraction. And in any case, you really don’t need the start_ip and end_ip columns (as you will see shortly). You could also split off the country names (cn column) into another table so that cc becomes a foreign key. Or you could ditch the country names completely and create an array of CC => CN in your application; there are only 239 unique CCs after all:

mysql> select count(distinct cc) from geo_csv;
 +--------------------+
 | count(distinct cc) |
 +--------------------+
 |                239 |
 +--------------------+
 1 row in set (0.05 sec)

So I have my table of 2,023 Swiss IP ranges. Now I need to grab a visitor’s IP address and convert it into decimal notation. For this we can use PHP’s built-in function ip2long. We use sprintf to ensure the result is always unsigned:

<?php  $ip_num = sprintf("%u", ip2long($_SERVER['REMOTE_ADDR']));  ?>

Once we have $ip_num we can create our MySQL query:

$qry = "SELECT '' FROM ch_ip WHERE $ip_num BETWEEN start AND end";

All we need to know is whether the query returns > 0 rows. If it does, then the visitor is Swiss and we’ll set their locale appropriately. Obviously we don’t want to be performing this query on every page, so once it has been performed once for the visitor we’ll set a session variable. So the final code looks like this:

<?php 

session_start();
if (!session_is_registered("locale")) { //check if the session variable has already been set first
    $con = mysql_connect('localhost', 'geo_user', 'geo_password');
    if ($con) {
        $ip_num = sprintf("%u", ip2long($_SERVER['REMOTE_ADDR']));
        mysql_select_db("geo_ip", $con);
        $result = mysql_query( "SELECT '' FROM ch_ip WHERE $ip_num BETWEEN start AND end" );
        $num_rows = mysql_num_rows($result);
        if ($num_rows > 0) {
            $_SESSION['locale'] = "ch";
        }
        else { $_SESSION['locale'] = "de"; }
    }
    else { $_SESSION['locale'] = "de"; //If no db connection can be made then set their locale to German }
};

?>
{lang: 'en-GB'}

Written by James Little

June 7th, 2008 at 6:17 pm

Posted in MySQL,PHP

Tagged with , ,

  • http://steven.macintyre.name Steven Macintyre

    Good post mate!

  • patriot

    great post thats what i looking for but still have a question there is a way to extract the city and stats?

  • http://www.jameslittle.me.uk James Little

    In this example I only needed country info, but Maxmind do have a city
    database as well. Check 'csv_samples' and 'database' in http://www.maxmind.com/download/geoip/
    for an idea of what the data looks like.

  • Sammy

    This has been so helpful, thank you so much for all your work.

  • http://buildegg.com/ be

    Excellent post. Now I can cater the site to my northern friends, eh?

  • bruno

    nice!

  • rendra_mm2

    Gud

  • http://twitter.com/abyss__ Muneeb Shahid

    wow nice exactly what i needed thanks

  • Gamble Simon

    In case you just want to handle any country….

    $result = mysql_query(“SELECT * FROM geo_csv WHERE $ip_num BETWEEN start AND end”);
    $num_rows = mysql_num_rows($result);
    while($row = mysql_fetch_array($result)) {echo “Country: “.$row['cn'];}

  • Koen

    Thanks m8 very usefull stuff!

  • Guest

    Thanks, this really helped me.

  • Nazmul5050

    there are only few countries list.  how will i get all the ip list of all the counties?

  • http://www.seekphp.com/country-culture-flag-database-export/ country culture flag database export? | SeekPHP.com

    [...] a look at the FamFamFam set found here. These flags fit in perfectly with the country codes that GeoIP has, however these are two character codes (GB, US, FR, etc), not exactly what you were looking [...]

  • Ecoluxhotels

    You sir, rule the school.