Philipp's Computing Blog

Success is about speed and efficiency

Piwik Plugin IntranetSubNetwork: Show IPv4 vs. IPv6 Statistics

I found an interesting plugin called IntranetSubNetwork on the Piwik trac ticket #1054. I was thinking that this would be perfect to adopt it to my needs to determine between IPv4 and IPv6 users on my site. Those needs were already specified in a feature suggestion for Piwik.

So I took the plugin and put it on a Github repository: pklaus / IntranetSubNetwork.

With the plugin you're able to assign each visitor to a 'network category' according to the range his or her IP belongs to. The network category name will then be stored in the Piwik database in the column location_ip of the table piwik_log_visit.

Here is a screenshot of the Piwik dashboard widget:
Screenshot of the Piwik Plugin IntranetSubNetwork showing IPv4 vs. IPv6 visits

More information (incl. on how to install) can be found in the readme file of the plugin.

Updating previous visits

To assign network categories to the previous visits (the plugin just does this for new visitors), you have to use SQL command to update them. To achieve this, you can use this Python script. It asks you for a 'Name' and for the networks to label with this name and returns two pieces of information:

  • It gives you the lines of PHP code for the plugin itself (to match new visits)
  • and it gives you the SQL command to update previous visits.

Before writing this script, I was playing with the following SQL commands on my installation:

-- Show all IPv4 addresses in the Piwik logs:
SELECT INET_NTOA(conv(hex(location_ip), 16,10) ) FROM `piwik_log_visit`
LIMIT 150000,50;
-- 33.26.129.188
-- 236.100.156.17
-- NULL
-- 246.171.79.197
-- ...

-- OK, some of the entries were NULL (could not be converted using INET_NTOA)
--- what are they like:
SELECT hex(location_IP) FROM piwik_log_visit
WHERE INET_NTOA(conv(hex(location_ip), 16,10) ) IS NULL -- only the case for IPv6
LIMIT 1,50;
-- 7FFFFFFFFFFF0000
-- 7FFFFFFFFFFF0000
-- ...
-- I think this is a result of me messing up the DB previously.
-- ( I was trying to anonymize IPs but messed up the IPv6 entries. )

-- All right, so do we also have some valid IPv6 addresses?
SELECT hex(location_IP) FROM piwik_log_visit
WHERE INET_NTOA(conv(hex(location_ip), 16,10) ) IS NULL
  AND hex(location_IP) NOT LIKE '7FFFFFFFFFFF0000'
LIMIT 1,50;
-- 20018C01AD010000FE23723EF24DA2FF
-- 2A0280006B00040FFE63123A8129ADDF
-- 200180108DFE00013FB6C10219B9FFFE
-- ...

-- OK. And how many valid IPv6 addresses are there in Piwik's visits table?
SELECT count(location_ip) FROM piwik_log_visit
WHERE INET_NTOA(conv(hex(location_ip), 16,10) ) IS NULL
  AND hex(location_IP) NOT LIKE '7FFFFFFFFFFF0000';
-- -> 3591 rows

-- And how many invalid IPv6 addresses?
SELECT count(location_ip) FROM piwik_log_visit
WHERE hex(location_IP)='7FFFFFFFFFFF0000';
-- -> 338 rows

-- OK. Now let's set the network name to 'Global IPv6' for
-- those visiting with an IPv6 address:
UPDATE `piwik_log_visit`
SET `location_IntranetSubNetwork`='Global IPv6'
WHERE INET_NTOA(conv(hex(`location_ip`), 16,10) ) IS NULL;
-- -> 3929 rows affected


-- and set the network name to 'Global IPv4' for
-- those visiting with an IPv4 address:
UPDATE `piwik_log_visit`
SET `location_IntranetSubNetwork`='Global IPv4'
WHERE INET_NTOA(conv(hex(`location_ip`), 16,10) ) IS NOT NULL;
-- -> 172837 rows affected

-- If you want to match the network 141.2.0.0/16 (University of Frankfurt)
-- You can check how many rows will be updated and if your query is fine:
SELECT count(location_ip) FROM piwik_log_visit
WHERE INET_NTOA(conv(hex(location_ip), 16,10) ) IS NOT NULL
AND conv(hex(location_ip), 16,10) >= INET_ATON('141.2.0.0')
AND conv(hex(location_ip), 16,10) <= INET_ATON('141.2.255.255');
-- -> 16 rows affected

-- And then assign the network name 'University Frankfurt' to those:
UPDATE `piwik_log_visit`
SET `location_IntranetSubNetwork`='University Frankfurt'
WHERE INET_NTOA(conv(hex(location_ip), 16,10) ) IS NOT NULL
AND conv(hex(location_ip), 16,10) >= INET_ATON('141.2.0.0')
AND conv(hex(location_ip), 16,10) <= INET_ATON('141.2.255.255');

-- If you have a couple of (missed) values where the column
-- location_IntranetSubNetwork is NULL, you can set them via
UPDATE `piwik_log_visit`
SET `location_IntranetSubNetwork`='Global IPv4'
WHERE INET_NTOA(conv(hex(`location_ip`), 16,10) ) IS NOT NULL
AND `location_IntranetSubNetwork` IS NULL;
UPDATE `piwik_log_visit`
SET `location_IntranetSubNetwork`='Global IPv6'
WHERE INET_NTOA(conv(hex(`location_ip`), 16,10) ) IS NULL
AND `location_IntranetSubNetwork` IS NULL;

OK, now that we've updated the visits table, we need to to delete Piwik's archive tables in order to get the updated values displayed. (Read How do I force the reports to be re-processed from the logs?. )

SHOW TABLES LIKE 'piwik_archive_%';

-- we can use SQL to help us prepare the statements to delete the tables:
SELECT concat('DROP TABLE ', table_name,';') FROM information_schema.tables
WHERE table_name LIKE 'piwik_archive_%';
-- copy the output, and run it again as SQL to delete all archive tables:
DROP TABLE piwik_archive_blob_2012_01;
DROP TABLE piwik_archive_blob_2012_02;
-- ... and so on

Then let Piwik recalculate the archived reports using the cronjob (or by visiting the Piwik site if you haven't set up a cron job). If you're using the cronjob, the command should look like this (where 4838400 are 8 weeks in seconds):

/usr/bin/php5 /var/www/path/to/piwik/misc/cron/archive.php --url=http://piwik.example.com/ --force-all-periods=4838400

Developer Notes

Here are some notes for those who want to work on this plugin:

If you messed up something and want to start from scratch with the data this plugin produced, you can find all the values added to Piwik:

SELECT `location_IntranetSubNetwork` FROM `piwik_log_visit`
WHERE `location_IntranetSubNetwork` IS NOT NULL LIMIT 0,1000;

To remove all network entries, run:

UPDATE `piwik_log_visit`
SET `location_IntranetSubNetwork`=NULL
WHERE `location_IntranetSubNetwork` IS NOT NULL;

References