Pigeon rank plugin for WordPress – part 2

In the previous post, I created a basic plugin, that sets a cookie to track the user’s whereabouts (on the site). It then shows the user id, URL and time at the bottom of every post on the site.

This time I will add code for:

  • Logging all user visits
  • Convert visits into trails
  • Compute and log usefull statistics based on trails

Database

In the init() function of the plugin, I check a custom WordPress variable to check the current version of the plugin. If the version is lower than the current version, then the plugin should upgrade the database, or create if the version is 0.0.

$version = get_option('pigeon_rank_version', '0.0');
if ($version == '0.0') {
	$sql = "CREATE TABLE $landings_table_name (
	uuid varchar(64) NOT NULL,
	id int NOT NULL,
	time int NOT NULL,
	KEY (uuid),
	KEY (time)
	) $charset_collate;";
	
	dbDelta($sql);

When that is done, the version can be updated.

$version = '0.01';
update_option('pigeon_rank_version', $version);	

The main loop, which runs when content is loaded, adds the visit to the landings table. It also does some cleanup to keep the table as small as practically possible. This is the code to insert the row in the database.

$wpdb->insert($landings_table_name, array('uuid' => $uuid, 
										  'id' => $id,
										  'time' => $now));

Then when looking at the table, after a couple of page visits from a single user, the data looks like this. You can see the cookie ID, the WordPress page numbers and the time in seconds since epoch.

uuid,id,time
62fe93bdcf60f,75,1660851134
62fe93bdcf60f,60,1660851134
62fe93bdcf60f,54,1660851134
62fe93bdcf60f,52,1660851134
62fe93bdcf60f,50,1660851134
62fe93bdcf60f,8,1660851134

The trail – core functionality

Now I can see which pages a user visits in succesion and how much time they spend. Then, with enough samples, I can figure out which page I can present to the visitor that is most likely to keep them interrested. Because they are most relevant to each other.

For this I need to know how long they spend on a page, and most important, which page send them there.

First I create a table for this, in the same init() function with version checking etc.

$sql = "CREATE TABLE $grid_table_name (
id int NOT NULL,
src int NOT NULL,
time float NOT NULL,
datapoints int NOT NULL,
KEY (id, src)
) $charset_collate;";

dbDelta($sql);

Then when visiting a post, I retrieve all previous visits from this visitor. Also, I put the ID’s in a separate array for easy checking.

// Retrieve previous landings from this uuid
$sql = "SELECT `id`, `time` FROM `$landings_table_name` WHERE `uuid` = '$uuid' ORDER BY `time` DESC;";
$results = $wpdb->get_results($wpdb->prepare($sql));

// Create list of posts already visited
$visited = array();
foreach ($results as $v) {
	$visited[] = $v->id;
}

Then I check if we have at least three visits. This is important because this visit and the previous give a time difference of time spend at the previous page. And the page before that is the page that send them.

// If we have three results then we can add time to our grid
if (count($results) >= 3) {
	$spend = min($results[0]->time - $results[1]->time, $max_time);
	$here = $results[0]->id;
	$id = $results[1]->id;
	$src = $results[2]->id;

Then I need to check if all the page ID’s are unique, otherwise the user hit F5 or navigated backwards.

if ($id != $src && $id != $here && $src != $here) {	# User probably hit F5 or something

We are keeping statistics for the combination of which page send you where, so I get this combination if it is already there. If not, I create it with default values.

// Fetch gridpoint if already exists
$sql = "SELECT `time`, `datapoints` FROM `$grid_table_name` WHERE `id` = $id AND `src` = $src;";
$rresults = $wpdb->get_results($wpdb->prepare($sql));
		
$time = 0;
$datapoints = 0;

if (count($rresults) == 1) {
	$time = $rresults[0]->time;
	$datapoints = $rresults[0]->datapoints;
} else {
	$wpdb->insert($grid_table_name, array('time' => $time, 
									  'datapoints' => $datapoints,
									  'src' => $src,
									  'id' => $id)); 
}

Because pages on the site can change, and user interest can also, I decided to keep a moving average of time spend on the page. Normally, with a moving average, one has to keep track of all the previous values withing the moving window. I don’t intend to store that much data, so I created a sort-of-moving-average. After that, I update the table.

// Do a moving average addition
$time *= $datapoints;
$time += $spend;
$datapoints += 1;
$time /= $datapoints;

$datapoints = min($datapoints, $moving_avgerage_length);

// Create or update in grid
$sql = "UPDATE `$grid_table_name` SET `time`=$time, `datapoints`=$datapoints WHERE `id` = $id AND `src` = $src;";
$wpdb->query($wpdb->prepare($sql));

That’s it for now, the plugin can now track user trails and keep track of what page after the current one leads to how-much-time-spend-more on the site.

More on this in a future post.

Leave a Reply

Your email address will not be published. Required fields are marked *