Archive for September, 2004

Sep 28 2004

Image Metadata Tagging Patent

Published by Ian Davis under Uncategorized and tagged as ,

Hmmm. Just stumbled across United States Patent Application 0030033296 which claims:

Methods and apparatus for managing, finding and displaying objects such as digital images. Objects are tagged (“associated”) with descriptive textual and numeric data (“metadata”), and stored in a relational database from which they can be selected, sorted, and found. Tags can be defined by name, tag type, and associated attributes. Objects can be tagged by dropping a tag onto the object, or relating a database record for the tag to a database record for the object. Tagged objects can be searched for and displayed according to the degree to which their metadata matches the search criteria. Visual cues can indicate whether displayed objects match all, some but not all, or none of the search criteria. Database object distributions can be displayed as histograms or scatter plots, including timelines, calendars or maps. Object distributions can be used to search for objects or to limit search results for a previous search.

The claimants appear to be be ex-Apple and now part of Fotiva. This patent application appears to tread heavily on the toes of flickr and picasa.

Comments Off

Sep 27 2004

Zero Comment Spam

Published by Ian Davis under Uncategorized and tagged as , , ,

Yes it’s true, I’ve had zero comment spam since implementing the keyword scheme. I’ve had quite a few comments, even one from my Dad, which demonstrates that entering the keyword isn’t too onerous. As promised, here’s what I did.

First of all I added the following (in green) at the end of wp-includes/template-functions-comments.php:


	echo '</rdf:RDF>';
	}
}

function ordinalSuffix($number)  {
    $suffixes = array("th","st","nd","rd");

    $suffixIndex = $number % 10;
   if (    $suffixIndex > 3
        || $number == 11
        || $number == 12
        || $number == 13) {
    $suffixIndex = 0;
   }

    return $suffixes[$suffixIndex];
  }

?>

Then, I modified wp-comments.php by adding the following:

<p>
  <label for="comment"><?php _e("Your Comment"); ?></label>
  <br />
  <textarea name="comment" id="comment" cols="70" rows="4" tabindex="4"></textarea>
</p>

<p>
  <label for="phraseword">As a comment spam precaution, please type the
    <?php _e( (1 + ($id  % 16)) . ordinalSuffix(1 + ($id  % 16)) ) ?>
    word of the following phrase: <br />
    <q>I know a bank where the wild thyme blows,
    where oxlips and the nodding violet grows</q>
  </label>
  <br />
  <input type="text" name="phraseword" id="phraseword" size="28" tabindex="5 " />
</p>

<p>
  <input name="submit" type="submit" tabindex="5" value="<?php _e("Say It!"); ?>" />
</p>

Feel free to choose your own pass phrase. Finally, I added the following to wp-comments-post.php:

$comment = trim($_POST['comment']);
$comment_post_ID = intval($_POST['comment_post_ID']);
$user_ip = $_SERVER['REMOTE_ADDR'];

$phrase = 'I know a bank where the wild thyme blows, where oxlips and the nodding violet grows';
$keywords = preg_split("/[s,.]+/", $phrase);

$phraseword = trim($_POST['phraseword']);
if ( empty ($phraseword) ) {
	die( __('Sorry, you didn't enter the phrase word.') );
}
else {
  if  ($phraseword != $keywords[ $comment_post_ID % count($keywords)]) {
	  die( __('Sorry, you didn't enter the correct phrase word. '));
  }

}

if ( 'closed' ==  $wpdb->get_var("SELECT comment_status FROM $tableposts WHERE ID = '$comment_post_ID'") )

Make sure that your phrase doesn’t have any punctuation at the end otherwise the code to split the phrase into words will add an extra empty word at the end and confuse the code that checks the poster entered the correct word.

There it is. Zero comment spam. For now at least.

3 responses so far

Sep 27 2004

Dad on the Great Wall of China

Published by Ian Davis under Personal

Dad on the Great Wall of China

That’s my Dad on the Great Wall of China. He’s just completed a 5 day charity walk of the wall, sleeping in tiny tents in local school playgrounds and climbing gradients of up to 70%. My Dad, in his sixties and the oldest member of the party, has never been what you would call fit. However over the past six months he walked and walked in preparation and is now probably the fittest he’s ever been. Well done Dad, I’m proud of what you’ve achieved! This is what he had to say about the hike:

The walk itself was gruelling, much harder than I expected. Every day the temperature was about 24 degrees C and water consumption was averaging five litres a day per person. There were two groups of twelve people led by two experienced leaders and accompanied by a British doctor, local trackers and translators. We began each day at dawn and after breakfast had warm up exercises and some Tai Chi. The local Chinese would stand and stare, and usually giggle, at our motley group made up of all shapes and sizes. Then we would begin our trekking, carrying rucksacks heavy with litres of water, and take a slow steady pace along the Great Wall.

The terrain was remote and hilly and gradients of 70 degrees were not uncommon. The steps of the wall were not consistent, sometimes very high at other times very low. A large part of the wall was derelict and like rubble, and along other parts we had to make detours across remote countryside. Several trekkers became dehydrated or suffered exhaustion and had to bail out. We covered approximately 75 kilometres in five days and climbed the highest tower, Wanjing Lou (960 metres). The challenge was worth it in many ways, the spectacular views, meeting the local Chinese people and the camaraderie of the fellow trekkers.

The main purpose of the walk was to raise vitally needed funds for the British Red Cross and this was also successfully accomplished. Afterwards, there was time to relax with the sightseeing that included Mao’s mausoleum in Tiananmen Square where we lined up with the Chinese visitors. Was it him or was it wax? That appeared to be the question on everyone’s mind. On leaving, we encountered rows of stall holders all selling Mao memorabilia, a nice little earner. I was particularly impressed with the Forbidden City which is unique and covers a vast area.

It was a fantastic trip and most rewarding. And I feel fitter for having done it.

One response so far

Sep 27 2004

WordPress Hack for Slim Pages

Published by Ian Davis under Uncategorized and tagged as ,

Here’s the PHP file I use to generate my slim page. It’s called wp-slim.php and lives in the same directory as index.php.


<?php
if (!isset($feed)) {
    $blog = 1;
    $doing_rss = 1;
    require('wp-blog-header.php');
}
$more = 1;
$charset = get_settings('blog_charset');
if (!$charset) $charset = 'UTF-8';
header('Content-type: text/html', true);

?>
<?php echo '<?xml version="1.0" encoding="' . $charset . '"?'.'>'; ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <base href="<?php bloginfo_rss('url') ?>"/>
    <title><?php bloginfo_rss('name') ?></title>
    <meta name="author" content="Ian Davis" />
    <meta name="copyright" content="Copyright (c) 1999-<?php _e(gmdate("Y")) ?> Ian Davis" />
    <meta name="description" content="<?php bloginfo_rss("description") ?>" />
  </head>
  <body>
    <?php $items_count = 0; if ($posts) { foreach ($posts as $post) { start_wp(); ?>
    <div class="entry" id="entry<?php _e($post->ID) ?>">
      <h1><a href="<?php permalink_single_rss() ?>"><?php the_title_rss() ?></a></h1>
<?php if (get_settings('rss_use_excerpt')) : ?>
      <div class="content"><?php the_excerpt_rss(get_settings('rss_excerpt_length'), 2) ?></div>
<?php else : ?>
      <div class="content"><?php the_content('', 0, '') ?></div>
<?php endif; ?>

    </div>
    <?php $items_count++; if (($items_count == get_settings('posts_per_rss')) && empty($m)) { break; } } } ?>
  </body>
</html>

I also changed wp-feed.php to dispatch requests for slim pages (additions in green):

    case 'rss2':
        require('wp-rss2.php');
        break;
    case 'slim':
        require('wp-slim.php');
        break;
    }
}

I changed the rewrite rule in my .htaccess to map index.slim to wp-feed.php:

RewriteRule ^index.(feed|rdf|rss|rss2|atom|slim)$ /2004/09/wordpress/wp-feed.php?feed=$1 [QSA]

Note: I’m not using the standard WordPress rewrite rule set. For backwards and future compatibility with other weblog systems I prefer to use file extensions for the various formats of a document.

One response so far

Sep 27 2004

Slim Pages

Published by Ian Davis under Uncategorized and tagged as

I’m experimenting with a slim page version of this site. Slim Pages are slimmed down versions of standard web pages. The basic rules are:

  1. Slim Pages are a subset of XHTML strict. They cannot contain script/noscript elements, style attributes nor any of the event attributes such as onclick.
  2. The body tag can only contain div tags of class “entry” and an id attribute providing a site-unique identifier for the entry the div contains.
  3. The entry divs contain a h1 heading as the first tag. The heading contains a link to the permalink for the entry. The link text is the title for the entry.
  4. The heading is followed by another div with a class of “content” which contains the content of the entry.

That’s it. Here’s an example:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <base href="http://internetalchemy.org/" />
  <title>Internet Alchemy</title>
  <meta name="author" content="Ian Davis" />
  <meta name="copyright" content="Copyright (c) 1999-2004 Ian Davis" />
  <meta name="description" content="Digital explorations and experiments" />
 </head>
 <body>
  <div class="entry" id="entry810">
   <h1>
    <a href="http://internetalchemy.org/2004/09/more-comment-spam">More Comment Spam</a>
   </h1>
   <div class="content">
    <p>
     I'm still getting comment spam, despite the posting timeslot
     idea. Obviously the assumptions I made there were unsound.
     So, here's the supplement that might raise the bar a little. If
     you want to comment on an entry here, you now have to
     enter a particular word of a well known quotation. If you
     don't or enter the wrong word then you get locked out for 10
     seconds (a standard WordPress feature).
    </p>
   </div>
  </div>
 </body>
</html>

Because it’s XHTML it has all kinds of nice properties such as being viewable on smartphones and PDAs. It prints nicely if all you want is the content and it can be styled using CSS. It’s readable by ordinary people with a web browser. The tags it uses are pretty well known by every web developer so it’s quite easy to write, perhaps even using an off the shelf authoring program. All the meta and link conventions in HTML headers such as geo location work too. It can be transformed using XSLT into any flavour of RSS or Atom although there are less programs that understand those formats than understand Slim Pages.

It has a regular entry structure, which means you could aggregate it and because the id attributes are site-unique, the aggregator can work out when something new is posted.

Slim Pages are just the content without the candy. Some people like candy, some don’t. Now you can choose :)

3 responses so far

Sep 26 2004

More Comment Spam

Published by Ian Davis under Uncategorized and tagged as , ,

I’m still getting comment spam, despite the posting timeslot idea. Obviously the assumptions I made there were unsound. So, here’s the supplement that might raise the bar a little. If you want to comment on an entry here, you now have to enter a particular word of a well known quotation. If you don’t or enter the wrong word then you get locked out for 10 seconds (a standard WordPress feature).

(code to follow if it’s successful…)

One response so far

Sep 25 2004

BCG

Published by Ian Davis under Personal

BCG Team 24 Sep 2004

I went to Weybridge today for a friend’s leaving lunch. He was the first person we hired into the newly created BCG team at Sony way back in 1997. This was back when the people at Sony UK had barely heard of the web and sony.co.uk was owned by some guy in one of the factories in Wales. We built the team, educated the company about the Internet and then the time came for me to say goodbye – I’d got the startup bug. The team grew, changed and shrank again and when I went back these were the people left. Things were very different the second time round: Sony was in a vastly different economic situation and morale was at an all time low. Now the end has come for BCG: I departed again ages ago, Kerry and Dave, on the right have gone, Mark (on the left) went today, Dan (standing at the back) goes in 6 weeks. I’m going to miss them all . . . they gave me some of the best days of my life.

Comments Off

Sep 24 2004

Comment Posting Timeslots

Published by Ian Davis under Uncategorized and tagged as , , ,

Ok, I have comment moderation but I don’t want to waste time deleting crap comments. So I need to do something different to combat it. My assumptions are: most comment spam is automated; the spammers have specific applications that understand common weblog software characteristics; they find pages to spam through Google. Given the first and secons assumptions, I can prevent automated spam by changing the posting url. But in case their tools are configurable on a site by site basis, I need change the url according to some algorithm, such as the current time. So instead of posting to a url like: wordpress/wp-comments-post.php I’ll make my forms point to wordpress/wp-comments-post.php/1096063037. I can then check that the post was submitted within a given timeslot.

So, I’ve made the following change (in green) to my wp-comments.php file:

<form action="<?php echo get_settings('siteurl'); ?>/wp-comments-post.php/<?php _e( time() )?>" method="post" id="commentform">

Then, to validate the time I’ve added the following to my wp-comments-post.php file:

<?php
require( dirname(__FILE__) . '/wp-config.php' );

if (! empty($_SERVER['PATH_INFO']) && preg_match("!^/([0-9]+)$!", $_SERVER['PATH_INFO'], $timeslot_matches) ) {
  $timeslot_start = $timeslot_matches[1];
  $current_time = time();
  if  ($current_time < $timeslot_start || $current_time - $timeslot_start > 3600) { // choose a number of seconds 3600 ==  1 hour
     die( __('Error: Posting timeslot has expired. '  ) );
  }
}
else {
  die( __('Error: No timeslot specified.') );
}

function add_magic_quotes($array) {

So, the form only works within one hour of loading the comments page. It won’t defeat a human spammer, but might reduce the amount of automated spam I get. Let’s see…

One response so far

Sep 24 2004

Comment Spam

Published by Ian Davis under Uncategorized and tagged as

One day after enabling comments and I’ve had two genuine comments and 5 spam. Think nice thoughts… think nice thoughts…

Comments Off

Sep 23 2004

WordPress Hacks

Published by Ian Davis under Uncategorized and tagged as , ,

I forgot to mention a couple of hacks I used to build this WordPress installation. Both are by Morten Frederiksen who was generous enough to document exactly what he did with each one:
addingRSS 1.0 comment feeds

Comments Off

Next »