Jun 26

The Internet Corporation for Assigned Names and Numbers (Icann) has just decided that the web is going to get a whole host of new domain names — which could be as varied as .bank, .sport, .shopping or pretty much anything else you can dream up.

The article I’ve linked to cites fears of cybersquatting (sitting on a domain name that you shouldn’t really have, like if I bought and amazon.biz and tried to basically blackmail Amazon into buying it from me), as well as general disorganization on the web brought about by these changes.

Sound scary? Well, there’s no reason to be afraid — many have tried and failed to organize the web, and it hasn’t ground to a halt yet. Funny enough, if we look at a bit of history, we can see that things weren’t really supposed to be this way in the first place.

Continue reading »

Dec 04

For those of you who don’t know, I live in Toronto, Ontario, Canada (there are approximately two and a half million people that live in Toronto, making it the largest city in Canada).

A few days ago, I heard about the very bizarre story of an art project gone horribly wrong — where a Toronto art student placed a ’sculpture’ (which closely resembled a bomb) in the Royal Ontario Museum (since most readers are not from Toronto, the ROM is one of the biggest museums here in Canada).

To make the story even stranger, the art student recorded a video of both his placing of the bomb in the ROM, as well as of a fake ‘explosion’ for his art project, and placed them on YouTube. Understandably, not everyone thought this was either very artistic, or frankly, very smart. You can watch an interview with the art student here.

Continue reading »

Jul 25

While wading through my daily dose of spam (which originally never plagued my Gmail account, but now has risen to approximately 50 spam a day), I started thinking about the quality of information. How do we gauge the value of something that there is so much of out there?

How can we filter out the good from the bad — and prevent users from missing that one important message that went into the junk mail folder? First though, I decided to look at some statistics about spam.

The famous Akismet spam filter (which filters Wordpress spam) has caught a total of 2,171,905,896 spam comments so far. According to Akismet, this accounts for 94% of all comments left on blogs. Taking a look at the the diagram below (which is borrowed from the good folks at Akismet), we can see that legitimate messages make up a tiny portion of all comments.

Spam.

Continue reading »

Jun 25

Well, my other somewhat-experimental site, Knowledge Cog, has now reached 734 posts. A few of these belong to me, but the majority of them are the works that other people have created — and I’ve used Knowledge Cog as a public medium in order for other people to see the stuff that I like to read.

On a related note to this, I’ve heard some talk lately about the potential for a passive medium like television and an active medium like the web to intersect and eventually merge into one medium. While I don’t doubt that this is a possibility, I don’t think we’re as far along in a combination of these two media as some might think.

Continue reading »

Mar 26

On every post I write, I use Ultimate Tag Warrior to insert some Technorati tags, which show up at the bottom of the post. I’ve been thinking lots about metadata lately — and started wondering how useful those tags actually are (and how much traffic they actually generate).

Wikipedia describes metadata as “‘data about data.’ It can generally be thought of as information that describes, or supplements, the central data. For example, metadata produced by digital still cameras describe the settings used for the picture, such as exposure value or flash intensity. In such cases, the metadata can be considered as extra data, which merely add information, and is not critical to the functions of the main data.”

More and more often however, metadata doesn’t end up being a few keywords or even supplemental at all. For example, if I want to provide some metadata on the poet T.S. Eliot, I might include his date of birth, birthplace and even some of the titles of stuff that he’s written.

But what if I wanted to provide all of the text he ever wrote as metadata about him? Maybe I also want to provide the entire history of St. Louis, Missouri as well (where Eliot was born)? It would seem that I am no longer providing metadata about another data set, but rather I’m just linking two sets of data together (Eliot’s biography and the text of his works).

In reality, it would appear I’m just providing mesodata (OK, I just made up that word, but I think it accurately describes what I’m talking about). For example, a link on the Web that goes from one body of data to another (from the Eliot biography page to the Eliot writing text page), is really just a small piece of data that indicates to the user how the two things are linked together.

This seems pretty obvious doesn’t it? Well, yes, I wouldn’t say this should come as a great revelation to anyone. But when talking about metadata, too often the assumption is that we’re extracting or creating a small set of data that can be used to describe the broader data set — when it can often be the opposite.

So how does this relate to Technorati? Well, I want to know how useful these tags are that I’m sticking on every post on this site. Does this user-created metadata really improve the findability of the information I’m posting (on Technorati, but also on Google or other search engines as well)?

Here’s the experiment: I’m going to tag this post with the top 10 search terms on Technorati and see how much (if any) traffic comes to this site through searches on those terms on Technorati, Google or other search engines. I don’t really have any prediction for what will happen when I do this — I just hope to get some indication of how useful this metadata actually is.

Mar 17

After clearing out my comment spam filter on this site this morning, I decided it was time to attempt to curb the amount of spam this site is getting. In the first few months of this blog being around, I got around 1000 spam. As of today, the site’s up to about 100 comment spam per day. Which would be fine if it were all spam — but unfortunately I have to go through all that spam to make sure that there aren’t legitimate comments buried in there. This is a tedious task that I don’t really like doing.

While I can still remember my first spam comment I received (which was not as long ago as John Chow’s first spam comment), I knew that eventually, as this blog got more popular, spam would become a bigger and bigger problem. So my first step to combat spam is to introduce a challenge question (which I achieved by installing the Wordpress Challenge plugin), which appears at the bottom of the comment box. The question is pretty simple — what year is it?

Hopefully this will reduce the spam comments and keep me from having to clean out the spam filter on a daily basis. If this one measure doesn’t stem the tide of spam, then there are some other measures I’ll try to reduce the junk that makes it into the filter (Akismet, Wordpress’ native spam filter, is very good however at picking up spam comments, I’m just tired of cleaning out the filter). If you have problems, comments or questions about this new spam-fighting measure, leave a comment below. Thanks again for reading.

Update: The spam kept coming in — we’re now up to 1590 spam comments (if you’re keeping track, that’s 58 in the last few hours). I decided to take another step to block spam at its source by installing Bad Behaviour. So far, I haven’t had a single spam comment make it past Bad Behaviour, making it at the top of my list for ways I’d suggest to stop spam. I’ll keep you updated on how the battle with spam is going.

Mar 16

Did you know that you can combine multiple RSS feeds together into one feed and have them sent by email to you or anyone who signs up for that combined feed? But first, let me answer the question you’re asking yourself as you’re reading this: why would I want to do that?

One reason might be that you can’t read feeds on a mobile device, and would prefer seeing them in an email sent from address. Or maybe, like me, you just don’t really feel like installing an RSS reader (or using one built into Internet Explorer), and would just find it simpler if everything came to your email inbox like a regular newsletter.

This can also be a great way to distribute information to employees where everyone might not understand (or care) what RSS is — just put some feeds together into one feed, have them distributed every so often by email, and you have automatic updates for whatever industry or field you’re in.

Combine.

First, let’s find some feeds to combine. Syndic8 and RSS Feeds have a ton of feeds you can search or browse through (you might also have some feeds in mind already that you want to combine). Combining the feeds together is about as simple as going to RSS Mix, dropping the addresses of the fields in the box, and clicking the ‘Create!’ button. You will then get a feed ID number that you can use to access your feed.

Email.

Now, if you want to send out this feed by email, you can go to FeedBlitz, create an account, and subscribe to your feed by email. You will then get emailed (I believe it’s on a daily basis by default) the feed items from all of the feeds you entered into RSS Mix. Other people can also subscribe to your feeds this way. Create as many feeds as you want using RSS Mix, and you can syndicate them all to email using FeedBlitz. Cool huh?

Mar 13

According to my site statistics, there are several dozen people reading this site with Internet Explorer 6. With Internet Explorer 7 being only a few months old, this is certainly to be expected (my stats also tell me that almost no one — which means around 2% — of people visiting this site are using Windows Vista). I talked a bit about information sharing and compatibility before, but Internet Explorer 6 presented a whole new challenge for me with the design of this site.

When adapting the theme of this site, I checked for compatibility (i.e. the site not looking like crap) on both Firefox and Internet Explorer 7 — but then started to get messages from people saying that things were unaligned and looking just generally messy on Internet Explorer 6. So then I had the problem of how to test on Internet Explorer 6, since I already have Internet Explorer 7 at home (I need to also state here that I am not a designer, and anyone who is a designer is probably having a chuckle at my introduction to a problem they face every day).

Continue reading »

Feb 13

The semantic web will allow us to recombine information in ways we never thought would be possible. But how do we get all of that information in a form that all of us can use?

I was asked recently to explain how Web 2.0, the semantic web, and metadata are related to each other, and what these concepts might mean for the future of the web.

Here’s my (not overly brief) take on it.

If you haven’t heard the phrase ‘the semantic web’ before, this is what it is in a nutshell (thanks Wikipedia):

“…an evolution of the World Wide Web in which information is machine processable (rather than being only human oriented), thus permitting browsers or other software agents to find, share and combine information more easily.”

Obviously, most of the information that’s on the web today got there by being marked up in HyperText Markup Language (a.k.a. HTML). Yet HTML has been around a long time (since around 1993), and has some inherent problems.

We can all remix whatever we need to — but how?

The problem with HTML however, is that it statically describes the content on a page, as opposed to really tagging that content in a way that can be reused and recombined with other content (and metadata).

New services like Yahoo Pipes, which I’ve talked about previously, have brought the ability to recombine information to pretty much everyone who needs that recombination (but of course, it’s not without it’s problems).

Continue reading »

Feb 09

With all the talk about Web 2.0 applications and sites, I don’t hear much thinking happening on how to redesign our most common form of electronic communication: email.

TechCrunch compared the major email offerings yesterday, and found that Gmail was your best bet (a similar comparison was done quite a while ago, with the same offerings being considered).

But what is really new in these email offerings? Access to email through a browser is certainly nothing new (and if we look at the characteristics of Web 2.0, is only a small part of the equation).

New services like Yahoo Pipes will users to participate in the creation of new data from existing data sets (which is something like Michael Wesch’s database-backed web from yesterday’s post).

The issue with email is that it’s largely an isolated, one-way and very non-collaborative approach to communication.

So when we start asking where we’re going with Web 2.0, we must eventually start to question whether email is a communication medium that is going to survive.

More and more, people are looking for ways to not only expedite communication, but also to develop more meaningful communication that deepens existing relationships or forges new ones.

Of course, with all of these added features comes the burden of having to ensure that things are done securely. Web 2.0, especially when it’s included in an environment that as potentially sensitive (as many email boxes are), there are a whole new set of security challenges (not to mention new rules for those starting new online ventures).

With email being so entrenched in the way we do things (especially at work), it’s tough to see it disappearing any time soon.

But before email, someone might have said the same thing about paper memos.