Jul 03 2009

The Future of Mobile Browsers

Category: Ideas,Mobile,Mobile web,Semantic WebAleksander Kmetec @ 4:39 pm

Not that long ago, while my coworkers were attending a conference in London, I was spending my time wandering around the city in full tourist mode, trying to find a shop I had read about on the Internet but that didn’t seem to actually exist in real life. Although I was pretty sure I was on the right street I couldn’t find it and nobody I asked about it had the slightest idea about where it could be located.

This problem should be easily solvable in our age of ubiquitous connectivity, right? All I had to do was take out my mobile phone, fire up the web browser, check the shop’s website for the address and locate it using Google maps. How hard could that be? Very hard, as it turned out.

The experience of trying to perform this task was so horribly bad that in the end I wound up walking to Apple store a couple of streets away and waiting in line for a display iMac so that I could find the information I needed, copy it to a piece of paper and finally walk back. Yes, that was in fact a better alternative to trying to accomplish anything using a mobile browser.

A year and a half has passed since, but things haven’t changed much for the better. Even though connection speeds have gotten better and hardware is now much faster, the gap between mobile devices and computers is still huge and browsing the web on a mobile device can still be rather frustrating experience. Unlike computers, which these days have large displays running at high resolutions and are coupled with mice which give users pixel-accurate cursor control, mobile devices still have tiny touch screens and pointer accuracy of around 400 square pixels… if you’re lucky enough to have pointy fingers.

And despite their differences we use both types of devices for accessing the same websites.

So you’d probably think that browsers running on mobile devices with their tiny screens, clumsy keyboards and imprecise pointer control would be the ones getting all the usability improvements and fancy automated features. But you’d be wrong. What’s curious to me is that despite the fact that mobile devices are the ones with highly obvious usability problems, desktop browsers – which are already much easier to use – are the ones getting all the improvements. Autopager, Adblock, Greasemonkey, Stylish, Readability/Tidyread, Bookmarklets, Web slices, Accelerators and many others are features mobile users can only wish for1.

What about usability features specific to mobile browsers? Apart form the “.com” button on iPhone’s virtual keyboard I really can’t think of any. As a matter of fact, I’m having a difficult time thinking of a single user-facing feature available in today’s mobile browsers that wasn’t already present in Netscape Navigator in the mid 90s2.

So what can we do to bring mobile browsers into the modern times?

1. The single most important feature need by mobile browsers is support for extensions.

When was the last time you tried out an interesting mobile browser extension? I’d say never. Since most mobile browsers have no support for extensions whatsoever, you couldn’t do it even if you wanted to.

Once it’s possible to easily extend the browser anyone can start experimenting and adding features, but without that the existing browser is a dead end. Developers who want to add even a tiniest feature need to re-implement the whole browser functionality around the page rendering component. And I know from personal experience that doing this and then sneaking code for additional functionality in through the back door is a truly awful way of adding features. Not only is it a lot of work for the developer, but it’s also rather inconvenient and unlikely for users to try out a new browser.

2. Mobile browsers need to become content aware.

Now that the miracle of copy&paste has finally arrived to the world of mobile phones, performing some tasks like moving an address from the browser into the maps application has become slightly easier. But is this really enough to keep us happy for the next few years? Even on desktop computers where performing such a task takes only a second, copying from the browser and pasting into another app is no longer considered good enough.

Internet Explorer 8 now supports accelerators (so does Firefox, via IE8 Activities add-on) which perform such tasks for you. They are not perfect, of course. One problem is that all accelerators are offered every time, even if they make no sense in the given context:

IE8 Accelerators

Suggested accelerators don't always make sense

This same problem will be bigger on mobile devices – once and if they gain support for accelerators. I predict that more accelerators will be needed due to the limited nature of the devices, but displaying them all won’t work very well on small screens. Which is where content awareness comes into play. If the browser knew what kind of content you just selected it could present you only with options which make sense at that moment. Even better – most of the time selecting the text could be skipped altogether since the browser would already know where something starts and ends and could attach accelerators to that piece of content in advance. It could even grab new accelerators online if it determines using them would make sense.

There are several ways content awareness could be implemented. Microformats and RDFa immediately spring to mind, but since they are about as common in the wild as pink flying giraffes, some other solutions would need to be found. Alternatives might include external content descriptions and extraction rules (possibly crowdsourced), downloadable sets of algorithms for detection, data extracted by services like YQL open data tables or NLP services… But more about that some other time.

What I believe is important is that this content recognition should be performed by a centralized framework and available to all extensions so they don’t need to do their own parsing. Some basic work in this direction has been done with the Operator toolbar and Firefox’s support for Microformats, but in my opinion such functionality is much more needed in mobile browsers.

3. Browsers need to start detecting and exposing available extensions

You’re visiting a page that contains several events which can be added to your calendar and a table which can be displayed as a chart. Someone has created an alternative stylesheet which removes the huge header and there is also a mobile widget which allows you to interact with he same service in a more mobile-friendly way. But you’re never going to know about them and use them if your browser doesn’t notify you about their existence and make it easy to install and use them.

4.  We need support for client-side content modification

Some might disagree, but I believe we do.

How many times have you had to zoom in just so you could click the link to the next page in a sequence of pages, or pan around to find where that huge header ends and content starts? You don’t need to put up with any of this if you’re browsing the web on a computer. All sorts of common annoyances can be solved by simple tweaks, and extensions for desktop browsers have been making it possible to do this for years. By augmenting, modifying and filtering content they not only make it easier to access content you’re interested in, but also make it possible to ignore irrelevant parts of pages.

These are just some ideas about what the future of mobile browsers could look like. You might not agree with me on whether content awareness and even content modification are going to be important factors in the future of mobile browsing, but there’s one thing which is difficult to disagree with: a lot of work still needs to be done in order to turn mobile browsing into a user-friendly experience.

  1. While iPhone’s browser does support bookmarklets, they’re still a royal pain to set up.
  2. I wanted to list pinch-to-zoom here but I realized it’s more of an OS feature than a browser feature.

Tags: , , , , , , ,

Apr 18 2009

Crowdsourcing the semantic web

Category: Ideas,Semantic WebAleksander Kmetec @ 4:08 am

This is getting a bit old, isn’t it? Even after years and years of hearing about the semantic web, the actual semantic metadata is still an extremely rare occurrence on the web. It’s obvious that our current approach to building the linked data cloud is just not working.

Currently, all attempts at providing semantic metadata require server-side changes which means that we need to rely on page authors to implement them. This, of course, is a major obstacle. But what if we could change that? What if we could bypass page authors and have the crowd add semantic metadata to existing pages?

I believe that this is more than possible.

Semantic metadata would usually be added to web pages by adding additional attributes to HTML elements or creating new elements where needed. But as it turns out, existing web pages are already broken up into a surprising number of elements, so why not just use these?

Let’s take this search result from Amazon as an example:

Block tagging example

Relax, it's just a mock-up. Let's please resist arguing about the correctness of the labels.

Everything in the above image you see having a blue border is already a separate HTML block, addressable using an XPath expression. In order to attach a meaning to a block, we could use this XPath expression and associate it with a meaning, like this:

//div[@class="productData"]/div[@class="productTitle"]/a = “Book:Title”

Behind the scenes URIs from an OWL based ontology would be used instead of simple text labels. These mappings could then be uploaded to a server and made available to anyone.

So let’s forget forget about convincing thousands of web developers to learn and start using semantic markup. Forget Microformats. Forget RDFa. Everything about the semantic web boils down to one thing: reusable data. And existing data already published on the web can be made reusable by the crowd, using simple tools. Within a year we could have thousands of reusable semantic data sources!

Possible uses

Once we have a large collection of rules for determining what the data inside a certain HTML block represents, what can we do with them?

We can begin by creating an application which knows how to extract data from web pages and transform it into various formats like the original HTML with RDFa mixed in, pure RDF, RSS and others. That’s right – we can still have RDF and therefore compatibility with existing and future data published in the same format! This would also create a business opportunity for hosted RDF services, similar to how Feedburner is hosting RSS feeds. A service could even be created for hosting SPARQL endpoints.

Diagram 1

What about some uses that would benefit the regular people instead of just linked data nerds?

I believe that one area which desperately needs semantic metadata is mobile web browsing. Limited screen sizes and tiny or even virtual keyboards make tasks which are trivial to perform using a desktop computer a real chore on mobile devices. With semantic metadata, mobile browsers could be much more context-aware and could offer a better browsing experience. Please see my experimental browser Mosembro for more information on that topic.

Diagram 2

Some other things which would suddenly become easy to implement:

  • Automatic browser-generated mashups. Put a map next to any address, a “find on Amazon” link next to any book, etc.
  • Ad-hoc personal search and comparison engines: search all international Amazon stores, search used car ads on multiple sites or find all the 26″ screens costing less than 400€ by pulling in data from several electronics retailers and then filtering it.
  • Autonomous agents that notify you when one of your favorite travel agencies posts a last minute offer that matches your criteria.
  • Site-level search integrated into the browser (by using semantically tagged search forms). I never want to use browser’s “find in page” functionality again just to locate the search form! And since there would also be semantic metadata for search results available, local search results could then be combined with Google search results for that same domain.
  • Autofill for all kinds of forms.
  • Custom RSS feeds from any source, filtered by any set of criteria. (show Hacker news feed filtered by your favorite posters, or only posts with more than 5 comments)
  • Backup tool for your data stored in web apps: store tagged content in a reusable format; then import it into another app (which can also be automated using tagged input forms).
  • Data portability
  • Pretty much anything else promised by Microformats and RDFa advocates. The list goes on and on.

I suggest you also have a look at demo videos for the Aurora browser concept, as they are full of great examples of what would be possible if lots of semantic metadata was available.

At this point you might be thinking: “This sounds nice and all, but it’s never going to work”, so here’s a closer look at two existing browser extensions which are based on similar approaches.

Extension 1: Autopager

Autopager is a Firefox extension which automatically loads the next page of a site inline when you reach the end of the current page. It’s interesting for us because it uses XPath expressions for addressing HTML blocks, user generated rules and a central server for sharing those rules.

In order for it to do its thing, Autopager only needs to know two things:

  • which page element is the link to the next page
  • which element represents the contents of the page.

This is done using XPath expressions, like shown in the image below:


Autopager is very popular and is an excellent proof that this approach works – at least for simple tasks such as sharing rules for locating the “next page” link.

Extension 2: Intel Mash Maker

Intel logo


Intel’s Mash Maker is also a Firefox extension and it comes remarkably close to the idea described above. Poking at the JSON encoded data returned by its web service reveals that it also uses XPath expressions for addressing HTML elements. It also makes it possible to further narrow down the selection using regular expressions to only capture a part of the element’s contents. The part where it falls short, though, is that it only uses plain text lables instead of URIs for defining the meaning of blocks. This unfortunately makes the definitions pretty much unusable outside of their mashup platform.

Unfortunately, the project appears to be dead or on hold. The newest user contributed mashup is more than two months old and the latest blog entry was posted more than a year ago. But it still is worth checking out and it does prove that this approach can be implemented to handle more complex situations than those from the Autopager example.


There you have it, folks: a simple and straightforward way of solving the semantic web’s chicken and egg problem if perfectly within our reach.

The question now is who would be willing to sponsor such a project?

Tags: , , , , , ,