HTML5 Zone is brought to you in partnership with:

Henri Bergius, a.k.a. Bergie, is a former Viking based in the Nordic country of Finland. When he is not exploring Georgia’s cave cities or running with the bulls in Pamplona, Bergie works on web services built on top of the Midgard toolkit. His company, Nemein, provides web and mobile solutions for several major companies in Finland and abroad. He has been actively working on integrating standards like RDFa into the system and traveling the world advocating interoperation between open-source CMS’s. Much of his latest work involves building web services in CoffeeScript and doing data integration with the NoFlo flow-based programming toolkit. Henri is a DZone MVB and is not an employee of DZone and has posted 31 posts at DZone. You can read more from them at their website. View Full User Profile

Automated linking with rich text editors

  • submit to reddit

The web is built of links, of pages linking to other resources on the internet. But making those links manually is tedious. This is another area where modern inline editors could do better.

Yesterday on Hacker News, there was a thread about Wikidata, Wikimedia Foundation's new knowledge base. This comment struck me especially:

I was using Wikipedia the other day and it occurred to me how primitive it is to have all the inner links to other Wikipedia articles defined manually, surely these should have been automated by now (i.e., marking a word or two would link you to the relevant article).

And indeed, this is a usability problem that can already be fixed with the Semantic Interaction stack underneath Create.js.

Introducing Annotate.js

Annotate.js is a VIE widget built by Szaby Grünwald. It works very similarly to a spell checker in traditional text editors — you write text, and it highlights the potential entities you might want to link to. You then can either accept or decline these link suggestions by clicking them. In case of multiple potential matches, you can also disambiguate between them by selecting from an offered list.

Here is a quick video of Annotate.js in action:

You can also try it yourself with an online demo.

Currently Annotate.js has been integrated with the Hallo rich text editor, but it would be easy to do the same with other popular editors like Aloha and CKEditor.

Connecting to entities

The big question with automatic linking is where the entities come from. There are services like OpenCalais that can provide these suggestions for your content, but most of them are focused only on shared knowledge bases of big companies, famous people, and major cities.

Unless you're running a newspaper, it is unlikely that these are the things your content is about.

Apache Stanbol is an open source engine that can provide the enhancements for you. Out of the box it provides suggestions based on the Wikipedia knowledge repository. But more importantly, you can feed it with your own entities.

This way the enhancements you get for your content can be tuned to be meaningful to your content and your audience. If you write about medicine, they could be about symptoms and diseases, or if you're writing about technology, they could be specific open source projects and their contributors. With Stanbol, the choice is yours.

The current downside of Stanbol is that you'll have to run it yourself, but there may be solutions coming for that as well.

Beyond editing

Like never losing your content or managing collections, Annotate.js shows what we can do to improve the editing experience when we interact with it in a semantically meaningful manner.

What Annotate.js does is not merely creating links, but it also marking the machine-readable relationship between them and the HTML content being edited. This can then be used by yet another set of tools — like search engines — to understand and organize the content better.

It is easy to see Create.js (like Drupal did, unfortunately) as just an easy way to add nice inline editing features to your CMS. However, while that is a good initial step, the addition of being able to interact with your content on the semantic level can do a lot more. Automated linking is just another demonstration of that.

As the ecosystem around Create.js and VIE matures, and it ships in more systems, there will be things that we can't even imagine now built on the stack.

If your CMS is properly decoupled, you can benefit from that immediately.

Read more Decoupled CMS posts.

Published at DZone with permission of Henri Bergius, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)