Archive for April, 2004

DMOZ URIs for dc:subject

Tuesday, April 20th, 2004

As a first step, using DMOZ categories for dc:subject is very simple. In Movable Type, just place the dmoz URI (eg, http://dmoz.org/Computers/Internet/On_the_Web/Weblogs/ for blogging) as a category in your blog configuration.

dc:subject wants a Literal as a value, so the URI from dmoz will show up as a Literal. I believe the Semantic Blog Demonstrator has a more full schema for blogging. Once it starts to work again, I’ll see how best to indicate a blog entry’s subject in a more RDF friendly way (ie, make the DMOZ URI a Resource and not a Literal).

Just adding more semantic web friendly triples.

A Trust Web Might Help

Saturday, April 17th, 2004

phil ringnalda dot com: TypeKey from a different angle

Phil Ringnalda writes:

Now, I’m not so sure that’s the right angle to take. If instead of seeing TypeKey as a slightly more tolerable (because you only have to lie about your email address once instead of many times) way to implement registration, you look at it as way to implement comment moderation, it begins to look a tad bit better.

He’s right about this. TypeKey alone won’t stop any problems. We need authorization controls, which are above authentication. TypeKey is only a first step.

I am investigating using Semantic Web technologies to help with the authorization problem. There’s a concept of building a trust web that can likely help here.

  1. Use a authenitcation service (TypeKey or PGP signatures) to prove Seth is Seth.
  2. Give Seth a URI. A foaf:mbox_sha1 might be good here.
  3. Put blog comment triples into RSS feeds. The aggregators can pick these up and start to collect where Seth has posted, when, etc.
  4. Some concepts of trust begin to form. For instance, if Seth has posted to many blogs, and those posts are old (haven’t been deleted), then they might not be content spam. Also, RSS feeds can include any moderation points given to posts. The white lists and black lists created by blog authors can also easily be captured inside the RSS.

What’s nice about putting all that raw information into the RSS is the aggregators can make their own assumptions. Isn’t this what the semantic web is all about? Don’t like how one aggregator’s algorithm for trust is working? Upload your own OWL file, or use a different aggregator. The important part is that "who to trust" is decided at runtime.

As Phil points out in a blog post to me, the above is not the hard part. The hard part is getting everyone to put these types of tripes into their RSS feeds. It’s a chicken and egg problem. No one will put the tripes in until they see a killer app. But it’s hard to write a good killer app without all the triples to prove it’s killer.

Slight Change to RSS Blog Comments Module

Friday, April 16th, 2004

RSS 1.0 and 2.0 Blog Comments Module

The current blog comments module does not render as RDF. The current usage:

<blogcomments:comments>

need to become

<blogcomments:comments rdf:parseType=”Collection”>

I have made this change, and along with the MTEntryIfComments plugin, I now have validated RSS 1.0 as RDF!

Just doing my part to add more triples to the world.

Of course, now question remains, how best to promote these (and other) triples to other blog authors?

Add Comment Triples To RSS

Friday, April 16th, 2004

MT Extensions: MTEntryIfComments 2.0

MovableType has conditional tags that are true when entries and pings are enabled. The following tags allow you to test whether comments or pings exist.

MTEntryIfComments:
A conditional tag that is true if the current entry has one or more (or a specific number of) comments. Useful if you want to add content before/after a list of comments only if the list is not empty, or if you want to display different text based on the number of comments.

This is useful for rendering comment triples if they exist for a particular blog entry.

An External OWL Ontology Creates Order, Promotes Freedom

Thursday, April 15th, 2004

SemErgence: Comment on Subject metadata to Blog Posts

CaptSolo writes:

Regarding the hard part to make the UI for selecting categories - I think that most blog engines already have UI for entering the blog post categories.

Therefore, instead of adding semantic category to each post and having to create a new UI for that, we may add the semantic meta-information to the existing categories.

This is the route I initially travelled. I have a post that shows how to use OWL to map a blog post that has a particular dc:subject Literal to a OWL Class. For instance, declaring that posts that have a dc:subject of "blogging" are members of the :PostAboutBlogging class.

This leaves each blog owner the freedom to declare any dc:subject they want (literal or Resource) and to use an external OWL ontology to bring some order.

Add dc:references to RSS, Help The World

Thursday, April 15th, 2004

SemErgence: How To Make Blogging More SemWeb Friendly
Kasei writes:


While the dcterms:references might be easily obtainable from the HTML, I don’t see any reason to make people parse the HTML over and over again, when it could easily be done once, and be immediately available as RDF. So here’s a Movable Type plugin which does just that; just drop the <$MTRDFReferences$> tag into the RSS 1.0 template.

Thanks kansei! This is great. I’ll add this in tonight.

If people do this, it’ll make semweb applications that compile and operate on Resources that people are blogging about much easier. How does the semantic web community evangelize movements such as this? My guess would be to get the tool manufacturers to agree to modify their templates. Most blog owners wouldn’t know how to manipulate their blog backend.

There is a Semantic Web Best Practices and Deployment Working Group from the W3. Maybe I’ll talk to them about ideas for help.

Hacking MovableType, Here I Come

Wednesday, April 14th, 2004

movabletype.org : FAQ


Q: May I modify the Movable Type source code?
A: Yes, as long as you do not redistribute the modified code without permission. In addition, if you modify any of the code and feel that your changes would be useful to all Movable Type users, we encourage you to make your work available to the community of users by sharing it on the independent plug-ins site, http://www.mt-plugins.org.

Looks like Six Apart doesn’t mind too much if we hack Movable Type. Sounds good to me. I think we have enough ideas to keep us busy.

I think first up will be to include dmoz and/or yahoo category forests as selectable Categories. The modification of the RSS template to include the dmoz URI as a rdf:resource of dc:subject will be next. If I can do all that as a plugin, then all the better!

Oh, and I’m focusing on Movable Type because that seems to be pretty popular.

Subject metadata to Blog Posts

Wednesday, April 14th, 2004

SemErgence: How To Make Blogging More SemWeb Friendly

Morten Frederiksen writes:


While it would certainly be nice, I think you’ll find that the hardest part of (3) is agreeing on a taxonomy of subjects (or mapping between differing taxonomies). For this reason, something like dmoz:category or yahoo:index might be interesting.

I agree that it would be hard to map all that data. His suggestion of using dmoz and yahoo is great. The easiest thing would be to take any URIs found in the blog post (perhaps via dcterms:references?) and query for them in dmoz or yahoo. Of course, that would only be correct about 5% of the time.

It would be nice, then, to have Movable Type and other blog posting software, to give a quick tree view of dmoz and yahoo. A good blog posting software would save the favorite categories for quick retrieval (a sort of hot subjects collection).

I just upgraded to Movable Type 2.661. I don’t believe it’s Open Source, but wonder how much I can hack in?

OWL Shows Off

Wednesday, April 14th, 2004

Re: Does Euler support owl:hasValue ? from Jos De_Roo on 2004-04-14 (www-rdf-logic@w3.org from April 2004)

I was very lucky to get help from Bejamin Nowack and Mr. Euler Jos De Roo. I asked the www-rdf-logic list about how to declare an object is a member of a owl:Class based on a Literal value of a Property. I was close, only backwards. The correct OWL would be something like:


[ a owl:Restriction ; owl:onProperty dc:subject ; owl:hasValue "blogging"^^xsd:string ] rdfs:subClassOf :PostAboutBlogging .
:PostAboutBlogging a owl:Class .
dc:subject a owl:DatatypeProperty ; rdfs:range xsd:string .

Then, with a fact like the following (easily taken from some RSS):


:blogitem a owl:Thing ; dc:subject "blogging"^^xsd:string .

I could tell:


:blogitem a :PostAboutBlogging .

I hope this at least illustrates my goal: to map different dc:subjects from different blogs into a locally understood set of owl:Classes. Then, provide a way to create Planet*.com sites with ad hoc aggregation based on queries given by users. And the first query: " Collect all blog entries that are of type XXX "

OWL is nice, and Euler is nice. Now, to plug them into the planetrdf.com code.

Thoughts or comments?

How To Make Blogging More SemWeb Friendly

Wednesday, April 14th, 2004

After thinking about this for a little while, and with the discussions on this blog, I offer some suggestions to make blogs more friendly and useful to the semantic web.

  1. Put triples describing comments and comment authors inside RSS - This would help with the comment spam problem and start to build a web of trust wrt posting to blogs. Movable Type’s TypeKey system solves the authentication problem. By finding triples of the blog comments, I feel we can help with the second level to the problem: authorization.
  2. RSS feeds should populate dc:references with URIs used in blog entry body - Typically, blog entries include one or more URIs in the body of the entry. By making those URIs explicitly available via triples inside the RSS, more accurate and interesting information becomes available. Specifically, the question of "What are people blogging about?" becomes easier to answer.
  3. Encourage Blog authors to include dc:subject, and help define what might go in there - By starting to label blog entries with information on their subject (what the entry might be about), ad hoc planet* sites can be built. These planet* sites scan all the RSS looking for entries about particular subject matters and aggregate the entries. For this to work, there would be a way for an aggregator to declare a mapping between the values of dc:subject found in the wild, and the recognized values of the aggregator (OWL helps here a lot).

Thoughts or comments?