<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Crowdsourcing the semantic web</title>
	<atom:link href="http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/feed/" rel="self" type="application/rss+xml" />
	<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/</link>
	<description>A blog about the web, mobile web, semantic web and mobile semantic web.</description>
	<lastBuildDate>Fri, 19 Feb 2010 23:47:09 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Rachel</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-2059</link>
		<dc:creator>Rachel</dc:creator>
		<pubDate>Thu, 29 Oct 2009 23:03:22 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-2059</guid>
		<description>I thought the post made some good points on web scrapers, I use python for simple html web scrapers, but for larger projects like the web, files, or documents i tried &lt;a href=&quot;http://www.extractingdata.com/web%20scraper.htm&quot; rel=&quot;nofollow&quot;&gt;web scrapers&lt;/a&gt; which worked great, they build quick custom screen scrapers, web scrapers, and data parsing programs</description>
		<content:encoded><![CDATA[<p>I thought the post made some good points on web scrapers, I use python for simple html web scrapers, but for larger projects like the web, files, or documents i tried <a href="http://www.extractingdata.com/web%20scraper.htm" rel="nofollow">web scrapers</a> which worked great, they build quick custom screen scrapers, web scrapers, and data parsing programs</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fuller</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-417</link>
		<dc:creator>Fuller</dc:creator>
		<pubDate>Tue, 21 Apr 2009 14:18:45 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-417</guid>
		<description>MetaSeeker toolkit and web-based services are another example. MetaSeeker toolkit is a HTML wrapper factory. The generated HTML wrappers, or called as scrappers, are coded with XML, XSLT and XPath, which are shared and collaboratively maintained on the Web-based MetaSeeker server. Compared to RDF, XML is more light and pragmatic from the point of view of defining semantic data structures of Web pages.

Please check it at: http://www.gooseeker.com</description>
		<content:encoded><![CDATA[<p>MetaSeeker toolkit and web-based services are another example. MetaSeeker toolkit is a HTML wrapper factory. The generated HTML wrappers, or called as scrappers, are coded with XML, XSLT and XPath, which are shared and collaboratively maintained on the Web-based MetaSeeker server. Compared to RDF, XML is more light and pragmatic from the point of view of defining semantic data structures of Web pages.</p>
<p>Please check it at: <a href="http://www.gooseeker.com" rel="nofollow">http://www.gooseeker.com</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aleksander Kmetec</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-414</link>
		<dc:creator>Aleksander Kmetec</dc:creator>
		<pubDate>Sun, 19 Apr 2009 13:58:40 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-414</guid>
		<description>Wow. RDF-EASE looks like something that would be easily understandable to anyone familiar with CSS and would cut down on the number of new acronyms that would need to be mastered by developers. But still... Remember the CSS &quot;movement&quot; from not that long ago? It took a dozen high profile bloggers and an active community something like 5 years, lots of hard work and some well crafted stories to push CSS into the mainstream. All that for one technology. The semantic community, on the other hand, doesn&#039;t currently have any leaders and no infrastructure for spreading ideas at all. All we have is a big bowl of acronym soup and we&#039;re completely baffled by the fact that (almost) nobody wants to eat it.</description>
		<content:encoded><![CDATA[<p>Wow. RDF-EASE looks like something that would be easily understandable to anyone familiar with CSS and would cut down on the number of new acronyms that would need to be mastered by developers. But still&#8230; Remember the CSS &#8220;movement&#8221; from not that long ago? It took a dozen high profile bloggers and an active community something like 5 years, lots of hard work and some well crafted stories to push CSS into the mainstream. All that for one technology. The semantic community, on the other hand, doesn&#8217;t currently have any leaders and no infrastructure for spreading ideas at all. All we have is a big bowl of acronym soup and we&#8217;re completely baffled by the fact that (almost) nobody wants to eat it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan Brickley</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-413</link>
		<dc:creator>Dan Brickley</dc:creator>
		<pubDate>Sun, 19 Apr 2009 12:01:34 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-413</guid>
		<description>s/SQL/HTML/ in my previous comment, ie. &quot;add RDFa / microformats into HTML&quot;</description>
		<content:encoded><![CDATA[<p>s/SQL/HTML/ in my previous comment, ie. &#8220;add RDFa / microformats into HTML&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan Brickley</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-412</link>
		<dc:creator>Dan Brickley</dc:creator>
		<pubDate>Sun, 19 Apr 2009 12:00:17 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-412</guid>
		<description>A lot of RDF folk are more pragmatic than is sometimes assumed. RDFa is nice, but we also made GRDDL, a system that has much in common with your (perfectly sensible) suggestions, and which uses XSLT as the language for describing such extractors. You might also look at http://buzzword.org.uk/2008/rdf-ease/spec which does similar using CSS-based notation.

Your positive case here stands alone, no need to beat up on strawmen (&quot; It’s obvious that our current approach to building the linked data cloud is just not working.&quot;) to make it. The &quot;current approach&quot; to growing the linked data cloud is that various of do whatever it takes to get the data out there. Sometimes this is tweaking a PHP script to have an RDF/XML mode or add RDFa / microformats into SQL. Sometimes this is a massive download, clean and republish exercise like DBPedia, sometimes it is conducted by transforming from SQL (D2RQ etc) or XML (GRDDL) or JSON sources. Sometimes data is created afresh, or reworked from other systems (Semantic Mediawiki, MusicBrainz, ...). The only thing that binds it all together is the shared use of standards. RDF for data model, RDFS/OWL for vocabulary description, URIs for identifiers, SPARQL for querying, SKOS for simple categories, ...</description>
		<content:encoded><![CDATA[<p>A lot of RDF folk are more pragmatic than is sometimes assumed. RDFa is nice, but we also made GRDDL, a system that has much in common with your (perfectly sensible) suggestions, and which uses XSLT as the language for describing such extractors. You might also look at <a href="http://buzzword.org.uk/2008/rdf-ease/spec" rel="nofollow">http://buzzword.org.uk/2008/rdf-ease/spec</a> which does similar using CSS-based notation.</p>
<p>Your positive case here stands alone, no need to beat up on strawmen (&#8221; It’s obvious that our current approach to building the linked data cloud is just not working.&#8221;) to make it. The &#8220;current approach&#8221; to growing the linked data cloud is that various of do whatever it takes to get the data out there. Sometimes this is tweaking a PHP script to have an RDF/XML mode or add RDFa / microformats into SQL. Sometimes this is a massive download, clean and republish exercise like DBPedia, sometimes it is conducted by transforming from SQL (D2RQ etc) or XML (GRDDL) or JSON sources. Sometimes data is created afresh, or reworked from other systems (Semantic Mediawiki, MusicBrainz, &#8230;). The only thing that binds it all together is the shared use of standards. RDF for data model, RDFS/OWL for vocabulary description, URIs for identifiers, SPARQL for querying, SKOS for simple categories, &#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Oleksandr Shturmov</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-411</link>
		<dc:creator>Oleksandr Shturmov</dc:creator>
		<pubDate>Sun, 19 Apr 2009 04:16:35 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-411</guid>
		<description>*they can crowdsource this*</description>
		<content:encoded><![CDATA[<p>*they can crowdsource this*</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Oleksandr Shturmov</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-410</link>
		<dc:creator>Oleksandr Shturmov</dc:creator>
		<pubDate>Sun, 19 Apr 2009 04:16:03 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-410</guid>
		<description>Google. As they crowd-sourced their image search optimization, they can crowdsource, and thus sponsor something like this. However the semantics is only half of the NLP problem.</description>
		<content:encoded><![CDATA[<p>Google. As they crowd-sourced their image search optimization, they can crowdsource, and thus sponsor something like this. However the semantics is only half of the NLP problem.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alex</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-409</link>
		<dc:creator>alex</dc:creator>
		<pubDate>Sun, 19 Apr 2009 01:51:42 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-409</guid>
		<description>I just love the patent application language. I could swear that individual words are in English, but sentences are somehow not. :)</description>
		<content:encoded><![CDATA[<p>I just love the patent application language. I could swear that individual words are in English, but sentences are somehow not. :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen Arnold</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-408</link>
		<dc:creator>Stephen Arnold</dc:creator>
		<pubDate>Sun, 19 Apr 2009 01:26:11 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-408</guid>
		<description>The April 16, 2009 Google patent US20090100036.</description>
		<content:encoded><![CDATA[<p>The April 16, 2009 Google patent US20090100036.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alex</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-407</link>
		<dc:creator>alex</dc:creator>
		<pubDate>Sun, 19 Apr 2009 00:13:59 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-407</guid>
		<description>@Vincent Murphy: Solvent looks very interesting. I never tried it out because it requires PiggyBank, which in turn requires an older version of Firefox... But yes, as far as the element tagging part goes it&#039;s almost exactly like what I described in the post. It could be a very useful starting point for a project. Adding more than just DublinCore support to it might be a good place to start (at least the screencast makes it look like they only support DC).</description>
		<content:encoded><![CDATA[<p>@Vincent Murphy: Solvent looks very interesting. I never tried it out because it requires PiggyBank, which in turn requires an older version of Firefox&#8230; But yes, as far as the element tagging part goes it&#8217;s almost exactly like what I described in the post. It could be a very useful starting point for a project. Adding more than just DublinCore support to it might be a good place to start (at least the screencast makes it look like they only support DC).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alex</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-406</link>
		<dc:creator>alex</dc:creator>
		<pubDate>Sun, 19 Apr 2009 00:06:39 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-406</guid>
		<description>@tim finin: Just had a quick look at Annotea. It appears to be similar from the technical point of view, but they didn&#039;t take the idea beyond attaching notes to elements. Their main focus seem to be the notes you can attach, not the HTML fragments you&#039;re referencing when you&#039;re attaching the note. Maybe there&#039;s some potential for &quot;abusing&quot; this by putting URIs in the notes, but that wouldn&#039;t be exactly user friendly.</description>
		<content:encoded><![CDATA[<p>@tim finin: Just had a quick look at Annotea. It appears to be similar from the technical point of view, but they didn&#8217;t take the idea beyond attaching notes to elements. Their main focus seem to be the notes you can attach, not the HTML fragments you&#8217;re referencing when you&#8217;re attaching the note. Maybe there&#8217;s some potential for &#8220;abusing&#8221; this by putting URIs in the notes, but that wouldn&#8217;t be exactly user friendly.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Luis Pereira</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-398</link>
		<dc:creator>Luis Pereira</dc:creator>
		<pubDate>Sat, 18 Apr 2009 18:46:19 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-398</guid>
		<description>Stumpedia is a social semantic project and community effort that relies on human participation and folksonomies to index, organize, and review the world wide web. Their aim is to help build Natural Language Processing and the Semantic Web.</description>
		<content:encoded><![CDATA[<p>Stumpedia is a social semantic project and community effort that relies on human participation and folksonomies to index, organize, and review the world wide web. Their aim is to help build Natural Language Processing and the Semantic Web.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kyle Maxwell</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-397</link>
		<dc:creator>Kyle Maxwell</dc:creator>
		<pubDate>Sat, 18 Apr 2009 18:09:06 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-397</guid>
		<description>Have you seen parselets.com?  It&#039;s almost exactly this.</description>
		<content:encoded><![CDATA[<p>Have you seen parselets.com?  It&#8217;s almost exactly this.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tim finin</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-395</link>
		<dc:creator>tim finin</dc:creator>
		<pubDate>Sat, 18 Apr 2009 15:30:37 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-395</guid>
		<description>How does this comare to the W3C&#039;s annotea project -- http://www.w3.org/2001/Annotea/?  Somehow that never seemed to gind wide-spead use and was, apparently, abandoned.</description>
		<content:encoded><![CDATA[<p>How does this comare to the W3C&#8217;s annotea project &#8212; <a href="http://www.w3.org/2001/Annotea/?" rel="nofollow">http://www.w3.org/2001/Annotea/?</a>  Somehow that never seemed to gind wide-spead use and was, apparently, abandoned.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vincent Murphy</title>
		<link>http://lexandera.com/2009/04/crowdsourcing-the-semantic-web/comment-page-1/#comment-394</link>
		<dc:creator>Vincent Murphy</dc:creator>
		<pubDate>Sat, 18 Apr 2009 15:11:01 +0000</pubDate>
		<guid isPermaLink="false">http://lexandera.com/?p=329#comment-394</guid>
		<description>How do Solvent http://simile.mit.edu/wiki/Solvent fit into your scheme? I think all that Solvent is missing is a website where you can upload and share your scraper.</description>
		<content:encoded><![CDATA[<p>How do Solvent <a href="http://simile.mit.edu/wiki/Solvent" rel="nofollow">http://simile.mit.edu/wiki/Solvent</a> fit into your scheme? I think all that Solvent is missing is a website where you can upload and share your scraper.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
