related: Linked data

  • 原本由ClearForest這家創立於1998的新創公司所研發出的語義分析技術。於2007被路透社(Thomson Reuters)所買下。
  • 運用自然語言處理(NLP)技術,解讀並即時地在網路文件上標示出人、地、公司、事(facts and events),以便於檢索與導覽。這也讓使用者更容易探索相關資訊。1)
  • OneCalais 技術由三部份構成:
    • 根據 IPTC 新聞標準,與 social tag 對內容進行歸類
    • 以RDF回傳內容中的人、事、地、組織主題標籤
    • 回傳文件的識別編碼,以與其他單位分享內容,如 Linking Open Data (LOD) 資料雲端
Founded in 1998, ClearForest was previously an independent software start-up. It was acquired by Reuters in 2007 and is now part of the Markets division of Thomson Reuters. 2)

OneCalais categorizes each piece of content using both IPTC news codes and ‘social tags.’ (For instance, if a story compares the racing performance of Ferraris vs. Porsches, it will suggest auto racing, motorsport and sports cars.)
It then identifies and tags the people, places, companies, facts and events in content, and returns those tags in the official W3C Semantic Web specification for metadata, Resource Description Framework (RDF).
It also returns a unique document identifier that makes it easy to share content with others, as well as links to related assets in the Linking Open Data (LOD) cloud – a rapidly growing ecosystem of open data that includes Wikipedia, The CIA World Fact Book, GeoNames, BBC News, The New York Times and more. 3)


  • 取用雲端連結資料的資料介面(API),提供詮釋資料產生服務:
  • Web CMS 工具, 目前有 Drupal 與 Wordpress (Tagaroo), Microsoft Office SharePoint Server 2007 (MOSS 2007 OpenCalais Integration) 版本
  • 透過 firefox 外掛 Calais
  • 開發工具與程式庫
    • Marmoset: 自動產生 microformat 服務 (PHP only)
  • 發佈工具 Submission Tool, 可在單機上將 xml, txt, htm/html 格式的文件上傳,Calais 處理完成會產生 XML/RDF 格式文件回存單機電腦。

OpenCalais uses natural language processing (NLP) to “read” a document, instantly identifying and tagging the relevant people, places, companies, facts and events for improved search and navigation. This will make it easy for users to explore connections between newsmakers, corporations and events across documents and across the full collection of source materials. 4)

The OpenCalais initiative supports the interoperability of content and advances Thomson Reuters mission to deliver intelligent information. It offers free metadata generation services, developer tools and an automatic connection to the Linked Data cloud. The free OpenCalais service and open API makes it easy to automate content operations, enhance content, increase audience engagement and extend distribution across the content ecosystem. 5)


OpenCalais 所提供的概念展示性服務,是 Tim Berners-Lee 語義網理想的一種展現

Gnosis 是一種安裝在瀏覽器上的應用工具,也在網路上最先進的即時語意處理工具。 Gnosis 能即時評估你所讀到的網頁,立即地在文本中找到人、組織、公司、產品或地域等關鍵資訊。只要將鼠標停在任何被標示出的主題文字上,依據款目類型的不同(個人,公司,地點等),Gnosis 會呈現不同的小彈出框,就能立即的找出相關新聞、網誌文章、地圖、公司資訊與維基百科條目的項目連結。 同時,在瀏覽器的側邊欄也能呈現 Gnosis 在頁面上找到款目的完整清單,讓使用者能夠鳥瞰地瀏覽與探索網頁中有那些主題內容。

Gnosis is the cutting edge of real-time semantic processing for the web. Gnosis evaluates web pages you read as you read them, immediately locating key information such as people, organizations, companies, products and geographies within the text. Simply hover over any of the identified topics, and immediately locate relevant news, blog entries, maps, company information and Wikipedia entries. 6)