IKS Project aims to bring semantic capabilities to traditional content management systems (CMS). IKS offers various functionalities considering all CMS layers from the graphical interface to database layer. The functionalities are provided by several standalone software tools. The most important tools coming out of IKS is the Apache Stanbol and Vienna IKS Editables (VIE).
Apache Stanbol provides a set of reusable components for semantic content management. Apache Stanbol’s intended use is to extend traditional content management systems with semantic services. As its name indicates Apache Stanbol is being developed in the umbrella of Apache Software Foundation (ASF). Stanbol is a graduated top level project of ASF.
Stanbol offers semantic enhancement functionalities for plain text content through various kinds of entity recognition algorithms. After the entity extraction process, the extracted entities such as places, persons, organizations, etc. are linked with the already existing, real entities included the Linked Open Data (LOD) cloud or local ontologies introduced to system. In the persistence layer, Stanbol offers semantic indexing and search for the content and additional information retrieved from LOD, regarding the entities contained in the content.
The IKS applications aim to address the incapabilities caused by not taking semantics of content items into consideration of existing CMSes. For instance, traditional CMSes do not have advanced semantic lifting and tagging capabilities which would enable development of semantically enriched text editors, similarity search, automatic categorization, etc. Besides semantic tagging, content items could be linked with each other based on their semantic characteristics which would enable semantic navigation between content items. Similarly, existing CMSes lack the search capabilities integrated with the underlying content semantics. Such content semantics based incapabilities are the main motivation behind the technologies provided by IKS Project.
We took part in the implementation of Stanbol. We have implemented the CMS Adapter, Content Hub and Ontology Manager modules of this project.
CMS Adapter communicates with CMSes in a standard way using the JCR and CMIS specifications. CMS Adapter provides retrieval of actual content, associated metadata related with the CMS itself to be indexed in a semantic way by Content Hub.
Contenthub allows creation of semi-automatic, domain specific index creation on top of the Apache Solr framework and indexing of the plain content, CMS related metadata and semantic enhancements of document retrieved from LOD cloud. On top of the indexed content, Content Hub provides different search functionalities such as faceted search and SPARQL search.
Ontology Manager provides an abstraction to manage the enhancement ontologies of content items in different types of triple stores.
In the project, we have built a health domain specific demonstration. We have configured the named entity recognizers of Stanbol so that it would detect health related entities such as drugs, diseases, adverse event, etc. within the documents. Thanks to this configuration, we were able to index health related documents located in a specific CMS in a way that we were aware of the diseases or drugs mentioned in them. As a result, were able to provide advanced search functionalities considering that metadata.
|2.||Consiglio Nazionale delle Ricerche||Italy|
|4.||Deutsche Forschungszentrum für Künstliche Intelligenz||Germany|
|12.||University of Paderborn||Germany|
|13.||University of St.Gallen||Switzerland|