Weiand, Klara (2011): Keyword-Based Querying for the Social Semantic Web: The KWQL Language: Concept, Algorithm and System. Dissertation, LMU München: Faculty of Mathematics, Computer Science and Statistics |
Preview |
PDF
Weiand_Klara.pdf 7MB |
Abstract
Enabling non-experts to publish data on the web is an important achievement of the social web and one of the primary goals of the social semantic web. Making the data easily accessible in turn has received only little attention, which is problematic from the point of view of incentives: users are likely to be less motivated to participate in the creation of content if the use of this content is mostly reserved to experts. Querying in semantic wikis, for example, is typically realized in terms of full text search over the textual content and a web query language such as SPARQL for the annotations. This approach has two shortcomings that limit the extent to which data can be leveraged by users: combined queries over content and annotations are not possible, and users either are restricted to expressing their query intent using simple but vague keyword queries or have to learn a complex web query language. The work presented in this dissertation investigates a more suitable form of querying for semantic wikis that consolidates two seemingly conflicting characteristics of query languages, ease of use and expressiveness. This work was carried out in the context of the semantic wiki KiWi, but the underlying ideas apply more generally to the social semantic and social web. We begin by defining a simple modular conceptual model for the KiWi wiki that enables rich and expressive knowledge representation. A component of this model are structured tags, an annotation formalism that is simple yet flexible and expressive, and aims at bridging the gap between atomic tags and RDF. The viability of the approach is confirmed by a user study, which finds that structured tags are suitable for quickly annotating evolving knowledge and are perceived well by the users. The main contribution of this dissertation is the design and implementation of KWQL, a query language for semantic wikis. KWQL combines keyword search and web querying to enable querying that scales with user experience and information need: basic queries are easy to express; as the search criteria become more complex, more expertise is needed to formulate the corresponding query. A novel aspect of KWQL is that it combines both paradigms in a bottom-up fashion. It treats neither of the two as an extension to the other, but instead integrates both in one framework. The language allows for rich combined queries of full text, metadata, document structure, and informal to formal semantic annotations. KWilt, the KWQL query engine, provides the full expressive power of first-order queries, but at the same time can evaluate basic queries at almost the speed of the underlying search engine. KWQL is accompanied by the visual query language visKWQL, and an editor that displays both the textual and visual form of the current query and reflects changes to either representation in the other. A user study shows that participants quickly learn to construct KWQL and visKWQL queries, even when given only a short introduction. KWQL allows users to sift the wealth of structure and annotations in an information system for relevant data. If relevant data constitutes a substantial fraction of all data, ranking becomes important. To this end, we propose PEST, a novel ranking method that propagates relevance among structurally related or similarly annotated data. Extensive experiments, including a user study on a real life wiki, show that pest improves the quality of the ranking over a range of existing ranking approaches.
Item Type: | Theses (Dissertation, LMU Munich) |
---|---|
Keywords: | wiki, querying, social web, web search, keyword search, fuzzy matching, ranking |
Subjects: | 000 Computers, Information and General Reference > 004 Data processing computer science 000 Computers, Information and General Reference |
Faculties: | Faculty of Mathematics, Computer Science and Statistics |
Language: | English |
Date of oral examination: | 8. February 2011 |
1. Referee: | Bry, François |
MD5 Checksum of the PDF-file: | c4d9d5bfd143d07b41f0369356b1a8bb |
Signature of the printed copy: | 0001/UMC 19255 |
ID Code: | 12671 |
Deposited On: | 21. Feb 2011 13:20 |
Last Modified: | 24. Oct 2020 04:07 |