Design and Implementation of Wiki Content Transformations and Refactorings

Abstract: The organic growth of wikis requires constant attention by contributors who are willing to patrol the wiki and improve its content structure. However, most wikis still only offer textual editing and even wikis which offer WYSIWYG editing do not assist the user in restructuring the wiki. Therefore, “gardening” a wiki is a tedious and error-prone task. One of the main obstacles to assisted restructuring of wikis is the underlying content model which prohibits automatic transformations of the content. Most wikis use either a purely textual representation of content or rely on the representational HTML format. To allow rigorous definitions of transformations we use and extend a Wiki Object Model. With the Wiki Object Model installed we present a catalog of transformations and refactorings that helps users to easily and consistently evolve the content and structure of a wiki. Furthermore we propose XSLT as language for transformation specification and provide working examples of selected transformations to demonstrate that the Wiki Object Model and the transformation framework are well designed. We believe that our contribution significantly simplifies wiki “gardening” by introducing the means of effortless restructuring of articles and groups of articles. It furthermore provides an easily extensible foundation for wiki content transformations.

Keywords: Wiki, Wiki Markup, WM, Wiki Object Model, WOM, Transformation, Refactoring, XML, XSLT, Sweble.

Reference: Hannes Dohrn and Dirk Riehle. “Design and Implementation of Wiki Content Transformations and Refactorings.” In Proceedings of the 9th International Symposium on Open Collaboration (WikiSym + OpenSym 2013). ACM, 2013.

The paper is available as PDF file.

Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia

Abstract: The heart of each wiki, including Wikipedia, is its content. Most machine processing starts and ends with this content. At present, such processing is limited, because most wiki engines today cannot provide a complete and precise representation of the wiki’s content. They can only generate HTML. The main reason is the lack of well-defined parsers that can handle the complexity of modern wiki markup. This applies to MediaWiki, the software running Wikipedia, and most other wiki engines. This paper shows why it has been so difficult to develop comprehensive parsers for wiki markup. It presents the design and implementation of a parser for Wikitext, the wiki markup language of MediaWiki. We use parsing expression grammars where most parsers used no grammars or grammars poorly suited to the task. Using this parser it is possible to directly and precisely query the structured data within wikis, including Wikipedia. The parser is available as open source from

Keywords: Wiki, Wikipedia, Wiki Parser, Wikitext Parser, Parsing Expression Grammar, PEG, Abstract Syntax Tree, AST, WYSIWYG, Sweble.

Reference: Hannes Dohrn and Dirk Riehle. “Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia.” In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym 2011). ACM Press, 2011. Pages 72-81.

The paper is available as PDF file (preprint).

Technical Report on WOM: An Object Model for Wikitext

Abstract: Wikipedia is a rich encyclopedia that is not only of great use to its contributors and readers but also to researchers and providers of third party software around Wikipedia. However, Wikipedia’s content is only available as Wikitext, the markup language in which articles on Wikipedia are written, and whoever needs to access the content of an article has to implement their own parser or has to use one of the available parser solutions. Unfortunately, those parsers which convert Wikitext into a high-level representation like an abstract syntax tree (AST) define their own format for storing and providing access to this data structure. Further, the semantics of Wikitext are only defined implicitly in the MediaWiki software itself. This situation makes it difficult to reason about the semantic content of an article or exchange and modify articles in a standardized and machine-accessible way. To remedy this situation we propose a markup language, called XWML, in which articles can be stored and an object model, called WOM, that defines how the contents of an article can be read and modified.

Keywords: Wiki, Wikipedia, Wikitext, Wikitext Parser, Open Source, Sweble, Mediawiki, Mediawiki Parser, XWML, HTML, WOM

Reference: Hannes Dohrn and Dirk Riehle. WOM: An Object Model for Wikitext. University of Erlangen, Technical Report CS-2011-05 (July 2011).

The technical report is available as PDF file.

Simultaneous Incremental Reconstruction of Object Geometry and Appearance for Interactive 3-D Model Acquisition

Abstract: While the creation of three-dimensional models from real-life objects is a commonly applied process, the reconstruction of complete datasets from such objects is still a delicate task. In this paper, a simple yet powerful framework is proposed that is able to reconstruct both geometry and appearance of a given object interactively. Working in an incremental mode of operation, it enables the user to reconstruct a given scene with full visual feedback during the progress of the reconstruction process. For geometry reconstruction, an existing surface reconstruction algorithm has been investigated and adjusted to the needs of the framework. Furthermore, a hardware-accelerated surface light field algorithm has been integrated into the framework that performs appearance reconstruction of the object.

The target application of our framework is the reconstruction of real-life objects using mobile acquisition devices. We demonstrate the performance and usefulness of our framework by reconstructing models from previously acquired datasets of real-life objects. Furthermore, we provide results of experiments run in our own model acquisition setup.

Keywords: Image-Based, Acquisition, Light fields, Modeling, Reconstruction, Interaction

Reference: Hannes Dohrn, Hannes Stadler, Marco Winter, and Günther Greiner. “Simultaneous Incremental Reconstruction of Object Geometry and Appearance for Interactive 3-D Model Acquisition”. In Proceedings of the 18th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (WSCG 2010). Vaclav Skala – Union Agency, 2011. Pages 97-104.

The paper is available as PDF file (preprint).

Incremental surface reconstruction methods

Abstract: The goal of this work is to study existing approaches to surface reconstruction and surface light field creation for their suitability for incremental surface reconstruction using a TOF-camera, which provides both geometry and color information. The emphasis is put on incremental surface reconstruction techniques for which two distinct algorithms by Ohtake et. al and Süssmuth et. al will be enhanced and evaluated. The surface light field is built using a method presented by Coombe et. al.

Furthermore an application is implemented which allows for rapid processing of 3D point data, thus creating an accurate 3D model which is used to build a surface light field incrementally. The application is interactive and new information incorporated into the reconstruction becomes immediately available to the user.

Keywords: Image-Based, Acquisition, Light fields, Modeling, Reconstruction, Interaction

Reference: Hannes Dohrn. “Incremental surface reconstruction methods”. Diplomarbeit, Friedrich-Alexander-Universität Erlangen-Nürnberg, 2009.

The Diplomarbeit is available as PDF file.

Oberflächenlichtfelder für moderne Grafikhardware

Abstract: Ziel dieser Arbeit ist die Implementierung eines Systems, das die grundlegenden Algorithmen realisiert, die für das interaktive Erweitern und das Rendering eines Oberflächenlichtfeldes notwendig sind. Darüber hinaus werden einige der verwendeten Algorithmen auf die Grafikhardware portiert, um den Vorgang der Erstellung eines Oberflächenlichtfeldes unter Verwendung moderner Hardware zu beschleunigen. Abschließend werden die erzielten Ergebnisse diskutiert.

Keywords: Image-Based, Acquisition, Light fields

Reference: Hannes Dohrn. “Oberflächenlichtfelder für moderne Grafikhardware”. Studienarbeit, Friedrich-Alexander-Universität Erlangen-Nürnberg, 2008.

The Studienarbeit is available as PDF file.

Phase-based Gesture Motion Parametrization and Transitions for Conversational Agents with MPML3D

Abstract: We present a method to produce smooth transitions between arbitrary pieces of character animation, which is based on the application of dynamic transition curves. Unlike other approaches, we achieve anytime interruptibility for body expressions, that is, gestures can be changed anytime during execution while maintaining naturalness of motion transition. To obtain highly natural skeletal movement, our approach is integrated with motion parametrization, as proposed in the “Verbs and Adverbs” technique, and further methods of fuzzy motion blending. We will demonstrate how the latest version of the Multimodal Presentation Markup Language (MPML3D) integrates parameterized agent behavior, and can support the incorporation of personality and emotional attentiveness in a straightforward way.


Reference: Klaus Brügmann, Hannes Dohrn, Helmut Prendinger, Marc Stamminger, and Mitsuru Ishizuka. “Phase-based Gesture Motion Parametrization and Transitions for Conversational Agents with MPML3D.” In Proceedings of the 2nd international conference on INtelligent TEchnologies for interactive enterTAINment (INTETAIN ’08). ICST, 2007. Pages 10:1-10:6.

The paper is available as PDF file.