Journal article

Unifying dimensions in coherence relations: How various annotation frameworks are related

  • Sanders, Ted J.M. Utrecht Institute of Linguistics OTS, Utrecht University, Utrecht, Netherlands
  • Demberg, Vera Computer Science, Saarland Informatics Campus, Saarbrücken, Germany
  • Hoek, Jet Utrecht Institute of Linguistics OTS, Utrecht University, Utrecht, Netherlands
  • Scholman, Merel C.J. Language Science & Technology, Saarland University, Saarbrücken, Germany
  • Asr, Fatemeh Torabi Discourse Processing La, Department of Linguistics, Simon Fraser University, Burnaby, Canada
  • Zufferey, Sandrine Institut de Langue et de Littérature françaises, University of Bern, Bern, Switzerland
  • Evers-Vermeul, Jacqueline Utrecht Institute of Linguistics OTS, Utrecht University, Utrecht, Netherlands
Show more…
Published in:
  • Corpus Linguistics and Linguistic Theory. - Walter de Gruyter GmbH. - 2018, vol. 0, no. 0
English AbstractIn this paper, we show how three often used and seemingly different discourse annotation frameworks – Penn Discourse Treebank (PDTB), Rhetorical Structure Theory (RST), and Segmented Discourse Representation Theory – can be related by using a set of unifying dimensions. These dimensions are taken from the Cognitive approach to Coherence Relations and combined with more fine-grained additional features from the frameworks themselves to yield a posited set of dimensions that can successfully map three frameworks. The resulting interface will allow researchers to find identical or at least closely related relations within sets of annotated corpora, even if they are annotated within different frameworks. Furthermore, we tested our unified dimension (UniDim) approach by comparing PDTB and RST annotations of identical newspaper texts and converting their original end label annotations of relations into the accompanying values per dimension. Subsequently, rates of overlap in the attributed values per dimension were analyzed. Results indicate that the proposed dimensions indeed create an interface that makes existing annotation systems “talk to each other.”
Language
  • English
Open access status
green
Identifiers
Persistent URL
https://sonar.ch/global/documents/192783
Statistics

Document views: 38 File downloads:
  • Full-text: 0