Sequence to sequence networks learn the meaning of reflexive anaphora

2 November 2020 • Robert Frank* and Jackson Petty*

Abstract

Reflexive anaphora present a challenge for semantic interpretation: their meaning varies depending on context in a way that appears to require abstract variables. Past work has raised doubts about the ability of recurrent networks to meet this challenge. In this paper, we explore this question in the context of a fragment of English that incorporates the relevant sort of contextual variability. We consider sequence-to-sequence architectures with recurrent units and show that such networks are capable of learning semantic interpretations for reflexive anaphora which generalize to novel antecedents. We explore the effect of attention mechanisms and different recurrent unit types on the type of training data that is needed for success as measured in two ways: how much lexical support is needed to induce an abstract reflexive meaning (i.e., how many distinct reflexive antecedents must occur during training) and what contexts must a noun phrase occur in to support generalization of reflexive interpretation to this noun phrase?

Bib(La)TeX Citation

@inproceedings{frank-petty-2020-sequence,
    title = "Sequence-to-Sequence Networks Learn the Meaning of Reflexive Anaphora",
    author = "Frank, Robert  and
      Petty, Jackson",
    editor = "Ogrodniczuk, Maciej  and
      Ng, Vincent  and
      Grishina, Yulia  and
      Pradhan, Sameer",
    booktitle = "Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference",
    month = dec,
    year = "2020",
    address = "Barcelona, Spain (online)",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.crac-1.16",
    pages = "154--164",
    abstract = "Reflexive anaphora present a challenge for semantic interpretation: their meaning varies depending on context in a way that appears to require abstract variables. Past work has raised doubts about the ability of recurrent networks to meet this challenge. In this paper, we explore this question in the context of a fragment of English that incorporates the relevant sort of contextual variability. We consider sequence-to-sequence architectures with recurrent units and show that such networks are capable of learning semantic interpretations for reflexive anaphora which generalize to novel antecedents. We explore the effect of attention mechanisms and different recurrent unit types on the type of training data that is needed for success as measured in two ways: how much lexical support is needed to induce an abstract reflexive meaning (i.e., how many distinct reflexive antecedents must occur during training) and what contexts must a noun phrase occur in to support generalization of reflexive interpretation to this noun phrase?",
}