Characterizing Algebraic Generalization in Linguistic Neural Networks
• Jackson Petty
[PDF]
Abstract
To learn an unbounded problem is to generalize well from a limited set of training data. In humans, robust language acquisition requires language learners to form strong generalizations on the basis of very limited evidence (Chomsky 1980). These generalizations seem to require the acquisition of functional abstractions of some sort. Various analysis of these abstractions have been put forth in the context of replicating this generalization in artificial settings (Fodor & Pylyshyn 1988, G. F. Marcus 1998a, Lake & Baroni 2018). But what connects these descriptions of generalization, and how do they characterize the difficulty of a particular linguistic phenomena in a way which informs our expectations of whether it can be learned by an artificial neural network?
This thesis takes a broad approach to answering this question. First, we explore a formalism to unite the various descriptions of linguistic generalization in a way which establishes a complexity hierarchy for tasks requiring generalization. We then use this typology to explore whether a given task is learnable by simple recurrent networks lacking explicit inductive biases for such generalization. We present constructive evidence that such simple models are in fact capable of learning stronger generalizations than previously thought, raising important questions about the mechanisms by which generalizations can be learned by neural network models of language.