interests | Bethia Sun

Keywords: hyperdimensional computing, vector symbolic architectures, tensor product representations, learning as Bayesian program induction, program synthesis, empiricism vs nativism, embodied learning, theoretical models of memory.

My research seeks to develop cognitively and biologically plausible, mathematically-principled deep learning models. These efforts are currently organised around three interconnected research themes:

1. The Paradox of Cognition & Distributed Compositional Architectures: The enduring connectionist/classicalist debate in cognitive science concerns a core challenge: how to reconcile the fluid, probabilistic, and graded dimensions of human cognition with its structured, rule-governed, and compositional aspects? These two cognitive paradigms - one grounded in continuous, statistical processes (the neural), and the other in discrete, algebraic operations (the symbolic) - must ultimately be unified within a single cognitive architecture to account for the full richness of human cognition (see Smolensky’s Paradox of Cognition). Inspired by the insights of the Integrated Connectionist/Symbolic (ICS) Theory developed by Smolensky, Legendre & Miyata, I have focussed on distributed compositional frameworks - particularly Tensor Product Representations (TPRs) and Vector Symbolic Architectures (VSAs) - as a viable means of resolving this paradox. Grounded in fully continuous mathematics and thus “connectionist all the way down”, these frameworks reliably exhibit emergent symbolic behaviour at higher levels of abstraction due to their carefully orchestrated mathematical design. This phenomenon finds an elegant parallel in physics: while the probabilistic wavefunctions of quantum mechanics provide the most accurate micro-level description of reality, these same principles can be approximated at macroscopic scales as the stable, deterministic laws of classical mechanics.

In the initial stages of my MPhil, I conducted an extensive investigation of the TPR framework, which culminated in the development of the Soft TPR framework (a first-authored publication at NeurIPS). This work introduces a novel TPR-based compositional representation, while also providing a concrete implementational strategy to learn such representations. Specifically, the Soft TPR framework: 1) formalises and learns flexible, quasi-compositional representations (termed Soft TPRs) that approximate the precise analytical form prescribed by classical TPRs, 2) generates these representations in a single step, thus oblivating the need to token individual constituents to form the compositional representation (once the network is fully trained), and 3) extends TPR-based representation learning to weakly-supervised, non-formal data domains.

More recently, my attention has turned to VSAs, which retain the benefits of distributed compositionality characteristic of TPRs, while mitigating exponential dimensional growth, enhancing representational flexibility, and offering greater neural plausibility compared to their TPR predecessor. Taken together, these investigations aim to bridge the gap between the neural and symbolic paradigms of cognition, moving us closer to a truly unified theory of mind.

2. Bayesian Program Induction for AGI: A second focal point of my research addresses a fundamental question in cognitive science: How do humans, even in infancy, rapidly acquire complex concepts, causal relationships, and intuitive theories about their physical and social environments despite highly limited data, which renders the learning problem statistically underconstrained? The Child-as-Scientist framework provides an illuminating perspective, conceptualising the process of human learning as an active process of theory formation and revision. Of particular interest to me is the computational instantiation of such a framework provided by the Bayesian program induction approach, which formalises theories as probabilistic programs that maximise posterior probability given the observed data. Although the use of probabilistic programs confers notable expressive power - capturing causal, counterfactual and explanatory dimensions of cognition - such approaches face significant computational challenges due to the innumerably vast space of possible theories.

I am interested in scaling computational implementations of Child-as-Scientist by drawing on insights derived from both empiricist and nativist traditions in cognitive science. First, I am interested in exploring how innate concepts and core knowledge - informed by Elizabeth Spelke’s work on domain-specific cognitive structures (e.g., object, cardinality) - can function as strong priors that effectively constrain the hypothesis space, rendering the search for the Bayes optimal hypothesis (theory) more tractable. Second, I am interested in the potential of embodied, egocentric data generation, where active interaction with the environment (modulated by internal goals, states, and evolving hypotheses) adaptively refines priors in real time and directs data collection toward the most relevant portions of the search space. These strategies offer promising avenues to overcome the computational intractability that currently limits large-scale implementations of the Child-as-Scientist paradigm.

3. Integrating Distributed Compositional Frameworks with Deep Learning: A key objective of my current research agenda is to translate the mathematical and theoretical formalism of VSAs and TPRs into modern deep learning architectures. By embedding the mathematics of these distributed compositional frameworks directly into neural networks, it becomes possible to encode symbolic structures as high-dimensional vectors and implement symbolic computations (e.g., binary tree manipulations) through parallelisable, continuous operations. This approach holds the promise of preserving interpretability and compositionality - hallmarks of symbolic methods - while benefitting from the scalability, adaptiveness, gradedness, and continuous optimisation strategies intrinsic to distributed systems. Several promising directions naturally follow:

Combinatorial Search for Program Synthesis: By leveraging the equivalence between a) symbolic binary trees (i.e., program parse trees) and their VSA encodings, and b) sequential symbolic manipulations (e.g., compositions of car/cdr/cons operations) and parallelisable linear transformations on these VSA encodings, distributed compositional frameworks - when integrated into neural networks - may help mitigate the complexity of searching over combinatorial, discrete program spaces. Moreover, specialised VSA-based optimisation strategies, such as resonator networks, could further improve the efficiency of combinatorial search.
Neural Abstract Machines : Although some program synthesis approaches exploit the Church-Turing equivalence to induce neural networks capable of representing algorithmic procedures, conventional neural implementations of abstract machines (e.g., Neural Turing Machine, neural pushdown automata) often depend on symbolic memory structures, such as matrix-based memory and symbolic stacks. Replacing these components with VSA-based representations not only expands storage capacity - enabling exponential growth in the number of storable patterns with respect to dimensionality - but may also enhance the stability of gradient-based training and representational flexibility, thus offering a more robust and scalable pathway for learning implicit programs.
Compositional Concept Learning : An additional avenue of investigation involves the possibility of using an integrated VSA/deep learning system as an implementational framework for conceptual role semantics. Although one might initially regard the randomly initialised, high-dimensional “seed primitives” of VSAs as a limitation - given their lack of intrinsic semantic content - this very characteristic can be viewed differently in light of the conceptual role semantics framework. Specifically, these primitives can be understood as arbitrary states in a dynamical system, such that their systematic combination engenders a network of vector-symbolic structures. Within this network, meaning would thus energe from the functional interrelationships among these seed vectors, rather than from any predefined semantics at the level of individual primitives.

4. Theoretical Models of Memory: I have recently become captivated by theoretical models of memory, including Sparse Distributed Memory and Hopfield Networks.