Mixture of Cognitive Reasoners

Modular Reasoning with Brain-Like Specialization

EPFL
*Equal Supervision

Abstract

Human intelligence emerges from the interaction of specialized brain networks, each dedicated to distinct cognitive functions such as language processing, logical reasoning, social understanding, and memory retrieval. Inspired by this biological observation, we introduce the Mixture of Cognitive Reasoners (MiCRo) architecture and training paradigm: a modular transformer-based language model with a training curriculum that encourages the emergence of functional specialization among different modules. Inspired by studies in neuroscience, we partition the layers of a pretrained transformer model into four expert modules, each corresponding to a well-studied cognitive brain network. Our Brain-Like model has three key benefits over the state of the art: First, the specialized experts are highly interpretable and functionally critical, where removing a module significantly impairs performance on domain-relevant benchmarks. Second, our model outperforms comparable baselines that lack specialization on seven reasoning benchmarks. And third, the model's behavior can be steered at inference time by selectively emphasizing certain expert modules (e.g., favoring social over logical reasoning), enabling fine-grained control over the style of its response. Our findings suggest that biologically inspired inductive biases involved in human cognition lead to significant modeling gains in interpretability, performance, and controllability.

Preliminaries

The Human Language Network

Fedorenko, E., Ivanova, A.A. & Regev, T.I. The language network as a natural kind within the broader landscape of the human brain. Nat. Rev. Neurosci. 25, 289-312 (2024).

Our model includes four expert modules inspired by how the human brain organizes thinking into specialized systems. The Language expert reflects areas that process written and spoken words and are not involved in tasks like math or music. The Logic expert mimics regions responsible for reasoning and problem-solving, which become more active during difficult tasks and are linked to general intelligence. The Social expert is based on brain systems that help us understand others’ thoughts and emotions, such as interpreting sarcasm or showing empathy. Finally, the World expert is modeled after the brain's "default mode network," which is active during daydreaming, recalling memories, and imagining future events. This system helps connect ideas over time, enabling the model to understand longer stories or events. Together, these modules help the model reflect core aspects of human thought.


Training Curriculum for Inducing Specialization

Our brain-inspired Mixture of Cognitive Reasoners (\ourmodel) model contains four experts per layer, each aligned with a distinct cognitive network in the brain. In Stage 1, we train only the expert modules using a small, curated dataset $D_{\text{experts}}$ (see example on the left), providing each expert with an initial inductive bias. In Stage 2, we freeze the experts and train the router on the same dataset to learn expert selection. In Stage 3, we fine-tune the entire model end-to-end on a large-scale instruction tuning dataset.
The Human Language Network


Experiments & Findings

Token Routing Patterns Across Experts and Layers. (a-d) Proportion of tokens routed to each expert on four datasets: (a) MATH, (b) Empathy, (c) MMLU (Humanities), and (d) GSM8K. For each token, we report the most frequently selected expert across layers. (e-f) Layer-wise token routing distributions on the Empathy and GSM8K datasets, showing a strong early preference for the language expert and specialization to the relevant cognitive expert in later layers. (g) Example prompt with routed expert assignments for each generated token, illustrating that routing is semantically coherent. Results are using the MiCRo-Llama model. Again, for each token, we report the most frequently selected expert across layers.

Token Routing Patterns Across Experts and Layers

Comparing Modular Architectures with Varying Degrees of Specialization. Evaluation of models trained on the same data but differing in architectural design across 7 benchmarks using two base models. No Experts refers to a baseline fine-tuned without any modularity. General uses random expert assignments during Stage 1, making all experts general-purpose. Specialized is our proposed MiCRo model, which routes each sentence to a pseudo-labeled expert aligned with a specific cognitive domain. Specialized* denotes the same model with one irrelevant expert removed for each benchmark to isolate the impact of specialization. Specialized** refers to increasing the number of experts (compute) at test time by selecting the top-2 experts instead of top-1 for the same model. Best results for models with comparable compute are shown in bold. Best results overall are underlined.

Table with Results of Modular Architectures

Steering Model Behavior via Expert Ablation. This figure shows the effect of removing individual expert modules on performance, compared to the full model using all four experts. The Language expert is excluded from the ablation analysis because removing it leads to a substantial performance drop across all benchmarks, as shown in Appendix A.3. MiCRo's absolute performance without any ablation is reported at the bottom of each subplot. (a) On MATH, removing the Logic expert significantly reduces performance, while removing the Social expert leads to a slight improvement. (b) GSM8K exhibits a similar pattern. (c-d) For MMLU and BBH, which cover a broad range of subtasks, we used GPT-4o to cluster them into four groups aligned with our expert domains (see Table 6 in the Appendix). We report performance drops within each cluster to show how expert ablation impacts different task types.

Steering Model Behavior via Expert Ablation

Neuroscience Localizers Identify Functionally Specialized Experts. (a) MiCRo-Llama and (b) MiCRo-OLMo. For each model, we apply three neuroscience-inspired localizers—Language, Multiple Demand (MD), and Theory of Mind (ToM)—to examine how localized units are distributed across experts and layers. Each plot shows the percentage of the top 1% most selective units, as identified by each localizer, within each expert module across layers. The language localizer consistently highlights units in the Language expert, while the MD localizer mainly identifies the Logic expert, confirming that these modules capture distinct cognitive functions aligned with their intended roles. However, the ToM localizer does not identify units in the Social expert, likely due to the limited size of the localization dataset.

Neuroscience Localizers Identify Functionally Specialized Experts

BibTeX


        @article{alkhamissi2025mixturecognitivereasoners,
              title={Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization}, 
              author={Badr AlKhamissi and C. Nicolò De Sabbata and Zeming Chen and Martin Schrimpf and Antoine Bosselut},
              year={2025},
              eprint={2506.13331},
              archivePrefix={arXiv},
              primaryClass={cs.LG},
              url={https://arxiv.org/abs/2506.13331}, 
        }
      

Acknowledgement

This website is adapted from LLaVA-VL, Nerfies, and VL-RewardBench, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.