MeTHanol: Modularized Thinking Language Models

🧪 MeTHanol: Modularized Thinking Language Models with Intermediate Layer Thinking, Decoding and Bootstrapping Reasoning

Ningyuan Xi¹, Xiaoyu Wang², Yetao Wu³, Teng Chen³, Qingqing Gu³, Luo Ji³

¹Beihang University, Beijing, China ²Beijing Institute of Technology, Beijing, China ³Geely AI Lab, Beijing, China

2025 International Joint Conference on Neural Networks (IJCNN)

Abstract

Current research efforts are focused on enhancing the thinking and reasoning capability of large language models (LLMs) by prompting, data-driven emergence and inference-time computation. In this study, we consider stimulating language models' thinking and cognitive abilities from a modular perspective, which mimics the human brain architecture. We select a specific intermediate attention layer with newly implemented language heads. We conduct dual-layer fine-tuning by annotated (query, thought, response) samples and show that the intermediate layer can also learn to decode fluent and reasonable language tokens. A two-pass inference mechanism is designed to generate thoughts then formal responses. The entire framework is called modularized thinking language model (MeTHanol) which can enhance LLM's cognitive behaviors as indicated by Theory of Mind (ToM) and Vignette-based experiments. Case studies also show that MeTHanol can plan and self-reflect and generate human-like thoughts and answers, even on unseen and open-domain tasks. MeTHanol can also adapt to a personalized prompt and behave as the specified character. Our study holds promise for significant cognitive gains from a modular perspective.

Benchmark

Fine-tuned results of Sally-Anne false-belief experiments. Values are percentages.

Zero-shot results of Vignette-based experiments. Values are percentages.

BibTeX

@INPROCEEDINGS{11229297, author={Xi, Ningyuan and Wang, Xiaoyu and Wu, Yetao and Chen, Teng and Gu, Qingqing and Ji, Luo}, booktitle={2025 International Joint Conference on Neural Networks (IJCNN)}, title={MeTHanol: Modularized Thinking Language Models with Intermediate Layer Thinking, Decoding and Bootstrapping Reasoning}, year={2025}, volume={}, number={}, pages={1-9}, keywords={Training;Adaptation models;Inference mechanisms;Large language models;MIMICs;Computer architecture;Oral communication;Cognition;Decoding;Methanol;modularity;LLM;latent space;reasoning}, doi={10.1109/IJCNN64981.2025.11229297}}

🧪 MeTHanol: Modularized Thinking Language Models with Intermediate Layer Thinking, Decoding and Bootstrapping Reasoning

Abstract

Overview

An overview of MeTHanol with modular correspondence to human brain architecture.

Framework

Comparison of the MeTHanol framework to standard LLM fine-tuning.

Training Result

Training loss curves and special case performance across training steps.

Benchmark

Fine-tuned results of Sally-Anne false-belief experiments. Values are percentages.

Zero-shot results of Vignette-based experiments. Values are percentages.

BibTeX