Adaptive Reasoning Compression
Balancing Short and Long Chains of Thought for Improved Overthinking LLM Reasoning
DOI:
https://doi.org/10.64059/eiu.v4i4.90Keywords:
Adaptive Reasoning, Reasoning Compression, Chain-of-Thought, Overthinking Mitigation, Computational Efficiency, LLM Reasoning OptimizationAbstract
Adaptive Reasoning Compression: Balancing Short and Long Chains of Thought for Improved Overthinking LLM ReasoningLarge Language Models (LLMs) have shown remarkable capabilities in reasoning and problem solving. However, one emerging phenomenon is overthinking—when a model spends unnecessary steps reasoning about problems that could be solved directly. While deeper reasoning can sometimes improve accuracy for complex tasks, excessive reasoning often increases computational costs without significant gains. This simulation aims to study the tradeoff between direct answering and overthinking in LLMs. This research builds on the idea that “less is more” when it comes to reasoning in LLMs. By developing adaptive and compressed reasoning strategies, we aim to optimize the balance between brevity and accuracy, making LLMs both smarter and more efficient. We will simulate the proposal idea (Adaptive + Compressed Reasoning for LLMs). Also this study proposed several strategies to mitigate overthinking, Self Braking Tuning (SBT), Certainty Guided Reflection Suppression, Long Short Chain of Thought Mixtures, and Framework Based Orchestration.
References
Aggarwal, P., Kim, S., Lanchantin, J., Welleck, S., Weston, J., Kulikov, I., & Saha, S. (2025). OptimalThinkingBench: Evaluating Over and Underthinking in LLMs. http://arxiv.org/abs/2508.13141
Chen, Z., Ma, X., Fang, G., Yu, R., & Wang, X. (2025). VeriThinker: Learning to Verify Makes Reasoning Model Efficient. http://arxiv.org/abs/2505.17941
DeepSeek-AI, Guo, D., Yang, D., Zhang, Haowei, Song, J., Wang, P., Zhu, Q., Xu, R., Zhang, Ruoyu, Ma, S., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2026). DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. Nature, 645(8081), 633–638. https://doi.org/10.1038/s41586-025-09422-z
Gan, Z., Yi, H., & Liu, Y. (2025). CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning. https://arxiv.org/pdf/2509.04027v2
Guo, Z., Chen, T., Meng, W., Gong, C., Yu, X., Wei, C., & Chen, W. (2026). Dynamic Thinking-Token Selection for Efficient Reasoning in Large Reasoning Models. http://arxiv.org/abs/2601.18383
Qu, X., Li, Y., Su, Z.-C., Sun, W., Yan, J., Liu, Dongrui, Cui, G., Liu, Daizong, Liang, S., He, J., Li, P., Wei, W., Shao, J., Lu, C., Zhang, Y., Hua, X.-S., Zhou, B., & Cheng, Y. (2025). A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond. http://arxiv.org/abs/2503.21614
Shao, Z., Wang, P., Zhu, Q., Xu, R., Song, J., Bi, X., Zhang, H., Zhang, M., Li, Y. K., Wu, Y., & Guo, D. (2024). DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. https://arxiv.org/pdf/2402.03300
Su, J., Healey, J., Nakov, P., & Cardie, C. (2025). Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs. http://arxiv.org/abs/2505.00127
Sui, Y., Chuang, Y.-N., Wang, G., Zhang, J., Zhang, T., Yuan, J., Liu, H., Wen, A., Zhong, S., Zou, N., Chen, H., & Hu, X. (2025). Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models. Transactions on Machine Learning Research, 2025-August. http://arxiv.org/abs/2503.16419
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2023). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems, 35. http://arxiv.org/abs/2201.11903
Yang, C., Si, Q., Duan, Y., Zhu, Z., Zhu, C., Li, Q., Chen, M., Lin, Z., & Wang, W. (2025). Dynamic Early Exit in Reasoning Models. http://arxiv.org/abs/2504.15895
Yu, Q., Zhang, Z., Zhu, R., Yuan, Y., Zuo, X., Yue, Y., Dai, W., Fan, T., Liu, G., Liu, L., Liu, X., Lin, H., Lin, Z., Ma, B., Sheng, G., Tong, Y., Zhang, C., Zhang, M., Zhang, W., … Wang, M. (2025). DAPO: An Open-Source LLM Reinforcement Learning System at Scale. https://arxiv.org/pdf/2503.14476
Yue, Y., Yuan, Y., Yu, Q., Zuo, X., Zhu, R., Xu, W., Chen, J., Wang, C., Fan, T., Du, Z., Wei, X., Yu, X., Liu, G., Liu, J., Liu, L., Lin, H., Lin, Z., Ma, B., Zhang, C., … Yan, L. (2025). VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks. http://arxiv.org/abs/2504.05118
Zhao, H., Yan, Y., Shen, Y., Xu, H., Zhang, W., Song, K., Shao, J., Lu, W., Xiao, J., & Zhuang, Y. (2025). Let LRMs Break Free from Overthinking via Self-Braking Tuning. http://arxiv.org/abs/2505.14604
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2025 the Author(s).

This work is licensed under a Creative Commons Attribution 4.0 International License.