NeurIPS 2025: Workshop Efficient Reasoning
Daniel Kaiser, Arnoldo Frigessi, Ali Ramezani-Kebrya, Benjamin Ricaud
Building on the CogniLoad benchmark, this work introduces a novel efficiency metric for LLMs—tokens generated per solved puzzle (including thinking traces)—to evaluate computational cost alongside accuracy, establishing a new token-efficiency leaderboard for real-world deployment.
arXiv:2509.18458 (Under review ICLR 2026)
Daniel Kaiser, Arnoldo Frigessi, Ali Ramezani-Kebrya, Benjamin Ricaud
A synthetic benchmark grounded in Cognitive Load Theory (CLT) that generates natural-language logic puzzles with independently tunable parameters (intrinsic difficulty, distractor density, task length) to precisely diagnose LLM reasoning bottlenecks and failure modes.
SSRN 3520684
Amir Amel-Zadeh, Jan-Peter Calliess, Daniel Kaiser, Stephen Roberts
Authors listed in alphabetic order. Please refer to my thesis instead with all details.
Investigates the application of machine learning methods to forecast stock movements and delivers abnormal returns over multiple decades. It's the first study to successfully apply Machine Learning to the quantitative data in financial statements.