Please add these papers to your list if you see fit. 1. [LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models](https://aclanthology.org/2024.acl-long.739/) 2. [Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models ](https://arxiv.org/abs/2406.17169) 3. [Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?](https://arxiv.org/abs/2407.14790) Thanks, Mihir Parmar.