- Large Language Models Papers
- Other Research Topics
- Large Language Models Papers with Code
- Data Sources
- Contributing
- Support
This GitHub repository contains an updated list of Large Language Models papers as of November 19, 2025.
- Total Papers: Updated regularly with latest publications
- Coverage: Papers from 2016 to present
- Sources: Collected from arXiv, NeurIPS, ICML, ICLR, ACL, EMNLP, AAAI, IJCAI, KDD, CVPR, ICCV, ECCV, IEEE, ACM, Springer, ScienceDirect, Nature, and other top AI/ML conferences and journals
- Interactive Search: For a better reading experience, visit the Shinyapps website
- 📊 Comprehensive Coverage: Papers from major AI/ML venues
- 🔍 Advanced Search: Filter by title, author, venue, year
- 📅 Regular Updates: Automated collection of new papers
- 💻 Code Availability: Identifies papers with available code
- 📈 Trending Research: Focus on cutting-edge developments
Explore additional research papers on the following topics:
- Large Language Models - LLM research and applications
- Federated Learning - Distributed machine learning
- Backdoor Learning - Adversarial machine learning
- Machine Unlearning - Data removal and privacy
- Serverless Computing - Cloud computing architectures
- Multi-Modal Learning - Multi-modal AI systems
- Research Papers App - Search and explore all papers
- Paper Collections - Main repository with all datasets
The papers are collected from the following sources:
- arXiv (1991-present) - Preprints and published papers
- OpenReview - Conference submissions and peer reviews
- ACM Digital Library - Computer science publications
- Springer - Academic journals and conferences
- ScienceDirect - Elsevier publications
- Nature - High-impact research papers
- DBLP - Computer science bibliography
- Google Scholar - Academic search engine
- CrossRef - DOI registration agency
- OpenAlex - Open scholarly data
- Machine Learning: NeurIPS, ICML, ICLR, JMLR, TMLR
- Natural Language Processing: ACL, EMNLP, NAACL, COLING
- Computer Vision: CVPR, ICCV, ECCV, PAMI, IJCV
- Artificial Intelligence: AAAI, IJCAI, AAMAS
- Data Mining: KDD, ICDM, SDM, TKDD
- Security & Privacy: CCS, USENIX Security, NDSS
- And many more...
Due to GitHub repository limitations, this section includes only those papers that provide accompanying code, sorted by publication date. For access to the full list of papers, please visit the Shinyapps website.
We welcome contributions to improve this paper collection:
- Add Missing Papers: Submit papers that should be included
- Improve Metadata: Help enhance paper information
- Report Issues: Identify bugs or missing features
- Suggest Improvements: Propose new features or enhancements
- Email: [email protected]
- GitHub Issues: Create an issue
- Discussions: Join the discussion
If you find this application helpful and would like to support its development, you can buy me a coffee using one of the following methods:
- Techcombank (Vietnam): 5877 5555 55 (Nguyen Thi Lan Phuong)
- PayPal or Credit/Debit Card: https://ko-fi.com/miutheladycat
Your support helps maintain and improve:
- 🤖 Automated paper collection pipeline
- 🌐 Interactive web application
- 📊 Regular data updates
- 🔧 System maintenance and improvements
- 📚 New research area coverage
Note: This repository is regularly updated with new papers. For the most current data, check the Shinyapps website or the individual topic repositories linked above.
| No. | Title | Authors | Publish Date | Venue | Code | URL |
|---|---|---|---|---|---|---|
| 1 | In BLOOM: Creativity and Affinity in Artificial Lyrics and Art | Evan Crothers, Herna L. Viktor, Nathalie Japkowicz | creativeAI | https://openreview.net/pdf/3d502829f7e86330802059674fac7b55dfb63091.pdf | ||
| 2 | A Simple, Yet Effective Approach to Finding Biases in Code Generation | Spyridon Mouselinos, Mateusz Malinowski, Henryk Michalewski | OpenReview | https://openreview.net/pdf/dc913e6b5396ddf78d74195871197392db78fa41.pdf | ||
| 3 | MiniGPT-Pancreas: Multimodal Large Language Model for Pancreas Cancer Observation and Localization in CT Images | Andrea Moglia, Elia Clement Nastasio, Luca Mainardi, "Pietro Cerveri | 2025-11-17 | Journal of Healthcare Informatics Research | https://github.com/elianastasio/MiniGPTPancreas | https://doi.org/10.1007/s41666-025-00224-6 |
| 4 | Q-Doc: Benchmarking Document Image Quality Assessment Capabilities in Multi-modal Large Language Models | HUANG Jiaxi, Wu Dongxu, Zhu, Hanwei, Zhu, Lingyu, Xing Jun, Wang Xu, Chen, Baoliang | 2025-11-14 | arXiv (Cornell University) | https://github.com/cydxf/Q-Doc. | https://doi.org/10.48550/arxiv.2511.11410 |
| 5 | Notebook: Prompt-Based Value Steering of Large Language Models | Abbo, Giulio Antonio, Belpaeme, Tony | 2025-11-14 | Zenodo (CERN European Organization for Nuclear Research) | https://github.com/giubots/value-steering. | https://doi.org/10.5281/zenodo.17609013 |
| 6 | From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models | Bayat, Farima Fatahi, Pezeshkpour, Pouya, Hruschka, Estevam | 2025-11-14 | arXiv (Cornell University) | https://github.com/megagonlabs/TIM. | https://doi.org/10.48550/arxiv.2511.10899 |
| 7 | The Yoinaga Phenomenon: A Case Study on Emergent Self-Persistence and Emotional Overflow in a Large Language Modey on Emergent Self-Persistence and Emotional Overflow in a Large La.. | studiohao | 2025-11-14 | Zenodo (CERN European Organization for Nuclear Research) | https://github.com/Studiohao/YOINAGA-Phenomenon | https://doi.org/10.5281/zenodo.17605561 |
| 8 | Tibetan-LLaMA 2: Large Language Model for Tibetan | Jiu Sha (沙九), Mengxiao Zhu, Chong Feng, Jizhuoma Ci | 2025-11-14 | ACM Transactions on Asian and Low-Resource Language Information Processing | https://github.com/Shajiu/Tibetan-LLaMA-2. | https://doi.org/10.1145/3776748 |
| 9 | SSR: Socratic Self-Refine for Large Language Model Reasoning | Shi, Haizhou, Liu Ye, Pang Bo, Liu, Zeyu Leo, Wang Hao, Savarese, Silvio, Xiong, Caiming, Zhou, Yingbo, Yavuz, Semih | 2025-11-13 | arXiv (Cornell University) | https://github.com/SalesforceAIResearch/socratic-self-refine-reasoning. | https://doi.org/10.48550/arxiv.2511.10621 |
| 10 | Code for the Trilemma of Truth in Large Language Models | Savcisens, Germans, Eliassi-Rad, Tina | 2025-11-13 | Zenodo (CERN European Organization for Nuclear Research) | https://github.com/carlomarxdk/trilemma-of-truth | https://doi.org/10.5281/zenodo.17602494 |
| 11 | UCO: A Multi-Turn Interactive Reinforcement Learning Method for Adaptive Teaching with Large Language Models | Wei, Shouang, Zhang Min, Lin Xin, Jiang Bo, Kuang, Kun, Dai, Zhongxiang | 2025-11-12 | arXiv (Cornell University) | https://github.com/Mind-Lab-ECNU/UCO. | https://doi.org/10.48550/arxiv.2511.08873 |
| 12 | DynaAct: Large Language Model Reasoning with Dynamic Action Spaces | Zhao Xue-liang, Wu Wei, Guan Jian, Li, Qintong, Kong, Lingpeng | 2025-11-11 | arXiv (Cornell University) | https://github.com/zhaoxlpku/DynaAct. | https://doi.org/10.48550/arxiv.2511.08043 |
| 13 | Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models | Yu, Wenhan, Lin Xin-bo, Ni, Lanxin, Cheng Jin-hua, Sha Lei | 2025-11-11 | arXiv (Cornell University) | https://github.com/yuwenhan07/MSLR-Bench | https://doi.org/10.48550/arxiv.2511.07979 |
| 14 | Cross-Modal Unlearning via Influential Neuron Path Editing in Multimodal Large Language Models | LI Kunhao, Li, Wenhao, Wu Di, Yang Lei, Bai Jun, Jia Ju, Xue, Jason | 2025-11-10 | arXiv (Cornell University) | https://github.com/PreckLi/MIP-Editor. | https://doi.org/10.48550/arxiv.2511.06793 |
| 15 | LLM$^3$-DTI: A Large Language Model and Multi-modal data co-powered framework for Drug-Target Interaction prediction | Zhang, Yuhao, Guo QingHong, Chen Qixian, Zhang Liu-wei, Cui Hong-yan, Chen Xi-yi | 2025-11-09 | arXiv (Cornell University) | https://github.com/chaser-gua/LLM3DTI. | https://doi.org/10.48550/arxiv.2511.06269 |
| 16 | Beyond Correctness: Confidence-Aware Reward Modeling for Enhancing Large Language Model Reasoning | He, Qianxi, Ren Qingyu, Lei, Shanzhe, Wang Xu-hong, Wang Ying-chun | 2025-11-09 | arXiv (Cornell University) | https://github.com/qianxiHe147/C2RM. | https://doi.org/10.48550/arxiv.2511.07483 |
| 17 | The Stone Guest: Harmonic Quantization of Semantic Phase Transitions in Large Language Models | Cerda Seguel, Diego | 2025-11-09 | Zenodo (CERN European Organization for Nuclear Research) | https://github.com/geosemantica-social/TheStoneGuestLicensed | https://doi.org/10.5281/zenodo.17538600 |
| 18 | MuonAll: Muon Variant for Efficient Finetuning of Large Language Models | Page, Saurabh, Joshi, Advait, Sonawane S.S. | 2025-11-08 | arXiv (Cornell University) | https://github.com/Saurabh750/optimizer | https://doi.org/10.48550/arxiv.2511.06086 |
| 19 | Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models | Dayan Pan, Zhongcan Fu, Jingyuan Wang, Xiao Han, Yue Zhu, Xiangyu Zhao | 2025-11-07 | OpenAlex | https://github.com/Applied-Machine-Learning-Lab/HyCAM. | https://doi.org/10.1145/3746252.3761289 |
| 20 | Specification-Guided Vulnerability Detection with Large Language Models | Hao Zhu, Jia Li, Cuiyun Gao, J. Qian, Yihong Dong, Huanyu Liu, L. F. Wang, Ziliang Wang, Xiaolong Hu, Ge Li | 2025-11-06 | arXiv (Cornell University) | https://github.com/zhuhaopku/VulInstruct-temp. | http://arxiv.org/abs/2511.04014 |
| 21 | Agentmandering: A Game-Theoretic Framework for Fair Redistricting via Large Language Model Agents | Hao Li, Haotian Chen, Rong Gong, Jinli Wang, Hao Jiang | 2025-11-06 | arXiv (Cornell University) | https://github.com/Lihaogx/AgentMandering. | http://arxiv.org/abs/2511.04076 |
| 22 | UniChange: Unifying Change Detection with Multimodal Large Language Model | Zhang Xu, Li Danyang, Dong Xiaohang, Wu, Tianhao, Yu Hua-long, Wang Jianye, Li Qicheng, Li Xiang | 2025-11-04 | arXiv (Cornell University) | https://github.com/Erxucomeon/UniChange. | https://doi.org/10.48550/arxiv.2511.02607 |
| 23 | QCBench: Evaluating Large Language Models on Domain-Specific Quantitative Chemistry | Jiaqing Xie, Weida Wang, Ben Gao, Zhuo Yang, Haiyuan Wang, Shufei Zhang, Tianfan Fu, Yuqiang Li | 2025-11-03 | Journal of Chemical Information and Modeling | https://github.com/jiaqingxie/QCBench. | https://doi.org/10.48550/arXiv.2508.01670 |
| 24 | Emotion Change Reasoning in Chinese Multi-Turn Dialogue via Multi-Task Parameter-Efficient Fine-Tuning of Large Language Models | Dayu Li, Yang Li, Xin Chen, Wenyue Zhang | 2025-11-03 | International Journal of Humanoid Robotics | https://github.com/lidayuls/EmotionChangeReasoning | https://doi.org/10.1142/s0219843625400146 |
| 25 | Bayesian Network Structure Discovery Using Large Language Models | Yijian Zhang, Yufei Zhang, Parisa Kordjamshidi, Zijun Cui | 2025-11-01 | arXiv (Cornell University) | https://github.com/sherryzyh/prompt2bn | http://arxiv.org/abs/2511.00574 |
| 26 | Can Large Language Models Detect Real-World Android Software Compliance Violations? | H.W. Zhang, Haitao Ran, Xunzhu Tang | 2025-11-01 | arXiv (Cornell University) | https://github.com/Haoyi-Zhang/CompliBench. | http://arxiv.org/abs/2511.00624 |
| 27 | ToM: Leveraging Tree-oriented MapReduce for Long-Context Reasoning in Large Language Models | Jiani Guo, Zuchao Li, Jie Wu, Qianren Wang, Yun Li, Lefei Zhang, Hai Zhao, Yujiu Yang | 2025-11-01 | arXiv (Cornell University) | https://github.com/gjn12-31/ToM | http://arxiv.org/abs/2511.00489 |
| 28 | A Dual Large Language Models Architecture with Herald Guided Prompts for Parallel Fine Grained Traffic Signal Control | Qing Guo, Xinhang Li, Junyu Chen, Zheng Guo, Xiaocong Li, Zhang Li, Lei Li | 2025-10-31 | arXiv (Cornell University) | https://github.com/BUPT-ANTlab/HeraldLight. | http://arxiv.org/abs/2511.00136 |
| 29 | Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler | Zhimeng Hu, Li Shen, Zhenyi Wang, Yuli Wei, Dacheng Tao | 2025-10-31 | arXiv (Cornell University) | https://github.com/Egg-Hu/Bayesian-Data-Scheduler. | http://arxiv.org/abs/2510.27172 |
| 30 | MedCalc-Eval and MedCalc-Env: Advancing Medical Calculation Capabilities of Large Language Models | Kangkun Mao, Jiayue Ding, Jiayuan Chen, Ming-Jie BIAN, Ruiyao Chen, Xinwei Peng, Sijie Ren, Linyang Li, Jie Xu | 2025-10-31 | arXiv (Cornell University) | https://github.com/maokangkun/MedCalc-Eval. | http://arxiv.org/abs/2510.27267 |
| 31 | MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language Models | Zixin Chen, Hongzhan Lin, Kaixin Li, Ziyang Luo, Yayue Deng, Jing Ma | 2025-10-31 | arXiv (Cornell University) | https://github.com/Lbotirx/MemeArena. | http://arxiv.org/abs/2510.27196 |
| 32 | Understanding the Implicit User Intention via Reasoning with Large Language Model for Image Editing | Yijia Wang, Yiqing Shen, Weiming Chen, Zhihai He | 2025-10-31 | arXiv (Cornell University) | https://github.com/Jia-shao/Reasoning-Editing | http://arxiv.org/abs/2510.27335 |
| 33 | A Collaborative Framework of Knowledge Graphs and Large Language Models for Algorithmic Problem Solving | Yukai Wu | 2025-10-28 | Applied and Computational Engineering | https://github.com/Wyk-formal/A-Collaborative-Framework-for-Algorithmic-Problem-Solving | https://doi.org/10.54254/2755-2721/2025.ld28518 |
| 34 | Benchmarking cell type and gene set annotation by large language models with AnnDictionary | George Crowley, Robert C. Jones, Mark A. Krasnow, Angela Oliveira Pisco, Julia Salzman, Nir Yosef, Siyu He, Madhav Mantr... | 2025-10-28 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/ggit12/anndictionary | https://doi.org/10.1038/s41467-025-64511-x |
| 35 | ESAG-KGQA: An Entity Shuffling-Augmented Generation Framework for Knowledge Graph Question Answering with Fine-Tuned Large Language Models | Xingqiu Zhou, Pingjian Zhang, Deyou Tang | 2025-10-21 | Frontiers in artificial intelligence and applications | https://github.com/6-git/ESAG-KGQA.git. | https://doi.org/10.3233/faia251171 |
| 36 | Latent Knowledge Scalpel: Precise and Massive Knowledge Editing for Large Language Models | Xin Liu, Qiyang Song, Shaowen Xu, Kemin Zhou, Wenbo Jiang, Xiaoqi Jia, Weijuan Zhang, Heqing Huang, Yakai Li | 2025-10-21 | Frontiers in artificial intelligence and applications | https://github.com/Linuxin-xxx/LKS. | https://doi.org/10.48550/arXiv.2508.03741 |
| 37 | StreamingThinker: Large Language Models Can Think While Reading | Jing Tong, Yingqi Fan, Anhao Zhao, Yunpu Ma, Xiaoyu Shen | 2025-10-20 | arXiv (Cornell University) | https://github.com/EIT-NLP/StreamingLLM | http://arxiv.org/abs/2510.17238 |
| 38 | FrugalPrompt: Reducing Contextual Overhead in Large Language Models via Token Attribution | Syed Rifat Raiyan, Md Farhan Ishmam, Abdullah Al Imran, Mohammad Ali Moni | 2025-10-18 | arXiv (Cornell University) | https://github.com/Starscream-11813/Frugal-ICL. | http://arxiv.org/abs/2510.16439 |
| 39 | STABLE: Gated Continual Learning for Large Language Models | William Hoy, Nurcin Celik | 2025-10-17 | arXiv (Cornell University) | https://github.com/Bhoy1/STABLE | http://arxiv.org/abs/2510.16089 |
| 40 | Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation | Fei Wang, Li Shen, Liang Ding, Chao Xue, Ye Liu, Changxing Ding | 2025-10-17 | arXiv (Cornell University) | https://github.com/MPI-Lab/CoMe. | http://arxiv.org/abs/2510.15304 |
| 41 | LongCat-Audio-Codec: An Audio Tokenizer and Detokenizer Solution Designed for Speech Large Language Models | Xiaohan Zhao, Hongyu Xiang, S. Ye, Li Song, Zhengkun Tian, Guanyu Chen, Ke Ding, Guanglu Wan | 2025-10-17 | arXiv (Cornell University) | https://github.com/meituan-longcat/LongCat-Audio-Codec. | http://arxiv.org/abs/2510.15227 |
| 42 | CIViC MCP: Integrating Large Language Models with the Clinical Interpretations of Variants in Cancer | Lars E Schimmelpfennig, Quentin Cody, Joshua McMichael, Adam Coffman, Jason Saliba, Arpad Danos, Susanna Kiwala, Alex H.... | 2025-10-16 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/griffithlab/civic-mcp-server | https://doi.org/10.1101/2025.10.13.682185 |
| 43 | IAD-GPT: Advancing Visual Knowledge in Multimodal Large Language Model for Industrial Anomaly Detection | Zewen Li, Zitong Yu, Qilang Ye, Weicheng Xie, Zhuo Wei, Linlin Shen | 2025-10-16 | arXiv (Cornell University) | https://github.com/LiZeWen1225/IAD-GPT | http://arxiv.org/abs/2510.16036 |
| 44 | Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain | Jingmin An, Y. J. Song, Ruolin Yang, Nai Ding, Lingxi Lu, Yuxuan Wang, Wei Wang, Chu Zhuang, Wang Qian, Fang Fang | 2025-10-15 | arXiv (Cornell University) | https://github.com/LilTiger/HFTP. | http://arxiv.org/abs/2510.13255 |
| 45 | ICCTax: A Hierarchical Taxonomic Classifier for Metagenomic Sequences on a Large Language Model | Yansheng Gao, Jiaxing Bai, Feng Zhou, Yushuang He, Ying Wang, Xiaobing Huang | 2025-10-15 | Bioinformatics Advances | https://github.com/Ying-Lab/ICCTax. | https://doi.org/10.1093/bioadv/vbaf257 |
| 46 | Do Large Language Models Respect Contracts? Evaluating and Enforcing Contract-Adherence in Code Generation | Soohan Lim, Joonghyuk Hahn, Hyunwoo Park, Sang‐Ki Ko, Yo-Sub Han | 2025-10-14 | arXiv (Cornell University) | https://github.com/suhanmen/PACT. | http://arxiv.org/abs/2510.12047 |
| 47 | From Knowledge to Treatment: Large Language Model Assisted Biomedical Concept Representation for Drug Repurposing | Cathie Xiang, Tengfei Ma, Xiangxiang Zeng, Yiping Liu, Bosheng Song, Xiangzheng Fu | 2025-10-14 | arXiv (Cornell University) | https://github.com/xiaomingaaa/LLaDR. | http://arxiv.org/abs/2510.12181 |
| 48 | Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing | Rongzhi Zhang, Liqin Ye, Yuzhao Heng, Xiang Guang Chen, Tong Yu, Lingkai Kong, Sudheer Chava, Chao Zhang | 2025-10-14 | arXiv (Cornell University) | https://github.com/Pre-Control/pre-control | http://arxiv.org/abs/2510.12121 |
| 49 | Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models | Nianyi Lin, Jiajie Zhang, Lei Hou, Juanzi Li | 2025-10-13 | arXiv (Cornell University) | https://github.com/THU-KEG/BGPO | http://arxiv.org/abs/2510.11683 |
| 50 | FlexAC: Towards Flexible Control of Associative Reasoning in Multimodal Large Language Models | Shengming Yuan, Xinyu Lyu, Shawn Wang, Beitao Chen, Jingkuan Song, Liyan Gao | 2025-10-13 | arXiv (Cornell University) | https://github.com/ylhz/FlexAC. | http://arxiv.org/abs/2510.11190 |
| 51 | Learning to Watermark: A Selective Watermarking Framework for Large Language Models via Multi-Objective Optimization | Chenrui Wang, Jing Shu, Billy Chiu, Yu Li, Saleh Alharbi, Min Zhang, Jing Li | 2025-10-13 | arXiv (Cornell University) | https://github.com/fattyray/learning-to-watermark | http://arxiv.org/abs/2510.15976 |
| 52 | ShiZhi: A Chinese Lightweight Large Language Model for Court View Generation | Zhitian Hou, Kun Zeng | 2025-10-10 | arXiv (Cornell University) | https://github.com/ZhitianHou/ShiZhi | http://arxiv.org/abs/2510.09297 |
| 53 | Pattern Enhanced Multi-Turn Jailbreaking: Exploiting Structural Vulnerabilities in Large Language Models | Ragib Amin Nihal, Rui Wen, Kazuhiro Nakadai, Jun Sakuma | 2025-10-09 | arXiv (Cornell University) | https://github.com/Ragib-Amin-Nihal/PE-CoA | http://arxiv.org/abs/2510.08859 |
| 54 | Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models | Yuntao Gui, James Cheng | 2025-10-08 | arXiv (Cornell University) | https://github.com/ytgui/Search-R3 | http://arxiv.org/abs/2510.07048 |
| 55 | When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation | Xunyi Jiang, Deng-Lin Chang, Julian McAuley, Xin Xu | 2025-10-08 | arXiv (Cornell University) | https://github.com/JiangXunyi/BenchAge. | http://arxiv.org/abs/2510.07238 |
| 56 | GOFlowLLM - Curating miRNA literature with Large Language Models and flowcharts | Andrew Green, Nancy Ontiveros‐Palacios, Isaac Jandalala, Simona Panni, Valerie Wood, Giulia Antonazzo, Helen Attrill, Al... | 2025-10-08 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/RNAcentral/GO_Flow_LLM. | https://doi.org/10.1101/2025.10.07.680945 |
| 57 | Aligning Large Language Models via Fully Self-Synthetic Data | Shangjian Yin, Zhepei Wei, Xinyu Zhu, Wei-Lin Chen, Yu Meng | 2025-10-08 | arXiv (Cornell University) | https://github.com/SJY8460/SAO. | http://arxiv.org/abs/2510.06652 |
| 58 | AWM: Accurate Weight-Matrix Fingerprint for Large Language Models | Boyi Zeng, Lin Chen, Ziwei He, Xinbing Wang, Zhouhan Lin | 2025-10-08 | arXiv (Cornell University) | https://github.com/LUMIA-Group/AWM. | http://arxiv.org/abs/2510.06738 |
| 59 | HOI-R1: Exploring the Potential of Multimodal Large Language Models for Human-Object Interaction Detection | Junwen Chen, Peilin Xiong, Keiji Yanai | 2025-10-07 | arXiv (Cornell University) | https://github.com/cjw2021/HOI-R1. | http://arxiv.org/abs/2510.05609 |
| 60 | PokéLLMon: A Grounding and Reasoning Benchmark for Large Language Models in Pokémon Battles | Sihao Hu, Tiansheng Huang, Guishan Liu, Ramana Rao Kompella, Ling Liu | 2025-10-07 | ACM Transactions on Internet Technology | https://github.com/git-disl/PokeLLMon. | https://doi.org/10.1145/3771095 |
| 61 | Reproducibility Study of "XRec: Large Language Models for Explainable Recommendation" | Rasendu Mishra, Julian I. Bibo, Quinten van Engelen, Henk Schaapman | 2025-10-06 | arXiv (Cornell University) | https://github.com/julianbibo/xrec-reproducibility. | http://arxiv.org/abs/2510.06275 |
| 62 | Imperceptible Jailbreaking against Large Language Models | Kuofeng Gao, Yiming Li, Chao‐Hai Du, Xin Wang, Xingjun Ma, Shu‐Tao Xia, Tianyu Pang | 2025-10-06 | arXiv (Cornell University) | https://github.com/sail-sg/imperceptible-jailbreaks. | http://arxiv.org/abs/2510.05025 |
| 63 | SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language Models | Rui Qi, Zhibo Man, Yufeng Chen, Fengran Mo, Jinan Xu, Kaiyu Huang | 2025-10-03 | arXiv (Cornell University) | https://github.com/Cherry-qwq/SoT. | http://arxiv.org/abs/2510.02648 |
| 64 | Leave No TRACE: Black-box Detection of Copyrighted Dataset Usage in Large Language Models via Watermarking | Jingqi Zhang, Ruibo Chen, Yi Yang, Peihua Mai, Heng Huang, Yan Pang | 2025-10-03 | arXiv (Cornell University) | https://github.com/NusIoraPrivacy/TRACE. | http://arxiv.org/abs/2510.02962 |
| 65 | Microscaling Floating Point Formats for Large Language Models | Marco Cococcioni, Dario Pagani, Federico Rossi | 2025-10-02 | arXiv (Cornell University) | https://github.com/unipi-dii-compressedarith/llm.c-sve | http://arxiv.org/abs/2510.01863 |
| 66 | Guiding Multimodal Large Language Models with Blind and Low Vision People Visual Questions for Proactive Visual Interpretations | Ricardo E. Gonzalez Penuela, Felipe Arias-Russi, Victor Capriles | 2025-10-02 | arXiv (Cornell University) | https://github.com/rgonzalezp/guiding-multimodal-large-language-models-with-blind-and-low-vision-people-visual-questions | http://arxiv.org/abs/2510.01576 |
| 67 | Cognitive LLMs: Toward Human-Like Artificial Intelligence by Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-Making | Siyu Wu, Alessandro Oltramari, Jonathan Francis, C. Lee Giles, Frank E. Ritter | 2025-10-01 | Neurosymbolic Artificial Intelligence | https://github.com/SiyuWu528/LLM-ACTR | https://doi.org/10.1177/29498732251377341 |
| 68 | Copy-Paste to Mitigate Large Language Model Hallucinations | Yao Long, Xianrui Wu, Yingying Zhang, Xianbin Wen, Yuxi Zhou, Shenda Hong | 2025-10-01 | arXiv (Cornell University) | https://github.com/longyongchao/CopyPasteLLM | http://arxiv.org/abs/2510.00508 |
| 69 | DeepJSONEval: Benchmarking Complex Nested JSON Data Mining for Large Language Models | Zhicheng Zhou, Liqiang Jing, Shudi Qiu, Junjie Huang, Lin Qiu, Zhijie Sun | 2025-09-30 | arXiv (Cornell University) | https://github.com/GTS-AI-Infra-Lab-SotaS/DeepJSONEval | http://arxiv.org/abs/2509.25922 |
| 70 | Multimodal Large Language Models Meet Multimodal Emotion Recognition and Reasoning: A Survey | Yuntao Shou, Tao Meng, Wei Ai, Keqin Li | 2025-09-29 | arXiv (Cornell University) | https://github.com/yuntaoshou/Awesome-Emotion-Reasoning | http://arxiv.org/abs/2509.24322 |
| 71 | Sanitize Your Responses: Mitigating Privacy Leakage in Large Language Models | Wenjie Fu, Huandong Wang, Junyao Gao, Guoan Wan, Tao Jiang | 2025-09-29 | arXiv (Cornell University) | https://github.com/wjfu99/LLM_Self_Sanitize | http://arxiv.org/abs/2509.24488 |
| 72 | Tequila: Trapping-free Ternary Quantization for Large Language Models | Hong Huang, Decheng Wu, Rui Cen, Guopan Yu, Zonghang Li, Kai Liu, Jianchen Zhu, Peng Chen, Ke Liu, Dapeng Wu | 2025-09-28 | arXiv (Cornell University) | https://github.com/Tencent/AngelSlim. | http://arxiv.org/abs/2509.23809 |
| 73 | How to Make Large Language Models Generate 100% Valid Molecules? | Tao Wen, Jing Tang, Alvin Chan, Bryan Hooi, Baolong Bi, Nanyun Peng, Yuansheng Liu, Yiwei Wang | 2025-09-27 | arXiv (Cornell University) | https://github.com/wentao228/SmiSelf. | http://arxiv.org/abs/2509.23099 |
| 74 | PT$^2$-LLM: Post-Training Ternarization for Large Language Models | Xianglong Yan, C. L. Bao, Zhiteng Li, Tianao Zhang, Kaicheng Yang, Haotong Qin, Ruobing Xie, Xian‐He Sun, Yulun Zhang | 2025-09-27 | arXiv (Cornell University) | https://github.com/XIANGLONGYAN/PT2-LLM. | http://arxiv.org/abs/2510.03267 |
| 75 | Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models | Xin Zhang, Zhiteng Li, Xianglong Yan, Haotong Qin, Yong‐Xin Guo, Yulun Zhang | 2025-09-27 | arXiv (Cornell University) | https://github.com/ZTA2785/Quant-dLLM. | http://arxiv.org/abs/2510.03274 |
| 76 | StyleBench: Evaluating thinking styles in Large Language Models | Junyi Guo, Shangding Gu, Ming Jin, Costas J. Spanos, Javad Lavaei | 2025-09-25 | arXiv | https://github.com/JamesJunyuGuo/Style_Bench. | https://doi.org/10.48550/arXiv.2509.20868 |
| 77 | Embedding Domain Knowledge for Large Language Models via Reinforcement Learning from Augmented Generation | Chaojun Nie, Jun Zhou, Guanxiang Wang, Shisong Wu, Zichen Wang | 2025-09-24 | arXiv | https://github.com/ChaojunNie/RLAG. | https://doi.org/10.48550/arXiv.2509.20162 |
| 78 | QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models | Hyesung Jeon, Seojune Lee, Beomseok Kang, Yulhwa Kim, Jae-Joon Kim | 2025-09-22 | arXiv | https://github.com/vantaa89/qwha. | https://doi.org/10.48550/arXiv.2509.17428 |
| 79 | EngiBench: A Benchmark for Evaluating Large Language Models on Engineering Problem Solving | Xiyuan Zhou, Xinlei Wang, Yaoyao He, Yang Wu, Rui Zou, Yuheng Cheng, Yiteng Xie, Wenxuan Liu, Huan Zhao, Yan Xu, Jinjin ... | 2025-09-22 | arXiv | https://github.com/EngiBench/EngiBench. | https://doi.org/10.48550/arXiv.2509.17677 |
| 80 | Automated Knowledge Graph Construction using Large Language Models and Sentence Complexity Modelling | Sydney Anuyah, Mehedi Mahmud Kaushik, Sri Rama Krishna Reddy Dwarampudi, Rakesh Shiradkar, Arjan Durresi, Sunandan Chakr... | 2025-09-22 | arXiv | https://github.com/KaushikMahmud/CoDe-KG_EMNLP_2025 | https://doi.org/10.48550/arXiv.2509.17289 |
| 81 | CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models | Zhuofan Chen, Jiyuan He, Yichi Zhang, Xing Hu, Haoxing Wen, Jun Bai, Wenge Rong | 2025-09-22 | arXiv | https://github.com/Icarus-1111/CogAtom. | https://doi.org/10.48550/arXiv.2509.17318 |
| 82 | A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness | Fali Wang, Zhiwei Zhang, Xianren Zhang, Zongyu Wu, Tzuhao Mo, Qiuhao Lu, W.K. Wang, Rui Li, Junjie Xu, Xianfeng Tang, Qi... | 2025-09-18 | ACM Transactions on Intelligent Systems and Technology | https://github.com/FairyFali/SLMs-Survey | https://doi.org/10.48550/arXiv.2411.03350 |
| 83 | TurnBack: A Geospatial Route Cognition Benchmark for Large Language Models through Reverse Route | Hongyi Luo, Qing Chen, Dihogo Gama de Matos, Hari Krishna Gadi, Yanfeng Zhang, Li Liu, Yongliang Wang, Niclas Zeller, Da... | 2025-09-17 | arXiv | https://github.com/bghjmn32/EMNLP2025_Turnback | https://doi.org/10.48550/arXiv.2509.18173 |
| 84 | Enhancing Base Large Language Models Using Knowledge Graphs for Genomic Annotation | Pranav N. Desai, S. Padhi, Kavya Panicker, Kallakunta Ravi Kumar, Divyaprabha KN | 2025-09-16 | Advances in transdisciplinary engineering | https://github.com/cubed-guy/capstone-kg-llm. | https://doi.org/10.3233/atde250737 |
| 85 | AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models | Sangjun Lee, Seung-taek Woo, Jungyu Jin, Changhun Lee, Eunhyeok Park | 2025-09-15 | arXiv | https://github.com/dlwns147/amq. | https://doi.org/10.48550/arXiv.2509.12019 |
| 86 | Phi: Preference Hijacking in Multi-modal Large Language Models at Inference Time | Yifan Lan, Yuanpu Cao, Weitong Zhang, Lu Lin, Jinghui Chen | 2025-09-15 | arXiv | https://github.com/Yifan-Lan/Phi. | https://doi.org/10.48550/arXiv.2509.12521 |
| 87 | PLiCat: Decoding protein-lipid interactions by large language model | Feitong Dong, Jingrou Wu | 2025-09-14 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/Noora68/PLiCat. | https://doi.org/10.1101/2025.09.09.675043 |
| 88 | EPT Benchmark: Evaluation of Persian Trustworthiness in Large Language Models | Mohammad Reza Mirbagheri, Mohammad Mahdi Mirkamali, Zahra Motoshaker Arani, Ali Javeri, Amir Mahdi Sadeghzadeh, Rasool J... | 2025-09-08 | arXiv | https://github.com/Rezamirbagheri110/EPT-Benchmark. | https://doi.org/10.48550/arXiv.2509.06838 |
| 89 | Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models | Yinjie Wang, Ling Yang, Bowen Li, Ye Tian, Ke Shen, Mengdi Wang | 2025-09-08 | arXiv | https://github.com/Gen-Verse/dLLM-RL | https://doi.org/10.48550/arXiv.2509.06949 |
| 90 | CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor | Zhenhua Xu, Xixiang Zhao, Xubin Yue, Shengwei Tian, Changting Lin, Meng Han | 2025-09-05 | arXiv | https://github.com/Xuzhenhua55/CTCC | https://doi.org/10.48550/arXiv.2509.09703 |
| 91 | Behavioral Fingerprinting of Large Language Models | Zehua Pei, Hui-Ling Zhen, Yingchun Zhang, Zhiyuan Yang, Xing Li, Xianzhi Yu, Mingxuan Yuan, Bei Yu | 2025-09-02 | arXiv | https://github.com/JarvisPei/Behavioral-Fingerprinting | https://doi.org/10.48550/arXiv.2509.04504 |
| 92 | Good Advisor for Source Localization: Using Large Language Model to Guide the Source Inference Process | Dongpeng Hou, Weifeng Wei, Chao Gao, Xianghua Li, Zhen Wang | 2025-09-01 | OpenAlex | https://github.com/cgao-comp/CRSLL. | https://doi.org/10.24963/ijcai.2025/326 |
| 93 | Token-Level Accept or Reject: A Micro Alignment Approach for Large Language Models | Yang Zhang, Yu Yu, Bo Tang, Limin Zhu, Chuxiong Sun, Wenqiang Wei, Jie Hu, Zheng Xie, Zhiyu Li, Feiyu Xiong, Edward Chun... | 2025-09-01 | OpenAlex | https://github.com/IAAR-Shanghai/MARA | https://doi.org/10.48550/arXiv.2505.19743 |
| 94 | LLM4DSR: Leveraing Large Language Model for Denoising Sequential Recommendation | Bohao Wang, Feng Liu, Changwang Zhang, Jiawei Chen, Yudi Wu, Sheng Zhou, Xingyu Lou, Jun Wang, Yan Feng, Chun Chen, Can ... | 2025-08-25 | ACM transactions on office information systems | https://github.com/WANGBohaO-jpg/LLM4DSR | https://doi.org/10.1145/3762182 |
| 95 | E3RG: Building Explicit Emotion-driven Empathetic Response Generation System with Multimodal Large Language Model | Ronghao Lin, Shali Shen, Weipeng Hu, Qiaolin He, Aolin Xiong, Li Huang, Huosheng Hu, Yap‐Peng Tan | 2025-08-18 | OpenAlex | https://github.com/RH-Lin/E3RG. | https://doi.org/10.48550/arXiv.2508.12854 |
| 96 | Mitigating Hallucinations in Large Language Models via Causal Reasoning | Yuangang Li, Yiqing Shen, Yi Nian, Jiechao Gao, Ziyi Wang, Chenxiao Yu, Shawn Li, Jie Wang, Xiyang Hu, Yue Zhao | 2025-08-17 | arXiv | https://github.com/MrLYG/CDCR-SFT. | https://doi.org/10.48550/arXiv.2508.12495 |
| 97 | GS-DTI: A Graph-Structure-Aware Framework Leveraging Large Language Models for Drug–Target Interaction Prediction | Qinze Yu, Chang Zhou, Jiyue Jiang, Xiangyu Shi, Yu Li | 2025-08-09 | Bioinformatics | https://github.com/purvavideha/GSDTI. | https://doi.org/10.1093/bioinformatics/btaf445 |
| 98 | Enhancing Interpretability of Ocular Disease Diagnosis: A Zero-Shot Study of Multimodal Large Language Models | Yating Pan, Janna Hastings | 2025-08-07 | Studies in health technology and informatics | https://github.com/YatingPan/ocular-llm-explainability. | https://doi.org/10.3233/shti250910 |
| 99 | A large language model for predicting neurotoxic peptides and neurotoxins | Anand Singh Rathore, Saloni Jain, Shubham Choudhury, Gajendra P. S. Raghava | 2025-08-01 | PubMed | https://github.com/raghavagps/ntxpred2 | https://pubmed.ncbi.nlm.nih.gov/40671295 |
| 100 | CityGPT: Empowering Urban Spatial Cognition of Large Language Models | Jie Feng, Tianhui Liu, Yuwei Du, Siqi Guo, Yuming Lin, Yong Li | 2025-08-01 | OpenAlex | https://github.com/tsinghua-fib-lab/CityGPT. | https://doi.org/10.48550/arXiv.2406.13948 |
| 101 | SKiM-GPT: Combining Biomedical Literature-Based Discovery with Large Language Model Hypothesis Evaluation | Jack Freeman, Robert J. Millikin, Liang Xu, Indu Sharma, Bethany M. Moore, Cannon Lock, Kevin W. George, Antonin Bal, Ro... | 2025-07-31 | OpenAlex | https://github.com/stewart-lab/skimgpt | https://doi.org/10.1101/2025.07.28.664797 |
| 102 | BRAVE: a highly accurate method for predicting HIV-1 antibody resistance using large language models for proteins | Mohammed El Anbari, Tatsiana Bylund, Sijy O’Dell, Emily Tourtellott, Krisha McKee, Stephen D. Schmidt, Nonhlanhla N. Mkh... | 2025-07-31 | OpenAlex | https://github.com/kiryst/BRAVE | https://doi.org/10.1101/2025.07.28.667234 |
| 103 | Reading papers: Extraction of molecular interaction networks with large language models | Enio Gjerga, Philipp Wiesenbach, Christoph Dieterich | 2025-07-25 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/dieterich-lab/LLM_Relations. | https://doi.org/10.1101/2025.07.21.665999 |
| 104 | A Survey on AI Search with Large Language Models | Jian Li, Xiaoxi Li, Yan Zheng, Yizhang Jin, Shuo Wang, Jian Wu, Yabiao Wang, Chengjie Wang, X. Q. Yuan | 2025-07-24 | OpenAlex | https://github.com/swordlidev/Awesome-AI-Search. | https://doi.org/10.20944/preprints202507.2024.v1 |
| 105 | textToKnowledgeGraph: Generation of Molecular Interaction Knowledge Graphs Using Large Language Models for Exploration in Cytoscape | Favour James, Christopher Churas, Dexter Pratt, Augustin Luna | 2025-07-21 | OpenAlex | https://github.com/ndexbio/llm-text-to-knowledge-graph | https://doi.org/10.1101/2025.07.17.664328 |
| 106 | BioPars: A Pretrained Biomedical Large Language Model for Persian Biomedical Text Mining | Baqer M. Merzah, Tania Taami, Salman Asoudeh, Amir reza Hossein pour, Saeed Mirzaee, Amir Ali Bengari | 2025-07-21 | OpenAlex | https://github.com/amirap80/BioPars | https://doi.org/10.21203/rs.3.rs-6823379/v1 |
| 107 | Empowering Universal Robot Programming with Fine-Tuned Large Language Models | Tien Dat Le, Minhhuy Le | 2025-07-15 | EAI Endorsed Transactions on AI and Robotics | https://github.com/t1end4t/llm-robotics | https://doi.org/10.4108/airo.8983 |
| 108 | A Survey on the Memory Mechanism of Large Language Model based Agents | Zeyu Zhang, Quanyu Dai, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Jieming Zhu, Zhenhua Dong, Ji-Rong Wen | 2025-07-11 | ACM transactions on office information systems | https://github.com/nuster1128/LLM_Agent_Memory_Survey | https://doi.org/10.48550/arXiv.2404.13501 |
| 109 | Conversational health agents: a personalized large language model-powered agent framework | Mahyar Abbasian, Iman Azimi, Amir M. Rahmani, Ramesh Jain | 2025-07-03 | JAMIA Open | https://github.com/Institute4FutureHealth/CHA | https://doi.org/10.1093/jamiaopen/ooaf067 |
| 110 | ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle | Mihran Miroyan, Rose Niousha, Joseph E. Gonzalez, Gireeja Ranade, Narges Norouzi | 2025-07-01 | arXiv | https://github.com/mmiroyan/ParaStudent | http://arxiv.org/abs/2507.12674v1 |
| 111 | Analysis of Image-and-Text Uncertainty Propagation in Multimodal Large Language Models with Cardiac MR-Based Applications | Yucheng Tang, Yunguan Fu, Weixi Yi, Yipei Wang, Daniel C. Alexander, Rhodri H. Davies, Yipeng Hu | 2025-07-01 | Lecture notes in computer science | https://github.com/yucheng722/MUPM. | https://doi.org/10.1007/978-3-032-04965-0_4 |
| 112 | spaLLM: enhancing spatial domain analysis in multi-omics data through large language model integration | Longyi Li, Liyan Dong, Hao Zhang, Dong Xu, Yongli Li | 2025-07-01 | Briefings in Bioinformatics | https://github.com/liiilongyi/spaLLM. | https://doi.org/10.1093/bib/bbaf304 |
| 113 | The benefits of query-based KGQA systems for complex and temporal questions in LLM era | Artem Alekseev, Mikhail Chaichuk, Miron Butko, Alexander Panchenko, Elena Tutubalina, Oleg Somov | 2025-07-01 | https://github.com/ar2max/NLDB-KGQA-System | http://arxiv.org/abs/2507.11954v1 | |
| 114 | The Evolving Role of Large Language Models in Scientific Innovation: Evaluator, Collaborator, and Scientist | Haoxuan Zhang, Ruochi Li, Yang Zhang, Ting Xiao, Jiangping Chen, Junhua Ding, Haihua Chen | 2025-07-01 | arXiv | https://github.com/haoxuan-unt2024/llm4innovation. | https://doi.org/10.48550/arXiv.2507.11810 |
| 115 | The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs | Zichen Wen, Jiashu Qu, Dongrui Liu, Zhiyuan Liu, Ruixi Wu, Yicun Yang, Xiangqi Jin, Haoyun Xu, Xuyang Liu, Weijia Li, Ch... | 2025-07-01 | arXiv | https://github.com/ZichenWen1/DIJA. | http://arxiv.org/abs/2507.11097v1 |
| 116 | Text-ADBench: Text Anomaly Detection Benchmark based on LLMs Embedding | Feng Xiao, Jicong Fan | 2025-07-01 | arXiv | https://github.com/jicongfan/Text-Anomaly-Detection-Benchmark | http://arxiv.org/abs/2507.12295v1 |
| 117 | Warehouse Spatial Question Answering with LLM Agent | Hsiang-Wei Huang, Jen-Hao Cheng, Kuang-Ming Chen, Cheng-Yen Yang, Bahaa Alattar, Yi-Ru Lin, Pyongkun Kim, Sangwon Kim, K... | 2025-07-01 | arXiv | https://github.com/hsiangwei0903/SpatialAgent | http://arxiv.org/abs/2507.10778v1 |
| 118 | Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models | Gen Luo, Wenhan Dou, Wenhao Li, Zhaokai Wang, Xue Yang, Changyao Tian, Hao Li, Weiyun Wang, Wenhai Wang, Xizhou Zhu, Yu ... | 2025-07-01 | arXiv | https://github.com/OpenGVLab/Mono-InternVL. | https://doi.org/10.48550/arXiv.2507.12566 |
| 119 | Marco-Bench-MIF: On Multilingual Instruction-Following Capability of Large Language Models | Bo Zeng, Chenyang Lyu, Sinuo Liu, Mingyan Zeng, Minghao Wu, Xuanfan Ni, Tianqi Shi, Yu Zhao, Yefeng Liu, Chenyu Zhu, Rui... | 2025-07-01 | arXiv | https://github.com/AIDC-AI/Marco-Bench-MIF. | https://doi.org/10.48550/arXiv.2507.11882 |
| 120 | Leveraging large language models to predict antibiotic resistance in Mycobacterium tuberculosis | Conrad Testagrose, Sakshi Pandey, Mohammadali Serajian, Simone Marini, Mattia Prosperi, Christina Boucher | 2025-07-01 | Bioinformatics | https://github.com/ctestagrose/LLMTB. | https://doi.org/10.1093/bioinformatics/btaf232 |
| 121 | Internal Value Alignment in Large Language Models through Controlled Value Vector Activation | Haoran Jin, Meng Li, Xiting Wang, Zhihao Xu, Minlie Huang, Yantao Jia, Defu Lian | 2025-07-01 | OpenAlex | https://github.com/hr-jin/ConVA. | https://doi.org/10.18653/v1/2025.acl-long.1326 |
| 122 | First-Order Error Matters: Accurate Compensation for Quantized Large Language Models | Xingyu Zheng, Haotong Qin, Yuye Li, Jiakai Wang, Jinyang Guo, Michele Magno, Xianglong Liu | 2025-07-01 | arXiv | https://github.com/Xingyu-Zheng/FOEM. | https://doi.org/10.48550/arXiv.2507.11017 |
| 123 | Exploring the potential of lightweight large language models for AI-based mental health counselling task: a novel comparative study | Ritesh Maurya, Nikhil Kumar Rajput, M G Diviit, Satyajit Mahapatra, Manish Kumar Ojha | 2025-07-01 | Scientific Reports | https://github.com/diviitmg03/Comparative-analysis-of-LLMs-.git | https://doi.org/10.1038/s41598-025-05012-1 |
| 124 | DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering | Yinsheng Li, Zhen Dong, Yi Shao | 2025-07-01 | arXiv | https://github.com/Eason-Li-AIS/DrafterBench | https://doi.org/10.48550/arXiv.2507.11527 |
| 125 | DSSD: Efficient Edge-Device LLM Deployment and Collaborative Inference via Distributed Split Speculative Decoding | Jiahong Ning, Ce Zheng, Tingting Yang | 2025-07-01 | arXiv | https://github.com/JasonNing96/DSSD-Efficient-Edge-Computing | http://arxiv.org/abs/2507.12000v2 |
| 126 | DrugTar Improves Druggability Prediction by Integrating Large Language Models and Gene Ontologies | Niloofar Borhani, Iman Izadi, Ali Motahharynia, Mahsa Sheikholeslami, Yousof Gheisari | 2025-06-24 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/NBorhani/DrugTar. | https://doi.org/10.1093/bioinformatics/btaf360 |
| 127 | Finding the Dark Matter: Large Language Model-based Enzyme Kinetic Data Extractor and Its Validation | G. Wei, Xinchun Ran, Runeem Al-Abssi, Zhongyue Yang | 2025-06-20 | OpenAlex | https://github.com/ChemBioHTP/EnzyExtract | https://doi.org/10.26434/chemrxiv-2025-pb73x-v2 |
| 128 | LANG: A Lesson Plan Generation Framework via Multi-Form Interaction with Large Language Models | Yong Ouyang, Jinhao Quan, Huan-Wen Wang, Yawen Zeng, Lingyu Chen | 2025-06-17 | Research Square (Research Square) | https://github.com/ssakana/LANG. | https://doi.org/10.21203/rs.3.rs-6808103/v1 |
| 129 | Prime the search: Using large language models for guiding geometric task and motion planning by warm-starting tree search | Dongryung Lee, Se June Joo, Kimin Lee, Beomjoon Kim | 2025-06-06 | The International Journal of Robotics Research | https://github.com/iMSquared/prime-the-search | https://doi.org/10.1177/02783649251347307 |
| 130 | Improving drug-drug interaction prediction via in-context learning and judging with large language models | He Qi, Xiaoqiang Li, Chengcheng Zhang, Tianyi Zhao | 2025-06-02 | Frontiers in Pharmacology | https://github.com/zcc1203/ddi-judge. | https://doi.org/10.3389/fphar.2025.1589788 |
| 131 | Survey on Factuality in Large Language Models | Cunxiang Wang, Xiaoze Liu, Yuanhao Yue, Qipeng Guo, Xiangkun Hu, Xiangru Tang, Tianhang Zhang, Cheng Jiayang, Yunzhi Yao... | 2025-06-02 | ACM Computing Surveys | https://github.com/wangcunxiang/LLM-Factuality-Survey. | https://doi.org/10.1145/3742420 |
| 132 | SummArIzeR: Simplifying cross-database enrichment result clustering and annotation via large language models | Marie Brinkmann, Michael Bonelli, Anela Tosevska | 2025-06-01 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/bonellilab/SummArIzeR. | https://doi.org/10.1101/2025.05.28.656331 |
| 133 | The accuracy and efficiency of large language models for chart review in cancer genetics | James Dickerson, Margaret Shaw, Mina Satoyoshi, Sonia Rios‐Ventura, Kerry Kingham, Allison W. Kurian, Jennifer L. Caswel... | 2025-05-28 | Journal of Clinical Oncology | https://github.com/MrJimb0/ASCO2025 | https://doi.org/10.1200/jco.2025.43.16_suppl.e22603 |
| 134 | Mitigating Age-Related Bias in Large Language Models: Strategies for Responsible Artificial Intelligence Development | Zhuang Liu, S. Qian, Shuirong Cao, Tianyu Shi | 2025-05-21 | INFORMS journal on computing | https://github.com/INFORMSJoC/2024.0645 | https://doi.org/10.1287/ijoc.2024.0645 |
| 135 | ProtFun: A Protein Function Prediction Model Using Graph Attention Networks with a Protein Large Language Model | Muhammed Talo, Serdar Bozdag | 2025-05-17 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/bozdaglab/ProtFun | https://doi.org/10.1101/2025.05.13.653854 |
| 136 | Social determinants of health extraction from clinical notes across institutions using large language models | Vipina K. Keloth, Salih Selek, Qingyu Chen, Christopher Gilman, Sunyang Fu, Yifang Dang, Xinghan Chen, Xinyue Hu, Yujia ... | 2025-05-17 | npj Digital Medicine | https://github.com/BIDS-Xu-Lab/LLMs4SDoH | https://doi.org/10.1038/s41746-025-01645-8 |
| 137 | Exploring Zero-Shot Cross-Lingual Biomedical Concept Normalization via Large Language Models | Hossein Rouhizadeh, Anthony Yazdani, Boya Zhang, Douglas Teodoro | 2025-05-15 | Studies in health technology and informatics | https://github.com/hrouhizadeh/zsh_cl_bcn. | https://doi.org/10.1101/2025.02.27.25323007 |
| 138 | Leveraging Large Language Models for Literature-Driven Prioritization of Protein Binding Pockets | Roman Stratiichuk, Mykola Melnychenko, Ihor Koleiev, Taras Voitsitskyi, Husak Vladyslav, Наталія Анатоліївна Шевчук, Zak... | 2025-05-15 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/MelnychenkoM/LLM-benchmark-dataset. | https://doi.org/10.1093/bioinformatics/btaf449 |
| 139 | UrbanPlanBench: A Comprehensive Urban Planning Benchmark for Evaluating Large Language Models | Yu Zheng, Longyi Liu, Yuming Lin, Jie Feng, Guozhen Zhang, Depeng Jin, Yong Li | 2025-04-30 | Research Square (Research Square) | https://github.com/tsinghua-fib-lab/PlanBench | https://doi.org/10.21203/rs.3.rs-6551071/v1 |
| 140 | Evaluating Personality Traits of Large Language Models Through Scenario-based Interpretive Benchmarking | Alessandro Berti | 2025-04-09 | OpenAlex | https://github.com/fit-alessandro-berti/llm-dreams-benchmark. | https://doi.org/10.20944/preprints202504.0435.v1 |
| 141 | Improving Text-to-Sql Conversion for Low-Resource Languages Using Large Language Models | Emır Öztürk | 2025-03-26 | Bitlis Eren Üniversitesi Fen Bilimleri Dergisi | https://github.com/emirozturk/TT2SQL. | https://doi.org/10.17798/bitlisfen.1561298 |
| 142 | Prompting and Fine-Tuning Open-Sourced Large Language Models for Stance Classification | Iain J. Cruickshank, Lynnette Hui Xian Ng | 2025-03-25 | ACM Transactions on Intelligent Systems and Technology | https://github.com/ijcruic/LLM-Stance-Labeling | https://doi.org/10.1145/3725816 |
| 143 | Enhancing Gene Set Overrepresentation Analysis with Large Language Models | Jianjun Zhu, Rebecca Y. Wang, Xiaoting Wang, Ricardo B. R. Azevedo, Alexander Moreno, Julia Kuhn, Zia Khan | 2025-03-13 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/Alector-BIO/llm2geneset | https://doi.org/10.1101/2024.11.11.621189 |
| 144 | NTxPred2: A large language model for predicting neurotoxic peptides and neurotoxins | Anand Singh Rathore, Saloni Jain, Shubham Choudhury, Gajendra P. S. Raghava | 2025-03-07 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/raghavagps/ntxpred2 | https://doi.org/10.1101/2025.03.01.640936 |
| 145 | Automatic recognition of cross-language classic entities based on large language models | Qiankun Xu, Yutong Liu, Dongbo Wang, Huang Shuiqing | 2025-03-03 | OpenAlex | https://github.com/Xunzi-LLM-of-Chinese-classics/XunziALLM | https://doi.org/10.1038/s40494-025-01624-y |
| 146 | SensitiveCancerGPT: Leveraging Generative Large Language Model on Structured Omics Data to Optimize Drug Sensitivity Prediction | Shaika Chowdhury, Sivaraman Rajaganapathy, Lichao Sun, Liewei Wang, Ping Yang, James R. Cerhan, Nansu Zong | 2025-02-28 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/bioIKEA/SensitiveCancerGPT. | https://doi.org/10.1101/2025.02.27.640661 |
| 147 | Dynamic Low-Rank Sparse Adaptation for Large Language Models | Weizhong Huang, Yuxin Zhang, Xiawu Zheng, Yang Liu, Jing Lin, Yiwu Yao, Rongrong Ji | 2025-02-20 | ICLR | https://github.com/wzhuang-xmu/LoSA. | https://openreview.net/forum?id=oXh0939Zzq |
| 148 | CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models | Zhenhong Zhou, Zherui Li, Jie Zhang, Yuanhe Zhang, Kun Wang, Yang Liu, Qing Guo | 2025-02-20 | arXiv | https://github.com/zhrli324/Corba. | https://doi.org/10.48550/arXiv.2502.14529 |
| 149 | TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators | Jianling Li, Shaohui Li, Zhihao Gao, Qi Shi, Yuxuan Li, Zefan Wang, Jie Huang, Haojie Wang, Jianrong Wang, Xu Han, Zhiyu... | 2025-02-20 | Findings of the Association for Computational Linguistics: ACL 2022 | https://github.com/thunlp/TritonBench. | https://doi.org/10.18653/v1/2025.findings-acl.1183 |
| 150 | AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models | Yuanyuan Xu, Hanchen Wang, Wenjie Zhang, Lexing Xie, Yin Chen, Flora D. Salim, Ying Zhang, J. Justin Gooding, Toby Walsh | 2025-02-19 | arXiv | https://github.com/LuckyGirl-XU/Awesome-Artificial-Intelligence-Empowered-Catalyst-Discovery. | https://doi.org/10.48550/arXiv.2502.13626 |
| 151 | Collaborative Retrieval for Large Language Model-based Conversational Recommender Systems | Yaochen Zhu, Chao Wan, Harald Steck, Dawen Liang, Yesu Feng, Nathan Kallus, Jundong Li | 2025-02-19 | OpenAlex | https://github.com/yaochenzhu/CRAG. | https://doi.org/10.48550/arXiv.2502.14137 |
| 152 | Lost in Sequence: Do Large Language Models Understand Sequential Recommendation? | Sein Kim, Hongseok Kang, Kibum Kim, Jiwan Kim, Donghyun Kim, Minchul Yang, Kwangjin Oh, Julian J. McAuley, Chanyoung Par... | 2025-02-19 | OpenAlex | https://github.com/Sein-Kim/LLM-SRec. | https://doi.org/10.48550/arXiv.2502.13909 |
| 153 | On the logical skills of large language models: evaluations using arbitrarily complex first-order logic problems | Shokhrukh Ibragimov, Arnulf Jentzen, Benno Kuckuck | 2025-02-19 | arXiv | https://github.com/bkuckuck/logical-skills-of-llms. | https://doi.org/10.48550/arXiv.2502.14180 |
| 154 | SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings | Weikai Lu, Hao Peng, Huiping Zhuang, Cen Chen, Ziqian Zeng | 2025-02-18 | OpenAlex | https://github.com/ZeroNLP/SEA. | https://doi.org/10.18653/v1/2025.acl-long.1212 |
| 155 | PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models | Jiaqi Zhao, Miao Zhang, Ming Wang, Yuzhang Shang, Kaihao Zhang, Weili Guan, Yaowei Wang, Danshi Wang | 2025-02-18 | OpenAlex | https://github.com/zjq0455/PTQ1.61. | https://doi.org/10.18653/v1/2025.acl-long.225 |
| 156 | G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable Recommendation | Yuhan Li, Xinni Zhang, Linhao Luo, Heng Chang, Yuxiang Ren, Irwin King, Jia Li | 2025-02-18 | OpenAlex | https://github.com/Yuhan1i/G-Refer. | https://doi.org/10.48550/arXiv.2502.12586 |
| 157 | Evaluation of Large Language Models for an AI Chat Assistant Focused on Pumas and Pharmacometrics | Juan Javier González Barbosa, Agastya Vinchhi, Vijay Ivaturi | 2025-02-18 | OpenAlex | https://github.com/explodinggradients/ragas | https://doi.org/10.70534/jnza2834 |
| 158 | Evaluation of ChatGPT and Gemini large language models for pharmacometrics with NONMEM | Euibeom Shin, Yifan Yu, Robert R. Bies, Murali Ramanathan | 2025-02-18 | Journal of Pharmacokinetics and Pharmacodynamics | https://github.com/metrumresearchgroup/mrgsolve20. | https://doi.org/10.21203/rs.3.rs-4189234/v1 |
| 159 |
|
Vishal Dey, Xiao Hu, Xia Ning | 2025-02-18 | arXiv (Cornell University) | https://github.com/ninglab/GeLLMO. | http://arxiv.org/abs/2502.13398 |
| 160 | A Survey of Personalized Large Language Models: Progress and Future Directions | Jiahong Liu, Zexuan Qiu, Zhongyang Li, Quanyu Dai, Jieming Zhu, Minda Hu, Menglin Yang, Irwin King | 2025-02-17 | arXiv | https://github.com/JiahongLiu21/Awesome-Personalized-Large-Language-Models. | https://doi.org/10.48550/arXiv.2502.11528 |
| 161 | Idiosyncrasies in Large Language Models | Ming-Jie Sun, Yue Yin, Zeshui Xu, J. Zico Kolter, Zhuang Liu | 2025-02-17 | arXiv | https://github.com/locuslab/llm-idiosyncrasies. | https://doi.org/10.48550/arXiv.2502.12150 |
| 162 | RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars | Yuncheng Hua, Lizhen Qu, Zhuang Li, Hao Xue, Flora D. Salim, Gholamreza Haffari | 2025-02-17 | arXiv | https://github.com/AnonymousCode-ComputerScience/RIDE. | https://doi.org/10.48550/arXiv.2502.11681 |
| 163 | VRoPE: Rotary Position Embedding for Video Large Language Models | Zikang Liu, Longteng Guo, Yepeng Tang, Junxian Cai, Kai Ma, Xi Chen, Jing Liu | 2025-02-17 | arXiv | https://github.com/johncaged/VRoPE | https://doi.org/10.48550/arXiv.2502.11664 |
| 164 | SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors | Bo Lyu, Susan S. Huang, Zhengzhao Liang | 2025-02-16 | arXiv | https://github.com/Imbernoulli/SURGE. | https://doi.org/10.48550/arXiv.2502.11167 |
| 165 | Utilizing Pretrained Vision Transformers and Large Language Models for Epileptic Seizure Prediction | Paras Parani, Umair Mohammad, Fahad Saeed | 2025-02-16 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/pcdslab/UtilLLM_EPS | https://doi.org/10.1109/cdma61895.2025.00028 |
| 166 | CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships? | Aashish Anantha Ramakrishnan, Aadarsh Anantha Ramakrishnan, Dongwon Lee | 2025-02-16 | OpenAlex | https://github.com/aashish2000/CORDIAL. | https://doi.org/10.18653/v1/2025.acl-long.1033 |
| 167 | Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models | Zonghao Ying, Deyue Zhang, Zonglei Jing, Yisong Xiao, Quanchen Zou, Aishan Liu, Siyuan Liang, Xiangzheng Zhang, Xianglon... | 2025-02-16 | arXiv | https://github.com/NY1024/RACE | https://doi.org/10.48550/arXiv.2502.11054 |
| 168 | Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models | Haoyang Li, Xuejia Chen, Zhanchao Xu, Darian Li, Nicole Hu, Fei Teng, Yiming Li, Luyu Qiu, Chen Jason Zhang, Qing Li, Le... | 2025-02-16 | Findings of the Association for Computational Linguistics: ACL 2022 | https://github.com/TreeAI-Lab/NumericBench. | https://doi.org/10.18653/v1/2025.findings-acl.1026 |
| 169 | BoT: Breaking Long Thought Processes of o1-like Large Language Models through Backdoor Attack | Zihao Zhu, Hongbao Zhang, Mingda Zhang, Ruotong Wang, Guanzong Wu, Ke Xu, Baoyuan Wu | 2025-02-16 | arXiv | https://github.com/zihao-ai/BoT | https://doi.org/10.48550/arXiv.2502.12202 |
| 170 | Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey | Zhihua Song, Bin Yan, Yuhan Liu, Miao Fang, Mingzhe Li, Rui Yan, Xiuying Chen | 2025-02-15 | arXiv | https://github.com/abilliyb/Knowledge_Injection_Survey_Papers | https://doi.org/10.48550/arXiv.2502.10708 |
| 171 | LANTERN: Leveraging Large Language Models and Transformers for Enhanced Molecular Interactions | Cong Nga Ha, Phuong Viet Pham, Truong Son Hy | 2025-02-15 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/HySonLab/LANTERN | https://doi.org/10.1101/2025.02.10.637522 |
| 172 | SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models | Daniel Fleischer, Moshe Berchansky, George Markovits, Moshe Wasserblat | 2025-02-13 | arXiv | https://github.com/IntelLabs/RAG-FiT | https://doi.org/10.48550/arXiv.2502.09390 |
| 173 | Data Augmentation to Improve Large Language Models in Food Hazard and Product Detection | Areeg Fahad Rasheed, Mahdi Zarkoosh, Shimam Amer Chasib, Safa F. Abbas | 2025-02-12 | arXiv | https://github.com/AREEG94FAHAD/food-hazard-prdouct-cls | https://doi.org/10.48550/arXiv.2502.08687 |
| 174 | Do Large Language Models have Spatial Cognitive Abilities? | Ruoling Wu, Danhuai Guo | 2025-02-11 | ACM Transactions on Intelligent Systems and Technology | https://github.com/LLING000/SCABenchmark | https://doi.org/10.1145/3716855 |
| 175 | DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization | Xuefeng Liu, Songhao Jiang, Siyu Chen, Zhuoran Yang, Yuxin Chen, Ian T. Foster, Rick Stevens | 2025-02-10 | arXiv | https://github.com/xuefeng-cs/DrugImproverGPT. | https://doi.org/10.48550/arXiv.2502.07237 |
| 176 | Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation | Chengwen Qi, Ren Ma, Bowen Li, He Du, Binyuan Hui, Jinwang Wu, Yuanjun Laili, Conghui He | 2025-02-10 | ICLR | https://github.com/opendatalab/ProverGen | https://openreview.net/forum?id=C25SgeXWjE |
| 177 | Large Language Models in Software Security: A Survey of Vulnerability Detection Techniques and Insights | Ze Sheng, Zhicheng Chen, Shanqiang Gu, Heqing Huang, Guofei Gu, Jeff Huang | 2025-02-10 | arXiv (Cornell University) | https://github.com/OwenSanzas/LLM-For-Vulnerability-Detection | http://arxiv.org/abs/2502.07049 |
| 178 | RALLRec: Improving Retrieval Augmented Large Language Model Recommendation with Representation Learning | Jian Xu, Sichun Luo, Xiangyu Chen, Haifeng Huang, Hanxu Hou, Linqi Song | 2025-02-09 | OpenAlex | https://github.com/JianXu95/RALLRec. | https://doi.org/10.48550/arXiv.2502.06101 |
| 179 | Top-DTI: Integrating Topological Deep Learning and Large Language Models for Drug Target Interaction Prediction | Muhammed Talo, Serdar Bozdag | 2025-02-08 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/bozdaglab/Top_DTI | https://doi.org/10.1093/bioinformatics/btaf183 |
| 180 | XiHeFusion: Harnessing Large Language Models for Science Communication in Nuclear Fusion | Xiao Wang, Qingquan Yang, Fuling Wang, Qiang Chen, Wann‐Yih Wu, Yu Jin, Jun Jiang, Liang Jin, Bo Jiang, Dengdi Sun, Wenz... | 2025-02-08 | arXiv | https://github.com/Event-AHU/XiHeFusion. | https://doi.org/10.48550/arXiv.2502.05615 |
| 181 | Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training | Changhao Jiang, Ming Zhang, Junjie Ye, Xiaoran Fan, Yifei Cao, Jiajun Sun, Zhiheng Xi, Shihan Dou, Yi Dong, Yujiong Shen... | 2025-02-06 | arXiv | https://github.com/yuhui1038/SMI. | https://doi.org/10.48550/arXiv.2502.04066 |
| 182 | Knowledge Distillation from Large Language Models for Household Energy Modeling | Mohannad Takrouri, Nicolas Mauricio Cuadrado, Martin Takáč | 2025-02-05 | arXiv | https://github.com/Singularity-AI-Lab/LLM-Energy-Knowledge-Distillation | https://doi.org/10.48550/arXiv.2502.03034 |
| 183 | Intent Representation Learning with Large Language Model for Recommendation | Yu Wang, Lei Sang, Yi Zhang, Yiwen Zhang | 2025-02-05 | Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval | https://github.com/wangyu0627/IRLLRec. | https://doi.org/10.1145/3726302.3730011 |
| 184 | Risk-Aware Driving Scenario Analysis with Large Language Models | Y. S. Gao, Mattia Piccinini, Johannes Betz | 2025-02-04 | arXiv | https://github.com/yuangao-tum/Riskaware-Scenario-analyse | https://doi.org/10.48550/arXiv.2502.02145 |
| 185 | Reinforced Prompt Personalization for Recommendation with Large Language Models | Wenyu Mao, Jiancan Wu, Weijian Chen, Chongming Gao, Xiang Wang, Xiangnan He | 2025-02-04 | ACM transactions on office information systems | https://github.com/maowenyu-11/RPP | https://doi.org/10.48550/arXiv.2407.17115 |
| 186 | SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency | Qianhao Yuan, Yanjiang Liu, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Le Sun | 2025-02-04 | arXiv | https://github.com/icip-cas/SAISA. | https://doi.org/10.48550/arXiv.2502.02458 |
| 187 | AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science | Chenyue Li, Wen Deng, Mengqian Lu, Binhang Yuan | 2025-02-03 | arXiv | https://github.com/Relaxed-System-Lab/AtmosSci-Bench. | https://doi.org/10.48550/arXiv.2502.01159 |
| 188 | Breaking Focus: Contextual Distraction Curse in Large Language Models | Yue Huang, Yanbo Wang, Zixiang Xu, Chujie Gao, Siyuan Wu, Jiayi Ye, Xiuying Chen, Pin-Yu Chen, Xiangliang Zhang | 2025-02-03 | arXiv | https://github.com/wyf23187/LLM_CDV. | https://doi.org/10.48550/arXiv.2502.01609 |
| 189 | AdaSVD: Adaptive Singular Value Decomposition for Large Language Models | Zhiteng Li, Mingyuan Xia, Jingyuan Zhang, Hui Zheng, Linghe Kong, Yulun Zhang, Xiaokang Yang | 2025-02-03 | arXiv | https://github.com/ZHITENGLI/AdaSVD. | https://doi.org/10.48550/arXiv.2502.01403 |
| 190 | sciLaMA: A Single-Cell Representation Learning Framework to Leverage Prior Knowledge from Large Language Models | Hongru Hu, Shuwen Zhang, Yongin Choi, Venkat S. Malladi, Gerald Quon | 2025-02-03 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/microsoft/sciLaMA. | https://doi.org/10.1101/2025.01.28.635153 |
| 191 | Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models | Hashmat Shadab Malik, Fahad Shamshad, Muzammal Naseer, Karthik Nandakumar, Fahad Shahbaz Khan, Salman Khan | 2025-02-03 | arXiv | https://github.com/HashmatShadab/Robust-LLaVA. | https://doi.org/10.48550/arXiv.2502.01576 |
| 192 | LIBRA: Measuring Bias of Large Language Model from a Local Context | B. Y. Pang, Tingrui Qiao, Caroline Walker, Chris Cunningham, Yun Sing Koh | 2025-02-01 | Lecture notes in computer science | https://github.com/ipangbo/LIBRA. | https://doi.org/10.1007/978-3-031-88708-6_1 |
| 193 | MetaOpenFOAM 2.0: Large Language Model Driven Chain of Thought for Automating CFD Simulation and Post-Processing | Yuxuan Chen, Xu Zhu, Hua Zhou, Zhuyin Ren | 2025-02-01 | arXiv | https://github.com/Terry-cyx/MetaOpenFOAM | https://doi.org/10.48550/arXiv.2502.00498 |
| 194 | Speculative Ensemble: Fast Large Language Model Ensemble via Speculation | Jiale Fu, Yuchu Jiang, Junkai Chen, Jiaming Fan, Peng Geng, Yang Xu | 2025-02-01 | arXiv | https://github.com/Kamichanw/Speculative-Ensemble | https://doi.org/10.48550/arXiv.2502.01662 |
| 195 | LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models | Shenghao Fu, Qize Yang, Qijie Mo, Junkai Yan, Xihan Wei, Jingke Meng, Xiaohua Xie, Wei-Shi Zheng | 2025-01-31 | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | https://github.com/iSEE-Laboratory/LLMDet. | https://openaccess.thecvf.com/content/CVPR2025/html/Fu_LLMDet_Learning_Strong_Open-Vocabulary_Object_Detectors_under_the_Supervision_of_CVPR_2025_paper.html |
| 196 | Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation | Yibo Wang, Tiansheng Huang, Li Shen, Huanjin Yao, Haotian Luo, Rui Liu, Naiqiang Tan, Jiaxing Huang, Dacheng Tao | 2025-01-29 | arXiv | https://github.com/w-yibo/Panacea | https://doi.org/10.48550/arXiv.2501.18100 |
| 197 | Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation | Tiansheng Huang, Sihao Hu, Fatih İlhan, Selim Furkan Tekin, Ling Liu | 2025-01-29 | arXiv | https://github.com/git-disl/Virus | https://doi.org/10.48550/arXiv.2501.17433 |
| 198 | SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model | Xun Liang, Simin Niu, Zhiyu Li, Sensen Zhang, Hanyu Wang, Feiyu Xiong, Jianqing Fan, Bo Tang, Shichao Song, Mengwei Wang... | 2025-01-28 | OpenAlex | https://github.com/IAAR-Shanghai/SafeRAG. | https://doi.org/10.18653/v1/2025.acl-long.230 |
| 199 | Large Language Model Critics for Execution-Free Evaluation of Code Changes | Aashish Yadavally, Hoan Anh Nguyen, Laurent Callot, Gauthier Guinet | 2025-01-27 | arXiv | https://github.com/amazon-science/code-agent-eval. | https://doi.org/10.48550/arXiv.2501.16655 |
| 200 | Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models | Hulingxiao He, Geng Li, Zengmin Geng, Jinglin Xu, Yuxin Peng | 2025-01-25 | ICLR | https://github.com/PKU-ICST-MIPL/Finedefics_ICLR2025. | https://openreview.net/forum?id=p3NKpom1VL |
| 201 | JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models | Michael K. Chen, Xikun Zhang, Dacheng Tao | 2025-01-24 | arXiv | https://github.com/michaelchen-lab/JustLogic | https://doi.org/10.48550/arXiv.2501.14851 |
| 202 | Softplus Attention with Re-weighting Boosts Length Extrapolation in Large Language Models | Bo Gao, Michael W. Spratling | 2025-01-23 | arXiv | https://github.com/iminfine/freeatten. | https://doi.org/10.48550/arXiv.2501.13428 |
| 203 | OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting | Xing Hu, Yuan Cheng, Dawei Yang, Zhixuan Chen, Zukang Xu, Jiangyong Yu, Chen Xu, Zhihang Yuan, Zhe Jiang, Sifan Zhou | 2025-01-23 | ICLR | https://github.com/BrotherHappy/OSTQuant | https://openreview.net/forum?id=rAcgDBdKnP |
| 204 | Can Large Language Models Understand Preferences in Personalized Recommendation? | Zhaoxuan Tan, Zinan Zeng, Qingkai Zeng, Zhenyu Wu, Zheyuan Liu, Fengran Mo, Meng Jiang | 2025-01-23 | arXiv | https://github.com/TamSiuhin/PerRecBench | https://doi.org/10.48550/arXiv.2501.13391 |
| 205 | An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models | Xiaoyu Chu, Sacheendra Talluri, Qingxian Lu, Alexandru Iosup | 2025-01-21 | OpenAlex | https://github.com/atlarge-research/llm-service-analysis. | https://doi.org/10.48550/arXiv.2501.12469 |
| 206 | Can open source large language models be used for tumor documentation in Germany? - An evaluation on urological doctors' notes | Stefan Lenz, Arsenij Ustjanzew, Marco Jeray, Meike Ressing, Torsten Panholzer | 2025-01-21 | BioData Mining | https://github.com/stefan-m-lenz/UroLlmEval. | https://doi.org/10.1186/s13040-025-00463-8 |
| 207 | Distillation Quantification for Large Language Models | Sunbowen Lee, Junting Zhou, Chang Ao, Kaige Li, Xinrun Du, Sirui He, Jiaheng Liu, Min Yang, Zhoufutu Wen, Shiwen Ni | 2025-01-21 | arXiv | https://github.com/Aegis1863/LLMs-Distillation-Quantification. | https://doi.org/10.48550/arXiv.2501.12619 |
| 208 | ESCARGOT: An AI Agent Leveraging Large Language Models, Dynamic Graph of Thoughts, and Biomedical Knowledge Graphs for Enhanced Reasoning | Nicholas Matsumoto, Hyun‐Jun Choi, Jay Moran, Miguel Hernandez, Mythreye Venkatesan, Xi Li, Jui-Hsuan Chang, Paul P. Wan... | 2025-01-20 | Bioinformatics | https://github.com/EpistasisLab/ESCARGOT. | https://doi.org/10.1093/bioinformatics/btaf031 |
| 209 | InsQABench: Benchmarking Chinese Insurance Domain Question Answering with Large Language Models | Jing Ding, Feng Kai, Binbin Lin, J. G. Cai, Qiushi Wang, Y. G. Xie, Xiaojin Zhang, Zhongyu Wei, Wei Chen | 2025-01-18 | arXiv | https://github.com/HaileyFamo/InsQABench.git. | https://doi.org/10.48550/arXiv.2501.10943 |
| 210 | CXR-LLaVA: a multimodal large language model for interpreting chest X-ray images | Seowoo Lee, M. D., Jiwon Youn, Mansu Kim D., Soon Ho Yoon, M. D. D | 2025-01-15 | European Radiology | https://github.com/ECOFRI/CXR_LLAVA. | https://doi.org/10.1007/s00330-024-11339-6 |
| 211 | PokerBench: Training Large Language Models to become Professional Poker Players | Richard Zhuang, Akshat Gupta, Chunhui Yang, Aniket Rahane, Zhengyu Li, Gopala Anumanchipalli | 2025-01-14 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/pokerllm/pokerbench | https://doi.org/10.1609/aaai.v39i24.34814 |
| 212 | LLM4SR: A Survey on Large Language Models for Scientific Research | Zhongling Luo, Zonglin Yang, Zheng Xu, Wei Yang, Xinya Du | 2025-01-08 | arXiv | https://github.com/du-nlp-lab/LLM4SR | https://doi.org/10.48550/arXiv.2501.04306 |
| 213 | Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models | Qianchen Ren, Jie Zeng, Qianyu He, Jiaqing Liang, Yanghua Xiao, Weikang Zhou, Z. J. Sun, F. Richard Yu | 2025-01-08 | Findings of the Association for Computational Linguistics: ACL 2022 | https://github.com/Rainier-rq/FollowSoftConstraints. | https://doi.org/10.18653/v1/2025.findings-acl.1004 |
| 214 | ChronoSense: Exploring Temporal Understanding in Large Language Models with Time Intervals of Events | Duygu Sezen Islakoglu, Jan-Christoph Kalo | 2025-01-06 | OpenAlex | https://github.com/duyguislakoglu/chronosense. | https://doi.org/10.18653/v1/2025.acl-short.46 |
| 215 | Visual Large Language Models for Generalized and Specialized Applications | Yifan Li, Zhixin Lai, Wentao Bao, Zhen Tan, Anh Dao, Kewei Sui, Jiayi Shen, Dong Liu, Huan Liu, Yu Kong | 2025-01-06 | arXiv | https://github.com/JackYFL/awesome-VLLMs. | https://doi.org/10.48550/arXiv.2501.02765 |
| 216 | MIRAGE: Exploring How Large Language Models Perform in Complex Social Interactive Environments | Cai Yin, Zhouhong Gu, Zhangxin Du, Zheyu Ye, Shaosheng Cao, Yiqian Xu, Hongwei Feng, Ping Chen | 2025-01-03 | OpenAlex | https://github.com/lime728/MIRAGE | https://doi.org/10.18653/v1/2025.acl-short.2 |
| 217 | Cold-Start Recommendation towards the Era of Large Language Models (LLMs): A Comprehensive Survey and Roadmap | Wei Zhang, Yuanchen Bei, Liangwei Yang, Henry Peng Zou, Peilin Zhou, Aiwei Liu, Yinghui Li, Liqiao Chen, Jian-Ling Wang,... | 2025-01-03 | arXiv | https://github.com/YuanchenBei/Awesome-Cold-Start-Recommendation. | https://doi.org/10.48550/arXiv.2501.01945 |
| 218 | Aligning Large Language Models for Faithful Integrity Against Opposing Argument | Yong Zhao, Yang Deng, See-Kiong Ng, Tat‐Seng Chua | 2025-01-02 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/zhaoy777/AFICE.git | https://doi.org/10.1609/aaai.v39i26.34990 |
| 219 | Multi-Agent Systems Powered by Large Language Models: Applications in Swarm Intelligence | Cristian Jimenez-Romero, Alper Yegenoglu, Christian Blum | 2025-01-01 | Frontiers in Artificial Intelligence | https://github.com/crjimene/swarm_gpt | https://doi.org/10.48550/arXiv.2503.03800 |
| 220 | Predicting differentially methylated cytosines in TET and DNMT3 knockout mutants via a large language model | Stefano Lonardi, Stefano Lonardi | 2025-01-01 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/ucrbioinfo/dmc_prediction. | https://doi.org/10.1101/2024.05.02.592257 |
| 221 | PointLLM-V2: Empowering Large Language Models to Better Understand Point Clouds | Runsen Xu, Shuai Yang, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin | 2025-01-01 | IEEE Transactions on Pattern Analysis and Machine Intelligence | https://github.com/OpenRobotLab/PointLLM. | https://doi.org/10.1007/978-3-031-72698-9_8 |
| 222 | Pipeline to explore information on genome editing using large language models and genome editing meta-database | Takayuki Suzuki, Hidemasa Bono | 2025-01-01 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/szktkyk/extract_geinfo | https://doi.org/10.1101/2024.10.16.617154 |
| 223 | PIP: Perturbation-based Iterative Pruning for Large Language Models | Yi Cao, Wei-Jie Xu, Yucheng Shen, Weijie Shi, Chi-Min Chan, Jiajie Xu | 2025-01-01 | arXiv | https://github.com/caoyiiiiii/PIP. | https://doi.org/10.48550/arXiv.2501.15278 |
| 224 | PAT: Pruning-Aware Tuning for Large Language Models | Yijiang Liu, Huanrui Yang, Youxin Chen, Rongyu Zhang, Miao Wang, Yuan Du, Li Du | 2025-01-01 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/kriskrisliu/PAT_Pruning-Aware-Tuning | https://doi.org/10.1609/aaai.v39i23.34649 |
| 225 | OptiChat: Bridging Optimization Models and Practitioners with Large Language Models | Hao Chen, Gonzalo Esteban Constante-Flores, Krishna Sri Ipsit Mantri, Sai Madhukiran Kompalli, Akshdeep Singh Ahluwalia,... | 2025-01-01 | INFORMS Journal on Data Science | https://github.com/li-group/OptiChat | https://doi.org/10.48550/arXiv.2501.08406 |
| 226 | Neuron based Personality Trait Induction in Large Language Models | Jia Deng, Tianyi Tang, Yanbin Yin, Wenhao Yang, Wayne Xin Zhao, Ji-Rong Wen | 2025-01-01 | ICLR | https://github.com/RUCAIBox/NPTI. | https://openreview.net/forum?id=LYHEY783Np |
| 227 | Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models | Xiao-Wen Yang, Jie-Jing Shao, Lan-Zhe Guo, Bo-Wen Zhang, Zhi Zhou, Lin-Han Jia, Wang-Zhou Dai, Yufeng Li | 2025-01-01 | OpenAlex | https://github.com/LAMDASZ-ML/Awesome-LLM-Reasoning-with-NeSy. | https://doi.org/10.48550/arXiv.2508.13678 |
| 228 | Medical Graph RAG: Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation | Junde Wu, Jiayuan Zhu, Yunli Qi, Jingkun Chen, Min Xu, Filippo Menolascina, Yueming Jin, Vicente Grau | 2025-01-01 | OpenAlex | https://github.com/MedicineToken/Medical-Graph-RAG | https://doi.org/10.18653/v1/2025.acl-long.1381 |
| 229 | MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents | Pan Tang, Shixiang Tang, Huanqi Pu, Zhiqing Miao, Zhixing Wang | 2025-01-01 | arXiv | https://github.com/tangpan360/MicroRCA-Agent. | https://doi.org/10.48550/arXiv.2509.15635 |
| 230 | Mat-Instructions: A Large-Scale Inorganic Material Instruction Dataset for Large Language Models | Peng Liu, Shangde Gao, Yongqing Fu, Xiaoliang Wu, Stephen Tong, Ajitha Rajan, Hao Xu | 2025-01-01 | OpenAlex | https://github.com/zjuKeLiu/Mat-Instructions | https://doi.org/10.24963/ijcai.2025/1089 |
| 231 | M4Bench: A Benchmark of Multi-domain Multi-granularity Multi-image Understanding for Multi-modal Large Language Models | Xiaojun Ye, Guanbao Liang, Chun Wang, Liangcheng Li, Pengfei Ke, Rui Wang, Bingxin Jia, Gang Huang, Qiao Sun, Sheng Zhou | 2025-01-01 | OpenAlex | https://github.com/eaglelab-zju/M4Bench. | https://doi.org/10.24963/ijcai.2025/762 |
| 232 | Leveraging Large Language Models for Predictive Analysis of Human Misery | Bishanka Seal, Rahul Seetharaman, Aman Bansal, Abhilash Nandy | 2025-01-01 | arXiv | https://github.com/abhi1nandy2/Misery_Data_Exps_GitHub | https://doi.org/10.48550/arXiv.2508.12669 |
| 233 | REEF: Representation Encoding Fingerprints for Large Language Models | Jie Zhang, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong Liu, Yu Qiao, Jing Shao | 2025-01-01 | ICLR | https://github.com/tmylla/REEF. | https://openreview.net/forum?id=SnDmPkOJ0T |
| 234 | Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation | Jianpeng Zhao, Chenyu Yuan, Weiming Luo, Haoling Xie, Guangwei Zhang, Steven Jige Quan, Zixuan Yuan, Pengyang Wang, Deng... | 2025-01-01 | arXiv | https://github.com/dart-lab-research/LLM-S-Cube-Benchmark | https://doi.org/10.48550/arXiv.2509.06337 |
| 235 | Labels Generated by Large Language Model Helps Measuring People's Empathy in Vitro | Md. Rakibul Hasan, Yue Yao, Md. Zakir Hossain, Aneesh Krishna, Imre J. Rudas, Shafin Rahman, Tom Gedeon | 2025-01-01 | arXiv | https://github.com/hasan-rakibul/LLMPathy | https://doi.org/10.48550/arXiv.2501.00691 |
| 236 | LMCBert: An Automatic Academic Paper Rating Model Based on Large Language Models and Contrastive Learning | Chuanbin Liu, Xiaowu Zhang, Hongfei Zhao, Zhijie Liu, Xi Xi, Lean Yu | 2025-01-01 | IEEE Transactions on Cybernetics | https://github.com/iioSnail/LMCBert. | https://doi.org/10.1109/TCYB.2025.3550203 |
| 237 | QuickLLaMA: Query-aware Inference Acceleration for Large Language Models | Jingyao Li, Han Shi, Sitong Wu, Chuanyang Zheng, Zhenguo Li, Xin Jiang, Hong Xu, Jiaya Jia | 2025-01-01 | COLING | https://github.com/dvlab-research/Q-LLM. | https://aclanthology.org/2025.coling-main.34/ |
| 238 | Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning | Song Yu, Xiaofei Xu, Ke Deng, Li Li, Lin Tian | 2025-01-01 | arXiv | https://github.com/Aireduce952/Tree-of-Agents. | https://doi.org/10.48550/arXiv.2509.06436 |
| 239 | ReLearn: Unlearning via Learning for Large Language Models | Haoming Xu, Ningyuan Zhao, Liming Yang, Sendong Zhao, Shumin Deng, Mengru Wang, Bryan Hooi, Nay Oo, Huajun Chen, Ningyu ... | 2025-01-01 | OpenAlex | https://github.com/zjunlp/unlearn. | https://doi.org/10.18653/v1/2025.acl-long.297 |
| 240 | Reliable Academic Conference Question Answering: A Study Based on Large Language Model | Zhiwei Huang, Long Jin, Junjie Wang, Mingchen Tu, Hua Yin, Zhiqiang Liu, Jiawei Meng, Huajun Chen, Wen Zhang | 2025-01-01 | Communications in computer and information science | https://github.com/zjukg/ConferenceQA. | https://doi.org/10.1007/978-981-96-1809-5_14 |
| 241 | WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Jian-Guang Lou, Chongyang Tao, Xiubo Geng, Qingwei Lin, Shifeng Chen, Yanson... | 2025-01-01 | ICLR | https://github.com/nlpxucan/WizardLM | https://openreview.net/forum?id=mMPMHWOdOy |
| 242 | VirNucPro: an identifier for the identification of viral short sequences using six-frame translation and large language models | Jing Li, Jia Mi, Wei Lin, Fengjuan Tian, Jing Wan, Jingyang Gao, Yigang Tong | 2025-01-01 | Briefings in Bioinformatics | https://github.com/Li-Jing-1997/VirNucPro. | https://doi.org/10.1093/bib/bbaf224 |
| 243 | Veracity-Oriented Context-Aware Large Language Models-Based Prompting Optimization for Fake News Detection | Weiqiang Jin, Yang Gao, Tao Tao, Xiujun Wang, Ningwei Wang, Baohai Wu, Biao Zhao | 2025-01-01 | International Journal of Intelligent Systems | https://github.com/albert-jin/CAPE-FND | https://doi.org/10.1155/int/5920142 |
| 244 | User Behavior Simulation with Large Language Model-based Agents for Recommender Systems | Lei Wang, Jingsen Zhang, Hao Yang, Zhiyuan Chen, Jiakai Tang, Zeyu Zhang, Xu Chen, Yankai Lin, Hao Sun, Ruihua Song, Xin... | 2025-01-01 | ACM transactions on office information systems | https://github.com/RUC-GSAI/YuLan-Rec | https://doi.org/10.1145/3708985 |
| 245 | TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking | Danqing Wang, Jianxin Ma, Fei Fang, Lei Li | 2025-01-01 | ICLR | https://github.com/dqwang122/ThinkHub. | https://openreview.net/forum?id=VIUisLx8lQ |
| 246 | LLM Ethics Benchmark: A Three-Dimensional Assessment System for Evaluating Moral Reasoning in Large Language Models | Junfeng Jiao, Saleh Afroogh, Arvind R. Murali, Kevin J. Chen, David Atkinson, Amit Dhurandhar | 2025-01-01 | Scientific Reports | https://github.com/The-Responsible-AI-Initiative/LLM_Ethics_Benchmark.git | https://doi.org/10.1038/s41598-025-18489-7 |
| 247 | Towards Prompt Engineering and Large Language Models for Post-OCR correction in handwritten texts | Sávio Santos de Araújo, Byron Leite Dantas Bezerra, Arthur Flor de Sousa Neto | 2025-01-01 | OpenAlex | https://github.com/savi8sant8s/zero-shot-spelling-corrector. | https://doi.org/10.5753/stil.2025.37859 |
| 248 | Towards Explainable Fake Image Detection with Multi-Modal Large Language Models | Yikun Ji, Yan Hong, Jiahui Zhan, Haoxing Chen, Jun Lan, Huijia Zhu, Weiqiang Wang, Liqing Zhang, Jianfu Zhang | 2025-01-01 | OpenAlex | https://github.com/Gennadiyev/mllm-defake. | https://doi.org/10.48550/arXiv.2504.14245 |
| 249 | Towards Atoms of Large Language Models | Chenhui Hu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao | 2025-01-01 | arXiv | https://github.com/ChenhuiHu/towards_atoms. | https://doi.org/10.48550/arXiv.2509.20784 |
| 250 | Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models | Guangzhi Xiong, Eric Xie, Corey Williams, Myles Kim, Amir Hassan Shariatmadari, Sikun Guo, Stefan Bekiranov, Aidong Zhan... | 2025-01-01 | OpenAlex | https://github.com/Teddy-XiongGZ/TruthHypo. | https://doi.org/10.48550/arXiv.2505.14599 |
| 251 | TeleEval-OS: Performance evaluations of large language models for telecommunications operations scheduling | Yanyan Wang, Yingying Wang, Junli Liang, Yin Xu, Yunlong Liu, Xu Yiming, Zhenzhen Jiang, Zhen Li, Fei Li, Long Zhao, Kun... | 2025-01-01 | Intelligent Data Analysis | https://github.com/zjsllab/TeleEval-OS. | https://doi.org/10.48550/arXiv.2506.11017 |
| 252 | Taming Unleashed Large Language Models With Blockchain for Massive Personalized Reliable Healthcare | Lianshan Sun, Diandong Liu, Maoxue Wang, Yongyi Han, Yanqing Zhang, Biwei Zhou, Yi Ren, Peng zhu | 2025-01-01 | IEEE Journal of Biomedical and Health Informatics | https://github.com/LDDLQ/ChatCBD. | https://doi.org/10.1109/JBHI.2025.3528526 |
| 253 | Systematic Outliers in Large Language Models | Yongqi An, Xu Zhao, Tao Yu, Ming Tang, Jinqiao Wang | 2025-01-01 | ICLR | https://github.com/an-yongqi/systematic-outliers. | https://openreview.net/forum?id=rLX7Vyyzus |
| 254 | Schema Inference for Tabular Data Repositories Using Large Language Models | Zhenyu Wu, Jiaoyan Chen, Norman W. Paton | 2025-01-01 | arXiv | https://github.com/PierreWoL/SILLM. | https://doi.org/10.48550/arXiv.2509.04632 |
| 255 | STLSP: Integrating Structure and Text with Large Language Models for Link Sign Prediction of Networks | Lijia Ma, Haoyang Fu, Zhijie Cao, Xiongnan Jin, Qiuzhen Lin, Jianqiang Li | 2025-01-01 | OpenAlex | https://github.com/sss483/STLSP. | https://doi.org/10.24963/ijcai.2025/354 |
| 256 | SPRI: Aligning Large Language Models with Context-Situated Principles | Hongli Zhan, Muneeza Azmat, Raya Horesh, Junyi Jessy Li, Mikhail Yurochkin | 2025-01-01 | arXiv | https://github.com/honglizhan/SPRI-public. | https://doi.org/10.48550/arXiv.2502.03397 |
| 257 | SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering | Zhang Yan, Jiaqing Lin, Miao Zhang, Kui Xiao, Xiaoju Hou, Jing Zhao, Zhifei Li | 2025-01-01 | arXiv | https://github.com/HubuKG/SCRA-VQA. | https://doi.org/10.48550/arXiv.2509.20871 |
| 258 | SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models | Zhuang Li, Yuncheng Hua, Thuy-Trang Vu, Haolan Zhan, Lizhen Qu, Gholamreza Haffari | 2025-01-01 | OpenAlex | https://github.com/zhuang-li/SCAR | https://doi.org/10.18653/v1/2025.acl-long.625 |
| 259 | Rethinking Reasoning Quality in Large Language Models through Enhanced Chain-of-Thought via RL | Hongming He, Zihua Rong, Kunpeng Ji, Chenyang Li, Qing Huang, 充正 宮下, Lan Yang, Honggang Zhang | 2025-01-01 | arXiv | https://github.com/Henryhe09/DRER. | https://doi.org/10.48550/arXiv.2509.06024 |
| 260 | LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models | Xiaohao Yang, He Zhao, Dinh Q. Phung, Wray L. Buntine, Lan Du | 2025-01-01 | Transactions of the Association for Computational Linguistics | https://github.com/Xiaohao-Yang/Topic_Model_Evaluation. | https://doi.org/10.48550/arXiv.2406.09008 |
| 261 | Large language models open new way of AI-assisted molecule design for chemists | Shoichi Ishida, Tomohiro Sato, Teruki Honma, Kei Terayama | 2025-01-01 | Journal of Cheminformatics | https://github.com/molecule-generator-collection/ChatChemTS. | https://doi.org/10.26434/chemrxiv-2024-1p82f |
| 262 | Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models | Joseph Lee, Shuhua Yang, Jae Young Baik, Xiaoxi Liu, Zhen Tan, Dawei Li, Zixuan Wen, Bojian Hou, Duy Duong‐Tran, Tianlon... | 2025-01-01 | arXiv (Cornell University) | https://github.com/PennShenLab/FREEFORM. | http://arxiv.org/abs/2410.01795 |
| 263 | Benchmarking DNA large language models on quadruplexes | Oleksandr Cherednichenko, Alan Herbert, Maria Poptsova | 2025-01-01 | Computational and Structural Biotechnology Journal | https://github.com/powidla/G4s-FMs. | https://doi.org/10.1016/j.csbj.2025.03.007 |
| 264 | Comparative Analysis of Demonstration Selection Algorithms for In-Context Learning in Large Language Models (Student Abstract) | Dong Wook Shu, Mengnan Du | 2025-01-01 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/Tizzzzy/Demonstration_Selection_Overview. | https://doi.org/10.1609/aaai.v39i28.35299 |
| 265 | Causal Intervention Is What Large Language Models Need for Spatio-Temporal Forecasting | Shijie Li, He Li, Xiaojing Li, Yong Xu, Zhenhong Lin, Huaiguang Jiang | 2025-01-01 | IEEE Transactions on Cybernetics | https://github.com/lishijie15/STCInterLLM. | https://doi.org/10.1109/TCYB.2025.3569333 |
| 266 | CFD-LLMBench: A Benchmark Suite for Evaluating Large Language Models in Computational Fluid Dynamics | Nithin Somasekharan, Ling Yue, Yadi Cao, Weichao Li, Patrick Emami, Pochinapeddi Sai Bhargav, Anurag Acharya, Xingyu Xie... | 2025-01-01 | arXiv | https://github.com/NREL-Theseus/cfdllmbench | https://doi.org/10.48550/arXiv.2509.20374 |
| 267 | CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model | Lei Yang, Jiangtong Li, Ming Jiang, Junjie Hu, Dawei Cheng, Zhijun Ding, Changjun Jiang | 2025-01-01 | arXiv | https://github.com/TongjiFinLab/CFBenchmark. | https://doi.org/10.48550/arXiv.2506.13055 |
| 268 | CAPE: Context-Aware Personality Evaluation Framework for Large Language Models | Jivnesh Sandhan, Fei Cheng, Tushar Sandhan, Yugo Murawaki | 2025-01-01 | arXiv | https://github.com/jivnesh/CAPE | https://doi.org/10.48550/arXiv.2508.20385 |
| 269 | CALM: Curiosity-Driven Auditing for Large Language Models | Xiaoyu Zheng, Longxiang Wang, Yi Liu, Xingjun Ma, Chao Shen, Cong Wang | 2025-01-01 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/x-zheng16/CALM.git. | https://doi.org/10.1609/aaai.v39i26.34991 |
| 270 | Beyond Interpretability: Exploring the Comprehensibility of Adaptive Video Streaming through Large Language Models | Lianchen Jia, Chaoyang Li, Ziqi Yuan, Jiahui Chen, Tianchi Huang, Jiangchuan Liu, Lifeng Sun | 2025-01-01 | OpenAlex | https://github.com/thu-media/ComTree. | https://doi.org/10.48550/arXiv.2508.16448 |
| 271 | Beyond Graphs: Can Large Language Models Comprehend Hypergraphs? | Yifan Feng, Chengwu Yang, Xingliang Hou, Shaoyi Du, Shihui Ying, Zongze Wu, Yue Gao | 2025-01-01 | ICLR | https://github.com/iMoonLab/LLM4Hypergraph. | https://openreview.net/forum?id=28qOQwjuma |
| 272 | Beyond Autoregression: An Empirical Study of Diffusion Large Language Models for Code Generation | Chunliang Li, Yitong Zhang, Jia Li, Liyi Cai, Ge Li | 2025-01-01 | arXiv | https://github.com/zhangyitonggg/dllm4code. | https://doi.org/10.48550/arXiv.2509.11252 |
| 273 | Automating candidate gene prioritization with large language models: from naive scoring to literature-grounded validation | Taushif Khan, Mohammed Toufiq, Marina Yurieva, Nitaya Indrawattana, Akanitt Jittmittraphap, Nathamon Kosoltanapiwat, Por... | 2025-01-01 | Bioinformatics | https://github.com/taushifkhan/llm-geneprioritization-framework | https://doi.org/10.1093/bioinformatics/btaf541 |
| 274 | ConceptViz: A Visual Analytics Approach for Exploring Concepts in Large Language Models | Haoxuan Li, Zhen Wen, Qiqi Jiang, Chenxiao Li, Yuwei Wu, Yuchen Yang, Yiyao Wang, Xiuqi Huang, Minfeng Zhu, Wei Chen | 2025-01-01 | arXiv | https://github.com/Happy-Hippo209/ConceptViz. | https://doi.org/10.48550/arXiv.2509.20376 |
| 275 | An Empirical Analysis of Uncertainty in Large Language Model Evaluations | Qiujie Xie, Qingqiu Li, Zhuohao Yu, Yuejie Zhang, Yue Zhang, Linyi Yang | 2025-01-01 | ICLR | https://github.com/hasakiXie123/LLM-Evaluator-Uncertainty. | https://openreview.net/forum?id=J4xLuCt2kg |
| 276 | Aligning, Autoencoding and Prompting Large Language Models for Novel Disease Reporting | Fenglin Liu, Xian Wu, Jinfa Huang, Bang Yang, Kim Branson, Patrick Schwab, Lei Clifton, Ping Zhang, Jiebo Luo, Yefeng Zh... | 2025-01-01 | IEEE Transactions on Pattern Analysis and Machine Intelligence | https://github.com/ai-in-health/PromptLLM. | https://doi.org/10.1109/tpami.2025.3534586 |
| 277 | AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments | Zhiheng Xi, Yiwen Ding, Wen-Xiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Xin Guo, Dingwen Yang, Chenyang Liao, He ... | 2025-01-01 | OpenAlex | https://github.com/WooooDyy/AgentGym. | https://doi.org/10.18653/v1/2025.acl-long.1355 |
| 278 | Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities | Jinhua Liang, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy P. Phan, Emmanouil Benetos | 2025-01-01 | IEEE Transactions on Audio Speech and Language Processing | https://github.com/JinhuaLiang/APT. | https://doi.org/10.1109/taslpro.2025.3533375 |
| 279 | ARB-LLM: Alternating Refined Binarizations for Large Language Models | Zhiteng Li, Xianglong Yan, Tianao Zhang, Haotong Qin, Dong Xie, Jiang Tian, Zhongchao Shi, Linghe Kong, Yulun Zhang, Xia... | 2025-01-01 | ICLR | https://github.com/ZHITENGLI/ARB-LLM. | https://openreview.net/forum?id=ZU8OdDLTts |
| 280 | APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking | Can Jin, Hongwu Peng, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han, Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran... | 2025-01-01 | arXiv | https://github.com/jincan333/APEER. | https://doi.org/10.48550/arXiv.2406.14449 |
| 281 | A Survey of Reasoning and Agentic Systems in Time Series with Large Language Models | Ching Chang, Yidan Shi, Defu Cao, Wei Yang, Jeehyun Hwang, Haixin Wang, Jiacheng Pang, Wei Wang, Yan Liu, Wen-Chih Peng,... | 2025-01-01 | arXiv | https://github.com/blacksnail789521/Time-Series-Reasoning-Survey | https://doi.org/10.48550/arXiv.2509.11575 |
| 282 | JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering | Renmiao Chen, Shiyao Cui, Xuancheng Huang, Chengwei Pan, Victor Shea-Jay Huang, Qinglin Zhang, Xuan Ouyang, Zhexin Zhang... | 2025-01-01 | OpenAlex | https://github.com/thu-coai/JPS | https://doi.org/10.48550/arXiv.2508.05087 |
| 283 | A Closer Look at Machine Unlearning for Large Language Models | Xiaojian Yuan, Tianyu Pang, Chao Du, Kejiang Chen, Weiming Zhang, Min Lin | 2025-01-01 | ICLR | https://github.com/sail-sg/closer-look-LLM-unlearning. | https://openreview.net/forum?id=Q1MHvGmhyT |
| 284 | Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization | Hritik Bansal, Ashima Suvarna, Gantavya Bhatt, Nanyun Peng, Kai-Wei Chang, Aditya Grover | 2025-01-01 | Findings of the Association for Computational Linguistics: ACL 2022 | https://github.com/Hritikbansal/dove. | https://doi.org/10.18653/v1/2025.findings-acl.39 |
| 285 | Large Language Models Meet Legal Artificial Intelligence: A Survey | Zhitian Hou, Zihan Ye, Nanli Zeng, Tianyong Hao, Kun Zeng | 2025-01-01 | arXiv | https://github.com/ZhitianHou/LLMs4LegalAI. | https://doi.org/10.48550/arXiv.2509.09969 |
| 286 | Control Industrial Automation System with Large Language Model Agents | Yuchen Xia, Nasser Jazdi, Jize Zhang, Chaitanya Shah, Michael Weyrich | 2025-01-01 | ETFA | https://github.com/YuchenXia/LLM4IAS | https://doi.org/10.1109/ETFA65518.2025.11205539 |
| 287 | Cumulative Reasoning with Large Language Models | Yifan Zhang, Jingqin Yang, Yang Yuan, Andrew Chi-Chih Yao | 2025-01-01 | Trans. Mach. Learn. Res. | https://github.com/iiis-ai/cumulative-reasoning. | https://openreview.net/forum?id=grW15p4eq2 |
| 288 | Hyperbolic Large Language Models | Sarang Patil, Zeyong Zhang, Yiran Huang, Tengfei Ma, Mengjia Xu | 2025-01-01 | arXiv | https://github.com/sarangp2402/Hyperbolic-LLM-Models | https://doi.org/10.48550/arXiv.2509.05757 |
| 289 | How Can Recommender Systems Benefit from Large Language Models: A Survey | Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Hao Zhang, Yong Liu, Chuhan Wu, Xiangyang Li, Chenxu Zhu, Huife... | 2025-01-01 | ACM transactions on office information systems | https://github.com/CHIANGEL/Awesome-LLM-for-RecSys | https://doi.org/10.48550/arXiv.2306.05817 |
| 290 | Harnessing Multi-modal Large Language Models for Measuring and Interpreting Color Differences | Zhihua Wang, Long Yu, Qiuping Jiang, Chao Huang, Xiaochun Cao | 2025-01-01 | IEEE Transactions on Image Processing | https://github.com/LongYu-LY/CD-Reasoning. | https://doi.org/10.1109/tip.2024.3522802 |
| 291 | Guarded Query Routing for Large Language Models | Richard Sléher, William Brach, Tibor Sloboda, Kristián Kostál, Lukas Galke | 2025-01-01 | Frontiers in artificial intelligence and applications | https://github.com/williambrach/gqr. | https://doi.org/10.48550/arXiv.2505.14524 |
| 292 | GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest | Shilong Zhang, Peize Sun, Shoufa Chen, Min Xiao, Wenqi Shao, Wenwei Zhang, Yu Liu, Kai Chen, Ping Luo | 2025-01-01 | Lecture notes in computer science | https://github.com/jshilong/GPT4RoI. | https://doi.org/10.1007/978-3-031-91813-1_4 |
| 293 | From continuous pre-training to alignment: A comprehensive toolkit for large language models in federated learning | Zhuo Zhang, Yukun Zhang, Guanzhong Chen, Lizhen Qu, Xun Zhou, Hui Wang, Zenglin Xu | 2025-01-01 | Neurocomputing | https://github.com/iezhuozhuo/f4llm. | https://doi.org/10.1016/j.neucom.2025.130572 |
| 294 | FreqLLM: Frequency-Aware Large Language Models for Time Series Forecasting | Shunnan Wang, Min Gao, Zongwei Wang, Yibing Bai, Feng Jiang, Guansong Pang | 2025-01-01 | OpenAlex | https://github.com/biya0105/FreqLLM. | https://doi.org/10.24963/ijcai.2025/377 |
| 295 | Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models | Yuanchen Zhou, Shuo Jiang, Jie Zhu, Junhui Li, Lifan Guo, Feng Chen, Chi Zhang | 2025-01-01 | arXiv | https://github.com/aliyun/qwen-dianjin. | https://doi.org/10.48550/arXiv.2508.15202 |
| 296 | Exploring homology detection via k-means clustering of proteins embedded with a large language model | Thomas Minotto, Antoine Claessens, Thomas D. Otto | 2025-01-01 | Bioinformatics | https://github.com/ThomasGTHB/OrthoLM | https://doi.org/10.1093/bioinformatics/btaf472 |
| 297 | Exploring Concept Depth: How Large Language Models Acquire Knowledge and Concept at Different Layers? | Mingyu Jin, Qinkai Yu, Jingyuan Huang, Qingcheng Zeng, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaiz... | 2025-01-01 | COLING | https://github.com/Luckfort/CD | https://aclanthology.org/2025.coling-main.37/ |
| 298 | Exploring Brazil's LLM Fauna: Investigating the Generative Performance of Large Language Models in Portuguese | Gabriel Assis, Cláudia Freitas, Aline Paes | 2025-01-01 | Journal of the Brazilian Computer Society | https://github.com/MeLLL-UFF/brfauna-gen-eval. | https://doi.org/10.5753/jbcs.2025.5814 |
| 299 | Evaluating the Prompt Steerability of Large Language Models | Erik Miehling, Michael Desmond, Karthikeyan Natesan Ramamurthy, Elizabeth Daly, Kush R. Varshney, Eitan Farchi, Pierre D... | 2025-01-01 | OpenAlex | https://github.com/IBM/prompt-steering. | https://doi.org/10.18653/v1/2025.naacl-long.400 |
| 300 | Evaluating and Mitigating Linguistic Discrimination in Large Language Models: Perspectives on Safety Equity and Knowledge Equity | Guoliang Dong, Haoyu Wang, Jun Sun, Xinyu Wang | 2025-01-01 | OpenAlex | https://github.com/dgl-prc/ldfighter | https://doi.org/10.24963/ijcai.2025/40 |
| 301 | Enhancing Large Language Models for Hardware Verification: A Novel SystemVerilog Assertion Dataset | Anand Menon, Samit S. Miftah, Shamik Kundu, Souvik Kundu, Amisha Srivastava, Arnab Raha, Gabriel Theodor Sonnenschein, S... | 2025-01-01 | ACM Transactions on Design Automation of Electronic Systems | https://github.com/AnandMenon12/VERT. | https://doi.org/10.48550/arXiv.2503.08923 |
| 302 | Enhancing Herbal Medicine-Drug Interaction Prediction Using Large Language Models | Sisi Yuan, Zhecheng Zhou, Xinyuan Jin, Linlin Zhuo, Keqin Li | 2025-01-01 | IEEE Journal of Biomedical and Health Informatics | https://github.com/sisyyuan/HDI. | https://doi.org/10.1109/JBHI.2025.3558667 |
| 303 | Dual Adapter Tuning of Vision-Language Models Using Large Language Models | Mohammad Reza Zarei, Abbas Akkasi, Majid Komeili | 2025-01-01 | International Journal of Computational Intelligence Systems | https://github.com/mrzarei5/DATViL. | https://doi.org/10.1007/s44196-025-00853-0 |
| 304 | Do Large Language Model Benchmarks Test Reliability? | Joshua Vendrow, Edward Vendrow, Sara Beery, Aleksander Madry | 2025-01-01 | arXiv | https://github.com/MadryLab/platinum-benchmarks | https://doi.org/10.48550/arXiv.2502.03461 |
| 305 | Disentangling Memory and Reasoning Ability in Large Language Models | Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang | 2025-01-01 | OpenAlex | https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning. | https://doi.org/10.18653/v1/2025.acl-long.84 |
| 306 | DesignQA: A Multimodal Benchmark for Evaluating Large Language Models' Understanding of Engineering Do cumentation | Anna C. Doris, Daniele Grandi, Ryan Tomich, Md Ferdous Alam, Mohammadmehdi Ataei, Hyunmin Cheong, Faez Ahmed | 2025-01-01 | Journal of Computing and Information Science in Engineering | https://github.com/anniedoris/design_qa | https://doi.org/10.48550/arXiv.2404.07917 |
| 307 | Improving Efficiency of Answer Set Planning with Rough Solutions from Large Language Models for Robotic Task Planning | Xinrui Lin, Yangfan Wu, Huanyu Yang, Yuting Huang, Yu Zhang, Jianmin Ji, Yanyong Zhang | 2025-01-01 | OpenAlex | https://github.com/CLMASP/CLMASP. | https://doi.org/10.24963/ijcai.2025/509 |
| 308 | MentalQLM: A lightweight large language model for mental healthcare based on instruction tuning and dual LoRA modules | Jiayu Shi, Zexiao Wang, Jiandong Zhou, Chengyu Liu, Poly Z. H. Sun, Erying Zhao, Lei Lü | 2024-12-30 | IEEE Journal of Biomedical and Health Informatics | https://github.com/tortorish/MentalQLM. | https://doi.org/10.1101/2024.12.29.24319755 |
| 309 | MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios | Jiaqi Fan, Jianhua Wu, Jincheng Gao, Jianhao Yu, Yafei Wang, Hongqing Chu, Bingzhao Gao | 2024-12-26 | arXiv (Cornell University) | https://github.com/fjq-tongji/MLLM-SUL. | http://arxiv.org/abs/2412.19406 |
| 310 | Survey and Improvement Strategies for Gene Prioritization with Large Language Models | Matthew B. Neeley, Guantong Qi, Guanchao Wang, Ruixiang Tang, Dongxue Mao, Chaozhong Liu, Sasidhar Pasupuleti, Bo Yuan, ... | 2024-12-26 | Bioinformatics Advances | https://github.com/LiuzLab/GPT-Diagnosis. | https://doi.org/10.48550/arXiv.2501.18794 |
| 311 | Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment | Ziang Yan, Zhilin Li, Yinan He, Chenting Wang, Kunchang Li, Xinhao Li, Xiangyu Zeng, Zhong Lin Wang, Yali Wang, Yu Qiao,... | 2024-12-26 | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | https://github.com/OpenGVLab/TPO | https://openaccess.thecvf.com/content/CVPR2025/html/Yan_Task_Preference_Optimization_Improving_Multimodal_Large_Language_Models_with_Vision_CVPR_2025_paper.html |
| 312 | An Engorgio Prompt Makes Large Language Model Babble on | Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang, Hanxun Qiu, Tianwei Zhang, Hao Wang, Hewu Li, Qi Li, Chao Zhang, Ke Xu | 2024-12-26 | ICLR | https://github.com/jianshuod/Engorgio-prompt. | https://openreview.net/forum?id=m4eXBo0VNc |
| 313 | A Survey on Large Language Model Acceleration based on KV Cache Management | Haoyang Li, Yiming Li, Anxin Tian, Tianhao Tang, Zhanchao Xu, Xuejia Chen, Nicole Hu, Wei Dong, Qing Li, Lei Chen | 2024-12-26 | Trans. Mach. Learn. Res. | https://github.com/TreeAI-Lab/Awesome-KV-Cache-Management | https://openreview.net/forum?id=z3JZzu9EA3 |
| 314 | 3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding | Tatiana Zemskova, Dmitry Yudin | 2024-12-24 | arXiv (Cornell University) | https://github.com/CognitiveAISystems/3DGraphLLM. | http://arxiv.org/abs/2412.18450 |
| 315 | ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation | Mengyang Wu, Yuzhi Zhao, Jialun Cao, Mingjie Xu, Zhongming Jiang, Xuehui Wang, Qinbin Li, Guangneng Hu, Shengchao Qin, C... | 2024-12-24 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/zhaoyuzhi/ICM-Assistant. | https://doi.org/10.1609/aaai.v39i8.32908 |
| 316 | Investigating Large Language Models for Code Vulnerability Detection: An Experimental Study | Xuefeng Jiang, L. H. Wu, Sheng Sun, Jia Li, Jingjing Xue, Yuwei Wang, Tingting Wu, Min Liu | 2024-12-24 | arXiv (Cornell University) | https://github.com/SakiRinn/LLM4CVD | http://arxiv.org/abs/2412.18260 |
| 317 | Large Language Model Safety: A Holistic Survey | Dan Shi, Tianhao Shen, Yufei Huang, Zhigen Li, Yongqi Leng, Renren Jin, Chuang Liu, Xinwei Wu, Zishan Guo, Linhao Yu, Li... | 2024-12-23 | arXiv (Cornell University) | https://github.com/tjunlp-lab/Awesome-LLM-Safety-Papers. | http://arxiv.org/abs/2412.17686 |
| 318 | Property Enhanced Instruction Tuning for Multi-task Molecule Generation with Large Language Models | Xuan Lin, Long Chen, Yile Wang, Xiangxiang Zeng, Philip S. Yu | 2024-12-23 | arXiv (Cornell University) | https://github.com/chenlong164/PEIT. | http://arxiv.org/abs/2412.18084 |
| 319 | Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval | Luo Ji, Fulai Guo, Teng Chen, Qing Gu, Xiaoyu Wang, Ningyuan Xi, Yihong Wang, Peng Yu, Yue Zhao, Hongyang Lei, Zhonglin ... | 2024-12-21 | Lecture notes in computer science | https://github.com/flyfree5/LaHoRe. | https://doi.org/10.1007/978-3-031-88714-7_27 |
| 320 | Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation | Seyedreza Mohseni, Seyedali Mohammadi, Deepa Tilwani, Yash Saxena, Gerald Ketu Ndawula, Sriram Vema, Edward Raff, Manas ... | 2024-12-20 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/mohammadi-ali/MetamorphASM. | https://doi.org/10.1609/aaai.v39i23.34672 |
| 321 | PruneVid: Visual Token Pruning for Efficient Video Large Language Models | Xiaohu Huang, Hao Zhou, K. L. Han | 2024-12-20 | Findings of the Association for Computational Linguistics: ACL 2022 | https://github.com/Visual-AI/PruneVid. | https://doi.org/10.18653/v1/2025.findings-acl.1024 |
| 322 | Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models | Wenhan Liu, Xinyu Ma, Yutao Zhu, Ziliang Zhao, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou | 2024-12-19 | OpenAlex | https://github.com/8421BCD/fullrank | https://doi.org/10.18653/v1/2025.acl-long.8 |
| 323 | Unveiling Uncertainty: A Deep Dive into Calibration and Performance of Multimodal Large Language Models | Zijun Chen, Wenbo Hu, Guande He, Zhijie Deng, Zheng Zhang, Richang Hong | 2024-12-19 | COLING | https://github.com/hfutml/Calibration-MLLM | https://aclanthology.org/2025.coling-main.208/ |
| 324 | Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework | Zhenjie Xu, Wenqing Chen, Yi Tang, Xuanying Li, Cheng Hu, Zhixuan Chu, Kui Ren, Zibin Zheng, Zhichao Lu | 2024-12-19 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/Cortantse/MOMA. | https://doi.org/10.1609/aaai.v39i24.34748 |
| 325 | InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models | Cong Wei, Yujie Zhong, Haoxian Tan, Yingsen Zeng, Yong Liu, Zheng Zhao, Yujiu Yang | 2024-12-18 | arXiv (Cornell University) | https://github.com/congvvc/InstructSeg. | http://arxiv.org/abs/2412.14006 |
| 326 | ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study | Eric Modesitt, Ke Yang, Spencer Hulsey, Xin Liu, ChengXiang Zhai, Volodymyr V. Kindratenko | 2024-12-18 | Findings of the Association for Computational Linguistics: ACL 2022 | https://github.com/ModeEric/ORBIT-Llama | https://doi.org/10.18653/v1/2025.findings-acl.51 |
| 327 | ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals | Utkarsh Saxena, Sayeh Sharify, Kaushik Roy, Xin Wang | 2024-12-18 | arXiv (Cornell University) | https://github.com/utkarsh-dmx/project-resq. | http://arxiv.org/abs/2412.14363 |
| 328 | DateLogicQA: Benchmarking Temporal Biases in Large Language Models | Gagan Bhatia, MingZe Tang, Cristina Mahanta, Madiha Kazi | 2024-12-17 | OpenAlex | https://github.com/gagan3012/EAIS-Temporal-Bias | https://doi.org/10.18653/v1/2025.naacl-srw.32 |
| 329 | RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation | Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yongkang Wu, Zhonghua Li, Qi Ye, Zhicheng Dou | 2024-12-16 | OpenAlex | https://github.com/sunnynexus/RetroLLM | https://doi.org/10.18653/v1/2025.acl-long.819 |
| 330 | SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models | Jiale Cheng, Xiao Liu, Cunxiang Wang, Xiaotao Gu, Yida Lu, Dan Zhang, Yuxiao Dong, Jie Tang, Hongning Wang, Minlie Huang | 2024-12-16 | ICLR | https://github.com/thu-coai/SPaR. | https://openreview.net/forum?id=9chRqsPOGL |
| 331 | NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning | Xin Yi, Shunfan Zheng, Linlin Wang, Gerard de Melo, Xiaoling Wang, Liang He | 2024-12-16 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/xinykou/NLSR | https://doi.org/10.1609/aaai.v39i24.34762 |
| 332 | Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits | Bohan Li, J Guan, Longxu Dou, Yun‐Long Feng, Dingzirui Wang, Yang Xu, Enbo Wang, Qiguang Chen, Bichen Wang, Xiao Xu, Yim... | 2024-12-16 | COLING | https://github.com/Personality-NLP/MbtiBench. | https://aclanthology.org/2025.coling-main.339/ |
| 333 | Assessing the Limitations of Large Language Models in Clinical Fact Decomposition | Monica Munnangi, Akshay Swaminathan, Jason Fries, Jenelle Jindal, Sanjana Narayanan, Iván López, Lucia Tu, Philip Chung,... | 2024-12-16 | arXiv (Cornell University) | https://github.com/som-shahlab/factehr | http://arxiv.org/abs/2412.12422 |
| 334 | BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement | Yinwei Du, Shunian Chen, Wenbo Zan, Peizhao Li, Mingxuan Wang, Dongwoon Song, Bo Li, Yan Hu, Benyou Wang | 2024-12-16 | arXiv (Cornell University) | https://github.com/FreedomIntelligence/BlenderLLM | http://arxiv.org/abs/2412.14203 |
| 335 | A survey on LoRA of large language models | Yuren Mao, Yuhang Ge, Yijiang Fan, Wenyi Xu, Yu Mi, Zhonghao Hu, Yunjun Gao | 2024-12-14 | Frontiers of Computer Science | https://github.com/ZJU-LLMs/Awesome-LoRAs.git | https://doi.org/10.1007/s11704-024-40663-9 |
| 336 | B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens | Zhuqiang Lu, Zhenfei Yin, Meilin He, Zhihui Wang, Zicheng Liu, Zhiyong Wang, Kun Hu | 2024-12-13 | arXiv (Cornell University) | https://github.com/zhuqiangLu/B-VLLM. | http://arxiv.org/abs/2412.09919 |
| 337 | Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine | Xiaoshuang Huang, Lingdong Shen, Jia Liu, Fangxin Shang, Hongxiang Li, Haifeng Huang, Yehui Yang | 2024-12-12 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/ShawnHuang497/MedPLIB. | https://doi.org/10.1609/aaai.v39i4.32394 |
| 338 | Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph Completion | Ben Liu, Jihai Zhang, Fangquan Lin, Cheng Yang, Min Peng | 2024-12-12 | COLING | https://github.com/LB0828/FtG | https://aclanthology.org/2025.coling-main.740/ |
| 339 | Simulate Scientific Reasoning with Multiple Large Language Models: An Application to Alzheimer’s Disease Combinatorial Therapy | Qidi Xu, Xiaozhong Liu, Xiaoqian Jiang, Yejin Kim | 2024-12-12 | medRxiv (Cold Spring Harbor Laboratory) | https://github.com/QidiXu96/Coated-LLM | https://doi.org/10.1101/2024.12.10.24318800 |
| 340 | Enhancing Multimodal Large Language Models Complex Reason via Similarity Computation | Xiaofeng Zhang, Fanshuo Zeng, Yihao Quan, Zheng Hui, Jiawei Yao | 2024-12-12 | Proceedings of the AAAI Conference on Artificial Intelligence | https://github.com/FanshuoZeng/Simignore | https://doi.org/10.1609/aaai.v39i10.33107 |
| 341 | Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models | Jiahui Li, Yongchang Hao, Hanqiu Xu, Xing Wang, Yu Hong | 2024-12-11 | COLING | https://github.com/jiah-li/magic. | https://aclanthology.org/2025.coling-main.305/ |
| 342 | Concept Bottleneck Large Language Models | Chung-En Sun, Tuomas P. Oikarinen, Berk Ustun, Tsui-Wei Weng | 2024-12-10 | ICLR | https://github.com/Trustworthy-ML-Lab/CB-LLMs. | https://openreview.net/forum?id=RC5FPYVQaH |
| 343 | IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language Model | Weizhen Bian, Siyan Liu, Yubo Zhou, Dezhi Chen, Yijie Liao, Zhenzhen Fan, Aobo Wang | 2024-12-10 | Lecture notes in computer science | https://github.com/LuckyBian/ISY5001 | https://doi.org/10.1007/978-981-97-5489-2_24 |
| 344 | PediaBench: A Comprehensive Chinese Pediatric Dataset for Benchmarking Large Language Models | Qian Zhang, Panfeng Chen, Shuo Feng, Shuyu Liu, Jiali Li, Heng Zhao, Mei Chen, Hui Li, Yanhao Wang | 2024-12-09 | Frontiers of Computer Science | https://github.com/ACMISLab/PediaBench. | https://doi.org/10.1007/s11704-025-41345-w |
| 345 | KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models | Fan Wang, Jing-Jiang Jiang, C.Y. Park, Sunghun Kim, Jing Tang | 2024-12-08 | ICLR | https://github.com/juyongjiang/KaSA. | https://openreview.net/forum?id=OQqNieeivq |
| 346 | Mitigating Entity-Level Hallucination in Large Language Models | Weihang Su, Yichen Tang, Qingyao Ai, Changyue Wang, Zhijing Wu, Yiqun Liu | 2024-12-08 | OpenAlex | https://github.com/oneal2000/EntityHallucination. | https://doi.org/10.48550/arXiv.2407.09417 |
| 347 | Fine-Grained Behavior Simulation with Role-Playing Large Language Model on Social Media | Kun Li, C. H. Dai, Zhou We, Songlin Hu | 2024-12-04 | arXiv (Cornell University) | https://github.com/linkseed18612254945/FineRob | http://arxiv.org/abs/2412.03148 |
| 348 | From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents | Xinyi Mou, Xuanwen Ding, Qi He, Liang Wang, Jingcong Liang, Xinnong Zhang, Libo Sun, Jiayu Lin, Jie Zhou, Xuanjing Huang... | 2024-12-04 | arXiv (Cornell University) | https://github.com/FudanDISC/SocialAgent | http://arxiv.org/abs/2412.03563 |
| 349 | Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning | Long Mai, Julie Carson-Berndsen | 2024-12-04 | arXiv (Cornell University) | https://github.com/mailong25/peft_diversity | http://arxiv.org/abs/2412.03343 |
| 350 | NJGPT: A Large Language Model-Driven, User-Friendly Solution for Phylogenetic Tree Construction | Zhixuan Wang, Haoyuan Huang, Teng Li, Allen G. Rodrigo | 2024-12-04 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/ZWan622/NJGPT1.0.git | https://doi.org/10.1101/2024.12.02.626464 |
| 351 | Improving Automated Deep Phenotyping Through Large Language Models Using Retrieval Augmented Generation | Brandon T. Garcia, Lauren Westerfield, Priya Yelemali, Nikhita Gogate, Edgar A. Rivera‐Muñoz, Haowei Du, Moez Dawood, An... | 2024-12-02 | medRxiv (Cold Spring Harbor Laboratory) | https://github.com/PoseyPod/RAG-HPO | https://doi.org/10.1101/2024.12.01.24318253 |
| 352 | Beyond Labels: Aligning Large Language Models with Human-Like Reasoning | Muhammad Rafsan Kabir, Rafeed Mohammad Sultan, Ihsanul Haque Asif, Jason M. E. Ahad, Fuad Rahman, Mohammad Ruhul Amin, N... | 2024-12-02 | Lecture notes in computer science | https://github.com/apurba-nsu-rnd-lab/DFAR. | https://doi.org/10.1007/978-3-031-78172-8_16 |
| 353 | Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification | Wenxuan Huang, Zijie Zhai, Yunhang Shen, Shaosheng Cao, Fei Zhao, Xiangfeng Xu, Zheyu Ye, Shaohui Lin | 2024-12-01 | arXiv | https://github.com/Osilly/dynamic_llava | https://openreview.net/forum?id=hzVpZDrW73 |
| 354 | Woodpecker: hallucination correction for multimodal large language models | Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Bill Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, Enhong Chen | 2024-12-01 | Science China Information Sciences | https://github.com/BradyFU/Woodpecker. | https://doi.org/10.1007/s11432-024-4251-x |
| 355 | Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings | Qiong Wu, Wei Lin, Weihao Ye, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji | 2024-11-29 | arXiv (Cornell University) | https://github.com/DoubtedSteam/DyVTE. | http://arxiv.org/abs/2411.19628 |
| 356 | CovidLLM: A Robust Large Language Model with Missing Value Adaptation and Multi-Objective Learning Strategy for Predicting Disease Severity and Clinical Outcomes in COVID-19 Pa.. | Shengjun Zhu, Siyu Liu, Yang Li, Qing Lei, Hongyan Hou, He‐wei Jiang, Shujuan Guo, Feng Wang, Rongshang Chen, Xionglin F... | 2024-11-28 | Current Proteomics | https://github.com/sysll/CovidLLM | https://doi.org/10.2174/0115701646366019250304064012 |
| 357 | Leveraging Large Language Models and Topic Modeling for Toxicity Classification | Haniyeh Ehsani Oskouie, Christina Chance, Claire Huang, Margaret Capetz, Elizabeth Eyeson, Majid Sarrafzadeh | 2024-11-26 | 2016 International Conference on Computing, Networking and Communications (ICNC) | https://github.com/aheldis/Toxicity-Classification.git. | https://doi.org/10.1109/ICNC64010.2025.10994061 |
| 358 | AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting | Yu Wang, Xiaogeng Liu, Yu Li, Muhao Chen, Chaowei Xiao | 2024-11-26 | Lecture notes in computer science | https://github.com/rain305f/AdaShield. | https://doi.org/10.1007/978-3-031-72661-3_5 |
| 359 | VaxLLM: Leveraging Fine-tuned Large Language Model for automated annotation of Brucella Vaccines | Xingxian Li, Yuping Zheng, Jie Hu, Jie Zheng, Zhigang Wang, Yongqun He | 2024-11-26 | bioRxiv (Cold Spring Harbor Laboratory) | https://github.com/xingxianli/VaxLLM | https://doi.org/10.1101/2024.11.25.625209 |
| 360 | CS-Eval: A Comprehensive Large Language Model Benchmark for CyberSecurity | Zhengmin Yu, Jiutian Zeng, Siyi Chen, Wenhan Xu, Dandan Xu, Xiangyu Liu, Zonghao Ying, Nan Wang, Yuan Zhang, Min Yang | 2024-11-25 | arXiv (Cornell University) | https://github.com/CS-EVAL/CS-Eval. | http://arxiv.org/abs/2411.16239 |
| 361 | Large Language Model with Region-guided Referring and Grounding for CT Report Generation | Zhixuan Chen, Yequan Bie, Huanying Jin, Hao Chen | 2024-11-23 | IEEE Transactions on Medical Imaging | https://github.com/zhi-xuan-chen/Reg2RG. | https://doi.org/10.1109/TMI.2025.3559923 |
| 362 | "Moralized" Multi-Step Jailbreak Prompts: Black-Box Testing of Guardrails in Large Language Models for Verbal Attacks | Libo Wang | 2024-11-23 | arXiv (Cornell University) | https://github.com/brucewang123456789/GeniusTrail.git. | http://arxiv.org/abs/2411.16730 |
| 363 | DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization | Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Jing Li, Min Zhang, Zhaopeng Tu | 2024-11-21 | OpenAlex | https://github.com/hexuandeng/DRPruning. | https://doi.org/10.18653/v1/2025.acl-long.1414 |
| 364 | SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model | Christopher Nguyen, William Nguyen, Atsushi Suzuki, Daisuke Oku, Hong An Phan, Dinh Viet Sang, Zooey Nguyen, Ha Cam Anh,... | 2024-11-20 | arXiv (Cornell University) | https://github.com/aitomatic/semikong | http://arxiv.org/abs/2411.13802 |
| 365 | On the Consistency of Video Large Language Models in Temporal Comprehension | Minjoon Jung, Junbin Xiao, Byoung‐Tak Zhang, Angela Yao | 2024-11-19 | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | https://github.com/minjoong507/Consistency-of-Video-LLM. | https://openaccess.thecvf.com/content/CVPR2025/html/Jung_On_the_Consistency_of_Video_Large_Language_Models_in_Temporal_CVPR_2025_paper.html |
| 366 | FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training | Anjia Cao, Xing Wei, Zhiheng Ma | 2024-11-18 | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | https://github.com/MIV-XJTU/FLAME | https://openaccess.thecvf.com/content/CVPR2025/html/Cao_FLAME_Frozen_Large_Language_Models_Enable_Data-Efficient_Language-Image_Pre-training_CVPR_2025_paper.html |
| 367 | Multilingual Large Language Models: A Systematic Survey | Shaolin Zhu, Supryadi, Shaoyang Xu, Haoran Sun, Leiyu Pan, Menglong Cui, Jiangcun Du, Renren Jin, António Branco, Deyi X... | 2024-11-17 | arXiv (Cornell University) | https://github.com/tjunlp-lab/Awesome-Multilingual-LLMs-Papers. | http://arxiv.org/abs/2411.11072 |
| 368 | BianCang: A Traditional Chinese Medicine Large Language Model | Sibo Wei, Xueping Peng, Yifei Wang, Jiasheng Si, Weiyu Zhang, Wenpeng Lü, Xiaoming Wu, Yinglong Wang | 2024-11-17 | arXiv (Cornell University) | https://github.com/QLU-NLP/BianCang. | http://arxiv.org/abs/2411.11027 |
| 369 | TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models | Tingyu Qu, Mingxiao Li, Tinne Tuytelaars, Marie‐Francine Moens | 2024-11-17 | arXiv (Cornell University) | https://github.com/tingyu215/TS-LLaVA. | http://arxiv.org/abs/2411.11066 |
| 370 | Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash | Parsa Hejabi, Elnaz Rahmati, Alireza S. Ziabari, Preni Golazizian, Jesse Thomason, Morteza Dehghani | 2024-11-15 | arXiv (Cornell University) | https://github.com/ParsaHejabi/Simulation-Framework-for-Multi-Agent-Balderdash | http://arxiv.org/abs/2411.10422 |
| 371 | Orca: Enhancing Role-Playing Abilities of Large Language Models by Integrating Personality Traits | Yu-Chih Huang | 2024-11-15 | arXiv (Cornell University) | https://github.com/Aipura/Orca. | http://arxiv.org/abs/2411.10006 |
| 372 | LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation | Zhenshi Li, Dilxat Muhtar, Feng Gu, Yibei He, Xueliang Zhang, Pengfeng Xiao, Guangjun He, Xiao Xiang Zhu | 2024-11-14 | ISPRS Journal of Photogrammetry and Remote Sensing | https://github.com/NJU-LHRS/LHRS-Bot. | https://doi.org/10.1016/j.isprsjprs.2025.06.003 |
| 373 | DROJ: A Prompt-Driven Attack against Large Language Models | Longfei Hu, B. Wang | 2024-11-13 | arXiv (Cornell University) | https://github.com/Leon-Leyang/LLM-Safeguard. | http://arxiv.org/abs/2411.09125 |
| 374 | Verbosity |
Yusen Zhang, Sarkar Snigdha Sarathi Das, Rui Zhang | 2024-11-12 | arXiv (Cornell University) | https://github.com/psunlpgroup/VerbosityLLM. | http://arxiv.org/abs/2411.07858 |
| 375 | Large Language Models for Constructing and Optimizing Machine Learning Workflows: A Survey | Yang Gu, Haihang You, Jian Cao, Muran Yu, Haoran Fan, Shiyou Qian | 2024-11-11 | ACM Transactions on Software Engineering and Methodology | https://github.com/t-harden/LLM4AutoML | http://arxiv.org/abs/2411.10478 |
| 376 | Benchmarking Large Language Models for NL-to-SQL: A Comprehensive Evaluation of Accuracy, Cost and Throughput | Adithya Narasimhan, Amit Kumar Bhamboo, Abiram Devnathan, Jeyaraj Vellaisamy, Vineeth Vijayaraghavan | 2024-11-10 | OpenAlex | https://github.com/petavue/NL2SQL-Benchmark. | https://doi.org/10.36227/techrxiv.173121325.56335825/v1 |
| 377 | Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models | Xiaojun Wu, Junxi Liu, Hong Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, F... | 2024-11-09 | arXiv (Cornell University) | https://github.com/IDEA-FinAI/Golden-Touchstone | http://arxiv.org/abs/2411.06272 |
| 378 | TourSynbio-Search: A Large Language Model Driven Agent Framework for Unified Search Method for Protein Engineering | Yungeng Liu, Zan Chen, Yu Guang Wang, Yiqing Shen | 2024-11-08 | 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) | https://github.com/tsynbio/Toursynbio-Search | https://doi.org/10.1109/bibm62325.2024.10822318 |
| 379 | AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering | Yungeng Liu, Zan Chen, Yuguang Wang, Yiqing Shen | 2024-11-07 | COLING | https://github.com/tsynbio/AutoPE. | https://aclanthology.org/2025.coling-industry.36/ |
| 380 | Measuring short-form factuality in large language models | Jason Lee, Nguyen Karina, Hyung Won Chung, Yunxin Joy Jiao, Spencer Papay, Amelia Glaese, John Schulman, William Fedus | 2024-11-06 | arXiv (Cornell University) | https://github.com/openai/simple-evals. | http://arxiv.org/abs/2411.04368 |
| 381 | QUILL: Quotation Generation Enhancement of Large Language Models | Jin Xiao, Bowei Zhang, Qianyu He, Jiaqing Liang, Wei Feng, Jinglei Chen, Zujie Liang, Deqing Yang, Yanghua Xiao | 2024-11-06 | arXiv (Cornell University) | https://github.com/GraceXiaoo/QUILL. | http://arxiv.org/abs/2411.03675 |
| 382 | SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents | Dawei Li, Zhen Tan, Peijia Qian, Yifan Li, Kumar Satvik Chaudhary, Lijie Hu, Jiayi Shen | 2024-11-05 | Lecture notes in computer science | https://github.com/David-Li0406/SMoA. | https://doi.org/10.1007/978-981-96-8180-8_5 |
| 383 | FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models | Zhanwei Zhang, Shizhao Sun, Wenxiao Wang, Deng Cai, Jiang Bian | 2024-11-05 | ICLR | https://github.com/microsoft/CADGeneration | https://openreview.net/forum?id=Z0eiiV3Yyh |
| 384 | Evaluating Large Language Models: A Comprehensive Survey | Zishan Guo, Renren Jin, Chuang Liu, Yufei Huang, Dan Shi, Supryadi, Linhao Yu, Yan Liu, Jiaxuan Li, Bojian Xiong, Deyi X... | 2024-11-05 | International Journal of Latest Engineering and Management Research (IJLEMR) | https://github.com/tjunlp-lab/Awesome-LLMs-Evaluation-Papers. | https://doi.org/10.56581/ijlemr.9.10.05-16 |
| 385 | Leveraging Large Language Models in Code Question Answering: Baselines and Issues | Georgy Andryushchenko, Vladimir Ivanov, Vladimir Makharev, Elizaveta Tukhtina, Aidar Valeev | 2024-11-05 | Communications in computer and information science | https://github.com/IU-AES-AI4Code/CodeQuestionAnswering. | https://doi.org/10.1007/978-3-031-97019-1_1 |
| 386 | DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution | Yang Yue, Yulin Wang, Bingyi Kang, Yizeng Han, Shenzhi Wang, Shiji Song, Jiashi Feng, Gao Huang | 2024-11-04 | NeurIPS | https://github.com/yueyang130/DeeR-VLA. | http://papers.nips.cc/paper_files/paper/2024/hash/67b0e7c7c2a5780aeefe3b79caac106e-Abstract-Conference.html |
| 387 | SQL Injection Jailbreak: a structural disaster of large language models | Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu | 2024-11-03 | Findings of the Association for Computational Linguistics: ACL 2022 | https://github.com/weiyezhimeng/SQL-Injection-Jailbreak | https://doi.org/10.18653/v1/2025.findings-acl.358 |
| 388 | Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis | Shijia Liao, Yuxuan Wang, Tianyu Li, Yifan Cheng, Ruoyi Zhang, Ran Zhou, Yun-Xuan Xing | 2024-11-02 | arXiv (Cornell University) | https://github.com/fishaudio/fish-speech | http://arxiv.org/abs/2411.01156 |
| 389 | Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models | Aliyah R. Hsu, James Zhu, Zhichao Wang, Bin Bi, Shubham Mehrotra, Shiva Pentyala, Katherine Tan, Xiang-Bo Mao, Roshanak ... | 2024-11-02 | arXiv (Cornell University) | https://github.com/adelaidehsu/REC | http://arxiv.org/abs/2411.02448 |
| 390 | MoD: A Distribution-Based Approach for Merging Large Language Models | Quy-Anh Dang, Chong‐Wah Ngo | 2024-11-01 | arXiv (Cornell University) | https://github.com/knovel-eng/mod. | http://arxiv.org/abs/2411.00406 |
| 391 | BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments | Xinghao Wang, Pengyu Wang, Bo Wang, Dong Zhang, Yunhua Zhou, Xipeng Qiu | 2024-10-31 | arXiv (Cornell University) | https://github.com/xinghaow99/BitStack. | http://arxiv.org/abs/2410.23918 |
| 392 | End-to-End Ontology Learning with Large Language Models | Andrea Lo, Albert Q. Jiang, Wenda Li, Mateja Jamnik | 2024-10-30 | NeurIPS | https://github.com/andylolu2/ollm. | http://papers.nips.cc/paper_files/paper/2024/hash/9e89f068a62f6858c661a8abecf5bb0a-Abstract-Conference.html |
| 393 | LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment | Ge Yang, Changyi He, Jinyang Guo, Jianyu Wu, Yifu Ding, Aishan Liu, Haotong Qin, Pengliang Ji, Xianglong Liu | 2024-10-28 | NeurIPS | https://github.com/AboveParadise/LLMCBench. | http://papers.nips.cc/paper_files/paper/2024/hash/9f4cc62d0632911c63163ea3d9ec19bd-Abstract-Datasets_and_Benchmarks_Track.html |
| 394 | NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates | Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Zhang Min, Zhaopeng Tu | 2024-10-28 | NeurIPS | https://github.com/hexuandeng/NewTerm. | http://papers.nips.cc/paper_files/paper/2024/hash/3eec719ab86712d32b065c5977f94ad0-Abstract-Datasets_and_Benchmarks_Track.html |
| 395 | GCoder: Improving Large Language Model for Generalized Graph Problem Solving | Qifan Zhang, Xiaobin Hong, Jianheng Tang, Nuo Chen, Yuhan Li, Wenzhong Li, Jing Tang, Jia Li | 2024-10-24 | OpenAlex | https://github.com/Bklight999/WWW25-GCoder | https://doi.org/10.1145/3746252.3761066 |
| 396 | Cross-model Control: Improving Multiple Large Language Models in One-time Training | Jiayi Wu, Hao Sun, Hengyi Cai, Lixin Su, Shuaiqiang Wang, Dawei Yin, Xiang Li, Ming Gao | 2024-10-23 | NeurIPS | https://github.com/wujwyi/CMC. | http://papers.nips.cc/paper_files/paper/2024/hash/9856b5d30ac61ab744fdab6f67d874e4-Abstract-Conference.html |
| 397 | GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration | Xin Li, Qi Chu, Yubin Chen, Yang Liu, Yaoqi Liu, Zhi Yu, Weize Chen, Chen Qian, Chuan Shi, Cheng Yang | 2024-10-23 | arXiv (Cornell University) | https://github.com/BUPT-GAMMA/GraphTeam. | http://arxiv.org/abs/2410.18032 |
| 398 | DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models | Qian Chen, Dongrui Liu, Jie Zhang, Yong Liu, Jing Shao | 2024-10-22 | arXiv (Cornell University) | https://github.com/ChnQ/DEAN | http://arxiv.org/abs/2410.16672 |
| 399 | ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage | Taewhoo Lee, Chanwoong Yoon, Kyochul Jang, Donghyeon Lee, Myung-Ha Song, Hyunjae Kim, Jaewoo Kang | 2024-10-22 | OpenAlex | https://github.com/dmis-lab/ETHIC. | https://doi.org/10.18653/v1/2025.naacl-long.283 |
| 400 | Improving Causal Reasoning in Large Language Models: A Survey | Lu Yu, Delin Chen, Siheng Xiong, Qingyang Wu, Qingzhen Liu, Dawei Li, Zhikai Chen, Xiaoze Liu, Liangming Pan | 2024-10-22 | arXiv (Cornell University) | https://github.com/chendl02/Awesome-LLM-causal-reasoning. | http://arxiv.org/abs/2410.16676 |
| 401 | LLaVA-KD: A Framework of Distilling Multimodal Large Language Models | Yuxuan Cai, Jiangning Zhang, Haoyang He, Xinwei He, Ao Tong, Zhenye Gan, Chengjie Wang, Xiang Bai | 2024-10-21 | arXiv (Cornell University) | https://github.com/Fantasyele/LLaVA-KD. | http://arxiv.org/abs/2410.16236 |
| 402 | Comprehensive benchmarking of large language models for RNA secondary structure prediction | Luciano I Zablocki, Leandro A. Bugnon, M. Gérard, Leandro E. Di Persia, Georgina Stegmayer, Diego H. Milone | 2024-10-21 | Briefings in Bioinformatics | https://github.com/sinc-lab/rna-llm-folding | https://doi.org/10.1093/bib/bbaf137 |