arXiv Papers Bot 🤖

This repository automatically fetches and displays relevant papers from arXiv based on configured criteria.

RSS Vercel Deployment

You can click this to deploy yours

📊 Statistics

Last Updated: 2025-09-16 12:51:36 UTC
Total Papers Found: 30
Categories Monitored: cs.AI, cs.CL, cs.DC, cs.LG

📚 Recent Papers

1. Balanced and Elastic End-to-end Training of Dynamic LLMs

Authors: Mohamed Wahib, Muhammed Abdullah Soyturk, Didem Unat
Category: cs.AI
Published: 2025-09-16
Score: 13.0

arXiv:2505.14864v2 Announce Type: replace-cross Abstract: To reduce the computational and memory overhead of Large Language Models, various approaches have been proposed. These include a) Mixture of Experts (MoEs), where token routing affects compute balance; b) gradual pruning of model parameters;...

2. SpecVLM: Fast Speculative Decoding in Vision-Language Models

Authors: Haiduo Huang, Fuwei Yang, Zhenhua Liu, Xuanwu Yin, Dong Li, Pengju Ren, Emad Barsoum
Category: cs.AI
Published: 2025-09-16
Score: 10.5

arXiv:2509.11815v1 Announce Type: cross Abstract: Speculative decoding is a powerful way to accelerate autoregressive large language models (LLMs), but directly porting it to vision-language models (VLMs) faces unique systems constraints: the prefill stage is dominated by visual tokens whose count ...

3. MoEtion: Efficient and Reliable Sparse Checkpointing for Mixture-of-Experts Models at Scale

Authors: Swapnil Gandhi, Christos Kozyrakis
Category: cs.DC
Published: 2025-09-16
Score: 9.5

arXiv:2412.15411v3 Announce Type: replace Abstract: As large language models scale, training them requires thousands of GPUs over extended durations--making frequent failures an inevitable reality. While checkpointing remains the primary fault-tolerance mechanism, existing methods fall short when a...

4. Online Learning Based Efficient Resource Allocation for LoRaWAN Network

Authors: Ruiqi Wang, Jing Ren, Tongyu Song, Wenjun Li, Xiong Wang, Sheng Wang, Shizhong Xu
Category: cs.AI
Published: 2025-09-16
Score: 9.0

arXiv:2509.10493v1 Announce Type: cross Abstract: The deployment of large-scale LoRaWAN networks requires jointly optimizing conflicting metrics like Packet Delivery Ratio (PDR) and Energy Efficiency (EE) by dynamically allocating transmission parameters, including Carrier Frequency, Spreading Fact...

5. Principled Approximation Methods for Efficient and Scalable Deep Learning

Authors: Pedro Savarese
Category: cs.AI
Published: 2025-09-16
Score: 9.0

arXiv:2509.00174v2 Announce Type: replace-cross Abstract: Recent progress in deep learning has been driven by increasingly larger models. However, their computational and energy demands have grown proportionally, creating significant barriers to their deployment and to a wider adoption of deep lear...

6. TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval

Authors: Chien-Yu Lin, Keisuke Kamahori, Yiyu Liu, Xiaoxiang Shi, Madhav Kashyap, Yile Gu, Rulin Shao, Zihao Ye, Kan Zhu, Stephanie Wang, Arvind Krishnamurthy, Rohan Kadekodi, Luis Ceze, Baris Kasikci
Category: cs.DC
Published: 2025-09-16
Score: 9.0

arXiv:2502.20969v2 Announce Type: replace Abstract: Retrieval-augmented generation (RAG) extends large language models (LLMs) with external data sources to enhance factual correctness and domain coverage. Modern RAG pipelines rely on large datastores, leading to system challenges in latency-sensiti...

7. Odyssey: Adaptive Policy Selection for Resilient Distributed Training

Authors: Yuhang Zhou, Zhibin Wang, Peng Jiang, Haoran Xia, Junhe Lu, Qianyu Jiang, Rong Gu, Hengxi Xu, Xinjing Huang, Guanghuan Fang, Zhiheng Hu, Jingyi Zhang, Yongjin Cai, Jian He, Chen Tian
Category: cs.DC
Published: 2025-09-16
Score: 9.0

arXiv:2508.21613v2 Announce Type: replace Abstract: Training large language models faces frequent interruptions due to various faults, demanding robust fault-tolerance. Existing backup-free methods, such as redundant computation, dynamic parallelism, and data rerouting, each incur performance penal...

8. When MoE Meets Blockchain: A Trustworthy Distributed Framework of Large Models

Authors: Weihao Zhu, Long Shi, Kang Wei, Zhen Mei, Zhe Wang, Jiaheng Wang, Jun Li
Category: cs.DC
Published: 2025-09-16
Score: 8.5

arXiv:2509.12141v1 Announce Type: new Abstract: As an enabling architecture of Large Models (LMs), Mixture of Experts (MoE) has become prevalent thanks to its sparsely-gated mechanism, which lowers computational overhead while maintaining learning performance comparable to dense LMs. The essence of...

9. FineServe: Precision-Aware KV Slab and Two-Level Scheduling for Heterogeneous Precision LLM Serving

Authors: Kyungmin Bin, Seungbeom Choi, Jimyoung Son, Jieun Choi, Daseul Bae, Daehyeon Baek, Kihyo Moon, Minsung Jang, Hyojung Lee
Category: cs.DC
Published: 2025-09-16
Score: 8.5

arXiv:2509.06261v2 Announce Type: replace Abstract: Recent advances in Post-Training Quantization (PTQ) techniques have significantly increased demand for serving quantized large language models (LLMs), enabling higher throughput and substantially reduced memory usage with minimal accuracy loss. Qu...

10. AQUA: Attention via QUery mAgnitudes for Memory and Compute Efficient Inference in LLMs

Authors: Santhosh G S, Saurav Prakash, Balaraman Ravindran
Category: cs.AI
Published: 2025-09-16
Score: 8.0

arXiv:2509.11155v1 Announce Type: cross Abstract: The quadratic complexity of the attention mechanism remains a fundamental barrier to scaling Large Language Models (LLMs) to longer contexts, creating a critical bottleneck in both computation and memory. To address this, we introduce AQUA (Attentio...

11. Gradient Free Deep Reinforcement Learning With TabPFN

Authors: David Schiff, Ofir Lindenbaum, Yonathan Efroni
Category: cs.AI
Published: 2025-09-16
Score: 8.0

arXiv:2509.11259v1 Announce Type: cross Abstract: Gradient based optimization is fundamental to most modern deep reinforcement learning algorithms, however, it introduces significant sensitivity to hyperparameters, unstable training dynamics, and high computational costs. We propose TabPFN RL, a no...

12. EfficientUICoder: Efficient MLLM-based UI Code Generation via Input and Output Token Compression

Authors: Jingyu Xiao, Zhongyi Zhang, Yuxuan Wan, Yintong Huo, Yang Liu, Michael R. Lyu
Category: cs.AI
Published: 2025-09-16
Score: 8.0

arXiv:2509.12159v1 Announce Type: cross Abstract: Multimodal Large Language Models have demonstrated exceptional performance in UI2Code tasks, significantly enhancing website development efficiency. However, these tasks incur substantially higher computational overhead than traditional code generat...

13. Binary Quantization For LLMs Through Dynamic Grouping

Authors: Xinzhe Zheng, Zhen-Qun Yang, Haoran Xie, S. Joe Qin, Arlene Chen, Fangzhen Lin
Category: cs.AI
Published: 2025-09-16
Score: 8.0

arXiv:2509.03054v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of Natural Language Processing (NLP) tasks, but require substantial memory and computational resources. Binary quantization, which compresses model wei...

14. Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding

Authors: Mingxiao Huo, Jiayi Zhang, Hewei Wang, Jinfeng Xu, Zheyu Chen, Huilin Tai, Yijun Chen
Category: cs.CL
Published: 2025-09-16
Score: 8.0

arXiv:2509.11961v1 Announce Type: new Abstract: Vision-Language Models (VLMs) enable powerful multimodal reasoning but suffer from slow autoregressive inference, limiting their deployment in real-time applications. We introduce Spec-LLaVA, a system that applies speculative decoding to accelerate VL...

15. MinatoLoader: Accelerating Machine Learning Training Through Efficient Data Preprocessing

Authors: Rahma Nouaji, Stella Bitchebe, Ricardo Macedo, Oana Balmau
Category: cs.DC
Published: 2025-09-16
Score: 8.0

arXiv:2509.10712v1 Announce Type: new Abstract: Data loaders are used by Machine Learning (ML) frameworks like PyTorch and TensorFlow to apply transformations to data before feeding it into the accelerator. This operation is called data preprocessing. Data preprocessing plays an important role in t...

16. LogGuardQ: A Cognitive-Enhanced Reinforcement Learning Framework for Cybersecurity Anomaly Detection in Security Logs

Authors: Umberto Gon\c{c}alves de Sousa
Category: cs.AI
Published: 2025-09-16
Score: 7.5

arXiv:2509.10511v1 Announce Type: cross Abstract: Reinforcement learning (RL) has transformed sequential decision-making, but traditional algorithms like Deep Q-Networks (DQNs) and Proximal Policy Optimization (PPO) often struggle with efficient exploration, stability, and adaptability in dynamic e...

17. Application of Machine Learning for Correcting Defect-induced Neuromorphic Circuit Inference Errors

Authors: Vedant Sawal, Hiu Yung Wong
Category: cs.AI
Published: 2025-09-16
Score: 7.5

arXiv:2509.11113v1 Announce Type: cross Abstract: This paper presents a machine learning-based approach to correct inference errors caused by stuck-at faults in fully analog ReRAM-based neuromorphic circuits. Using a Design-Technology Co-Optimization (DTCO) simulation framework, we model and analyz...

18. Efficient Single-Step Framework for Incremental Class Learning in Neural Networks

Authors: Alejandro Dopico-Castro, Oscar Fontenla-Romero, Bertha Guijarro-Berdi~nas, Amparo Alonso-Betanzos
Category: cs.AI
Published: 2025-09-16
Score: 7.5

arXiv:2509.11285v1 Announce Type: cross Abstract: Incremental learning remains a critical challenge in machine learning, as models often struggle with catastrophic forgetting -the tendency to lose previously acquired knowledge when learning new information. These challenges are even more pronounced...

19. AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models

Authors: Sangjun Lee, Seung-taek Woo, Jungyu Jin, Changhun Lee, Eunhyeok Park
Category: cs.AI
Published: 2025-09-16
Score: 7.5

arXiv:2509.12019v1 Announce Type: cross Abstract: To enable broader deployment of Large Language Models (LLMs), it is essential to identify the best-performing model under strict memory constraints. We present AMQ, Automated Mixed-Precision Weight-Only Quantization, a framework that assigns layer-w...

20. Hide-and-Shill: A Reinforcement Learning Framework for Market Manipulation Detection in Symphony-a Decentralized Multi-Agent System

Authors: Ronghua Shi, Yiou Liu, Xinyu Ying, Yang Tan, Yuchun Feng, Lynn Ai, Bill Shi, Xuhui Wang, Zhuang Liu
Category: cs.AI
Published: 2025-09-16
Score: 7.5

arXiv:2507.09179v2 Announce Type: replace Abstract: Decentralized finance (DeFi) has introduced a new era of permissionless financial innovation but also led to unprecedented market manipulation. Without centralized oversight, malicious actors coordinate shilling campaigns and pump-and-dump schemes...

21. D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs

Authors: Yue Ding, Xiaofang Zhu, Tianze Xia, Junfei Wu, Xinlong Chen, Qiang Liu, Liang Wang
Category: cs.CL
Published: 2025-09-16
Score: 7.5

arXiv:2509.11569v1 Announce Type: new Abstract: Although large Language Models (LLMs) have achieved remarkable success, their practical application is often hindered by the generation of non-factual content, which is called "hallucination". Ensuring the reliability of LLMs' outputs is a critical ch...

22. DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs

Authors: Minxuan Lv, Zhenpeng Su, Leiyu Pan, Yizhe Xiong, Zijia Lin, Hui Chen, Wei Zhou, Jungong Han, Guiguang Ding, Cheng Luo, Di Zhang, Kun Gai, Songlin Hu
Category: cs.CL
Published: 2025-09-16
Score: 7.5

arXiv:2502.12455v3 Announce Type: replace Abstract: As large language models continue to scale, computational costs and resource consumption have emerged as significant challenges. While existing sparsification methods like pruning reduce computational overhead, they risk losing model knowledge thr...

23. A Range-Based Sharding (RBS) Protocol for Scalable Enterprise Blockchain

Authors: M. Z. Haider, M. Dias de Assuncao, Kaiwen Zhang
Category: cs.DC
Published: 2025-09-16
Score: 7.5

arXiv:2509.11006v1 Announce Type: cross Abstract: Blockchain technology offers decentralization and security but struggles with scalability, particularly in enterprise settings where efficiency and controlled access are paramount. Sharding is a promising solution for private blockchains, yet existi...

24. STADI: Fine-Grained Step-Patch Diffusion Parallelism for Heterogeneous GPUs

Authors: Han Liang, Jiahui Zhou, Zicheng Zhou, Xiaoxi Zhang, Xu Chen
Category: cs.DC
Published: 2025-09-16
Score: 7.5

arXiv:2509.04719v2 Announce Type: replace Abstract: The escalating adoption of diffusion models for applications such as image generation demands efficient parallel inference techniques to manage their substantial computational cost. However, existing diffusion parallelism inference schemes often u...

25. Holographic Knowledge Manifolds: A Novel Pipeline for Continual Learning Without Catastrophic Forgetting in Large Language Models

Authors: Justin Arndt
Category: cs.LG
Published: 2025-09-16
Score: 7.5

arXiv:2509.10518v1 Announce Type: new Abstract: We introduce the Holographic Knowledge Manifold (HKM), a four-phase pipeline that achieves zero catastrophic forgetting in AI knowledge representation while maintaining minimal memory growth and high efficiency. Leveraging fractal quantization, probab...

26. Deep operator network for surrogate modeling of poroelasticity with random permeability fields

Authors: Sangjoon Park, Yeonjong Shin, Jinhyun Choo
Category: cs.LG
Published: 2025-09-16
Score: 7.5

arXiv:2509.11966v1 Announce Type: new Abstract: Poroelasticity -- coupled fluid flow and elastic deformation in porous media -- often involves spatially variable permeability, especially in subsurface systems. In such cases, simulations with random permeability fields are widely used for probabilis...

27. K2-Think: A Parameter-Efficient Reasoning System

Authors: Zhoujun Cheng, Richard Fan, Shibo Hao, Taylor W. Killian, Haonan Li, Suqi Sun, Hector Ren, Alexander Moreno, Daqian Zhang, Tianjun Zhong, Yuxin Xiong, Yuanzhe Hu, Yutao Xie, Xudong Han, Yuqi Wang, Varad Pimpalkhute, Yonghao Zhuang, Aaryamonvikram Singh, Xuezhi Liang, Anze Xie, Jianshu She, Desai Fan, Chengqian Gao, Liqun Ma, Mikhail Yurochkin, John Maggs, Xuezhe Ma, Guowei He, Zhiting Hu, Zhengzhong Liu, Eric P. Xing
Category: cs.LG
Published: 2025-09-16
Score: 7.5

arXiv:2509.07604v3 Announce Type: replace Abstract: K2-Think is a reasoning system that achieves state-of-the-art performance with a 32B parameter model, matching or surpassing much larger models like GPT-OSS 120B and DeepSeek v3.1. Built on the Qwen2.5 base model, our system shows that smaller mod...

28. Think Small, Plan Smart: Minimalist Symbolic Abstraction and Heuristic Subspace Search for LLM-Guided Task Planning

Authors: Junfeng Tang, Yuping Yan, Zihan Ye, Zhenshou, Song, Zeqi Zheng, Yaochu Jin
Category: cs.LG
Published: 2025-09-16
Score: 7.5

arXiv:2501.15214v2 Announce Type: replace-cross Abstract: Reliable task planning is pivotal for achieving long-horizon autonomy in real-world robotic systems. Large language models (LLMs) offer a promising interface for translating complex and ambiguous natural language instructions into actionable...

29. SABR: A Stable Adaptive Bitrate Framework Using Behavior Cloning Pretraining and Reinforcement Learning Fine-Tuning

Authors: Pengcheng Luo, Yunyang Zhao, Bowen Zhang, Genke Yang, Boon-Hee Soong, Chau Yuen
Category: cs.AI
Published: 2025-09-16
Score: 7.0

arXiv:2509.10486v1 Announce Type: cross Abstract: With the advent of 5G, the internet has entered a new video-centric era. From short-video platforms like TikTok to long-video platforms like Bilibili, online video services are reshaping user consumption habits. Adaptive Bitrate (ABR) control is wid...

30. Distributed Gossip-GAN for Low-overhead CSI Feedback Training in FDD mMIMO-OFDM Systems

Authors: Yuwen Cao, Guijun Liu, Tomoaki Ohtsuki, Howard H. Yang, Tony Q. S. Quek
Category: cs.AI
Published: 2025-09-16
Score: 7.0

arXiv:2509.10490v1 Announce Type: cross Abstract: The deep autoencoder (DAE) framework has turned out to be efficient in reducing the channel state information (CSI) feedback overhead in massive multiple-input multipleoutput (mMIMO) systems. However, these DAE approaches presented in prior works re...

🔧 Configuration

This bot is configured to look for papers containing the following keywords:

LLM, RL, RLHF, Inference, Training, Attention, Pipeline, MOE, Sparse, Quantization, Speculative, Efficient, Efficiency, Framework, Parallel, Distributed, Kernel, Decode, Decoding, Prefill, Throughput, Fast, Network, Hardware, Cluster, FP8, FP4, Optimization, Scalable, Communication

📅 Schedule

The bot runs daily at 12:00 UTC via GitHub Actions to fetch the latest papers.

🚀 How to Use

Fork this repository to your GitHub account
Customize the configuration by editing config.json:
- Add/remove arXiv categories (e.g., cs.AI, cs.LG, cs.CL)
- Modify keywords to match your research interests
- Adjust max_papers and days_back settings
Enable GitHub Actions in your repository settings
The bot will automatically run daily and update the README.md

📝 Customization

arXiv Categories

Common categories include:

cs.AI - Artificial Intelligence
cs.LG - Machine Learning
cs.CL - Computation and Language
cs.CV - Computer Vision
cs.NE - Neural and Evolutionary Computing
stat.ML - Machine Learning (Statistics)

Keywords

Add keywords that match your research interests. The bot will search for these terms in paper titles and abstracts.

Exclude Keywords

Add terms to exclude certain types of papers (e.g., "survey", "review", "tutorial").

🔍 Manual Trigger

You can manually trigger the bot by:

Going to the "Actions" tab in your repository
Selecting "arXiv Bot Daily Update"
Clicking "Run workflow"

Generated automatically by arXiv Bot

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
CUSTOM_RULES.md		CUSTOM_RULES.md
GITHUB_ACTIONS_SETUP.md		GITHUB_ACTIONS_SETUP.md
README.md		README.md
README.template.md		README.template.md
RSS_SERVICE_README.md		RSS_SERVICE_README.md
SERVER_DEPLOYMENT.md		SERVER_DEPLOYMENT.md
arxiv_bot.py		arxiv_bot.py
config.json		config.json
deploy_rss_service.py		deploy_rss_service.py
example_custom_rules.py		example_custom_rules.py
requirements.txt		requirements.txt
rss_service.py		rss_service.py
setup.py		setup.py
test_bot.py		test_bot.py
test_deduplication.py		test_deduplication.py
test_readme.md		test_readme.md
test_rss_service.py		test_rss_service.py
test_workflow.py		test_workflow.py
vercel.json		vercel.json

MayDomine/arxiv_rss_bot

Folders and files

Latest commit

History

Repository files navigation