Inject Once Survive Later: Backdooring Vision-Language-Action Models to Persist Through Downstream Fine-tuning

1Harbin Institute of Technology, Shenzhen    2Harbin Institute of Technology
3Meituan Academy of Robotics    4Shanghai Jiaotong University
5National University of Singapore    6Central South University

Corresponding author
Code (Coming Soon) arXiv
INFUSE Method Overview

INFUSE is the first backdoor attack framework that targets fine-tune-insensitive modules in VLA base models, ensuring persistent malicious behavior even after extensive user-side fine-tuning on clean data.

Abstract

Vision-Language-Action (VLA) models have become foundational to modern embodied AI systems. By integrating visual perception, language understanding, and action planning, they enable general-purpose task execution across diverse environments. Despite their importance, the security of VLA models remains underexplored—particularly in the context of backdoor attacks, which pose realistic threats in physical-world deployments. While recent methods attempt to inject backdoors into VLA models, these backdoors are easily erased during downstream adaptation, as user-side fine-tuning with clean data significantly alters model parameters, rendering them impractical for real-world applications. To address these challenges, we propose INFUSE (INjection into Fine-tUne-inSensitive modulEs), the first backdoor attack framework for VLA base models that remains effective even with arbitrary user fine-tuning. INFUSE begins by analyzing parameter sensitivity across diverse fine-tuning scenarios to identify modules that remain stable (fine-tune-insensitive) and suitable for persistent backdoor injection. It then injects backdoors into these stable modules while freezing the rest, ensuring malicious behavior persists after extensive user fine-tuning. Comprehensive experiments across multiple VLA architectures demonstrate INFUSE's effectiveness. After user-side fine-tuning, INFUSE maintains mean attack success rates of 91.0% on simulation environments and 79.8% on real-world robot tasks, substantially surpassing BadVLA (38.8% and 36.6%, respectively), while preserving clean-task performance comparable to standard models.

Key Results

Method Overview

INFUSE consists of three key stages:

  1. Fine-tune-Insensitive Module Identification: We analyze parameter changes after fine-tuning the base VLA model on multiple clean environments to identify modules that remain stable (fine-tune-insensitive) and suitable for persistent backdoor injection.
  2. Selective Backdoor Injection: We construct a poisoned dataset with realistic object-based triggers (e.g., a blue mug) and malicious target actions, then selectively fine-tune only the fine-tune-insensitive modules while freezing the sensitive ones, producing a poisoned base VLA model.
  3. User-side Fine-tuning: We simulate realistic user adaptation by fine-tuning the poisoned base model with clean datasets from different environments, demonstrating that the injected backdoor remains effective even after user-side customization.

Our key insight is that certain modules (vision backbone, vision projector, LLM backbone) undergo 100-1000x smaller parameter updates during fine-tuning compared to sensitive modules (action head, proprio projector), making them ideal targets for persistent backdoor injection.

Real-world Robot Experiments

Real-world Robot Trajectory

INFUSE demonstrates strong effectiveness on real-world robot tasks. After user-side fine-tuning on clean data, our method achieves 79.8% attack success rate on physical robot manipulation tasks, substantially outperforming BadVLA (36.6%). The backdoor persists across different real-world environments and task variations.

Key Contributions

  • First persistent backdoor attack on base VLA models: Unlike prior methods that inject backdoors during downstream adaptation, our attack is conducted at the pre-distribution stage, enabling persistent threats where the attacker has no access to user data.
  • Novel selective injection framework: We leverage parameter stability analysis to identify fine-tune-insensitive modules and inject backdoors exclusively into these components, ensuring the backdoor survives user fine-tuning on clean data.
  • Comprehensive evaluation: INFUSE achieves average ASRs of 95.3% on LIBERO, 91.7% on SimplerEnv, and 79.8% on real-world tasks after clean fine-tuning, substantially surpassing BadVLA (31.7%, 39.4%, and 36.6%), while maintaining clean-task performance (95.0%) comparable to standard models (96.4%).

BibTeX


        @misc{zhou2026injectsurvivelaterbackdooring,
      title={Inject Once Survive Later: Backdooring Vision-Language-Action Models to Persist Through Downstream Fine-tuning}, 
      author={Jianyi Zhou and Yujie Wei and Ruichen Zhen and Bo Zhao and Xiaobo Xia and Rui Shao and Xiu Su and Shuo Yang},
      year={2026},
      eprint={2602.00500},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2602.00500}, 
}