Sui, 2025 - Google Patents

CtrlRAG: Black-box Adversarial Attacks Based on Masked Language Models in Retrieval-Augmented Language Generation

Sui, 2025

Document ID: 9803925057007599265
Author: Sui R
Publication year: 2025
Publication venue: arXiv preprint arXiv:2503.06950

External Links

Cited by

Snippet

Retrieval-Augmented Generation (RAG) systems enhance Large Language Models (LLMs) by integrating external knowledge bases. However, this integration introduces a new security threat: adversaries can exploit the retrieval mechanism to inject malicious content …

Continue reading at arxiv.org (PDF) (other versions)

235000000332 black box 0 title abstract description 22

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30864—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
- G06F17/30867—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
- G06Q50/01—Social networking

Similar Documents

Publication	Publication Date	Title
Arditi et al.	2024	Refusal in language models is mediated by a single direction
Omar et al.	2022	Robust natural language processing: Recent advances, challenges, and future directions
Kaiser et al.	2021	Adapting security warnings to counter online disinformation
Hosseini et al.	2017	Deceiving google's perspective api built for detecting toxic comments
US10742605B2 (en)	2020-08-11	Context-based firewall for learning artificial intelligence entities
Hui Kyong Chun	2011	Crisis, crisis, crisis, or sovereignty and networks
US20200004882A1 (en)	2020-01-02	Misinformation detection in online content
Dugan et al.	2024	Raid: A shared benchmark for robust evaluation of machine-generated text detectors
Kucharavy et al.	2023	Fundamentals of generative large language models and perspectives in cyber-defense
Kwon et al.	2021	Textual backdoor attack for the text classification system
Hadi et al.	2023	Introduction to ChatGPT: A new revolution of artificial intelligence with machine learning algorithms and cybersecurity
Huang et al.	2025	Authorship attribution in the era of llms: Problems, methodologies, and challenges
Ahmed et al.	2024	ChatGPT versus Bard: A comparative study
Chen et al.	2024	Black-box opinion manipulation attacks to retrieval-augmented generation of large language models
Zhao et al.	2024	Wildhallucinations: Evaluating long-form factuality in llms with real-world entity queries
Foley et al.	2023	Matching pairs: Attributing fine-tuned models to their pre-trained large language models
Gan et al.	2024	Navigating the risks: A survey of security, privacy, and ethics threats in llm-based agents
Bajaj et al.	2023	Exposing the vulnerabilities of deep learning models in news classification
Boumber et al.	2024	LLMs for explainable few-shot deception detection
Hossain et al.	2024	Securing vision-language models with a robust encoder against jailbreak and adversarial attacks
Li et al.	2024	Quickllama: Query-aware inference acceleration for large language models
US10621261B2 (en)	2020-04-14	Matching a comment to a section of a content item based upon a score for the section
Chen et al.	2024	The dark side of human feedback: Poisoning large language models via user inputs
Keluskar et al.	2024	Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering
Kwon et al.	2022	Textual adversarial training of machine learning model for resistance to adversarial examples