Sui, 2025 - Google Patents
CtrlRAG: Black-box Adversarial Attacks Based on Masked Language Models in Retrieval-Augmented Language GenerationSui, 2025
View PDF- Document ID
- 9803925057007599265
- Author
- Sui R
- Publication year
- Publication venue
- arXiv preprint arXiv:2503.06950
External Links
Snippet
Retrieval-Augmented Generation (RAG) systems enhance Large Language Models (LLMs) by integrating external knowledge bases. However, this integration introduces a new security threat: adversaries can exploit the retrieval mechanism to inject malicious content …
- 235000000332 black box 0 title abstract description 22
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30864—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
- G06F17/30867—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
- G06Q50/01—Social networking
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Arditi et al. | Refusal in language models is mediated by a single direction | |
Omar et al. | Robust natural language processing: Recent advances, challenges, and future directions | |
Kaiser et al. | Adapting security warnings to counter online disinformation | |
Hosseini et al. | Deceiving google's perspective api built for detecting toxic comments | |
US10742605B2 (en) | Context-based firewall for learning artificial intelligence entities | |
Hui Kyong Chun | Crisis, crisis, crisis, or sovereignty and networks | |
US20200004882A1 (en) | Misinformation detection in online content | |
Dugan et al. | Raid: A shared benchmark for robust evaluation of machine-generated text detectors | |
Kucharavy et al. | Fundamentals of generative large language models and perspectives in cyber-defense | |
Kwon et al. | Textual backdoor attack for the text classification system | |
Hadi et al. | Introduction to ChatGPT: A new revolution of artificial intelligence with machine learning algorithms and cybersecurity | |
Huang et al. | Authorship attribution in the era of llms: Problems, methodologies, and challenges | |
Ahmed et al. | ChatGPT versus Bard: A comparative study | |
Chen et al. | Black-box opinion manipulation attacks to retrieval-augmented generation of large language models | |
Zhao et al. | Wildhallucinations: Evaluating long-form factuality in llms with real-world entity queries | |
Foley et al. | Matching pairs: Attributing fine-tuned models to their pre-trained large language models | |
Gan et al. | Navigating the risks: A survey of security, privacy, and ethics threats in llm-based agents | |
Bajaj et al. | Exposing the vulnerabilities of deep learning models in news classification | |
Boumber et al. | LLMs for explainable few-shot deception detection | |
Hossain et al. | Securing vision-language models with a robust encoder against jailbreak and adversarial attacks | |
Li et al. | Quickllama: Query-aware inference acceleration for large language models | |
US10621261B2 (en) | Matching a comment to a section of a content item based upon a score for the section | |
Chen et al. | The dark side of human feedback: Poisoning large language models via user inputs | |
Keluskar et al. | Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering | |
Kwon et al. | Textual adversarial training of machine learning model for resistance to adversarial examples |