[go: up one dir, main page]

Sui, 2025 - Google Patents

CtrlRAG: Black-box Adversarial Attacks Based on Masked Language Models in Retrieval-Augmented Language Generation

Sui, 2025

View PDF
Document ID
9803925057007599265
Author
Sui R
Publication year
Publication venue
arXiv preprint arXiv:2503.06950

External Links

Snippet

Retrieval-Augmented Generation (RAG) systems enhance Large Language Models (LLMs) by integrating external knowledge bases. However, this integration introduces a new security threat: adversaries can exploit the retrieval mechanism to inject malicious content …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30864Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
    • G06F17/30867Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
    • G06Q50/01Social networking

Similar Documents

Publication Publication Date Title
Arditi et al. Refusal in language models is mediated by a single direction
Omar et al. Robust natural language processing: Recent advances, challenges, and future directions
Kaiser et al. Adapting security warnings to counter online disinformation
Hosseini et al. Deceiving google's perspective api built for detecting toxic comments
US10742605B2 (en) Context-based firewall for learning artificial intelligence entities
Hui Kyong Chun Crisis, crisis, crisis, or sovereignty and networks
US20200004882A1 (en) Misinformation detection in online content
Dugan et al. Raid: A shared benchmark for robust evaluation of machine-generated text detectors
Kucharavy et al. Fundamentals of generative large language models and perspectives in cyber-defense
Kwon et al. Textual backdoor attack for the text classification system
Hadi et al. Introduction to ChatGPT: A new revolution of artificial intelligence with machine learning algorithms and cybersecurity
Huang et al. Authorship attribution in the era of llms: Problems, methodologies, and challenges
Ahmed et al. ChatGPT versus Bard: A comparative study
Chen et al. Black-box opinion manipulation attacks to retrieval-augmented generation of large language models
Zhao et al. Wildhallucinations: Evaluating long-form factuality in llms with real-world entity queries
Foley et al. Matching pairs: Attributing fine-tuned models to their pre-trained large language models
Gan et al. Navigating the risks: A survey of security, privacy, and ethics threats in llm-based agents
Bajaj et al. Exposing the vulnerabilities of deep learning models in news classification
Boumber et al. LLMs for explainable few-shot deception detection
Hossain et al. Securing vision-language models with a robust encoder against jailbreak and adversarial attacks
Li et al. Quickllama: Query-aware inference acceleration for large language models
US10621261B2 (en) Matching a comment to a section of a content item based upon a score for the section
Chen et al. The dark side of human feedback: Poisoning large language models via user inputs
Keluskar et al. Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering
Kwon et al. Textual adversarial training of machine learning model for resistance to adversarial examples