LLMjacking Targets Local AI Servers at Scale

Kaspersky documents a rising wave of "LLMjacking" attacks that probe and attempt to hijack privately hosted AI servers, according to its May 12, 2026 blog post. The company ran an internet-accessible honeypot that impersonated common local hosting stacks including Ollama, LM Studio, AutoGPT, LangServe, and text-gen-webui, and that advertised a local instance of Qwen3-Coder 30B Heretic. Kaspersky reports that Shodan discovered the honeypot within 3 hours, recon-like requests began within 1 hour, and over a month the trap recorded more than 113000 requests from thousands of unique IPs, with 23% specifically probing for AI capabilities. Kaspersky recommends establishing rigorous security controls from deployment and monitoring for capability-reconnaissance traffic when exposing model-serving stacks.
What happened
Kaspersky published a detailed writeup on May 12, 2026, documenting attempts to hijack privately hosted LLM servers. Per Kaspersky, the author deployed an internet-facing honeypot on a Raspberry Pi that mimicked common local-serving stacks including Ollama, LM Studio, AutoGPT, LangServe, and text-gen-webui, and advertised a local instance of the model Qwen3-Coder 30B Heretic. According to Kaspersky, the internet scanner Shodan discovered the honeypot within 3 hours; recon-like requests began within 1 hour. Over the following month the honeypot logged more than 113000 requests from thousands of unique IPs, with 23% of traffic focused on discovering AI capabilities and exploiting local LLMs or agents.
Technical details
Editorial analysis: Kaspersky's honeypot setup reportedly served pre-saved model responses and exposed OpenAI-compatible API endpoints, which is now a common surface for attackers because OpenAI-format APIs are widely used. The article also notes the honeypot advertised ancillary resources such as RAG databases and an MCP endpoint exposing a _get_credentials_-like capability, elements attackers probe to escalate from compute-hijacking toward data or credential theft.
Context and significance
The documented activity illustrates two converging trends practitioners should note: easy-to-deploy local model stacks are increasingly exposed to the open internet, and automated scanners rapidly enumerate and test those endpoints. The Kaspersky data point that nearly one quarter of requests were capability probing underscores that attackers are not just opportunistic scanners but are looking specifically for model-serving behaviour and exploitable agent workflows.
Mitigations outlined
Kaspersky recommends treating private model-serving infrastructure with the same hardening applied to production servers, including restricting internet exposure, enforcing authentication on APIs, network segmentation, inventorying ancillary services (RAG, agents, MCP), and monitoring for reconnaissance patterns. Kaspersky frames these as deployment-day priorities rather than optional post-deployment steps.
What to watch
For practitioners: watch for increased scanning signatures targeting OpenAI-style endpoints, monitor unusual volumes of short capability-probing requests, and prioritize access controls and telemetry on any host that advertises model-serving APIs. Public scanning services like Shodan can surface your exposed endpoints rapidly; defenders should test visibility with controlled honeypots and adjust perimeter rules accordingly.
Scoring Rationale
The story documents a widespread, automated threat against privately hosted model servers with concrete honeypot telemetry, making it a notable operational risk for practitioners running local stacks. It is not a paradigm-shifting research result, but it has immediate operational relevance.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

