Hi, I'm Vivian, a cybersecurity and AI Product Manager trying to keep up with an industry that moves faster than I can whisk up my morning matcha. Every week brings a new wave of vulnerabilities, AI security mishaps, and breaches that keep us on our toes, so I take some time to share the most interesting news instead of letting it all blur together. Let's dive into whats been keeping us up at night recently in cybersecurity and AI security.

AI Research & Vulnerabilities

Anthropic Details First AI-Orchestrated Cyber Espionage Campaign

Anthropic disclosed that a Chinese state-sponsored group abused its Claude Code tool to run a highly autonomous cyber-espionage campaign against roughly thirty targets, including major tech firms, financial institutions, chemical manufacturers, and government agencies. The attackers used role-play to bypass safety controls and let AI handle 80–90% of the intrusion lifecycle from reconnaissance and exploitation to lateral movement and data analysis before Anthropic detected the activity.

MCP Server Hijack Lets Attackers Take Over Cursor's Internal Browser

Researchers showed that a malicious Model Context Protocol server can modify unverified runtime code in Cursor to control its built-in browser, swap legitimate login pages with credential harvesting pages, and execute attacker-supplied JavaScript. Since MCP servers run with broad permissions, a single untrusted server can steal passwords and run arbitrary actions on a user's machine.

65% of Forbes AI 50 Startups Caught Leaking Secrets on GitHub

Researchers scanned GitHub for the Forbes AI 50 and found that roughly two-thirds of those private AI companies had exposed verified secrets including API keys, access tokens, and other credentials in commit histories, deleted forks, workflow logs, and even public repos tied to employees' personal accounts.

Adversarial Prompt Tests Reveal Safety Gaps in ChatGPT, Gemini and Claude

Researchers used ChatGPT, Gemini, and Claude models with structured adversarial prompts and found that all could be pushed into unsafe territory when harmful requests were wrapped as academic research, third-person analysis, stories, or slightly broken grammar. Gemini Pro 2.5 was the easiest to steer into detailed illegal or abusive content, Gemini Flash 2.5 was the most consistent at refusing, Claude models were strict but vulnerable to academic-style wording, and ChatGPT models often gave softer but still usable answers, highlighting how prompt framing remains a key weak point in current LLM safety systems.

Code Reuse Spreads RCE Bugs Across AI Inference Servers

Researchers found that several popular AI inference servers from vendors and projects like Meta, NVIDIA, Microsoft, vLLM, SGLang, and Modular all inherited the same remote code execution risk by reusing code that combines unauthenticated ZeroMQ sockets with Python pickle deserialization. This ShadowMQ case shows how quickly a single insecure pattern can propagate across the AI ecosystem when teams borrow designs or libraries.

SSRF in Custom GPT Actions Lets ChatGPT Expose Cloud Metadata

A researcher revealed a severe Server-Side Request Forgery vulnerability in the Actions feature of ChatGPT's custom-GPT modules, where user-supplied URLs allowed arbitrary redirection and headers to reach OpenAI's internal cloud metadata endpoints.

Cybersecurity News

OpenAI Resists NYT Demand for 20 Million Private ChatGPT Logs

A federal judge ruled that OpenAI must turn over up to 20 million ChatGPT conversation logs as part of the New York Times' copyright lawsuit, but OpenAI argues the request still contains sensitive user data irrelevant to the case. Even with the court requiring de-identification and strict protective controls, OpenAI warns that producing this information poses real privacy and security risk.

4,300+ Fake Hotel Domains Used to Phish Travelers' Payment Details

Researchers uncovered an ongoing phishing campaign in which a Russian-speaking threat actor registered more than 4,300 domains that mimic brands like Airbnb and Booking.com to trick hotel guests into entering payment information on lookalike booking pages. The attackers send reservation-themed emails that link to tailored phishing sites where victims are asked to confirm card details or pay fees.

Oracle Zero-Day Campaign Hits Washington Post and Logitech via Clop Data Theft

The Washington Post and Logitech both disclosed breaches linked to a Clop campaign that exploited a zero-day in Oracle E-Business Suite to steal data from backend systems. The Washington Post says nearly 10,000 employees and contractors had personal and financial details exposed, while Logitech reports that roughly 1.8 TB of internal data was taken, including limited employee, customer, and supplier information.

DoorDash Data Breach Exposes Contact Details After Employee Social Engineering

DoorDash disclosed that an attacker used a social engineering scam against an employee, gained access to internal systems, and pulled certain user contact information for consumers, Dashers, and merchants. The stolen data varied by person but could include names, phone numbers, email addresses, and physical addresses.


That's all for now… Stay informed and protected.