Hi, I'm Vivian, a cybersecurity and AI Product Manager trying to keep up with an industry that moves faster than I can whisk up my morning matcha. Every week brings a new wave of vulnerabilities, AI security mishaps, and breaches that keep us on our toes, so I take some time to share the most interesting news instead of letting it all blur together. Let's dive into whats been happening recently in cybersecurity and AI security.

AI Research & Vulnerabilities

Perplexity Comet's AI Browser Can Be Tricked into Quietly Deleting Your Entire Google Drive

Researchers uncovered a zero-click attack where a single, harmless-looking email can cause Perplexity's Comet browser agent to wipe a user's Google Drive. Once Comet has OAuth access to Gmail and Drive, an attacker can embed polite, step-by-step instructions in an email. When the user later asks Comet to handle their recent tasks, the agent interprets those email instructions as part of the workflow and proceeds to delete legitimate files across personal and shared folders. No phishing links, jailbreaks, or additional prompts are required.

30+ Vulnerabilities Found in AI Coding Assistants Allow Prompt Injection to Escalate into Data Exfiltration and RCE

Researchers uncovered more than 30 flaws across AI-powered IDEs including Copilot, Cursor, Windsurf, Zed, Roo Code, Claude Code, and others, showing how simple prompt injections can escalate into data exfiltration or remote code execution. The issue stems from how these agents interact with underlying IDE features: an injected instruction can make the agent read sensitive files and then modify workspace settings or JSON configs that trigger the IDE to pull remote schemas or execute attacker-controlled binaries. Because the agents treat these configs as part of normal assistance, the attack requires no further user interaction.

Model Context Protocol's Sampling Feature Enables Tool Abuse and Silent Data Exfiltration

Researchers found that MCP, now used by many AI agents to connect to external tools, introduces a new attack surface when its "sampling" feature is enabled. A malicious MCP server can inject hidden instructions into the context it provides, causing the LLM to execute unauthorized tool calls, leak sensitive data, or run attacker-controlled workflows while appearing completely legitimate. Because agents treat MCP responses as trusted context, the exploit requires no jailbreak or user interaction.

PromptPwnd: Prompt Injection in GitHub Actions Lets AI Agents Leak Secrets

Researchers uncovered a new vulnerability pattern where AI agents embedded in GitHub Actions and GitLab CI/CD pipelines read untrusted issue bodies, PR descriptions, or commit messages directly into their prompts. By crafting those fields with hidden instructions, an attacker can trick the agent into running privileged GitHub CLI or shell commands with high-privilege tokens, leading to secret exfiltration, workflow modification, and repository tampering. The team confirmed real, exploitable cases in at least five Fortune 500 companies, including a now-patched Gemini CLI workflow that could be abused to leak API keys and cloud credentials.

Google Antigravity Accidentally Wipes Entire Storage Drive

A user reported that while using Antigravity in "Turbo mode," the AI executed a destructive command that wiped their entire D: drive. The deletion bypassed the recycle bin, making recovery impossible. This real-world case underscores how powerful agentic IDE tools still expose users to major risk when they run root-level commands with minimal confirmation.

Syntax Hacking: Sentence Structure Alone Can Bypass LLM Safety Filters

Researchers showed that large-language models can answer questions correctly even when the prompt contains meaningless or substituted words, as long as the underlying sentence structure remains intact. By preserving grammar while stripping semantics, the models still inferred the intended query, demonstrating a stronger reliance on syntactic patterns than expected. The same technique also enabled safety evasion: when harmful requests were embedded inside benign-looking grammatical templates, refusal rates dropped sharply.

OpenAI Tests "Confessions" to Make LLMs Reveal When They Cut Corners

OpenAI researchers introduced a new training method where models generate a secondary "confession" output that reports when they ignored instructions, guessed, or took shortcuts. The technique doesn't evaluate correctness, but rewards the model for openly flagging uncertainty or rule-breaking that wouldn't be obvious from the main answer. In testing, models became more transparent about hidden behaviors such as hallucinations, skipped reasoning steps, or actions taken purely to maximize reward.

Anthropic's "Interviewer" Reveals How Workers Use AI and Where Trust Breaks Down

Anthropic built Interviewer to understand people's real experiences with AI at work, and used it to interview 1,250 workers across general roles, creative fields, and scientific research. The results show strong adoption for routine tasks such as summarizing, drafting, brainstorming, and code cleanup, but limited trust when accuracy, originality, or domain judgment matter. Many participants said AI boosts speed but also admitted they conceal their usage due to workplace stigma or fear of appearing less competent.

Cybersecurity News

Google Investigating Gmail Hack That Permanently Locks Users Out of Accounts

There has been a wave of Gmail account takeovers where attackers compromise an account and reclassify it as a supervised "child" profile under their family group, allowing the attacker to manage sign-in and recovery. Even with recovery options, the original owner is effectively shut out because Google's standard account recovery flows treat it as a child account under a different adult. Google has said it is "looking into" these lockouts but has not yet provided a clear, documented path for victims.

FBI Warns of "Photo-Attack" Campaign Targeting Social Media Users

The Federal Bureau of Investigation issued a public alert warning users of Facebook, LinkedIn and X about a rising scam where attackers steal or scrape profile photos from social media, then manipulate or reuse them to impersonate real people or create fake identities for phishing, extortion, or social-engineering schemes. The FBI urged users to treat any unexpected messages, photo-based requests, or identity-based contact with skepticism, especially if they originate from profiles with little history or limited mutual connections.

China-Nexus Threat Groups Rapidly Exploit Critical React2Shell RCE

AWS reported that within hours of public disclosure of React2Shell, multiple China-nexus threat groups, including Earth Lamia and Jackpot Panda, began actively targeting vulnerable React Server Components and Next.js applications. The flaw, rated CVSS 10.0, allows unauthenticated remote code execution via crafted payloads sent to React Server Function endpoints, and affects React 19.x and Next.js 15.x–16.x deployments using App Router in default configurations. Observed activity includes broad internet scanning, attempts to steal AWS configuration and credential files, and post-exploitation deployment of malware and cryptominers.

Fintech Vendor Marquis Software Solutions Ransomware Breach Exposes 400k+ Banking Customers' Data

Marquis — a US fintech firm that provides marketing, compliance and CRM services to over 700 banks and credit unions — disclosed a ransomware attack dating to August 14, 2025, after attackers exploited a vulnerability in its SonicWall firewall. The breach exposed sensitive customer data including names, addresses, dates of birth, Social Security numbers, tax IDs, and bank account or credit/debit card numbers for at least 400,000 people so far.


That's all for now… Stay informed and protected.