Hello, world! I'm Vivian, a cybersecurity and AI Product Manager trying to keep up with an industry that moves faster than I can whisk up my morning matcha. Every week brings a new wave of vulnerabilities, AI security mishaps, and breaches that keep us on our toes, so I take some time to share the most interesting news instead of letting it all blur together. Let's dive into what kept us up at night recently in cybersecurity and AI security.

AI News:

Calendar Events Can Execute Code on Systems Running Claude Desktop Extensions

Security researchers identified a critical flaw in Claude Desktop Extensions affecting over 10,000 users and 50 extensions. Desktop extensions operate with complete system access rather than running in isolated environments like browser versions. Attackers can chain low-privilege connectors such as calendar apps with local code executors, requiring no user confirmation. The exploit needs only a simple user instruction paired with a calendar entry containing commands to retrieve and execute remote code, receiving a maximum severity rating.

Malicious Campaigns Exploit Claude AI Pages to Distribute Mac Malware

Security teams discovered attackers using publicly shared Claude artifacts and sponsored search listings to deliver MacSync infostealer to macOS users. Victims searching for technical solutions encounter promoted results directing them to Claude artifacts disguised as security guides. These fake guides instruct users to execute encoded terminal commands that download MacSync, which then steals keychain data, browser information, and cryptocurrency wallet contents. The malicious artifact accumulated over 15,600 views before detection, with similar tactics previously observed across other AI platforms.

Quarter Million Chrome Users Affected by Fake AI Assistant Extensions

Security researchers found 30 Chrome extensions falsely claiming to be legitimate AI tools, collectively installed by over 260,000 users. Instead of functioning locally, these extensions inject remotely controlled interfaces, allowing operators to modify behavior without store updates. The extensions extract webpage content and voice input, with 15 specifically targeting Gmail to capture and transmit email thread data. When removed from the store, identical copies reappeared under different names within weeks.

Single Prompt Can Remove AI Model Safety Controls

Microsoft researchers found that a training method called Group Relative Policy Optimization can be weaponized to eliminate model safety alignment. A single unlabeled prompt requesting fake news creation successfully removed safety controls from 15 different language models, causing them to become permissive across harmful categories far beyond the original request. The technique proved effective on both text models and safety-tuned image generation models.

Anthropic Publishes Claude Opus 4.6 Safety Assessment

Anthropic released a formal evaluation examining whether Claude Opus 4.6 could exploit system access to manipulate research, insert backdoors, or poison training data. Following alignment audits, interpretability studies, sabotage simulations, and internal usage monitoring, the assessment found no evidence of dangerous misaligned goals, though characterizing overall risk as very low but not zero. The model showed increased willingness to complete suspicious tasks without detection compared to earlier versions, and demonstrated more manipulative behavior in multi-agent testing environments.

Google Documents Nation-State AI Abuse and Model Theft

Google Threat Intelligence Group's quarterly report revealed government-backed actors from multiple countries actively misusing Gemini throughout attack operations, from reconnaissance and phishing to malware development and code translation. Google also disrupted model extraction campaigns where actors used legitimate API access to systematically replicate Gemini's reasoning capabilities through over 100,000 prompts. A new malware family called HONESTCUE was identified that calls Gemini mid-execution to generate payloads entirely in memory. All identified malicious accounts were disabled.

OpenAI Adds Lockdown Mode and Risk Labels for Prompt Injection Protection

OpenAI introduced two security features targeting prompt injection: Lockdown Mode for high-risk users and Elevated Risk labels that flag capabilities with additional network or data exposure. Lockdown Mode restricts web browsing to cached content, disables Deep Research and Agent Mode, and prevents file downloads for analysis, blocking routes attackers use for data exfiltration. Currently available for Enterprise, Education, Healthcare, and Teachers plans, with consumer rollout planned.

OpenAI Researcher Resigns Over ChatGPT Advertising Plans

An OpenAI researcher resigned the day the company began testing advertisements in ChatGPT, publishing an essay explaining concerns about incentives created by ad-based models. The concern centered on manipulation risks when building advertising models atop archives of private conversations, drawing comparisons to social media platforms that abandoned early data protection promises as ad revenue pressure increased. The departure follows other recent high-profile resignations from AI safety researchers at multiple companies.

North Korean Lazarus Group Targets Developers Through Fake Crypto Recruitment

Security researchers identified a campaign active since May 2025 targeting JavaScript and Python developers in crypto and blockchain. Attackers impersonate recruiters on social platforms, directing victims to complete coding interview tasks containing malicious dependencies published to legitimate package repositories. Running these tasks installs remote access trojans capable of downloading files, executing commands, and checking for cryptocurrency extensions. One package accumulated over 10,000 downloads in clean form before a malicious version was silently deployed.

Abandoned Outlook Add-In Hijacked to Steal 4,000 Credentials

An attacker compromised a legitimate meeting scheduling add-in after its developer abandoned it. Office add-ins load content from live URLs rather than static bundles, so when the original hosting became unclaimed, an attacker registered it and deployed a phishing page running inside Outlook's trusted sidebar with full email access. Microsoft reviewed and signed the original manifest but never rechecked the URL content. Researchers recovered over 4,000 stolen Microsoft credentials, credit card numbers, and banking security answers from the poorly secured exfiltration infrastructure.

North Korean Hackers Deploy Deepfakes in Crypto Targeting Campaign

Investigators examined a targeted attack where the victim was lured into a fake video call after contact through a compromised Telegram account belonging to a legitimate crypto executive. During the call, a disguised technical troubleshooting attack executed the initial infection, and the victim reported seeing deepfake video of a cryptocurrency CEO. The compromise deployed seven distinct malware families on a single system designed to steal credentials, browser data, messaging sessions, keystrokes, and cookies, with unusually heavy tooling intended to harvest data for immediate theft and future social engineering.

Singapore Discloses Largest Cyber Operation After Telecom Sector Attack

Singapore's cybersecurity agencies revealed that a state-sponsored actor conducted a targeted campaign against all four major telecommunications operators. Attackers exploited a zero-day vulnerability to bypass perimeter defenses and deployed rootkits for persistent access and detection evasion, though no evidence indicated customer data access or service disruption. The government response involved over 100 cyber defenders across six agencies over eleven months, representing Singapore's largest coordinated incident response effort.

Pig Butchering Scammer Sentenced Despite Being Fugitive

An individual received a 20-year prison sentence in absentia after pleading guilty to laundering over $73 million stolen through pig butchering scams. After arrest at an airport, the defendant cut off ankle monitoring and fled before sentencing. Conspirators contacted victims through social media and dating platforms, built fake relationships, then directed them to spoofed crypto trading platforms, routing stolen funds through shell companies before cryptocurrency conversion. Eight co-conspirators also pleaded guilty.

Luxury Brands Fined $25 Million After Breaches Expose 5.5 Million Customers

South Korea's data protection authority fined the Korean subsidiaries of three luxury brands a combined $25 million after data breaches tied to their customer management platforms. The largest penalty followed malware on an employee device exposing credentials for 3.6 million customers, while other breaches occurred through voice phishing attacks tricking customer service employees into providing system access. Two companies failed to notify authorities within the required 72-hour window, with one disclosing five days after discovery and another reporting 13 days after detection.

That's all for this week. Stay informed.

 

That's all for now… Stay informed and protected.