AI News: Agents Get Smarter, Faster, and More Secure
Today's top stories focus on practical AI applications: refining agent workflows with new tools and architectures, optimizing model performance with speculative decoding, and enhancing security protocols for AI systems.
Tools & Products
Grok Build is a coding agent that runs from the terminal
Grok Build is a coding agent now in early beta for SuperGrok Heavy subscribers, and features AGENTS.md, plugin, hooks, skills, and MCP servers all work out of the box. Grok Build supports subagents for larger tasks, and it also supports deep worktree integrations, so users can launch subagents in their own worktrees. There is a headless mode that allows the easy running of agents inside scripts and automations.
Hermes Agent Masterclass
Hermes Agent is a personal AI agent that learns workflows, remembers contexts, and runs 24/7, and has a learning loop which remembers across sessions, writes its own reusable skills, prunes them in the background, and validates them offline through an evolutionary engine called GEPA. This masterclass explains what it is and how to customize the agent as per needs.
How OpenAI Built the Codex Windows Sandbox
OpenAI detailed the engineering behind Codex's Windows sandbox, which constrained local commands, file access, and networking permissions while still allowing coding agents to operate effectively on developer machines. This is a very important component when developing secure agents.
Big Tech
Anthropic Launches Claude Platform on AWS for Native Cloud Integration
Claude Platform on AWS aims to make it easier for businesses to use Anthropic's AI models by offering direct access through AWS. This enables a more integrated experience, leveraging existing AWS infrastructure. For teams already heavily invested in AWS, this could streamline AI deployment and management, providing a familiar environment for working with Claude models.
Research
Thinking Machines Lab Unveils Interaction Models for Real-Time Human-AI Collaboration
Thinking Machines Lab's new interaction models are interesting because they focus on real-time collaboration across multiple data streams, moving beyond traditional turn-based interactions. The goal is to create AI systems that can respond faster and more intelligently in dynamic environments. This is worth watching because it could lead to more seamless and intuitive AI-driven tools, especially in fields like robotics and customer service.
Daily Dose of Data Science Explains Speculative Decoding
The Daily Dose of Data Science's explanation of speculative decoding with LLMs is a very useful technique for accelerating inference. By using a smaller model to predict several tokens in advance and then verifying those predictions with the larger model, Google, Anthropic, and Meta get 2-3x more tokens per second. This optimization technique mathematically ensures identical outputs and helps scale model inference in a cheaper, performant way.
Stay ahead of the AI curve - book a free AI audit at consult.kylemzhang.com
Get this in your inbox every morning.