-
New Jun 4, 2026
Model Evaluations: Prove Your Routing Policy Actually Works
Most teams running inference at scale do not fail because they cannot find a âgoodâ model. They fail because they ship a routing policy that looks fine in a playground, but drifts the moment it se...
-
New Jun 3, 2026
The Team Behind Deploy: Shipping AI, the DigitalOcean Way
Deploy 2026 came and went, and weâre still buzzing. For one day at Convene 100 Stockton in San Francisco, developers, startup founders, customers, and partners filled the room to talk about a shared c...
-
New Jun 3, 2026
Powering the Inference Era: Inside the DigitalOcean Data & Learning Layer
Building an AI-native application requires a data layer that can do two things at once: handle the structured, transactional queries your application runs on, and understand meaning well enough to po...
-
New Jun 2, 2026
Open by Design: How NVIDIA and DigitalOcean Are Building the Stack for the Always-On Agentic Era
The growth of generative AI isnât driven solely by AI companies with proprietary models. Open-source AI is reshaping the developer ecosystem, fueled by a growing community of builders. But what do...
-
New Jun 1, 2026
The Inference Tax: How Prefix-Aware Routing Eliminates the Hidden Cost of LLMs at Scale
Introduction Inference demand is growing fast, and itâs only accelerating. By 2030, inference is expected to account for the majority of AI compute globally. But scaling inference i...
-
New Jun 1, 2026
DigitalOcean Serverless Inference: A Deep Dive
The Problem: Inference Gets Hard at Scale If youâve shipped an AI feature to production, you already know: the hard part isnât making a model resp...
-
New May 29, 2026
AI Disruptors: How the Next Generation of Business is Being Built
Getting your hands on a capable AI model is the easy part now. Every team can reach the same frontier models through an API, so a strong model is not what sets a product apart. What separates a wo...
-
New May 28, 2026
OpenCode Now Supports DigitalOcean Inference Router for Intelligent Model Routing
Coding agents today have a massive spending problem. Every request, whether youâre designing system architecture or writing a single-line docstring, often gets routed to the same expensive frontier mo...
-
New May 27, 2026
Scalable, Cost-Efficient AI: Introducing Unified Batch Inference on DigitalOcean
At Deploy 2026, we introduced the DigitalOcean AI-Native Cloud, built for the inference era. Batch Inference on the DigitalOcean Inference Engine enables high-volume asynchronous workloads. As develop...
-
New May 22, 2026
Request-Based Autoscaling Is Now Generally Available on App Platform
Traffic doesnât spike on a schedule. A product launch, a viral moment, or a flash sale can send request volume through the roof in seconds, long before your CPU metrics catch up. That gap is where pe...
-
New May 20, 2026
How We Built DigitalOcean Inference Router
Most teams building on LLMs today make a single model decision and apply it uniformly across every request. They reach for a frontier model not because every task demands it, but because building th...
-
New May 13, 2026
Your Model Doesn't Matter. Your Infrastructure Does.
Everyone calling an LLM API has access to the same models. So what actually sets technical teams apart? Itâs everything around the model like the routing logic, the live data pipelines, and the abili...
-
New May 4, 2026
Powering the Inference Era: Inside the DigitalOcean AI-Native Cloud
Iâve spent the last fifteen years building cloud services: early days of AWS building S3 and EBS, helping launch Oracle Cloud Infrastructure from inception, and now building the agentic cloud at Di...
-
New Apr 28, 2026
Introducing DigitalOcean AI-Native Cloud for Production AI Workloads
The AI industry has a compounding bottleneck, and it isnât the models. Itâs inference. What used to be a single model call has become a system of continuous interaction. Applications now orchestra...
-
New Apr 28, 2026
How we built the most performant DeepSeek V3.2, MiniMax-M2.5 and Qwen 3.5 397B on DigitalOcean Serverless Inference
Today at Deploy, we are announcing the general availability of DeepSeek V3.2, MiniMax-M2.5, and Qwen 3.5 397B on DigitalOcean Serverless Inference. On DeepSeek V3.2 and Qwen 3.5 397B, we deliver #1 ou...
-
New Apr 25, 2026
DigitalOcean Dedicated Inference: A Technical Deep Dive
Getting a model to answer 10 inference requests concurrently is tricky but simple enough; getting it to handle 2,000 engineers hitting a coding assistant with long contexts, all day, without ru...
-
New Apr 23, 2026
Beyond the Abyss Project Poseidonâs Quest for Zero-Downtime Reliability
In large-scale cloud environments, unpredictable hypervisor crashes carry real operational cost. While traditional reactive monitoring that relies on static thresholds and post-hoc alerts were on...
-
New Apr 23, 2026
From Incident Counting to SLIs: How DigitalOcean Rethought Availability
Our journey to truly understand our customer experience began with a hard look at our internal availability numbers at the start of 2025. We saw something uncomfortable: the numbers didnât match ou...
-
New Apr 22, 2026
The LLM Inference Trilemma: Throughput, Latency, Cost
We know how to scale traditional web services: throw a load balancer in front of stateless microservices and horizontally scale your CPU instances as traffic grows. Large Language Models break this pl...
-
New Apr 21, 2026
Mastering the 600B+ Frontier: Optimizing Large Model Deployments on the Inference Cloud
We have moved past the point where a 70GB model was considered âheavy.â With the rise of models like DeepSeek-V3, the GLM series, and other massive Mixture-of-Experts (MoE) architectur...
-
New Apr 17, 2026
The Inference Cloud Memory Layer: A Technical Dive into DigitalOcean Managed Databases
As AI moves from experimental chat interfaces to production-grade agents, the need for a foundational memory layer to transform these AI-powered tasks into stateful models is apparent. The absence o...
-
New Apr 15, 2026
Load Balancing and Scaling LLM Serving
Load balancing for LLMs is fundamentally different from load balancing for traditional services like web servers, APIs, or databases. Prompt caching is the reason. Prompt caching typically cuts inp...
-
New Apr 13, 2026
Building a Robust Documentation Agent with DigitalOcean Gradient AI Platform
At DigitalOcean, documentation has always been a priority. Developers come to our docs to get unstuck, and the faster they find what they need, the better. Traditional docs pages work, but they re...
-
New Apr 7, 2026
Advanced Prompt Caching at Scale
Introduction Prompt caching is the process of reusing already computed KV states across inference requests in order to save money and reduce latency. Within a single replica, m...
-
New Apr 3, 2026
The Hidden Cost of Complex AI Platforms: Why Developer Experience Matters
The cloud AI platform ecosystem today looks more powerful than ever, with access to powerful GPUs like NVIDIA H100 and H200, massive libraries of pre-trained models, and full pipelines for fine-tu...
-
New Apr 2, 2026
The Glue Problem in Modern AI Development
AI is now central to modern software development. Teams across industries are turning to AI to solve product and workflow problems in software. But building production systems is still complex. The ha...
-
New Apr 2, 2026
The Agentic Era Demands a New Class of Infrastructure: DigitalOcean Acquires Katanemo Labs
At DigitalOcean, we have been vocal about our strategic shift: we are building the worldâs premier Agentic Inference Cloud. Our mission is to provide the foundation where AI-native enterprises bu...
-
New Apr 1, 2026
Run Advanced Reasoning on DigitalOcean with Arcee AI's Trinity Large-Thinking
Today, weâre announcing that Arcee AIâs Trinity Large-Thinking is now available in Public Preview on DigitalOceanâs Agentic Inference Cloud, giving developers the ability to run frontier-class reaso...
-
New Apr 1, 2026
Now Available: DigitalOcean Cloud Security Posture Management (CSPM)
Keeping cloud infrastructure secure at scale is challenging. Infrastructure drift, exposed services, and sprawling identities create risk, and teams donât always have the time or expertise to ma...
-
New Mar 27, 2026
NVIDIA GTC 2026 Confirmed It: The Inference Era Is Here
Last week at NVIDIA GTC 2026, one message was clear: AI has moved beyond the training era and into the era of production inference. The conversation was no longer just about building faster chips an...
-
New Mar 24, 2026
DigitalOcean India: Inside Our Growing Hub for AI and Cloud Innovation
At DigitalOcean, our philosophy, âWe ship in hours and days, not months or quarters,â delivers real results for our customers. This is the exact velocity that has defined DigitalOceanâs India re...
-
New Mar 23, 2026
Enhancing Security with User-Specific Access Keys for DigitalOcean Functions
As teams grow and scale their serverless workloads, managing security postures becomes just as critical as managing code. Our goal at DigitalOcean is to support your growth at every stage. One way we...
-
New Mar 19, 2026
Meet the New Standard for High-Performance, Low-Cost Inference: NVIDIA Dynamo 1.0 is now available to DigitalOcean Customers
NVIDIA Dynamo 1.0, which was released on Monday at NVIDIA GTC, is now available to DigitalOcean customers to help drive performance enhancements and cost efficiency. NVIDIA Dynamo 1.0 offers a 7x in...
-
New Mar 17, 2026
Prompt Caching for Anthropic and OpenAI Models: Building Cost-Efficient AI Systems
Large Language Models (LLMs) have become a foundational component for modern AI applications, from developer copilots and documentation assistants to advanced troubleshooting tools. As these ap...
-
New Mar 16, 2026
DigitalOcean at NVIDIA GTC 2026: Building the AI Factory for the Agentic Era
A seamless path for builders: Start building on build.nvidia.com, Deploy to DigitalOcean The landscape of artificial intelligence has shifted from static models to dyna...
-
New Mar 16, 2026
Deploy Smarter with AI: Introducing App Platform Skills on DigitalOcean
AI coding assistants have fundamentally changed how developers write software. Tools like Claude Code, Codex, GitHub Copilot, Gemini, and Cursor can scaffold an entire application in minutes. But as...
-
New Mar 13, 2026
Scaling Autonomous Site Reliability Engineering: Architecture, Orchestration, and Validation for a 90,000+ Server Fleet
As Cloudways scaled from a bootstrapped startup to a leading managed PHP hosting service, one of the biggest challenges we encountered was the growing support load. Managing a fleet of over 90,000 se...
-
New Mar 5, 2026
Native .NET Buildpack Support is Now Available on App Platform
The .NET ecosystem continues to power a significant share of enterprise and cloud-native applications, from web APIs and microservices to full-stack applications built with ASP.NET Core. Developers bu...
-
New Mar 3, 2026
How DigitalOceanâs Agentic Inference Cloud powered by NVIDIA GPUs Achieved 67% Lower Inference Costs for Workato
Workatoâs AI Research Lab is focused on helping customers extend their production automation with agentic AI capabilities, systems that can reason, act, and orchestrate work across the business. At Wo...
-
New Feb 26, 2026
Supabase Template is Now Available on DigitalOcean App Platform
Modern applications need more than just a database. They need authentication, auto-generated APIs, file storage, and real-time subscriptions. Supabase is a powerful open-source Firebase alternative th...
-
New Feb 25, 2026
Zero to Deploy: Launching Your Career at DigitalOcean
Diving into the professional world is a big moment for recent graduates. At DigitalOcean, we believe the best way to become a world-class builder is to build. Our entry-level roles are designed to tr...
-
New Feb 19, 2026
DigitalOcean Gradient⢠AI GPU Droplets Optimized for Inference: Increasing Throughput at Lower the Cost
Production-grade LLM inference demands more than just access to GPUs; it requires deep optimization across the entire serving stack, from quantization and attention kernels to memory management and pa...
-
New Feb 19, 2026
Expanding our Agentic Inference Cloud: Introducing GPU Droplets Powered by AMD Instinct⢠MI350X GPUs
As our Agentic Inference Cloud continues to grow, weâre excited to announce the availability of new, high-performance GPU D...
-
New Feb 18, 2026
DigitalOcean Gradient⢠AI Platform Now Integrates with LlamaIndex
Weâre excited to announce that DigitalOcean Gradient⢠AI Platform now integrates natively with LlamaIndex - one of the most popular frameworks for building RAG applications. This means you can now c...
-
New Feb 10, 2026
The Container paradox: Why the Inference Cloud Demands a âDecoupledâ Database
Kubernetes has won the cloud-native war for a reason: itâs one of, if not the most powerful tool we have for scaling applications and ensuring they stay up when unexpected things happen. But as we mo...
-
New Feb 9, 2026
Herokuâs Next Chapter Is Maintenance. Yours Shouldnât Be
Herokuâs move to a âsustaining engineeringâ model was carefully worded. It avoids the term end-of-life. It reassures existing customers that nothing changes immediately. It emphasizes stability and su...