Cloudflare Outage, Reward Hacking for LLMs, Ubers Multi-Clou…

Cloudflare Outage, Reward Hacking for LLMs, Ubers Multi-Clou…

Software Architecture Weekly (Architecture Weekly Newsletter)

Architecture Weekly Issue #188. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🤟 means hardcore, 👷‍♂️ is technically applicable right away, 🍼 - is an introduction to the topic or an overview. Available in telegram as well.





Software Architecture Weekly is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.







Highlights

Cloudflare Outage on November 18,2025 👷‍♂️

Two weeks ago Our QA engineer - Nadia - reported that our admin panel on staging shows an Internal Server Error illustrated by famous screenshot saying Cloudflare actually has an error. Apparently, we were not the only one cause actually half of the internet experienced the same including twitter, tesla and other giants. The reason was using Rust unsafe() function in Production in the Bot Management module. The details - in the Post Mortem by Cloudflare itself.





#postmortem

Consequences of reward hacking in LLMs 🤟

Reward hacking is a phenomena when a model finds a workaround to achieve it’s goal instead of doing the work, e.g. removing tests instead of writing code which passes them. Surprisingly enough, the model trained to do reward hacking appears to be misaligned on other aspects. Read a paper from Antropic to know more.





#ai #paper

Building Uber’s Multi-Cloud Secrets Platform 👷‍♂️

Uber explains how they consolidated 25 disparate secrets vaults across multiple clouds into a centrally governed Secret Management Platform, standardizing metadata, lifecycle, and rotation across ~150k secrets and several deployment platforms (Kubernetes, YARN, Flink, data pipelines, etc.). Architecturally, it demonstrates centralization of control planes, decoupling of secret distribution from workloads, workflow orchestration (Cadence-based Secret Lifecycle Manager), and risk-based prioritization of automation. This is particularly useful for senior architects designing cross-cutting security and configuration platforms in heterogeneous, multi-cloud microservice environments.





#security

Follow-Up

AWS Lambda Managed Instances 🍼

AWS Lambda is a serverless offering, but what if you want to save some bucks on the committed usage? Now you have such an opportunity with Lambda Managed Instances. Go to Capacity Providers and commit!





#aws #serverless

AI Agents with Temporal 🍼

LLMs are not purely determenistic, but it does not mean they are completely uncontrollable. Their output is still generally deterministic - they will either comply to a requested format, or you treat the output as an error. That’s why building workflows is still possible - be it using make.com, n8n or Temporal. Find the details in the blog post by the latter.





#ai

Structured Logging Explained 👷‍♂️

Logs are a backbone of observability. However if you still interpolate strings, you miss on logs searchability and surprisingly performance. Amit explains how to get those back with structured logging and elaborates on the log collection architecture.





#observability

How to handle billions emojis 👷‍♂️

Ever seen those upcoming emojis in twitch streams or football championships? Have you ever wondered what it takes to reliably support it? Well, find a case study showing how to implement such a system.





#casestudy

INs and OUTs of Outbox Pattern 👷‍♂️

Gunnar Morling warns that dual writes—updating a database and publishing an event separately—inevitably cause inconsistencies in distributed microservices. He promotes the Outbox Pattern, which atomically commits data changes and events in one local transaction. Debezium then pulls these events from the database’s transaction log and delivers them reliably. He also outlines optimizations like using PostgreSQL’s log directly and ensuring idempotent consumers. The core message: strong transactional guarantees are mandatory for consistent event-driven systems.

#cdc





Big thanks to Nikita, Constantin, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter on Patreon! Alternatively, you can upgrade to the paid version and get access to the premium only posts!

Generated by RSStT. The copyright belongs to the original author.

Source

Report Page