Shopify Open-Sources ML Platform That Has Already Saved Them…
Analytics India Magazine (Supreeth Koundinya)

Shopify has open-sourced Tangle, an internal machine-learning experimentation platform designed to cut repetition, enforce reproducibility, and accelerate development cycles.
The system emerged from challenges faced by search and discovery teams within the company, which routinely train and evaluate models against millions of products and billions of queries.
Before Tangle, engineers often rebuilt identical datasets, reran long preprocessing steps, and struggled to reproduce historical results.
Shopify reports the platform has already saved more than a year of compute time internally by eliminating redundant work. “The CPU time savings alone are ridiculous,” said Mikhail Parakhin, the CTO of Shopify.
Tangle addresses six standard failure modes in ML development — scattered queries, unstructured notebooks, repeated data preparation, irreproducible results, slow deployment, and limited collaboration.
As Shopify puts it, “Machine learning development shouldn’t work this way, but it does. 80% of development time is spent on data engineering, not algorithms.”
Its core mechanism is a visual pipeline interface backed by content-based caching. Developers assemble pipelines as directed acyclic graphs composed of “components”—YAML-defined, language-agnostic units that wrap arbitrary CLI programs.
“Think of Tangle as the glue that connects everything in your workflow, no matter how mismatched.”
Each task runs in isolation inside a container, ensuring deterministic behaviour and enabling automatic artefact reuse.
Components operate as pure functions, and, as Shopify describes, “Components are designed as pure functions: deterministic… and free from side effects.”
Because caching is based on output content rather than lineage, Tangle reuses identical intermediate results even when only part of a pipeline changes or when another user has already run equivalent steps.
According to Shopify, this leads to significant real-world gains: “A 10-hour pipeline completes in 20 minutes when only one component changes.” It also applies globally: “Tangle’s cache operates globally across all users… all three pipelines share the artefact—even for still-running executions.”
The platform is designed to work across any language, cloud provider, or on-prem environment. Components can be written in Python, JavaScript, Rust, or anything capable of reading and writing files. This neutrality allows teams to integrate existing code without refactoring.
The visual editor provides real-time visibility into execution status, cached steps, logs, and performance bottlenecks, while every run is stored with complete lineage for reproducibility.
“Tangle is a major piece of our Shopify data and ML system,” said Tobi Lutke, the CEO of Shopify.
“It makes complex things easy and automatically avoids doing things more than once, saving an insane amount of waste.”
The post Shopify Open-Sources ML Platform That Has Already Saved Them Over a Year of Compute appeared first on Analytics India Magazine.
Generated by RSStT. The copyright belongs to the original author.