Replicate
Телеграм -> @oscave Run AI
with an API.
Run and fine-tune open-source models. Deploy custom models at scale. All with one line of code.
With Replicate you can
import replicate
output = replicate.run(
"stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",
input={
"prompt": "An astronaut riding a rainbow unicorn, cinematic, dramatic"
}
)
print(output)
import Replicate from "replicate";
const replicate = new Replicate();
const output = await replicate.run(
"stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",
{
input: {
prompt: "An astronaut riding a rainbow unicorn, cinematic, dramatic"
}
}
);
console.log(output);
curl -s -X POST \
-H "Authorization: Token $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-d $'{
"version": "39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",
"input": {
"prompt": "An astronaut riding a rainbow unicorn, cinematic, dramatic"
}
}' \
https://api.replicate.com/v1/predictionsRun stability-ai/sdxl with an API
Thousands of models contributed by our community
All the latest open-source models are on Replicate. They’re not just demos — they all actually work and have production-ready APIs.
AI shouldn’t be locked up inside academic papers and demos. Make it real by pushing it to Replicate.
How it works
You can get started with any open-source model with just one line of code. But as you do more complex things, you fine-tune models or deploy your own custom code.
Run open-source models
Our community has already published thousands of models that are ready to use in production. You can run these with one line of code.
import replicate
output = replicate.run(
"stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",
input={
"width": 768,
"height": 768,
"prompt": "An astronaut riding a rainbow unicorn, cinematic, dramatic",
"refine": "expert_ensemble_refiner",
"scheduler": "K_EULER",
}
)
print(output)Fine-tune models with your own data
You can improve open-source models with your own data to create new models that are better suited to specific tasks.
Image models like SDXL can generate images of a particular person, object, or style.
Language models like Llama 2 generate text in a specific style or get better at a particular task.
Train a model:
import replicate
training = replicate.trainings.create(
version="stability-ai/sdxl:c221b2b8ef527988fb59bf24a8b97c4561f1c671f73bd389f866bfb27c061316",
input={
"input_images": "https://my-domain/my-input-images.zip",
},
destination="mattrothenberg/sdxl-fine-tuned"
)
print(training)This will result in a new model:

mattrothenberg/sdxl-fine-tuned
A very special, fine-tuned version of SDXL
0 runs

mattrothenberg/sdxl-fine-tuned
A very special, fine-tuned version of SDXL
0 runs
Then, you can run it with one line of code:
output = replicate.run(
"mattrothenberg/sdxl-fine-tuned:abcde1234...",
input={"prompt": "a photo of TOK riding a rainbow unicorn"},
)Deploy custom models
You aren’t limited to the models on Replicate: you can deploy your own custom models using Cog, our open-source tool for packaging machine learning models.
Cog takes care of generating an API server and deploying it on a big cluster in the cloud. We scale up and down to handle demand, and you only pay for the compute that you use.
First, define the environment your model runs in with cog.yaml:
build:
gpu: true
system_packages:
- "libgl1-mesa-glx"
- "libglib2.0-0"
python_version: "3.10"
python_packages:
- "torch==1.13.1"
predict: "predict.py:Predictor"Next, define how predictions are run on your model with predict.py:
from cog import BasePredictor, Input, Path
import torch
class Predictor(BasePredictor):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
self.model = torch.load("./weights.pth")
# The arguments and types the model takes as input
def predict(self,
image: Path = Input(description="Grayscale input image")
) -> Path:
"""Run a single prediction on the model"""
processed_image = preprocess(image)
output = self.model(processed_image)
return postprocess(output)Scale on Replicate
Thousands of businesses are building their AI products on Replicate. Your team can deploy an AI feature in a day and scale to millions of users, without having to be machine learning experts.

Automatic scale
If you get a ton of traffic, Replicate scales up automatically to handle the demand. If you don't get any traffic, we scale down to zero and don't charge you a thing.
- CPU $0.000100/sec
- Nvidia T4 GPU $0.000225/sec
- Nvidia A40 GPU $0.000575/sec
- Nvidia A40 (Large) GPU $0.000725/sec
- Nvidia A100 (40GB) GPU $0.001150/sec
- Nvidia A100 (80GB) GPU $0.001400/sec
- 8x Nvidia A40 (Large) GPU $0.005800/sec
- Learn more about pricing
Pay for what you use
Replicate only bills you for how long your code is running. You don't pay for expensive GPUs when you're not using them.
Forget about infrastructure
Deploying machine learning models at scale is hard. If you've tried, you know. API servers, weird dependencies, enormous model weights, CUDA, GPUs, batching.
Prediction throughput (requests per second)
Logging & monitoring
Metrics let you keep an eye on how your models are performing, and logs let you zoom in on particular predictions to debug how your model is behaving.
Logo
Imagine what you can build
- Autonomous Robots Zero-shot autonomous robots with open source models Paint with AI An iPad app that lets you paint with AI emojis.sh AI Emojis Replicover Find the hottest AI models on Replicate Language Model CLI Language model command line interface
Imagine
Autonomous Robots Zero-shot autonomous robots with open source models
what
you
Paint with AI An iPad app that lets you paint with AI
can
Replicover Find the hottest AI models on Replicate
build.
Language Model CLI Language model command line interface
With Replicate and tools like Next.js and Vercel, you can wake up with an idea and watch it hit the front page of Hacker News by the time you go to bed.
Get started Logo
Machine learning doesn’t need to be so hard.
Product
Community
Company