Instructions to use Salesforce/FOFPred with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Salesforce/FOFPred with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Salesforce/FOFPred", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
metadata
license: apache-2.0
library_name: diffusers
pipeline_tag: image-to-image
tags:
- optical-flow prediction
- motion prediction
- diffusion
FOFPred: Language-Driven Future Optical Flow Prediction
FOFPred is a diffusion-based model that predicts future optical flow from a single image guided by natural language instructions. Given an input image and a text prompt describing a desired action (e.g., "Moving the water bottle from right to left"), FOFPred generates 4 sequential optical flow frames showing how objects would move.
Usage
import torch
from fofpred.pipelines.fofpred.pipeline_fofpred import FOFPredPipeline
from fofpred.schedulers.scheduling_flow_match_euler_discrete import FlowMatchEulerDiscreteScheduler
from PIL import Image
pipeline = FOFPredPipeline.from_pretrained(
"Salesforce/FOFPred",
torch_dtype=torch.bfloat16,
).to("cuda")
pipeline.scheduler = FlowMatchEulerDiscreteScheduler()
results = pipeline(
prompt="Moving the water bottle from right to left.",
input_images=[Image.open("your_image.jpg")],
width=256,
height=256,
num_inference_steps=1,
num_images_per_prompt=4,
frame_count=4,
generator=torch.Generator(device="cuda").manual_seed(42),
output_type="pt",
)
flow_frames = results.images # [B, F, C, H, W]
Architecture
| Component | Model |
|---|---|
| V-LLM | Qwen2.5-VL-3B-Instruct |
| DiT | OmniGen2Transformer3DModel |
| VAE | FLUX.1-dev AutoencoderKL |
| Scheduler | FlowMatchEulerDiscreteScheduler |
Acknowledgements
License
Apache 2.0 — Copyright (c) 2025 Salesforce, Inc.