Independent guide

SANA-WM

Minute-scale world modeling with camera-controlled 720p video.

A fast answer to what SANA-WM is, what it can do, and where to find the official demos, paper, and code.

2.6Bparameters
720pminute-scale video
6-DoFcamera control
34sdistilled 60s clip on RTX 5090

What is SANA-WM?

A world model built for longer, controllable video instead of short visual tricks.

SANA-WM is an open-source world model from NVIDIA Research that turns one image plus a camera trajectory into minute-scale video. Its core promise is not just longer generation, but longer generation that still respects spatial structure and camera motion.

If you searched the term because it suddenly appeared in research news, the useful answer is simple: this is a model aimed at minute-long 720p worlds with precise camera control, not another ordinary short-form video generator.

Why people are searching for it now

A new kind of search result

People are not only asking what the model is. They want to know whether the long-video claim is real, where the examples are, and what to read next.

Minute-scale output is the headline

SANA-WM targets a full minute of controllable video, which is materially different from the short clips most users associate with video generation.

The hardware claim is unusually concrete

The paper reports single-GPU generation and a distilled variant that denoises a 60-second 720p clip in about 34 seconds on one RTX 5090.

What it can do

Long-horizon worlds

Generate minute-scale scenes that stay coherent across a longer camera path.

Precise camera motion

Follow 6-DoF trajectories instead of only producing unconstrained cinematic motion.

Higher-throughput evaluation

The paper reports comparable visual quality to industrial baselines with 36x higher throughput on its benchmark.

Open research footing

The project page, paper, and code repository are already public, making the model easier to inspect than a closed demo.

Official examples

See the model before reading the paper.

Lantern forest
Snowy shrine
Desert canyon
Rain-soaked city

Media on this page is adapted from the official SANA-WM project gallery and re-hosted here for faster browsing.

How it works

One image, one camera path, then a long controlled rollout.

  1. Start with a still image

    The model takes an initial frame as the visual anchor for the world.

  2. Add a camera path

    A 6-DoF trajectory tells the model where the virtual camera should move.

  3. Roll out the world

    Hybrid linear attention keeps the long sequence tractable while preserving scene continuity.

  4. Refine the result

    A second-stage long-video refiner improves texture, motion, and later-frame quality.

Efficiency

Why the hardware claim attracted attention.

Official SANA-WM efficiency chart comparing latency and GPU memory scaling.
Official efficiency figure from the SANA-WM project page.

Hardware and official resources

Use the official sources for the latest release state.

Paper

Read the full method, benchmark, and hardware details in the official paper.

Open source

Project page

See the official hero reel, galleries, figures, and citation information.

Open source

Code

Track the official SANA repository for implementation details and future updates.

Open source

FAQ

Answers to the first questions people usually ask.

What is SANA-WM?

SANA-WM is an open-source world model from NVIDIA Research built for minute-scale, camera-controlled video generation.

Can SANA-WM generate one-minute videos?

Yes. The paper describes native one-minute generation at 720p resolution.

What makes SANA-WM different from ordinary video generators?

Its focus is not only visual quality. It is designed around long-horizon consistency and precise 6-DoF camera control over a minute-scale rollout.

Does SANA-WM support camera control?

Yes. The model uses dual-branch camera control to follow metric 6-DoF trajectories.

Can SANA-WM run on a single GPU?

The paper reports that each 60-second clip can be generated on a single GPU, and that a distilled variant denoises a 60-second 720p clip in about 34 seconds on one RTX 5090.

Is SANA-WM open source?

The paper describes SANA-WM as open source, and the broader SANA codebase is public on GitHub. For the latest release state, use the official project links on this page.

Are the model weights available now?

Availability can change quickly for new research releases. Check the official project page and repository for the current model-release status.

Where should I start if I want the official details?

Open the official project page for demos, the arXiv paper for the method, and the NVlabs/Sana repository for code updates.

Sources and attribution

Independent guide, official sources.

This site is not affiliated with NVIDIA Research. It summarizes public information and points visitors back to the original project materials.

Last updated: May 17, 2026