A new kind of search result
People are not only asking what the model is. They want to know whether the long-video claim is real, where the examples are, and what to read next.
Independent guide
Minute-scale world modeling with camera-controlled 720p video.
A fast answer to what SANA-WM is, what it can do, and where to find the official demos, paper, and code.
What is SANA-WM?
SANA-WM is an open-source world model from NVIDIA Research that turns one image plus a camera trajectory into minute-scale video. Its core promise is not just longer generation, but longer generation that still respects spatial structure and camera motion.
If you searched the term because it suddenly appeared in research news, the useful answer is simple: this is a model aimed at minute-long 720p worlds with precise camera control, not another ordinary short-form video generator.
Why people are searching for it now
People are not only asking what the model is. They want to know whether the long-video claim is real, where the examples are, and what to read next.
SANA-WM targets a full minute of controllable video, which is materially different from the short clips most users associate with video generation.
The paper reports single-GPU generation and a distilled variant that denoises a 60-second 720p clip in about 34 seconds on one RTX 5090.
What it can do
Generate minute-scale scenes that stay coherent across a longer camera path.
Follow 6-DoF trajectories instead of only producing unconstrained cinematic motion.
The paper reports comparable visual quality to industrial baselines with 36x higher throughput on its benchmark.
The project page, paper, and code repository are already public, making the model easier to inspect than a closed demo.
Official examples
Media on this page is adapted from the official SANA-WM project gallery and re-hosted here for faster browsing.
How it works
The model takes an initial frame as the visual anchor for the world.
A 6-DoF trajectory tells the model where the virtual camera should move.
Hybrid linear attention keeps the long sequence tractable while preserving scene continuity.
A second-stage long-video refiner improves texture, motion, and later-frame quality.
Efficiency

Hardware and official resources
Read the full method, benchmark, and hardware details in the official paper.
Open sourceSee the official hero reel, galleries, figures, and citation information.
Open sourceTrack the official SANA repository for implementation details and future updates.
Open sourceFAQ
SANA-WM is an open-source world model from NVIDIA Research built for minute-scale, camera-controlled video generation.
Yes. The paper describes native one-minute generation at 720p resolution.
Its focus is not only visual quality. It is designed around long-horizon consistency and precise 6-DoF camera control over a minute-scale rollout.
Yes. The model uses dual-branch camera control to follow metric 6-DoF trajectories.
The paper reports that each 60-second clip can be generated on a single GPU, and that a distilled variant denoises a 60-second 720p clip in about 34 seconds on one RTX 5090.
The paper describes SANA-WM as open source, and the broader SANA codebase is public on GitHub. For the latest release state, use the official project links on this page.
Availability can change quickly for new research releases. Check the official project page and repository for the current model-release status.
Open the official project page for demos, the arXiv paper for the method, and the NVlabs/Sana repository for code updates.
Sources and attribution
This site is not affiliated with NVIDIA Research. It summarizes public information and points visitors back to the original project materials.
Last updated: May 17, 2026