DragNUWA Online


Frequently asked questions
If you can't find what you're looking for, email our support team and if you're lucky someone will get back to you.
What are the key innovations of DragNUWA compared to previous controllable video generation models?
DragNUWA introduces simultaneous text, image, and trajectory control to enable more fine-grained control over video content from semantic, spatial, and temporal perspectives. It also enables open-domain trajectory control through innovations like the Trajectory Sampler, Multiscale Fusion, and Adaptive Training.
How does DragNUWA achieve fine-grained trajectory control in videos?
DragNUWA's Trajectory Sampler allows control of arbitrary trajectories. The Multiscale Fusion controls trajectories at different granularities. The Adaptive Training strategy generates consistent videos that follow the input trajectories.
What datasets was DragNUWA trained and evaluated on?
DragNUWA was trained and evaluated on complex open-domain image datasets, overcoming the limitations of previous works that relied on simpler datasets like Human3.6M.
Does DragNUWA allow interactive editing of generated videos?
Yes, DragNUWA enables intuitive interactive editing of generated videos by modifying the text prompts, input images, and trajectories. This allows fine-grained control over the video content.
How realistic and high-quality are the videos generated by DragNUWA?
DragNUWA produces highly realistic and high quality videos that closely follow the given text, image, and trajectory inputs. Both quantitative and human evaluations in the paper validate the realism of DragNUWA's video generation capabilities.