Wan 2.1 marks a significant leap forward in video foundation models, setting new standards within the video production sector. Utilizing a groundbreaking 3D VAE architecture alongside state-of-the-art diffusion transformer technology, it achieves remarkable performance on consumer-grade GPUs. This adaptable model excels at managing both text-to-video and image-to-video applications, and it is at the forefront of allowing text generation in English and Chinese languages.
Wan 2.1 marks a significant leap forward in video foundation models, setting new standards within the video production sector. Utilizing a groundbreaking 3D VAE architecture alongside state-of-the-art diffusion transformer technology, it achieves remarkable performance on consumer-grade GPUs. This adaptable model excels at managing both text-to-video and image-to-video applications, and it is at the forefront of allowing text generation in English and Chinese languages.