When discussing generative AI, there is just one topic that dominates the news: ChatGPT. But, the field of generative has much more to offer than ChatGPT-like language models. Text-to-video generative AI is cooking in the background, while text-to-image conversion is already a common topic of discourse.
AI that converts text to videos.
Simply, you can produce films driven by AI using nothing but your own words. Yep, it is precisely as it sounds: the text is inputted and an AI model generates a movie based on it. US-based firm Runway demonstrated their Gen-2 model, which is capable of doing so with a few exceptions.
Is this something “new”?
It is similar to Dall-E, which was developed by the inventors of ChatGPT, and uses generative AI language models. The findings are compelling enough to pique the interest of many people throughout the world.
Is text-to-video not associated with “Big Tech”?
They certainly are. Back in September 2022, Meta debuted a tool aptly titled Make-A-Video. Make-A-Video generates films from a few words or lines of text using generative AI, but these videos were silent. Mark Zuckerberg, CEO of Meta, stated, “It’s significantly more difficult to create video than still images because, in addition to precisely creating each pixel, the system must also forecast how they’ll change over time.
A week later, on cue, Google launched a comparable model. Google’s generative AI model is dubbed Imagen Video. Google characterized it as follows: “Given a text input, Imagen Video makes high definition films using a basic video production model and a sequence of interleaved spatial and temporal video super-resolution models.”
Google also presented the Phenaki model, which is designed to generate long-form films based on text inputs.
What obstacles does text-to-video AI face?
Multifold. From operational to ethical, there are just too many obstacles. Perhaps this is one of the reasons why only demonstrations of text-to-video generative AI models exist. Initially, building a film using text may appear absurdly simple and equally exciting, but picture creating a video with only text. One will need to be quite specific with the commands, or else it may output video nonsense.
Then come the ethical difficulties. Videos made by AI might be the new weapon in the armory of disinformation. Deepfakes might become an even greater issue than they are today.
With the rapid advancements in AI, it may only be a matter of time until text-to-video moves out of the experimental phase and into the mainstream.