QuickVid uses AI to generate short-form videos, complete with voiceovers

Generative AI is coming for videos. A new website, QuickVid, combines several generative AI systems into a single tool for automatically creating short-form YouTube, Instagram, TikTok and Snapchat videos.

Given as little as a single word, QuickVid chooses a background video from a library, writes a script and keywords, overlays images generated by DALL-E 2 and adds a synthetic voiceover and background music from YouTube’s royalty-free music library. QuickVid’s creator, Daniel Habib, says that he’s building the service to help creators meet the “ever-growing” demand from their fans.

“By providing creators with tools to quickly and easily produce quality content, QuickVid helps creators increase their content output, reducing the risk of burnout,” Habib told TechCrunch in an email interview. “Our goal is to empower your favorite creator to keep up with the demands of their audience by leveraging advancements in AI.”

But depending on how they’re used, tools like QuickVid threaten to flood already-crowded channels with spammy and duplicative content. They also face potential backlash from creators who opt not to use the tools, whether because of cost ($10 per month) or on principle, yet might have to compete with a raft of new AI-generated videos.

Going after video

QuickVid, which Habib, a self-taught developer who previously worked at Meta on Facebook Live and video infrastructure, built in a matter of weeks, launched on December 27. It’s relatively bare bones at present — Habib says that more personalization options will arrive in January — but QuickVid can cobble together the components that make up a typical informational YouTube Short or TikTok video, including captions and even avatars.

It’s easy to use. First, a user enters a prompt describing the subject matter of the video they want to create. QuickVid uses the prompt to generate a script, leveraging the generative text powers of GPT-3. From keywords either extracted from the script automatically or entered manually, QuickVid selects a background video from the royalty-free stock media library Pexels and generates overlay images using DALL-E 2. It then outputs a voiceover via Google Cloud’s text-to-speech API — Habib says that users will soon be able to clone their voice — before combining all these elements into a video.