OpenAI intros Sora, a new text-to-video AI model

Johannesburg, 16 Feb 2024

OpenAI unveils its new AI model Sora.

ChatGPT creator OpenAI yesterday unveiled its new text-to-video artificial intelligence (AI) model, Sora.

According to OpenAI, Sora uses transformer architecture, similar to the GPT models, enabling it to generate one-minute-long videos with a focus on both visual quality and adherence to user prompts.

Early demonstrations on social media platform X by OpenAI CEO Sam Altman showcase Sora’s ability to bring written descriptions to life in various styles, from photorealistic scenes to cartoons.

OpenAI notes Sora is able to generate videos featuring complex scenes and multiple characters. The AI tool is able to incorporate specific types of motion and provide accurate details of both the subject and background, says the company.

Additionally, it can create multiple shots within a single video and generate new videos or augment existing AI-generated ones.

“Sora is a diffusion model, which generates a video by starting off with one that looks like static noise and gradually transforms it. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”

See also

Sam Altman reinstated as CEO of OpenAI

Ex-OpenAI CEO Sam Altman joins Microsoft

Sora joins a growing suite of text-to-video AI models, including Google's Imagen Video and Meta's Make-A-Video.

Despite generating excitement, OpenAI acknowledges existing weaknesses in Sora. Challenges include accurately simulating the physics of complex scenes and understanding specific cause-and-effect instances. For example, the model may struggle with maintaining a bite mark on a cookie after someone takes a bite.

Spatial details, such as confusing left and right, and precise descriptions of events occurring over time, like following a specific camera trajectory, also pose challenges.

Addressing safety concerns, OpenAI says it is currently in a research phase, using Sora to explore potential benefits and ethical concerns associated with the AI model.

Furthermore, it says it is collaborating with cyber security experts, also referred to as red teamers, drawing on safety protocols integrated into products like DALL·E 3 and engaging with artists worldwide to gain insights into concerns and pinpoint positive applications for the technology.

“Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology or all the ways people will abuse it. That’s why we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time,” says OpenAI.

Sora is currently available to selected visual artists, designers and filmmakers who will provide feedback on how to advance the model.