OpenAI’s AI Video Platform Sora Is Finally Here: Details

Internet

OpenAI finally launched Sora, its artificial intelligence (AI) video generation model, on Monday. In February, the company previewed Sora to select individuals, and now, it released a different variant of the model dubbed Sora Turbo. Sora can generate videos in 1080p resolution which can be as long as 20 seconds. The AI model has been deployed on a standalone platform which is currently available as a website. Notably, Sora is currently only available to paid subscribers of ChatGPT with specified rate limits.

OpenAI’s Sora AI Video Generation Model

In a blog post, the AI firm announced the launch of Sora and detailed the capabilities of the model. Sora was first unveiled earlier this year, and the model has been repeatedly delayed. The company had stated that the reason behind the delay was strengthening the safety and privacy parameters of the model.

However, after a delay of nearly nine months, OpenAI has launched Sora as a standalone platform which can be accessed here. It is currently only available to ChatGPT Plus and Pro subscribers. Those without subscription cannot create a new account on the website currently. Meanwhile, Plus users are limited to 50 videos at 480p resolution or fewer videos at 720p every month.

ChatGPT Pro subscription, which was recently introduced at $200 (roughly Rs. 16,970) a month, will let users generate videos with “10x more usage, higher resolutions, and longer durations.” However, just like “fewer videos”, the company did not quantify what would entail under high resolutions and longer durations.

Sora can currently generate videos in widescreen, vertical, and square aspect ratios. Users can also upload their videos and images to extend, remix, and blend the content into generated videos. The AI model also allows generating videos from scratch using text prompts. Additionally, a storyboard interface lets users set particular inputs for each frame.

Coming to technicalities, OpenAI explained that Sora is a diffusion model, where the AI has the foresight of many frames at a time to keep the content consistent over the 20-second period. The AI model uses a transformer architecture, and takes recaptioning technique from DALL-E 3.

OpenAI also highlighted the details about the model data. The company claimed that it sourced a wide range of data from the public domain, via its data partnerships, and data from people working with the model. The public data was said to be collected from machine learning datasets and web crawls.

The company also partnered with Shutterstock Pond5 and commissioned datasets to generate proprietary data for the AI model. Finally, data for Sora was also collected from AI trainers, red teamers, and employees.

To minimise the risks associated with a realistic AI video generation model, OpenAI is adding both visible watermark as well as metadata as per the standards set by the Coalition for Content Provenance and Authenticity (C2PA). The company also claimed that it has added protections in the model for media uploads that include people.

The AI firm also stated that Sora will be blocked from generating videos containing damaging forms of abuse such as child sexual abuse and sexual deepfakes. Additionally, the number of uploads people can make will be limited at launch.