Meta just launched its own Sora before OpenAI – Meta Movie Gen


Meta Movie Gen has everything that Sora has, including the ability to create long HD videos with different aspect ratios and support for 1080p, 16 seconds, and 16 frames per second.
It also does what Sora doesn’t, generating accompanying background music and sound effects, editing videos based on text commands, and generating personalised videos based on images uploaded by the user.

Lets see what can meta do :

 The camera is behind a man. The man is shirtless, wearing a green cloth around his waist. He is barefoot. With a fiery object in each hand, he creates wide circular motions. A calm sea is in the background. The atmosphere is mesmerizing, with the fire dance.

A fluffy koala bear with grey and white fur and a round nose is surfing on a yellow surfboard. The koala is holding onto the surfboard with its paws and has a focused facial expression as it rides the waves. The sun is shining.
A red-faced monkey with white fur is enjoying a soak in a natural hot spring. The playful monkey is entertaining itself with a miniature wooden sailboat, complete with a white sail and a small rudder. The hot spring is nestled amidst lush greenery, surrounded by rocks and trees.

Simply put “put the light on the bubbles in the sky” can create beautiful visual effects, and at the same time perfectly show the objects in the scene, and at the same time can beautifully reflect the sky, looking more expressive.

Thunder cracks loudly, accompanied by an orchestral music track.

The character consistency is very strong.

You can edit videos directly, just by typing in text.

Create sound effects and soundtracks Just input text

Use video and text input to generate audio for your video. Movie Gen lets you create and extend sound effects, background music, or entire soundtracks.


Meta says it’s the most advanced Media Foundation Models to date’.



Some say it’s hard to imagine what long and short videos will look like in a few years as a large number of creators learn to use AI video editing tools.
This time, unlike Sora, which only has a demo and an official blog, Meta has made the architecture and training details public in a 92-page paper.

https://arxiv.org/pdf/2410.02746


But the model itself is not yet open source, and was met with hug-faced engineers sticking their faces in the air and dropping a link to Meta’s open source homepage directly in the comments section:
Here waiting for you now.


In its paper, Meta specifically emphasises that scaling of data size, model size, and training arithmetic is critical for training large-scale media generation models. By systematically improving these dimensions, it is possible to make such a powerful media generation system.
One of the most concerned points is that this time they completely threw away the diffusion model and diffusion loss function, using Transformer as the backbone network and Flow Matching as the training target.




The AI video generation space has been buzzing with activity over the past couple of days.


Shortly before Meta released Movie Gen, Tim Brooks, one of the creators of OpenAI Sora, jumped to Google DeepMind to continue his work on video generation and world simulators.
This got a lot of people thinking, just like when Google was slow to release the Big Model app and the Transformer 8 authors left in droves.
Now OpenAI is late in releasing Sora, and the main authors have also run away.
But others believe that Tim Brooks’ choice to leave now may indicate that his main work at OpenAI is done, and has led to speculation:
Did Meta’s launch force OpenAI to release Sora in response?
(As of this writing, Sora’s other creator, Bill Peebles, has yet to speak out.)
Now Meta has released models with video editing capabilities, plus the October 1 Pika 1.5 update, which focuses on adding physics effects like melting, expanding, and squeezing to objects in videos.
It’s not hard to see that the second half of AI video generation is going to start rolling towards AI video editing.