There's another important contender in the space: Hunyuan model from Tencent My ...

chrismorgan · on Dec 17, 2024

> A whimsical pelican, adorned in oversized sunglasses and a vibrant, patterned scarf, gracefully balances on a vintage bicycle, its sleek feathers glistening in the sunlight. As it pedals joyfully down a scenic coastal path, colorful wildflowers sway gently in the breeze, and azure waves crash rhythmically against the shore. The pelican occasionally flaps its wings, adding a playful touch to its enchanting ride. In the distance, a serene sunset bathes the landscape in warm hues, while seagulls glide gracefully overhead, celebrating this delightful and lighthearted adventure of a pelican enjoying a carefree day on two wheels.

What does it produce for “A pelican riding a bicycle along a coastal path overlooking a harbor”?

Or, what do Sora and Veo produce for your verbose prompt?

whywhywhywhy · on Dec 17, 2024

If Sora is anything like Dall-e a prompt like "A pelican riding a bicycle along a coastal path overlooking a harbor" will be extended into something like the longer prompt behind the scenes. OpenAI has been augmenting image prompts from day 1.

sashank_1509 · on Dec 17, 2024

Hard to say about SORA but the video you shared is most definitely worse than Veo.

The Pelican is doing some weird flying motion, motion blur is hiding a lack of detail, cycle is moving fast so background is blurred etc. I would even say SORA is better because I like the slow-motion and detail but it did do something very non physical.

Veo is clearly the best in this example. It has high detail but also feels the most physically grounded among the examples.

sfjailbird · on Dec 17, 2024

The prompt asks that it flaps its wings. So it's actually really impressive how closely it adheres (including the rest of the little details in the prompt, like the scarf). Definitely the best of the three, in my opinion.

dyauspitr · on Dec 17, 2024

Pretty good except the backwards body and the strange wing movement. The feeling of motion is fantastic though.

arjie · on Dec 17, 2024

I was curious how it would perform with prompt enhancement turned off. Here's a single attempt (no regenerations etc.): https://www.youtube.com/watch?v=730cb2qozcM

If you'd like to replicate, the sign-up process was very easy and I was easily able to run a single generation attempt. Maybe later when I want to generate video I'll use prompt enhancement. Without it, the video appears to have lost a notion of direction. Most image-generation models I'm aware of do prompt-enhancement. I've seen it on Grok+Flow/Aurora and ChatGPT+DallE.

    Prompt
    A pelican riding a bicycle along a coastal path overlooking a harbor
    Seed
    15185546
    Resolution
    720×480

taneq · on Dec 17, 2024

I mean, you didn’t SAY riding forwards…

TZubiri · on Dec 17, 2024

I suppose if you reverse it would look okish

gcr · on Dec 17, 2024

FYI your website shows me a static image on iOS 18.2 Safari. Strangely, the progress bar still appears to “loop,” but the bird isn’t moving at all.

Turning content blockers off does not make a difference.

theWreckluse · on Dec 17, 2024

Fwiw, it is finicky but the video played after a couple seconds (iOS 18.2 Safari).

dr_kiszonka · on Dec 17, 2024

Reddit says it is much better than Sora. Are you hosting the full version of Nunyuan? (Your video looks great.)

echelon · on Dec 17, 2024

HunYuan is also open source / source available unless you have 100M DAU.

Then there's Lightricks LTX-1 model and Genmo's Mochi-1. Even the research CogVideoX is making progress.

Open source video AI is just getting started, but it's off to a strong start.

yurylifshits · on Dec 17, 2024

Our limited tests show that yes, Hunyuan is comparable or better than Sora on most prompts. Very promising model

prometheon1 · on Dec 17, 2024

Is it still better if you copy his whole prompt instead of half of it?

c0brac0bra · on Dec 17, 2024

I mean, the pelican's body is backwards...