As an occupational hazard, I have been experimenting with various GenAI video generation models. Here are a number of different experiments run in June 2025, repeated with the same prompt, using Midjourney, Google Veo 3, Kling 2.1 Master, Sora, and Hailou 02.

If I were to comment anything general about them, it seems that as of June 2025, Sora – which used to be the leading model – is now woefully behind the competition. Judge for yourself:

a man in elaborate robes does a backflip while holding two pool noodles” inspired by Ethan Mollick’s LinkedIn post
people walking in a modern subway station, using my photo taken at Sydney’s Gadigal metro station as a starting point
Sunset as seen through an aircraft window seat window“, also using a real photo as a starting point, plus a real video as a bonus
Two friends walking by the pool. Suddenly, one pushes the other into the pool., with an AI-generated starting image

A Boeing 747 taking off from Sydney Airport”, pure text-to-video
A Boeing 747 taking off from Sydney Airport“, using a real photograph as the starting frame. As a somewhat of an aviation professional, most of these the previous ones above are pretty painful to watch.
Camera following a businessman walking through Central London during daytime, shot handheld with a mobile phone“, pure text-to-video