Skip to main content

"SIRENA"

"Sirena" AI music video by Mark DK Berry.

WORKFLOWS & PROJECT DETAILS

🎥 Description

"In Latin, "sirena" translates to siren. The word originates from the Greek word Σειρήν (Seirḗn), which refers to the mythical creatures known for their enchanting voices and ability to lure sailors to their doom."


🧜‍♀️ About the Project

"Sirena" was my seventh AI music video, and for this one I deliberately stepped out of my comfort zone to tackle something different: an underwater romance.

My main goal was to improve image and animation quality across the board. Unfortunately, despite giving myself extra time and effort, I didn’t quite reach the level I’d hoped for. While hardware played a part, character consistency and learning all the workflow settings were the real bottlenecks.


⚠️ Key Challenges


🔧 Workflows & Tools Used On "Sirena"


⏱️ Time & Energy Investment

→ Total: 18 working days (and some nights running batch renders, and a number of additional days installing, re-installing, fixing broken installs, workflows, nodes, etc.)

I didn't properly track electricity use this time, but I think it was around 40 KWhs. I will track it in the future. It's possible there will come a moment where it is cheaper and faster to rent powerful servers and batch process everything, than to run a GPU locally for days and nights burning KWhs.


💻 Hardware

All of it was done on a regular home PC.


🧰 Software Stack


🎨 Loras Used & Trained


📺 Resolution & Rendering Details

I need to work on this for the next project, it didn't go according to plan


😵‍💫 Final Thoughts

The biggest hurdle was still character consistency. I trained Loras, tested face-swapping, tried everything, but nothing quite nailed it. Underwater scenes and low-res footage made things harder.

Prompting and camera direction was another headache. Wan 2.1 is better than Hunyuan, but not exactly "obedient." I tried short prompts, long prompts, "3-sentence" tricks with mixed results.

By the end, I was feeling frustrated. I had hoped for more photorealism and tighter characters. Instead, the video still felt cartoonish (though that was partly intentional). I haven't fully mastered this — or maybe the tech just isn't quite there yet.

There were many challenges and frustrations, by the end it was more about getting it finished and learning from the experience.


🙏 Extra Thanks To:


back to top

We use cookies to improve your experience on our website. By continuing to browse, you consent to our use of cookies. Learn more