"The spirit of open source community is about sharing knowledge"

Q ≈ T + E
(Quality ≈ Time + Energy)

RESEARCH & DEVELOPMENT (2026)

(FREE WORKFLOWS)

For previous research posts and workflows visit Research (2025)

Topics are presented below in alphabetical order, and not in the order videos have been made. To aid speedier page caching, videos have been hidden behind clickable grey windows.

About
Audio Seperation
Base Image Pipeline
Detailers
Extending Videos
FFLF (First Frame, Last Frame)
Lipsync
My 2026 Hardware/Software
Upscalers (1080p)
Useful Software
VibeVoice

ABOUT

The below is research begun in 2026 and includes free workflows. 2026 brings a change of focus toward content creation, but I will continue adding/updating research and workflows here when I get time. All previous research can be found on the Research (2025) page and is still viable.

To keep up to date with new research and content creation, join my Patreon (free tier) or subscribe to the YouTube channel

💻 MY 2026 HARDWARE & SOFTWARE

GPU: RTX 3060 (12GB VRAM)
RAM: 32GB
+32GB static swap file on SSD (This helps avoid OOMs when everything else fills up)
OS: Windows 10
Comfyui portable version using: python 3.12, pytorch 2.7, Sage Attention 2.1, CUDA 12.8
Software switches for ComfyUI: --windows-standalone-build --lowvram --disable-smart-memory --disable-pinned-memory

NOTE: See the 2025 Research page video on Memory Management for further info on tweaks to get the most out of your setup

All 2025 research was done on a regular home PC using OSS software and models. It was running 24/7 quite often, and is still going in January 2026.

My current ethos is to work with affordable LowVRAM equipment to provide the best hope for other users who cannot afford expensive hardware.

BASE IMAGE PIPELINE

QWEN IMAGE EDIT 2511, Z-IMAGE, SEEDVR2 (4K)

Click to load video

Base Image Pipeline

Base Image Pipeline (Qwen Image Edit 2511, Z-Image, SeedVR2 (4K))

Date: 20th January 2026.

About: This is my base image pipeline for getting from idea to base image. I discuss how and why I use each tool. It may change as new tools develop and I will present them here when that happens.

This base image pipeline is good for character creation, camera angles for Lora training, realism, fixing plastic faces, fixing issues in shots, moving round shots, shot development from an idea to First Frame and Last Frame.

It is everything you need to be ready to start working on an idea and be ready to convert to video form.

This is the first time I have been able to get to 4K with ease, and I only bother because the process is so fast it makes it worth it.

The process outlined is fairly simple:

QWEN IMAGE EDIT to create an image or adjust one.
Z-IMAGE with low denoise to add realism.
SeedVR2 to upscale to 4K.

I also talk about Krita and the ACLY plugin in the video, but only briefly. The ACLY plugin gives Krita access to ComfyUI models.
Links to Krita are in the video text or in Useful Software section at the bottom of this page.

Workflows: To download the various workflows shown in this video right click here and download the zip file. The workflows can be found in the json files and you can drop them into Comfyui and they will load up. It contains the following:

MBEDIT - i2i-ZIMAGE_add-realism_vrs1.json
MBEDIT - Lanpaint-Z Image Turbo Fun 2_1 InPainting 4.0.json
MBEDIT - QWEN-2511-ref-image-restyle_vrs2.json
MBEDIT - QWEN-Benji_image_edit_2511 (multiangles)-V2.json
MBEDIT - SeedVR2_4K_image_upscale.json

The above are available in some form everywhere and don't contain anything new other than the models I use are Quantized. The difference is really in approach, and that is discussed in the video. I havent shared info on where to get models but notes in the video text explain that further. It is recommended you get the models that best suit your hardware setup.

DETAILERS

These are much the same as can be found in Research 2025 but renewed for 2026 here. They differ from upscalers though they can be use for it.

HuMO DETAILER

Date: 11th February 2026.

About: Detailers are v2v workflows. For example, when you have a finished video out of LTX-2 but you need to add detail or fix minor issues, this remains one of the best ways to "polish" or fix blemishes, effectively giving it a second pass through an alternative model for a touch-up.

At this moment in time, I cannot get this workflow to go above 480p, but it has one very interesting feature I was not aware of with HuMO - character consistency. (Like Phantom or MAGREF models).

This workflow from AbleJones (a.k.a. Droz) has been cunningly designed by him to use the first frame of your inbound video to inform the rest of the video about your characters. Thus you can force your characters back into the video based on the first frame (if you use FFLF this will usually be high grade).

This is fantastic, but to really make use of it you need to get to 720p or more. Right now I cannot with this workflow. Why? Probably because it also features the equally fantastic ClownShark Sampling stuff, but unfortunately for me on a 3060 RTX ClownShark stuff weighs heavily on time it takes to complete. Above 480p, it OOMs. So I need to tweak this workflow to get it reaching 720p before its of value to me, but you might find it useful.

Workflows: To download the HuMO detailer workflow shown in this video right click here and download the json file. The workflow can be dropped into Comfyui.

WAN DETAILER

Date: 11th February 2026.

A good example was an Eastern Brown snake in LTX-2 ended up with a head more like the snake in the Jungle Book cartoons, despite providing a good First-Frame image of the sname in the FFLF workflow. Running the final LTX-2 video through WAN at low denoise setting with a simple prompt: high quality, photorealistic. An Australian eastern brown snake. then setting it to 0.78 denoise adjusted the snakes head to a more correct look and tidied up the entire video.

Lower denoise levels between 0.3 and 0.79 can polish or be used to drive stronger fixes without losing the content of the original video. From 0.8 denoise and above, you will end up with completely different results than you put in. Still those are often good, and driven by the prompt, if you want that.

The issue now with this method is WAN is 16fps, but I find 157 frames (24fps out of LTX-2 reduced to 16fps = 157 frames) will work at 480p for this approach without issues. But going for higher resolutions runs into problems on my 3060 RTX at 157 frames. (Thankfully that is now resolved using 720p out of LTX-2 and running it through FlashVSR instead, but this WAN "detailer" approach still has its place, so I share it here).

You can use any model you like but t2v models are best generally. WAN or any type will work. I use WAN 2.2 t2v Low Noise model in this example workflow just because it always served me well. Using reduced GGUF models might allow for 720p, but I haven't tried.

Workflows: To download the WAN detailer workflow shown in this video right click here and download the json file. The workflow can be dropped into Comfyui.

EXTENDING VIDEOS

This includes extending from an image as a starting point, and extending an existing video

LTX-2 EXTENDING FROM IMAGE OR VIDEO CLIP

Includes two workflows using different approaches to extend a video.

Click to load video

LTX-2 Extending Videos - Two Approaches

Date (Original): 24th January 2026.

Updated Workflow 1: 11th February 2026.

The Kijai extension workflow originally used in the video above for the Mel Gibson Braveheart scene has been updated to include some extra features - mostly memory improvement nodes. I share it here. The below are the older versions, but just as useable, and you can use them if this does not work for you.

I have disabled the extensions and the audio-in by default in this one because I use it for fast i2v (image to video) sometimes (Most of the time I use the FFLF which has also had an update with similar setup)

About (Original): Two workflows are provided. (You will need to update ComfyUI to 23rd January 2026 (AEST) and Kijai nodes as well to make use of all the nodes featured).

In this video I discuss two different approaches to extending videos and provide both workflows in the links below.

Using an existing video clip to drive a v2v output. This takes the original clip and blends it in with whatever you prompt for action and dialogue. The result is an extended video. (MBEDIT-LTX-2_V2V_Extend_RuneXX_Vrs5.json)
Using a masked base image but driving "infinite" extension with an audio file for the lipsync dialogue. In this case I use a 28 second long audio file. (MBEDIT - KJ-wip-ltx2_extension_testing_vrs5.json)

The result can now get me to 720p on my lowVRAM (3060) GPU at infinite length (28 seconds took 40 minutes) thanks to additional nodes that I show in the i2v (Kijai based) workflow, these are recent additions to ComfyUI from Kijai, as well as a VAE memory improvement from another dev Rattus (available in update to ComfyUI after 23rd Jan 2026 AEST).

The power of the LTX model is only just starting to be understood and is rapidly evolving with all the dev attention. So this is a work in progress for both workflows, and both could be adapted to take alternative approaches as well.

Workflows: To download the two (original) workflows shown in this video right click here and download the zip file. The workflows can be found in the json files and you can drop them into Comfyui and they will load up.

Updated Workflow: To download the updated workflow right click here and download the json file. You can drop it into comfyui. Your ComfyUI will need to be up to date (Kijai custom nodes as of 13th January) for the additional nodes to work.

FFLF (First Frame, Last Frame)

There are other FFLF workflows available to download on the Research (2025). But for 2026 I will be using the LTX-2 version below.

LTX-2 FFLF

Click to load video

LTX-2 FFLF (First Frame, Last Frame)

Date: 21st January 2026.

Updated Workflow: 11th February 2026.

New nodes and loras have been added in this updated workflow and offer improvements to the original. Nothing else has been changed so you can use either the orginals or this one.

About: I really like this workflow and the look of LTX-2. The FFLF works well in this one and I tested a lot before finding it. It's also easy and quick to get to 720p and that is where the "Blancmange" effect starts to disappear, especially in this workflow.

Since making the video I have added the NAG in which now provides negative prompting (neg prompting wont work with cfg 1 without it, and cfg 1 is needed for the distilled models which I use).

More notes in the workflow will help you know how to use it.

I havent included Lipsync in this one because there are problems with lipsync audio-file-in approach but that will be the subject of another workflow and video. The issue has a solution but didnt work well in this workflow. But generally my dialogue shots wont be FFLF so its not something I bothered to address here. One workflow for each task suits me fine.

This workflow comes from Phr00t, I didnt design it. He has done a great job. Link to his github is in the workflow.

Workflows: To download the original workflow shown in this video right click here and download the png file. The workflow is in the metadata of the png file and you can drop it into Comfyui and it will load up.

Updated Workflow: this updated workflow has the latest memory improvement nodes, loras, and tweaks. If it doesnt work for you, the original is above and will work just fine. This has improved features worth using though. right click here to download the updated LTX-2 FFLF workflow and drop it into Comfyui. You will need Kijai custom nodes to be updated (after 13th Jan 2026) for the additional nodes to work.

LIPSYNC

There are other videos on Lipsync all still valid in the Research (2025) page. But for 2026 I will add new lipsync research models below.

LTX-2 LIPSYNC (Using Audio File)

Click to load video

LTX-2 Lipsync Audio-In

LTX-2 Lipsync Audio-In (Includes fixes for frozen-frame output)

Date: 22nd January 2026.

About: The most important thing is that this workflow fixes the "frozen frame" issue. It also introduces a fix for the fiddly solutions required when using the static camera lora (which I left in as some might find it useful).

The best solution is adding in the distilled lora and setting it to -0.3, once I did that I no longer had to keep tweaking multiple settings. The distilled lora is 7.5GB but weirdly still runs fine even on my 3060 RTX 12 GB Vram with 32 GB system ram. I havent fully tested this workflow beyond basically running it at various resolutions and no longer having all the problems show up. Consider it a wip that might be a final copy. If I find any extra things I will add them in and update the workflow below, and include a date of upload.

Notes in the workflow should get you started. As always I dont provide exact location of models as I assume you will want different ones based on your hardware. Most models come from three or four sources: Kijai, Quantstack, City96, and Unsloth all found on Hugging Face. (That is the GGUF's for us LowVRAM peasants, if you are using the big boi GPUs, you will likely have other sources).

The base workflow came from a guy called Abyss on discord, I have no idea where he got it from and have adapted it to suit my purposes.

Workflow: To download the workflow shown in this video right click here and download the png file. The workflow is in the metadata of the png file and you can drop it into Comfyui and it will load up.

UPSCALERS (1080p)

This is different to Detailers in that it is specifically for upscaling

FLASH-VSR

Date: 11th February 2026

About: How did this thing bypass my radar? I'll tell you, because I installed a version that was not this good. I am impressed by this upscaler.

My 3060 can push LTX-2 output video at 720p, 24fps, 241 frames (10 seconds long) to 1080p in just 10 minutes with great clarity. I have to experiment with it further to find the sweet spot between over clarifying and blending in what I need.

This version is from "naxci" - https://github.com/naxci1/ComfyUI-FlashVSR_Stable (I have used others before and didnt get the results as good as this)

Also, don't download the models in the above link, download the ones from within the workflow that are version 1.1. They are named the same, come from the same Hugging Face account, but annoyingly I downloaded 6GB of original models before realising the new ones were out (for some reason naxci missed updating that info). Follow the other instructions in the naxci link to get it up and running.

Workflow: To download the workflow right click here and download the json file. The workflow you can drop into Comfyui and it will load up. You will need to install the relevant custom node first.

USEFUL SOFTWARE

100% open source software (actually Davinci is not OSS, but is free with licensing caveats)

ComfyUI (Flux, Wan 2.1, VACE, FFLF, inpainting models, Upscalers, Interpolation) I use the portable version.
RVC for narration voice training and swap of my voice. Note: I have since started using Chatterbox and VibeVoice which are excellent text-to-speech tools and RVC might no longer be needed because of them.
VibeVoice is the fastest, free, MIT license, TTS I ever tried. It's amazing. I used it on all my videos in 2025. This is the Enemyx-Net version which is superior to the others I tried. You only need 10 to 30 seconds of voice audio, and with it you can create a believable podcast with the voice. It also does multiple voices and can create from text files. Incredible.
Krita + ACLY plugin – Inpainting and upscaling base images. image editing tasks.
Reaper DAW – Storyboarding with shot names and timecode rendered to MP4. I also used it for Narration, Foley, and Music. Mixing levels before taking it to Davinci for Final Cut. Reaper gave me far more granular power on the audio mix and is good for basic video duties.
Audacity - general audio file management duties; chopping out silence, changing format from wav to mp3, etc...
Shotcut - great as a fast fix for individual video clips: Stabilising, cropping, editing, masking, mirroring, reversing, colour and light control.
Davinci Resolve 20 – Final cut and colour grade
LibreOffice – Tracking shot names, prompts, colour themes, fixes, takes, etc.
Notepad++ - with the markdown plugin. I use markdown for project tracking and cross-platform compatibility. (I use Kate for editing markdown in Linux. Markor and ReadEra on android phone)
Kopia for auto-backups on Windows.
Free File Sync for manual backups.
Syncthing for sharing data between computers and phones. This is useful for project development as an alternative to Dropbox but needs careful management, it works differently. (I left Dropbox due to their AI data sharing policy concerns)
Python - I use VSCode for python coding over-night batch runs of Comfyui API workflows. I used Eleventy & Tailwind to develop this website. (I recently migrated away from React and NextJS).
Blender. I use Blender for animation duty, camera movement, and have plans to use the "grease pencil" feature for making 3D camera motion tracks, but havent had time to try that yet. I like it. But it is yet another thing I have to learn to use, and my feeling is in time that it won't be needed. AI prompting will do everything. We shall see.
Unreal Engine. I wanted to use UE for environment locations as I really liked what it did for me in Fallen Angel Music Video but... isnt there always a but... Firstly, UE is not good with animations imo. It is hard work. Secondly, its just so bloated and large to install I can't even fit it on my drives any more. So, I wont be using UE but I would if I could. Again, I think it will become obsolete in time as AI replaces it with a prompt. That is just my view.
Cascadeur. This software is also amazing for animation of bodies using human physics. It even has gravity physics, and you can pull a person around realistically by single points on their body, the rest of the body responds how a human would. It is by far the most useful human movement animation tool I used. But, it has licensing constraints, so although it is free to use, it is licensed. Something to be aware of, but I highly recommend checking it out.

VIBEVOICE

VIBE-VOICE & AUDIO SEPERATION

Both Vibe-Voice and Audio Seperation feature in the same workflow.

I did a recent TTS shoot out in the Patreon Free Tier between QWEN 3 TTS, VibeVoice TTS, Chatterbox TTS. I expected Chatterbox to win, it didn't. I remain a fan of Vibevoice at this time.

Date: 11th February 2026

About: This is not a unique workflow but is the VibeVoice version from Enemy-X Net and has the MelbandRoFormer nodes added in (from Kijai custom nodes found in the manager) for Audio Seperation (voice vrs instruments).

I use the MelbandRoFormer for two things:

Cleaning up the background noises VibeVoice tends to make
Seperation music tracks into vocals and instrumental for analysis or "narration" over the top to remove distracting vocals. (Do not use copyrighted songs for YouTube unless you know what you are doing. I use this for testing and research, or to analyse music tracks and vocal production methods)

I hadn't realised how good MelbandRoFormer was until I tried it in this setting. Now I use it in all my vibevoice TTS runs as it also normalises the output levels.

Workflow: Its best to install it from the Enemyx-net site linked above and follow their instructions for use of models and so on. But to download the workflow for inspection or use right click here and download the json file. The workflow you can drop into Comfyui and it will load up.

RESEARCH & DEVELOPMENT (2026)

NAVIGATION MENU

ABOUT

💻 MY 2026 HARDWARE & SOFTWARE

BASE IMAGE PIPELINE

QWEN IMAGE EDIT 2511, Z-IMAGE, SEEDVR2 (4K)

DETAILERS

HuMO DETAILER

WAN DETAILER

EXTENDING VIDEOS

LTX-2 EXTENDING FROM IMAGE OR VIDEO CLIP

FFLF (First Frame, Last Frame)

LTX-2 FFLF

LIPSYNC

LTX-2 LIPSYNC (Using Audio File)

UPSCALERS (1080p)

FLASH-VSR

USEFUL SOFTWARE

VIBEVOICE

VIBE-VOICE & AUDIO SEPERATION