GLOSSARY & EXPLANATIONS

“He's the kind of guy that when you ask him the time, he'll tell you how to make a wrist-watch.”

3D MODELLING

3D modelling of items and even heads and bodies can be done using Comfyui fairly easily. These can then be used either in other 3D modelling software like Blender if you know how (I don't), or just used to get camera angles of a thing and then restyle after in Comfyui using a styling workflow (I do this). But there is more...

fSpy and Blender software can also be used together to create realistic 3D models from 2D photos. fSpy helps determine the camera parameters (position, orientation, focal length) based on a photo, and the fSpy-Blender importer add-on brings those parameters into Blender. This allows you to build a 3D scene that accurately matches the perspective and appearance of the photo. This can then be used to create angled camera shots, that can be be used to provide the background set and environment of a video clip for later adding in characters using image editing software or Comfyui.

BSOD

Black, or Blue, Screen Of Death. Black implies hardware failure might be occuring in your computer system. Blue generally implies software failure is occuring in your computer system. Neither are fun, and both are considered the worst possible event of their kind. Often they lead to needing to either rebuild your computer, or buy new parts.

COMFYUI MODELS

Comfyui is an open source software used to run AI workflows to make images and videos. The creativity comes from use of pre-trained "models". In ComfyUI, "models" refer to the AI models used for generating the resulting media (images, videos, etc.). Models are large pre-trained data files that contain information about the relationships between text prompts (or an image) and the generated content, making it possible to translate words (or images) into pictures or video. These models are essential building blocks in ComfyUI workflows, allowing users to combine and customize them to achieve different creative effects. The model types in context of the work done on this site are: text-to-image, text-to-video, image-to-video, and video-to-video.

GAUSSIAN SPLATTING

Gaussian splatting is a 3D reconstruction and rendering technique that uses tiny, translucent ellipsoids called "splats" to represent 3D scenes. These splats, with their position, color, size, and transparency data, blend together to create realistic 3D models from a set of images or videos.

INTERPOLATE AND UPSCALE

Interpolating is adding new frames in based on the existing frames. i.e, if your video is 16fps (frames per second) then you "interpolate" to achieve 32fps, which is smoother on the eye. This is done by blending between each existing frame to double the number of frames that exist in the video clip. I do this twice, because I aim for 64fps. It's buttery-smooth on the eye. The Wan 2.1 model makes video clips in 16fps whether you like it or not. So interpolating is necessary if you want to get rid of the judder of 16fps movement that the eye detects. We stop noticing it after about 25fps.

Upscaling is when you take a 1024 x 576 video and increase it to 1920 x 1080. You increase the resolution. This does not increase the quality, only the size. You can add things into the process to increase the quality, but I don't at this stage.

I consider both these tasks to be done in the very final stage before going to post-production to be turned into the final video.

LORAS

LoRAs in ComfyUI are a way to fine-tune an existing model for specific styles or characteristics without retraining the entire base model. They work by adding low-rank matrices to the pre-trained model, adjusting only a portion of its parameters. This allows for a more efficient way to adapt a model to specific tasks, like adding a particular artistic style or character detail.

OPEN SOURCE SOFTWARE

Open-source software (OSS) is computer software whose source code is made available for public use, modification, and redistribution. This means anyone can access, examine, change, and share the code, fostering collaboration and innovation. It is made by geeks working for free on their passion. Generally it is distributed through outlets like github, or gitlab, and can be downloaded by anyone from those sources for free.

PROMPT ENGINEERING

Prompt Engineering is a black art. It seems easy until you try to do it. I believe in the future the only job left for humans will be "prompt engineering" to tell AI what you want from it.

SEED

In the context of Comfyui and workflow models, the "seed" is a number like 495182539294 that is the definition of a particular unique dataset point that the model was trained on. Every seed produces slightly different creative results from the AI model. The number of seeds to choose from will be huge, but depends on the model. When trying to get consistent results, sticking to a good seed that produced what you wanted will help, but other factors will also effect results. Like resolution, or other settings in the workflow. Don't expect the seed to always do what you wanted. e.g. if a person is looking in a different direction in your image than the previous image, the same seed might use a completely different structure for their face. There is no strict rule and it is impossible to predict, but a lot of time will be spent looking for "good" seeds for your particular need in that particular moment. One thing I do look for in seeds and make a note of, is action and motion, more so than features, since those aspects are more easily trained and controlled by Loras. Training Loras for "action" requires training on videos, and that is a bigger task than my machine can deal with. But change the input or output resolution, and you might find that seed no longer provides the action it did. Training Loras are definitely going to be a better approach than seed hunting, if you have to hardware, or the time and energy.

WORKFLOWS AND NODES

In ComfyUI, workflows are essentially a visual representation of how different AI processes connect and interact to generate media like images, videos, or audio. They are built using nodes (individually coded program objects) that are connected to form a network or graph. These connections define the flow of data and instructions, allowing users to create complex and customized generative AI pipelines. Into workflows you can add whatever Comfyui models that you wish, allowing you to drive creation - through the workflow - of stylised images, videos, or other kinds of media. Workflows are capable of being adapted into doing anything a computer can come up with. Workflow design is how we streamline our processes, and this is in the hands of the user to be creative in workflow design which will then dictate the output.

Back to the top