Consistent storytelling





Keeping characters and environments consistent is important in visual storytelling. Generative models are quickly getting better and better at maintaining consistency. 

I wanted to go through some ways to achieve consistency by preparing a couple of key frames that I would later use to generate videos.

The short story:
A documentarian travels through Scotland following old folk tales and finally captures the first sighting of the mythical unicorn.

I started with a prompt to establish the look of the documentarian and his surroundings:

Photoreal image of a man in his forties dressed for rainy weather, sitting on a wooden log with a camping tent behind him. He sits at the edge of the Scottish Wistman’s Wood Dartmoor moss covered forest. The forest consists of very ancient thick moss covered oak trees. Overcast sky, distant fog rolling through rugged highland hills and forest, damp earthy textures, muted greens and browns, moody and atmospheric outdoor camping scene. Shot with a Red camera for documentary style and hyper-detailed.

After a few tries, I was happy with the look of the documentarian and the environment was exactly what I was aiming for.

This image was generated using a style reference of the Scottish woods and Leonardo's model Lucid Origin:



One thing to note is that the prompting structure between a generative model and an editing model is different. I usually use ChatGPT or a similar LLM to switch between prompting for different models.

Before creating other frames, I wanted a close up of the documentarian, in order to have more detailed facial features to use in future generations. For this, editing models like Nano Banana do a great job:



Now I start creating some key frames with the closeup. For example, a frame of our documentarian crouching behind some rocks as he sees the unicorn for the first time.

 


As he would excitedly narrate to the camera at this point, I created another frame of him looking into the camera using Nano Banana:



Other shots would be the documentarian journeying through narrow paths across Scotland. Using the close up as an image reference into Nano Banana, I can place him into the desired environment: A narrow path through the ancient forest.




Final frame (path01):



He treks among ancient rock formations. Continuing with Nano Banana, I focused on creating the environment first:



I then place the documentarian on the path using the image from path01, because I wanted him to be holding his trekking stick in the same manner.



Zooming into a midshot, I have my frame:

























 

Comments

Popular posts from this blog

DMP plate rework with GenAI

ComfyUI IC-Light relighting exploration

Creating Styleframes with LeonardoAI and InvokeAI