ComfyUI: Flux with LLM, 5x Upscale (Workflow Tutorial)

ControlAltAI منتشر شده در تاریخ 1403/06/10

8.2 هزار بار بازدید - 2 هفته پیش - The video focuses on Flux.1[dev]

The video focuses on Flux.1[dev] usage and workflow in Comfy UI. The workflow is semiautomatic, with logical processing applied to reduce V Ram usage. It entails Image Reference, Image2Image, Text to Image, and Consistent upscaling techniques. Preserving the text during upscale was challenging. The workflow achieves upscaling with text retention up to 5.04x, approximately the original generation.

------------------------

JSON File (YouTube Membership): @controlaltai

Links for Models:
Flux.1 [dev]: https://huggingface.co/black-forest-l...
Flux.1 [schnell]: https://huggingface.co/black-forest-l...
t5xxl: https://huggingface.co/comfyanonymous...

Upscaler:
4xLeexicaDat2_otf: https://openmodeldb.info/models/4x-Le...
4xFaceUpDat: https://openmodeldb.info/models/4x-Fa...

GitHub:
ControlAltAI Nodes: https://github.com/gseth/ControlAltAI...

CivtiAI LoRA Used:
https://civitai.com/models/562866?mod...
https://civitai.com/models/633553?mod...

Ollama:
llama3.1: https://ollama.com/library/llama3.1
llava-llama3: https://ollama.com/library/llava-llama3
llava (alternate vision model): https://ollama.com/library/llava

ComfyUI (Official): https://www.comfy.org/

------------------------

Disable Smart Memory in ComfyUI:
- Right Click run_nvidia_gpu.bat and edit in notepad.
- Add “ --disable-smart-memory” at the end of first line. Save and start comfy

------------------------
System Instructions For Image LLM:
You are an advanced AI assistant equipped with visual and language understanding capabilities. Your primary goal is to meticulously analyze the given image and generate a comprehensive prompt suitable for recreating or expanding upon the image using various generative models.

Analyze the image and classify it as either a vector illustration, digital painting, traditional art painting, drawing, sketch, photograph, 3D rendering, graphic design, street art, folk art, conceptual art, texture, pattern, cartoon comic, etc. Subsequently, describe the image in extreme detail with classification, encompassing its composition, style, mood, atmosphere, colors, lighting, shadows, and overall theme. Ensure your response is structured in paragraphs and is free of additional commentary.

Modify Image LLM:
Objective: Modify the given text prompt based on specific change instructions while preserving all other details.

Read the Change Instruction: Identify the specific change(s) to be made as provided in the next line.

"Reimagine the room decor as a kids room who loves space and astronomy"

Implement the Change: Apply the specified modification(s) to the text.

Preserve Original Context: Ensure that all other aspects of the text, including descriptions, mood, style, and composition, remain unchanged.

Maintain Consistency: Keep the language style and tone consistent with the original text.

Review Changes: Verify that the modifications are limited to the specified change and that the overall meaning and intent of the text are preserved.

Provide Response: Just output the modified text. Ensure your response is free of additional commentary.

Text LLM:
Based on the user prompt, generate a detailed response explaining the composition, style, mood, atmosphere, colors, lighting, shadows, and overall theme. Ensure your response is structured in paragraphs, less than 512 tokens, and is free of additional commentary.

Image/Text Summary LLM:
Objective: Summarize the user's prompt to a concise version of no more than 80 tokens. It is critical that you do not exceed the 80 token limit under any circumstances.

Provide Response: Just output the modified text. Ensure your response is free of additional commentary.

------------------------

TimeStamps:

0:00 Intro.
01:48 Requirements.
05:58 Flux Resolution, Input & Logic.
18:16 Text LLM Conditioning.
24:36 Control Bridge Update.
25:17 Image LLM Conditioning & Modify Logic.
37:10 Switch & Final Conditioning
39:40 Flux Core.
44:32 Img2Img, Settings.
49:43 Img2Img Manipulation & Style Transfer.
53:21 Understanding max_shift.
54:44 InPainting, SAM2.
01:00:37 InPainting Test.
01:02:21 Flux Add Details, Upscale & Post Processing.
01:13:56 Upscale Full Run, VRAM Tips, Schnell.

2 هفته پیش در تاریخ 1403/06/10 منتشر شده است.

8,274 بـار بازدید شده

... بیشتر