I don't think there is anything Forge specific here.
Go to the Extensions tab, then Install from URL, use the URL for this repository.
almost current UI, imagine a 'D' icon next to the 'K'
Easiest way to ensure necessary diffusers release is installed is to edit requirements_versions.txt in the webUI folder.
diffusers>=0.28.1
- added option to caption using Florence-2, in image to image section. 'P' button toggles overwriting prompt, results always written to console.
- minor code improvements
- minor addition to save noise colour settings to infotext
- added v1.1. Enabled by default, but optional by using the obvious button. I think only the transformer has changed, so 5.64GB extra download (same for Distilled, if used). From brief tests, it does seem to be a step-up.
- option to not use T5 text encoder
- settings to colourize the initial noise. This offers some extra control over the output and is near-enough free. Leave strength at 0.0 to bypass it.
- experimental double prompting - subprompts for each text encoder. Split prompts with '|', first subprompt for CLIP, second for T5. If not used: same prompt sent to both, same as previous behaviour.
- code cleanup, handles the text encoders manually, better for VRAM usage. In good conditions, no speed up; but bad conditions are harder to hit.
- moved styles to unique file
- added support for the distilled version, which is better when using fewer steps. I download only the distilled transformer, so the cost is ~5.6GB rather than another 13GB. Toggle the D icon, top-right of left column: lit up means using the distilled version. Downloaded on demand, cached locally. Should this be new default?
- !! don't apply i2i denoise strength when not doing i2i, late night me forgot to copy that over from the PixArt implementation
- enabled guidance rescale for testing. It's good, very similar method used in my cfgFade extension.
- reduced VRAM, no longer flirting with shared memory
- caching of prompt embeds to avoid text encoder processing if prompt and negative not changed
- img2img, same method as used with PixArt
Initial release, dips into shared memory too easily. K icon (top-right of left column) toggles use of Karras sigmas for the samplers. Seemed useful with PixArt + Cascade, so why not here?
Of course, always same prompt, seed, sampler. Non-distilled version.
From left to right:
- cfg: 2, 4, 8, 8
- steps: 20, 20, 20, 40 Top row: 0 rescale; bottom row: 0.75 rescale
Generating with 8GB VRAM is possible. Using CFG 1 saves some VRAM and is considerably faster, but still slower than equivalent resolutions with sdXL or PixArt. Using small resolutions (768x768) seems to give very poor/broken results. Resolution binning is NOT enabled (width/height would be automatically adjusted to 'supported' values) as this seems to cause issues along borders.
prompt: photograph of a kintsugi bowl of steaming dumplings on a rustic wooden table