Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SVD error #132

Open
Tanghonghan opened this issue Jan 22, 2024 · 19 comments
Open

SVD error #132

Tanghonghan opened this issue Jan 22, 2024 · 19 comments

Comments

@Tanghonghan
Copy link

Can generate image using Enfugue, but couldn't generate animation using it.

2BF%EH{G0BBV{TAAT 4ZNGW

Once I tried to using SVD, this error always appears, don't know if my setting is not right, or it's anything else.
Thanks!

@painebenjamin
Copy link
Owner

Hello @Tanghonghan!

I'm not sure if you're the same user who came to report this in the Discord or not, but I believe this is caused by selecting the SVD model in the model picker. I was not clear enough on this, but you do not need to select SVD to use it. Simply enabling animation and setting the engine to "Stable Video Diffusion" will tell enfugue to load the SVD motion module when it is necessary to do so, and enfugue will load it alongside your chosen stable diffusion checkpoint when it is relevant to do so.

@Tanghonghan
Copy link
Author

Hello @Tanghonghan!

I'm not sure if you're the same user who came to report this in the Discord or not, but I believe this is caused by selecting the SVD model in the model picker. I was not clear enough on this, but you do not need to select SVD to use it. Simply enabling animation and setting the engine to "Stable Video Diffusion" will tell enfugue to load the SVD motion module when it is necessary to do so, and enfugue will load it alongside your chosen stable diffusion checkpoint when it is relevant to do so.

Thank you for your response! I strictly followed your advice and problem solved, yet another problem arose: Couldn't generate normal animation using SVD. Either image stood still as image(and not animation) or some error occurs. Could you please make a video on how to generate animation using Enfugue and release it on Youtube? Especially how to use DragNUWA on Enfugue as well cause I found it very hard to do so. Thanks again for your effort! Much appreciatied!

@painebenjamin
Copy link
Owner

painebenjamin commented Jan 24, 2024

Hey @Tanghonghan,

Glad the first problem is solved, sorry you hit another issue! Did you follow the instructions from this video? If you did and still experienced errors, I would love to know a bit more about what you were trying to do. Were there a lot of motion vectors involved? What resolution was the image, and how many frames did you try to do?

297960767-ae28ac55-2eba-4315-9362-29dc41cdd8d4.mp4

If you want to share all the settings from the front-end, you can go to File > Save and upload the configuration as a .json file, and I would be able to try and reproduce the error using the same settings.

@Tanghonghan
Copy link
Author

Hey @Tanghonghan,

Glad the first problem is solved, sorry you hit another issue! Did you follow the instructions from this video? If you did and still experienced errors, I would love to know a bit more about what you were trying to do. Were there a lot of motion vectors involved? What resolution was the image, and how many frames did you try to do?

297960767-ae28ac55-2eba-4315-9362-29dc41cdd8d4.mp4
If you want to share all the settings from the front-end, you can go to File > Save and upload the configuration as a .json file, and I would be able to try and reproduce the error using the same settings.

Hi @painebenjamin

Actually I watched this video you showed me several times, and I successfully generate an animation using the method in the video so thank you very much. Yet I encounted some more problems( seems they keep popping out no matter what hahaha). One of which is : I don't know what image I am currently working at. I thought I am working on picture A, but in fact when I run "Enfugue", it shows that I was actually working on pic B, and that's very confusing. I guess it has sth to do with layers which I am not very familiar with. But when I delete all the layers exhibiting on the down right corner (and kept only one), the result still shows that I was working on some other pic instead. I found that a bit frustrating.

Another question is: when I drag some picture in the canvas, it only shows part of the pic, and tne buttons on the top right of pic seems useless, I don't know why. I couldn't let the whole pic show no matter how I tried. And the animation also is only part of the pic.

Those are the two main problems I was encountering.

Thank you for your patience and time!

@Tanghonghan
Copy link
Author

[
Enfugue Project.json
](url)

Oh almost forget, I uploaded my json for your reference. Thanks!

@Tanghonghan
Copy link
Author

RCB22PF}Y{U136 }AXTRPDK

Tried DragNUWA but always ran out of GPU memory. How can it be? 24G is not so small. The pic size is 1792*1024. Frames are 14

@painebenjamin
Copy link
Owner

Hey @Tanghonghan! Very glad you got farther!

I'll answer your questions in order:

  1. Thank you for sharing your feelings on the interface. I believe you might find the "split" layout to be more intuitive for you - under Layout in the menu you can split the main window horizontally or vertically, making one side always the input and one side always the output. I understand your frustrations though, I know it is not a perfect experience. 0.3.0 had a massive front-end redesign, and 0.4.0 will have another (with some advice from expert designers this time.) If you have any other suggestions about the UI I'd love to hear them.
  2. I think you're saying you can't see the entire image at once, so I'm guessing you're having difficulties with the pan/zoom features of the canvas. There is some text in the bottom-left-hand corner of the panel that, when hovered over, gives the controls for the canvas - you can also go to Help > Controls to see the same information. I'll copy the note below.
    image
  3. You are correct that 24G is not small, but unfortunately you are still asking too much of DragNUWA anyway; those dimensions are huge. The original model was trained at a tiny 576x320, and the official repository uses 16 Gb for that! I've managed to significantly reduce the VRAM usage to the single digits for that size, but it is still the biggest VRAM hog of all the supported video models. I also have 24GB and 1024x1024 is about the maximum it will go. You may also find that animations that large do not move very much; again it is because of the difference between the inference resolution and the trained resolution. My go-to resolution has been 1024x576 or 576x1024, and then upscaling from there.

Controls:

General

  1. Move the entire canvas (pan) by placing your cursor over it then holding down Middle-Mouse-Button, or alternatively Ctrl+Left-Mouse-Button or Alt+Left-Mouse-Button (Option⌥+Left-Mouse-Button on MacOS), and move the canvas around.
  2. Zoom in and out using the scroll wheel or scroll gestures. You can also click the + and - icons in the bottom-right-hand corner. Click 'RESET' at any time to bring the canvas back to the initial position.
    ##Painting
  3. Use the scroll wheel or scroll gestures to increase or descrease the size of your brush.
  4. Hold Ctrl when scrolling up/down to stop this behavior and instead perform the general behavior of zooming in/out.
  5. Use Left-Mouse-Button to draw, or Alt+Left-Mouse-Button to erase.
  6. After painting and releasing Left-Mouse-Button, hold shift when you begin painting again to draw a straight line between the previous final and your current position using the current brush.

Motion Vectors

  1. Click the Left-Mouse-Button on an empty portion of the canvas to start selecting existing points with a rectangular selector. Hold shift while doing this to add the selected points to your current selection, instead of replacing it. When you left-click on a point on the canvas, it will be grabbed and moved, optionally with shift as well to move all points at once. Left-clicking on a spline instead will select and move the entire spline.
  2. Click Alt+Left-Mouse-Button on an empty section of the canvas to draw a new linear motion vector. Move your mouse and release it to draw a line between those points. When you use Alt and left-click on a point instead of the canvas, the point will be deleted. Alt-left-clicking a spline will add a new point along the segment your mouse is over.
  3. Hold Ctrl+Shift and click the left-mouse-button anywhere on the canvas to rotate all selected points about their center.
  4. When points are selected, press Delete on your keyboard to delete them.
  5. When points are selected, press Ctrl+C on your keyboard to copy them. You must select at least two points on a spline for it to be copied.
  6. Double-Click an anchor point to convert it back-and-forth between linear and bezier. When a point is in bezier mode, there will be one or two control points that control the curvature and can be moved in the same manner as other points.
  7. Holding Ctrl while left-clicking will resume the previous behavior of moving the canvas.
  8. Press Ctrl+Z to undo an action, and Ctrl+Y to redo an action after undoing it.

@Tanghonghan
Copy link
Author

Hey @Tanghonghan! Very glad you got farther!

I'll answer your questions in order:

  1. Thank you for sharing your feelings on the interface. I believe you might find the "split" layout to be more intuitive for you - under Layout in the menu you can split the main window horizontally or vertically, making one side always the input and one side always the output. I understand your frustrations though, I know it is not a perfect experience. 0.3.0 had a massive front-end redesign, and 0.4.0 will have another (with some advice from expert designers this time.) If you have any other suggestions about the UI I'd love to hear them.
  2. I think you're saying you can't see the entire image at once, so I'm guessing you're having difficulties with the pan/zoom features of the canvas. There is some text in the bottom-left-hand corner of the panel that, when hovered over, gives the controls for the canvas - you can also go to Help > Controls to see the same information. I'll copy the note below.
    image
  3. You are correct that 24G is not small, but unfortunately you are still asking too much of DragNUWA anyway; those dimensions are huge. The original model was trained at a tiny 576x320, and the official repository uses 16 Gb for that! I've managed to significantly reduce the VRAM usage to the single digits for that size, but it is still the biggest VRAM hog of all the supported video models. I also have 24GB and 1024x1024 is about the maximum it will go. You may also find that animations that large do not move very much; again it is because of the difference between the inference resolution and the trained resolution. My go-to resolution has been 1024x576 or 576x1024, and then upscaling from there.

Controls:

General

  1. Move the entire canvas (pan) by placing your cursor over it then holding down Middle-Mouse-Button, or alternatively Ctrl+Left-Mouse-Button or Alt+Left-Mouse-Button (Option⌥+Left-Mouse-Button on MacOS), and move the canvas around.
  2. Zoom in and out using the scroll wheel or scroll gestures. You can also click the + and - icons in the bottom-right-hand corner. Click 'RESET' at any time to bring the canvas back to the initial position.
    ##Painting
  3. Use the scroll wheel or scroll gestures to increase or descrease the size of your brush.
  4. Hold Ctrl when scrolling up/down to stop this behavior and instead perform the general behavior of zooming in/out.
  5. Use Left-Mouse-Button to draw, or Alt+Left-Mouse-Button to erase.
  6. After painting and releasing Left-Mouse-Button, hold shift when you begin painting again to draw a straight line between the previous final and your current position using the current brush.

Motion Vectors

  1. Click the Left-Mouse-Button on an empty portion of the canvas to start selecting existing points with a rectangular selector. Hold shift while doing this to add the selected points to your current selection, instead of replacing it. When you left-click on a point on the canvas, it will be grabbed and moved, optionally with shift as well to move all points at once. Left-clicking on a spline instead will select and move the entire spline.
  2. Click Alt+Left-Mouse-Button on an empty section of the canvas to draw a new linear motion vector. Move your mouse and release it to draw a line between those points. When you use Alt and left-click on a point instead of the canvas, the point will be deleted. Alt-left-clicking a spline will add a new point along the segment your mouse is over.
  3. Hold Ctrl+Shift and click the left-mouse-button anywhere on the canvas to rotate all selected points about their center.
  4. When points are selected, press Delete on your keyboard to delete them.
  5. When points are selected, press Ctrl+C on your keyboard to copy them. You must select at least two points on a spline for it to be copied.
  6. Double-Click an anchor point to convert it back-and-forth between linear and bezier. When a point is in bezier mode, there will be one or two control points that control the curvature and can be moved in the same manner as other points.
  7. Holding Ctrl while left-clicking will resume the previous behavior of moving the canvas.
  8. Press Ctrl+Z to undo an action, and Ctrl+Y to redo an action after undoing it.

Hi @painebenjamin

Thank you for your explanation in detail, I think I begin to get hold of what you say a little bit. Last time when I think I work on pic A, I was actually working on pic B on canvas, and "samples" clearly is no canvas. That might be what got me confused.

Still something bothers me: If I use a pic at 17921024, how can I generate a DragNUWA svd animation with it? Or maybe I couldn't right now? I tried to manipulated the resolution, but as soon as I change it, the pic becomes part of the original one and cannot show the whole pic (and animation goes with that). If I want to generate a DragNUWA svd animation, the max resolution I use is 10241024 as you said in the last article? That surely puts a lot of limitation on the animation we generate. For eg, if I have some pics from Midjourney at the resolution of 1456*816, I will have to lower the resolution before I put them inside Enfugue? That's really a bummer....

Wish you a happy day!

best,

Tanghonghan

@painebenjamin
Copy link
Owner

Images and video are still very different when it comes to AI generation, I'm afraid. Runway's Gen-2 model runs on big server GPU's and only generates at 768x448, for example - relying on upscaling to bring it higher. Generating coherent animation is still very expensive in terms of memory, even after all of the optimizations I and others have been able to make.

There are many features in ENFUGUE to make up for these limitations, though. I decided to record a video of me generating an image at the resolution of MJ at 1456x816, using that to generate an animation using DragNUWA at 896x512, then upscaling the result to 1792x1024 using AnimateDiff and interpolating 14 frames to 112 frames without leaving ENFUGUE, taking 20 minutes total (including working time and render time) on a 3090 Ti. Here is the result of that:

b24a7bcb70bf4fa681322ba77b3e1e06.mp4

And here is the video:

anim.mp4

After the video was done I decided to go back and give it another pass just to show you that more can be done with tweaking. For this final animation, I interpolated once to a total of 28 frames prior to upscaling, then instead of doing an upscale step with AnimateDiff, I did it with HotshotXL and the same OpenDallE model that I made the image with. I prefer this to the first:

d9e83c46e55f4003857a127b69c93c55.mp4

@Tanghonghan
Copy link
Author

Tanghonghan commented Jan 26, 2024

@painebenjamin
Awesome work! I learned a lot in your video, thank you!
I understand nowadays there are still a lot of limits on svd, and try to lower my expectations, but looking forward to progress.
When Enfugue dealing with more complex images, is it possible that we could bring more motion to the subject in the image, not just the camera movement? Especially when there are multiple human beings or creatures inside the image
7Q$A}TF8GJ8KK20D%PAQEUS

. For eg, I found it very hard try to get the people moving in this image. Is there anyway to achieve that goal?

@painebenjamin
Copy link
Owner

Absolutely!

DragNUWA is great for camera motion, with maybe one or two subjects in frame doing the action you wanted. The AI will fill in the remainder of the movement in the frame, and as you've discovered it won't always result in as much motion as you want.

So my first tip is that if you can forego some control, you can give some more power back to the AI to fill in the gaps and it can identify more motion in frame. This is the result of taking that image and running it through SVD-XT without any further processing:

walking.mp4

If you want to interpolate and then add more background motion to the video, doing video-to-video with AnimateDiff and a high Motion Attention Scale (this is 2.75) can add a lot of varying movements in the scene.

f8e76c28e59744a5a866b22500edd901.mp4

@Tanghonghan
Copy link
Author

Absolutely!

DragNUWA is great for camera motion, with maybe one or two subjects in frame doing the action you wanted. The AI will fill in the remainder of the movement in the frame, and as you've discovered it won't always result in as much motion as you want.

So my first tip is that if you can forego some control, you can give some more power back to the AI to fill in the gaps and it can identify more motion in frame. This is the result of taking that image and running it through SVD-XT without any further processing:

walking.mp4
If you want to interpolate and then add more background motion to the video, doing video-to-video with AnimateDiff and a high Motion Attention Scale (this is 2.75) can add a lot of varying movements in the scene.

f8e76c28e59744a5a866b22500edd901.mp4

@painebenjamin
Those two outcome are great! Could you teach me how to achieve such wonderful result please?

@painebenjamin
Copy link
Owner

@Tanghonghan,

Certainly, it was very simple! I merely brought in the image, scaled it to the ideal SVD dimensions of 1024x576, and selected the SVD-XT model. All of the other settings were default!

Here is a video recording of me setting up a similar run, and here is the result of that run:

000bd2ad47d64eb6879bd7c6bec8bb12

For the second of the two videos, I followed the same upscale steps I showed in the other screen recording!

@Tanghonghan
Copy link
Author

@painebenjamin
Yeah!
I tried not using DragNUWA, just svd and got a similar results just like yours.

SVD_01098.mp4

But yesterday when I tried to use DragNUWA in Enfugue, I only got something like this (no subject moving, just some trash flying in the air), and that was what got me confused.

3776b89fee774095bea79590d04def88.mp4

Not using DragNUWA actually generates better outcome than using DragNUWA? Still don't know why.

Second question: could you share the workflow concerning interpolation and video to video using animatediff? And what is Motion Attention Scale (this is 2.75) exactly? I only know motion bucket ID.

Last but not least, I promoted Enfugue to several of my chatgroups, which amount to total of 1000 Chinese Comfyui users! After a couple of days using, Enfugue strikes me as an awesome ui under your guidance, wish it gain more and more users with your continuous hard working, and hope Enfugue evolves into greater version through time!

Best,

Tanghonghan

@painebenjamin
Copy link
Owner

@Tanghonghan There are a number of reasons base SVD outperforms DragNUWA at this task.

  1. The trained resolution of DragNUWA is 576x320. The trained resolution of SVD is 1024x576. The images we are testing on are much closer to the training resolution of base SVD than DragNUWA.
  2. We aren't sure of all of the training data that went into NUWA, but the examples they showed us generally had one or two subjects, combined with camera motion. The farther we get from the data the AI knows, the worse it performs. SVD's training data was much more diverse.

Overall, NUWA's current power does not lie in its ability to create high-quality video. Instead, its power is in creating highly controllable video, for when you want a specific shot with specific camera language or subject movements. At the moment our best bet for producing high-quality AI video is a combination of methods with SVD, DragNUWA, AnimateDiff and HotshotXL, using the strengths of each to make up for the shortcomings in the others.

In this video, I take the video I made and upscale it with ESRGAN and interpolate it with FILM. Thisn does not do video-to-video, it is just upscale/interpolate.

0001-0991.mp4

Here is the result:

moved.mp4

I then wanted to take you through a video-to-video workflow. Using the same video, I produced an anime version:

7eb0558ba73446cd8d02552dd71b9498.mp4

This requires a bit more configuration; it is using many different techniques in concert with one another. Here is how I produced that video:

0001-3389.mp4

I know I moved fairly quickly in setting that up so I wanted to write out all the configuration - it is below.

A quick note about that line in the corner you see in the workflow video: This is an artifact that can be produced by both AnimateDiff and HotshotXL when creating animations that are not the trained resolution of those models, which is 512x512. It appears in most motion modules, but not all. I typically will work around this by knowing I will cut off an edge of the animation (it is not always the same corner,) you can also work around this by enabling tiled diffusion but this also makes the render take longer, so it's a trade-off. In the above video, I cropped it out.

Model Configuration

If you weren't aware, going to Configuration Manager under Models lets you create pre-configured groups of models. I used one I have for anime, these are the settings.

image

  1. Base checkpoint: Counterfeit V3
  2. LoRA: SD 1.5 DPO V1.0 @ 0.99 scale
  3. Textual Inversion: EasyNegative
  4. VAE: sd-vae-ft-mse.safetensors (auto-downloads when selected)
  5. Prompt: anime style, sharp lines, vector art, high-quality digital drawing, japanese cartoon
  6. Negative Prompt: easynegative, shading, smooth shadows, photography, photorealistic, photorealism

Inference Configuration

Global (Left-Hand Side)

  1. Size: 1024x576
  2. Denoising Strength* 0.75

Under 'Tweaks'

  1. Guidance Scale: 8
  2. Inference Steps: 35
  3. Scheduler: Euler Discrete Scheduler Karras
  4. CLIP Skip: 1

Under 'Animation'

  1. Animation: Enabled
  2. Animation Frames: 20

Under 'AnimateDiff and HotshotXL'

  1. Use Motion Attention Scaling: Enabled
  2. Motion Attention Scale Multiplier: 1.25

Prompt

  1. Positive: a group of women walking down the street, traditional japanese clothing, confetti, celebration

Layer Options (Right-Hand Side)

Under 'Image Visibility'

  1. Visibility Mode: Denoised (Image to Image)

Under 'Image Roles'

  1. ControlNet: Enabled

Under 'ControlNet Units'

  1. Unit 1: Pose, conditioning @ 0.9 from 0.05 to 0.9
  2. Unit 2: Depth, conditioning @ 0.8 from 0.05 to 0.7
  3. Unit 3: Soft Edge, conditioning @ 0.5 from 0.05 to 0.65

@Tanghonghan
Copy link
Author

@painebenjamin
Tried the upscaling and interpolation process, but it always report this error even I did input prompt
image
image

@Tanghonghan
Copy link
Author

@painebenjamin
_Under 'AnimateDiff and HotshotXL'
Use Motion Attention Scaling: Enabled
Motion Attention Scale Multiplier: 1.25

Layer Options (Right-Hand Side)
Under 'Image Visibility'
Visibility Mode: Denoised (Image to Image)_

Two questions:
1,“Motion Attention Scaling”, this factor only affects when I want to using Animatediff/HotshotXL for a vid2vid process, right?
If I only using svd and upscaling/interpolation process, there is no "Motion Attention Scaling“,right?
2,Pretty much the same, choosing "Visibility Mode: Denoised (Image to Image)" only for vid2vid process. When doing image2video such as svd or NUWA, just choose "visible( inpainting/outpainting", right?

@painebenjamin
Copy link
Owner

painebenjamin commented Jan 28, 2024

Regarding your question about the prompt requirement: you need to use the global prompt (located at the bottom of the global options on the left sidebar), not the detail prompt, which is applicable only during the upscaling step. I understand this is confusing - the upscaling interface is slated for a complete redesign to address this issue.

Concerning your other two questions:

  1. Motion Attention Scaling is relevant for AD/HSXL, but not only for video-to-video - it is also used in text-to-video and image-to-video. Think of video generation as involving two types of attention: 'spatial' attention, which ensures consistency in the visible structures of a single frame, and 'temporal' attention, which maintains consistent motion across subsequent frames. Adjusting the motion attention independently allows AD/HSXL to preserve the structure of individual frames while altering the motion between frames by a certain factor. A factor above 1 generally amplifies motion, while a factor below 1 reduces it.
  2. This aspect is another source of confusion and is being revamped in the upcoming iteration. When you select an option in this context, you are essentially instructing ENFUGUE on the operation to perform during the primary inference stage. This concept became more complex with the introduction of SVD. The 'primary inference' includes the initial processes like text-to-image, text-to-video, image-to-image, image-to-video, video-to-video, etc., followed by post-processing steps such as detailing, upscaling, or interpolation. When SVD first came along it only supported one modality (image-to-video), so it was not suitable as a primary inference engine, being reclassified as a post-processing step step. Here's the key point: selecting "Denoised" sets the primary inference step to image-to-image, followed by the image-to-video post-processing step. Choosing "Visible," on the other hand, skips the primary inference step as there is no need to process the image further.

I know that the last point is complex and is indeed one of ENFUGUE's weaker aspects. It's been long overdue for an overhaul, and I'm now dedicating time to improve it for version 0.4.0. If you know any human interface designers or user experience experts interested in contributing to an open-source project, their expertise would be immensely valuable. 😄 Thank you so much for your support and for spreading the word!

@Tanghonghan
Copy link
Author

Regarding your question about the prompt requirement: you need to use the global prompt (located at the bottom of the global options on the left sidebar), not the detail prompt, which is applicable only during the upscaling step. I understand this is confusing - the upscaling interface is slated for a complete redesign to address this issue.

Concerning your other two questions:

  1. Motion Attention Scaling is relevant for AD/HSXL, but not only for video-to-video - it is also used in text-to-video and image-to-video. Think of video generation as involving two types of attention: 'spatial' attention, which ensures consistency in the visible structures of a single frame, and 'temporal' attention, which maintains consistent motion across subsequent frames. Adjusting the motion attention independently allows AD/HSXL to preserve the structure of individual frames while altering the motion between frames by a certain factor. A factor above 1 generally amplifies motion, while a factor below 1 reduces it.
  2. This aspect is another source of confusion and is being revamped in the upcoming iteration. When you select an option in this context, you are essentially instructing ENFUGUE on the operation to perform during the primary inference stage. This concept became more complex with the introduction of SVD. The 'primary inference' includes the initial processes like text-to-image, text-to-video, image-to-image, image-to-video, video-to-video, etc., followed by post-processing steps such as detailing, upscaling, or interpolation. When SVD first came along it only supported one modality (image-to-video), so it was not suitable as a primary inference engine, being reclassified as a post-processing step step. Here's the key point: selecting "Denoised" sets the primary inference step to image-to-image, followed by the image-to-video post-processing step. Choosing "Visible," on the other hand, skips the primary inference step as there is no need to process the image further.

I know that the last point is complex and is indeed one of ENFUGUE's weaker aspects. It's been long overdue for an overhaul, and I'm now dedicating time to improve it for version 0.4.0. If you know any human interface designers or user experience experts interested in contributing to an open-source project, their expertise would be immensely valuable. 😄 Thank you so much for your support and for spreading the word!

@painebenjamin

Hi, I read your explanation a few times, and found those two issues a little bit hard to understand. Let me rephrase my words of them:
Concerning Motion Attention Scaling, main point is: above 1 means more action while below 1 means fewer action ,but more consistent maybe?
Second question is Image Visibility, "denoised" means process going through latent space, which does image2image before post-processing step, right? And "Visible" skips latent space step and goes straight into post-processing step.
Those were my understanding of those two, feel free to correct me if my understanding is wrong please.

I have spread your need for human interface designer and user experience experts in my chatgroups, hopefully there will be someone who would like to contribute to Enfugue in the days to come.

best,

Tanghonghan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants