Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’ve been making a lot of stuff for my D&D buddies using Stable Diffusion. With hands, I basically brute force it. Using an A100 40GB on Colab I can generate ~28 or so (depending on the size of the prompt, Automatic1111 allows for prompts above the 75 token limit at the expense or more vRAM per image) batches in about a minute, filter those and look at the one with the best hands, then feed it back in using inpainting (so regenerating just that small space, not the whole image) and eventually get one set of good hands and 100 sets of bad hands. If you’ve got a mysterious sixth finger you just inpaint it off and add latent noise under the inpaint instead of the original picture (just a checkbox in the ui) and set your denoising to 0.80+ and it’ll replace the finger with the background pretty consistently.


Yeah, I fiddle with it locally and img2img/inpaint is very helpful with these kinds of touchups. Currently playing with LoRA training to put my friends into pictures, but I haven't figured it out well enough to get it working with inpainting - Still easier to Photoshop their face in and use inpaint to merge everything together.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: