Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Announcement:
- - Intro to diffusion models for text-to-image and Image Editing
- - DreamFusion — the power of pretrained diffusion models for 3D synthesis, and some following works
- - InstructNeRF2NeRF — prompt-based editing of 3D models
- - APAP — dragging manipulations with diffusion models as priors
- Plan:
- - About
- - I'll go over a few things, without focusing too much on one single thing, just jumping over cool ideas.
- - This is tough, but we have a whiteboard for any explanations, we can detour at any point
- - Context
- - reminder: ASK PEOPLE WHETHER THEY KNOW THIS STUFF
- - Everyone knows about DALL-E / Midjourney / Stable Diffusion
- - There's stuff like ComfyUI and other community efforts for making Stable Diffusion smarter
- - Like taking in human pose, or normals, or depth
- - There have also been attempts at generating multi-view images of the same object with Stable Diffusion, for gaming assets
- - Well, why not use the power of diffusion models for actually creating 3D objects?
- - Intro to diffusion
- - REMINDER TO ASK WHETHER ANYONE KNOWS ABOUT DIFFUSION OR NEURAL NETWORKS
- - The base diagram – forward process and back process
- - Funny formula slides
- - Btw, all this math is bullshit, researchers intentionally complicate this to make their papers look smarter
- - Here's the formula (with all the notation, still looks tough)
- - Let's simplify step-by-step
- - Take an image
- - Add noise
- - Ask the model to predict the noise
- - Loss is MSE(true_noise, predicted_noise)
- - Now let's replace with notation + expand the add_noise/reduce_noise, and we have exactly that
- - During sampling, in order to do it over multiple steps, we start from pure noise, denoise, but add it not fully, and so on
- - For editing
- - Basic idea from SDEdit — just add noise and denoise with another prompt
- - InstrucPix2Pix — дообучаем принимать исходную картинку для сохранения геометрии
- - For 3D generation
- - REMINDER TO ASK ABOUT WHETHER ANYONE KNOWS ABOUT 3D
- - Once again NeRFs — WE TRAIN THEM ONE TIME one the scene
- - Usually we have photos, pass them on to train the NeRF, get the 3D object like that
- - Now what if we take this image from this viewpoint, and instruct "make it more like prompt T", and use that for training
- - That is the idea behind DreamFusion (show images)
- - Works that do the same but finetuning over Objaverse (MVDream)
- - For 3D editing
- - Apply the same idea but with a NeRF we already have, and InstructNeRF2NeRF that will preserve the geometry of the object
- - This gets us InstructNeRF2NeRF
- - For 3D editing with dragging
- - For simplicity assume we're working with a mesh
- - On it we can define this differential geometry method that minimizes weird angles after moving vertices around, resulting in this nice deformation (ARAP)
- - Now let's add a diffusion model on top of that as prior, this becomes APAP
Add Comment
Please, Sign In to add comment