If the people behind DragGAN have their way, we may say goodbye to what photo editing looks like. We have featured many photoshop tutorials in the past. Many of them focus on taking out the complex “photoshopping” part and providing a recipe to the editing process.
But this new editing workflow from DragGAN (available via huggingface) is taking even what little Photoshop is left out of the equation, making it a simple click-and-drag interface.
The DragGAN concept
DragGAN is not about tools, brushes, and layers. Instead, it is about letting you click in strategic places in a photo to create points. Then executing your “intent” as you drag your point.
According to the model developers:
Through DragGAN, anyone can deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc. As these manipulations are performed on the learned generative image manifold of a GAN, they tend to produce realistic outputs even for challenging scenarios such as hallucinating occluded content and deforming shapes that consistently follow the object’s rigidity
If this sounds like Giberish to you, just imagine that you can now edit an element in a photo with a drag of a mouse. For example, change eyes from close to open, or make a skirt longer, or a car smaller or bigger and so on. And all that while retaining a realistic look and not opening Photoshop.
And this is quite amazing. Amazing to the point that the interest crashed the DragGAN website.
DragGAN is not another MidJourney or Dall-E AI
But you might be wondering, how is this tool different from other photo editing tools that can change facial expressions and other features? Or even just your “regular” image-generation AI.
Well, to start with, this DragGAN is not generating images. It is editing them. And doing it amazingly well for a first-gen-still-in research tool. In fact, if you look at the pre-editing and edited photo side by side, it would be quite hard to figure out which is which.
But this model can do things that other editing software simply can’t do, like change an angle of an object, not just its perspective, or “invent” the details it needs to make a size change look more realistic.
On the way to computerless editing?
So, I want to play a little bit with LEGOs, because I can. Adobe Firefly already has some tools that can take textual instructions (a.k.a. prompts) and use them to transform videos, images, and sounds. We also have Whisper which can pretty much understand anything that anyone says in almost any language. And DragGAN feels like the first piece in a puzzle where you will be to manipulate an image without any expertise.
So, it feels to me like editing in the future is going to look more like, “Hi Siri, please change this shirt into a red dress and make the model wear a hat” rather than actually using a mouse and keyboard.
Of course, you can pair this with AI image generators like Stable Diffusion or MidJourney, and skip the whole taking-an-image part altogether.
[via Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold]