SONY

Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models

Date
2023
Academic Conference
AI for Content Creation Workshop, CVPR 2023 (IEEE/CVF Conference on Computer Vision and Pattern Recognition)
Authors
Naoki Matsunaga(Sony Group Corporation)
Masato Ishii(Sony Group Corporation)
Akio Hayakawa(Sony Group Corporation)
Kenji Suzuki(Sony Group Corporation)
Takuya Narihira(Sony Group Corporation)
Research Areas
AI & Machine Learning

Abstract

Our goal is to develop fine-grained real-image editing methods suitable for real-world applications. In this paper, we first summarize four requirements for these methods and propose a novel diffusion-based image editing framework with pixel-wise guidance that satisfies these requirements. Specifically, we train pixel-classifiers with a few annotated data and then infer the segmentation map of a target image. Users then manipulate the map to instruct how the image will be edited. We utilize a pre-trained diffusion model to generate edited images aligned with the user's intention with pixel-wise guidance. The effective combination of proposed guidance and other techniques enables highly controllable editing with preserving the outside of the edited area, which results in meeting our requirements. The experimental results demonstrate that our proposal outperforms the GAN-based method for editing quality and speed.

このページの先頭へ