The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
A animação fotográfica consiste em transformar fotos de cenas do mundo real em imagens no estilo anime, o que é uma tarefa desafiadora no AIGC (AI Generated Content). Embora os métodos anteriores tenham alcançado resultados promissores, eles frequentemente introduzem artefatos ou distorções perceptíveis. Neste artigo, propomos uma nova rede adversária generativa de cauda dupla (DTGAN) para animação fotográfica rápida. DTGAN é a terceira versão da série AnimeGAN. Portanto, DTGAN também é chamado de AnimeGANv3. O gerador do DTGAN tem duas caudas de saída, uma cauda de suporte para produzir imagens de estilo anime de granulação grossa e uma cauda principal para refinar imagens de estilo anime de granulação grossa. No DTGAN, propomos uma nova técnica de normalização que pode ser aprendida, denominada desnormalização linearmente adaptativa (LADE), para evitar artefatos nas imagens geradas. A fim de melhorar a qualidade visual das imagens geradas no estilo anime, são propostas duas novas funções de perda adequadas para animação fotográfica: 1) a função de perda de suavização de região, que é usada para enfraquecer os detalhes de textura das imagens geradas para obter efeitos de anime com detalhes abstratos; 2) a função de perda de revisão refinada, que é usada para eliminar artefatos e ruídos na imagem gerada no estilo anime, preservando bordas nítidas. Além disso, o gerador do DTGAN é uma estrutura geradora leve com apenas 1.02 milhão de parâmetros na fase de inferência. O DTGAN proposto pode ser facilmente treinado de ponta a ponta com dados de treinamento não pareados. Extensos experimentos foram conduzidos para demonstrar qualitativa e quantitativamente que nosso método pode produzir imagens de estilo anime de alta qualidade a partir de fotos do mundo real e ter um desempenho melhor do que os modelos de última geração.
Gang LIU
Hubei University of Technology
Xin CHEN
Wuhan TianYu Information Industry CO., LTD.
Zhixiang GAO
Wuhan College
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Gang LIU, Xin CHEN, Zhixiang GAO, "A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation" in IEICE TRANSACTIONS on Information,
vol. E107-D, no. 1, pp. 72-82, January 2024, doi: 10.1587/transinf.2023EDP7061.
Abstract: Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2023EDP7061/_p
Copiar
@ARTICLE{e107-d_1_72,
author={Gang LIU, Xin CHEN, Zhixiang GAO, },
journal={IEICE TRANSACTIONS on Information},
title={A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation},
year={2024},
volume={E107-D},
number={1},
pages={72-82},
abstract={Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.},
keywords={},
doi={10.1587/transinf.2023EDP7061},
ISSN={1745-1361},
month={January},}
Copiar
TY - JOUR
TI - A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation
T2 - IEICE TRANSACTIONS on Information
SP - 72
EP - 82
AU - Gang LIU
AU - Xin CHEN
AU - Zhixiang GAO
PY - 2024
DO - 10.1587/transinf.2023EDP7061
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E107-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2024
AB - Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.
ER -