您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2026, Vol. 61 ›› Issue (6): 95-106.doi: 10.6040/j.issn.1671-9352.0.2026.045

• • 上一篇    

面向身份信息保持的肖像发型移除研究

姚勋祥1*,徐华2,徐英城3,张鹏2,赵建敏4   

  1. 1.山东财经大学计算机科学与人工智能学院, 山东 济南 250014;2.山东科技大学计算机科学与工程学院, 山东 青岛 266590;3.山东财经大学管理科学与工程学院, 山东 济南 250014;4.潍坊市融媒体中心, 山东 潍坊 261000
  • 发布日期:2026-06-04
  • 作者简介:姚勋祥(1989— ),男,讲师,博士,研究方向为图像超分辨及分形、目标检测. E-mail:Xunxiang.Yao@sdufe.edu.cn
  • 基金资助:
    国家自然科学基金项目(62506209);山东省自然科学基金项目(ZR2024QF016,ZR2023QF161);山东省高等学校青年创新团队项目(2022KJ185)

Research on identity-preserving portrait hairstyle removal

YAO Xunxiang1, XU Hua2, XU Yingcheng3, ZHANG Peng2, ZHAO Jianmin4   

  1. 1. School of Computer Science and Artificial Intelligence, Shandong University of Finance and Economics, Jinan 250014, Shandong, China;
    2. School of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, Shandong, China;
    3. School of Management Science and Engineering, Shandong University of Finance and Economics, Jinan 250014, Shandong, China;
    4. Weifang Financial Media Center, Weifang 261000, Shandong, China
  • Published:2026-06-04

摘要: 肖像发型移除技术能高效移除现有发型,生成高保真度的光头图像,为用户提供便捷的虚拟发型更换体验。该技术同时可为3D人脸重建提供无遮挡面部纹理数据,提升3D人脸模型的真实感和细节表现力。然而,由于发型几何结构复杂多变、存在帽饰等物品的遮挡干扰,以及缺乏成对训练数据集,实现高质量的肖像发型移除仍面临重大挑战。现有方法往往难以兼顾身份信息保持和遮挡物去除的双重需求。因此,本文提出一种面向身份信息保持的肖像发型移除框架,用于从肖像图像中移除发型和帽饰等遮挡物,生成自然真实的光头图像。该框架首先采用SegFace人脸语义分割模型获取头发与帽子的掩膜区域,随后训练一个光头生成器专注于掩膜区域内容生成,确保新生成的内容在肤色、阴影效果及语义等方面与原始面部和背景高度兼容,通过增加身份损失约束,在实现发型移除的同时保持身份一致性。针对发饰遮挡这一技术难点(包括长度可变性和样式多样性),本文方法结合面部关键点与Bézier曲线对眉毛下方区域进行拟合,从而减少对身份相关面部区域的干扰。实验结果表明,本文方法能够高效去除各类发型和帽饰遮挡,提升发型迁移效果。

关键词: 发型移除, 扩散模型, 遮挡, 语义分割, Bézier曲线

Abstract: Portrait hairstyle removal aims to eliminate existing hairstyles from portrait images and generate high-fidelity bald images. It not only provides users with a flexible tool for virtual hairstyle editing, but also supplies unobstructed facial texture information for 3D face reconstruction, thereby improving the realism and geometric detail of reconstructed face models. However, achieving high-quality hairstyle removal remains challenging due to the complex and highly variable geometry of hairstyles, interference from occlusions such as hats and hair accessories, and the scarcity of paired training data. Existing methods often struggle to balance effective occlusion removal with faithful identity preservation. To address these issues, this paper proposes an identity-preserving portrait hairstyle removal framework for removing hairstyles and hat-related occlusions while generating natural and realistic bald portraits. First, the SegFace semantic segmentation model is employed to extract mask regions corresponding to hair and hats. A bald generator is then trained to focus on content synthesis within the masked regions, so that the generated content remains consistent with the original face and background in terms of skin tone, illumination, and semantic continuity. In addition, an identity loss is introduced to preserve facial identity during hairstyle removal. To further handle hair accessory occlusions with diverse shapes and spatial extents, facial landmarks are combined with Bézier curve fitting to refine the region below the eyebrows, thereby reducing interference with identity-related facial areas. Experimental results demonstrate that the proposed method effectively removes a wide range of hairstyles and hat-related occlusions while maintaining natural visual quality and identity consistency.

Key words: hairstyle removal, diffusion models, occlusion, semantic segmentation, Bézier curve

中图分类号: 

  • TP391
[1] 陈彦名. 浅析帽饰在服装搭配设计中的创意与表现[J]. 轻纺工业与技术,2020,49(3):31-32. CHEN Yanming. Creative exploration and expression of head wear in fashion styling design[J]. Light and Textile Industry and Technology, 2020, 49(3):31-32.
[2] 赵丹妮. 帽饰在女性服饰搭配设计中的应用研究[J]. 明日风尚,2017(10):57. ZHAO Danni. Appliedresearch on the integration of headwear in womens fashion styling and design[J]. Ming Ri Feng Shang, 2017(10):57.
[3] ZHONG Y, ZHANG X, ZHAO Y, et al.Dreamlcm:towards high quality text-to-3D generation via latent consistency model[C] //Proceedings of the 32nd ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2024:1731-1740.
[4] KARRAS T, LAINE S, AILA T. A style-based generator architecture for generative adversarial networks[J]. IEEE Trans Pattern Anal Mach Intell, 2021, 43(12):4217-4228.
[5] ABDAL R, ZHU P, MITRA N J, et al. Styleflow: attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows[J]. ACM Transactions on Graphics(ToG), 2021, 40(3):1-21.
[6] PATASHNIK O, WU Z, SHECHTMAN E, et al. Styleclip: text-driven manipulation of stylegan imagery[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021:2085-2094.
[7] SHEN Y, GU J, TANG X, et al. Interpreting the latent space ofgans for semantic face editing[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020:9243-9252.
[8] WU Y, YANG Y L, XIAO Q, et al. Coarse-to-fine: facial structure editing of portrait images via latent space classifications[J]. ACM Transactions on Graphics(ToG), 2021, 40(4):1-13.
[9] SHEN Y, ZHOU B. Closed-form factorization of latent semantics ingans[C] //Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Nashville: Computer Vision Foundation/IEEE, 2021:1532-1540.
[10] LOU X, LIU Y, LI X. Tecm-clip: text-based controllable multi-attribute face image manipulation[C] //Proceedings of the Asian Conference on Computer Vision. Macao: Springer, 2022:1942-1958.
[11] TOV O, ALALUF Y, NITZAN Y, et al. Designing an encoder for style gan image manipulation[J]. ACM Transactions on Graphics(ToG), 2021, 40(4):1-14.
[12] SONG J, MENG C, ERMON S. Denoising diffusion implicit models[C] //International Conference on Learning Representations, OpenReview.net, 2021:12-44.
[13] HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[J]. Advances in Neural Information Processing Systems, 2020, 33:6840-6851.
[14] ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022:10684-10695.
[15] RAMESH A, DHARIWAL P, NICHOL A, et al. Hierarchical text-conditional image generation with cliplatents[EB/OL].(2022-04-13)[2026-04-27]. https://doi.org/10.48550/arXiv.2204.06125.
[16] SAHARIA C, CHAN W, SAXENA S, et al. Photorealistic text-to-image diffusion models with deep language understanding[J]. Advances in Neural Information Processing Systems, 2022, 35:36479-36494.
[17] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C] //Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014:2672-2680.
[18] WU Y, YANG Y L, JIN X.Hairmapper:removing hair from portraits using gans[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022:4227-4236.
[19] ZHANG Y, ZHANG Q, SONG Y, et al. Stable-hair:real-world hair transfer via diffusion model[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 2025, 39(10):10348-10356.
[20] KARRAS T, AITTALA M, HELLSTEN J, et al. Training generative adversarial networks with limited data[J]. Advances in Neural Information Processing Systems, 2020, 33:12104-12114.
[21] KARRAS T, LAINE S, AITTALA M, et al. Analyzing and improving the image quality ofstylegan[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020:8110-8119.
[22] ABDAL R, QIN Y, WONKA P. Image2stylegan: how to embed images into the stylegan latent space?[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019:4432-4441.
[23] ABDAL R, QIN Y, WONKA P. Image2stylegan++: how to edit the embedded images?[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020:8296-8305.
[24] RICHARDSON E, ALALUF Y, PATASHNIK O, et al. Encoding in style: a stylegan encoder for image-to-image translation[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:2287-2296.
[25] LIU L, REN Y, LIN Z, et al. Pseudo numerical methods for diffusion models on manifolds[C] //International Conference on Learning Representations, OpenReview.net, 2022:12-40.
[26] KINGMA D P, WELLING M. Auto-encoding variationalbayes[J]. Stat, 2014, 1050:1.
[27] ZENG Y, ZHANG Y, LIU J, et al. HairDiffusion: vivid multi-colored hair editing via latent diffusion[J]. Advances in Neural Information Processing Systems, 2024, 37:5048-5073.
[28] GAL R, ALALUF Y, ATZMON Y, et al. An image is worth one word: personalizing text-to-image generation using textual inversion[EB/OL].(2022-08-02)[2026-04-27]. https://arxiv.org/abs/2208.01618.
[29] ZHANG L, RAO A, AGRAWALA M. Adding conditional control to text-to-image diffusion models[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023:3836-3847.
[30] YANG B, GU S, ZHANG B, et al. Paint by example: exemplar-based image editing with diffusion models[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023:18381-18391.
[31] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C] //International Conference on Medical Image Computing and Computer-assisted Intervention. Cham: Springer International Publishing, 2015:234-241.
[32] NARAYAN K, VS V, PATEL V M. Segface: face segmentation of long-tail classes[C] //Proceedings of the AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 2025, 39(6):6182-6190.
[33] DENG J, GUO J, XUE N, et al. Arcface: additive angular margin loss for deep face recognition[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: Computer Vision Foundation/IEEE, 2019:4690-4699.
[1] 张赵柳,范小明. 广义分数布朗运动下的双重Heston跳扩散模型欧式期权定价[J]. 《山东大学学报(理学版)》, 2025, 60(3): 60-68.
[2] 艾露露,刘蕴贤. 半导体问题漂移扩散模型的超弱间断Galerkin方法[J]. 《山东大学学报(理学版)》, 2024, 59(10): 10-21.
[3] 仲诚诚,周恒,张梓童,张春雷. LAC-UNet: 基于胶囊表达局部-整体特征关系的语义分割模型[J]. 《山东大学学报(理学版)》, 2023, 58(11): 116-126.
[4] 李永花,张存华. 具有Dirichlet边界条件的单种群时滞反应扩散模型的稳定性[J]. 《山东大学学报(理学版)》, 2023, 58(10): 122-126.
[5] 韩梦洁,刘俊利. 具有不完全接种的反应扩散禽流感模型[J]. 《山东大学学报(理学版)》, 2023, 58(10): 106-121.
[6] 安翔,郭精军. 混合次分数跳扩散模型下回望期权的定价及模拟[J]. 《山东大学学报(理学版)》, 2022, 57(4): 100-110.
[7] 李海侠. 一类具有密度制约的捕食-食饵扩散模型的定性分析[J]. 《山东大学学报(理学版)》, 2019, 54(9): 54-61.
[8] 李国成,王继霞. 交叉熵蝙蝠算法求解期权定价模型参数估计问题[J]. 《山东大学学报(理学版)》, 2018, 53(12): 80-89.
[9] 刘华勇,谢新平,李璐,张大明,王焕宝. 一类满足G2连续的三角Bézier曲线曲面[J]. 山东大学学报(理学版), 2016, 51(10): 65-71.
[10] 苗杰1,师恪2,蔡华1. 跳扩散模型下的可分离债券的定价[J]. J4, 2010, 45(8): 109-117.
[11] . 一个具有BeddingtonDeAngelis功能反应项的捕食-食饵反应扩散模型的全局渐近稳定性[J]. J4, 2009, 44(6): 75-78.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!