IOE OpenIR  > 光电技术研究所博硕士论文
基于深度学习的图像生成技术
李红运
Subtype硕士
2019-05-22
Degree Grantor中国科学院大学
Place of Conferral中国科学院光电技术研究所
Degree Name工学硕士
Keyword数据增广 图像生成 对抗网络 抠图算法 语义分割
Abstract

深度学习以深度神经网络为基础, 寻求与建立数据和任务间的映射关系, 而在多种计算机视觉任务上均达到了业内领先的水平, 例如目标识别, 检测, 跟踪,分割等等。同样, 深度学习在自然语言处理(NLP), 智能机器人等人工智能领域也都得到了广泛应用和高度关注。
然而, 深度学习尽管在诸多领域取得了重大的成功, 但随着网络深度与网络结构复杂度的增加, 为了保持网络的泛化能力和避免网络的过拟合, 深度网络的学习与训练需要大量的标注数据, 这意味着一方面需要获取大量的数据, 另一方面需要耗费大量的人力物力进行数据的标注, 这无疑大为增加的深度学习的难度,并一定程度地限制了其应用。鉴此,本文就深度学习所需数据的生成技术进行研
究。
基于图像在高维像素空间中线性平滑过渡的假设,提出了混合叠加的图像生成算法,由现有的训练图像集构造其所支撑的凸集, 通过从该凸集中随机采样以获得新的训练图像,从而有效扩充训练数据集。实验证明, 与仅使用原始图像训练网络相比,使用该方法生成的图像训练网络, 能够使网络获得更好的性能。
针对数据生成对抗网络易崩溃不收敛的问题, 本文分析了对抗网络训练不稳定的原因, 提出了用于稳定对抗网络训练的权值梯度罚方法。该方法通过约束判决器在对抗网络训练过程中的梯度, 达到判决器与生成器更新的匹配,克服了网络训练的崩溃。实验证明,该方法能够在有效稳定网络训练的同时提升生成图像的质量。
为生成更多样的数据,本文进行了基于块抠图的图像生成算法研究,针对抠图算法在图像颜色相似区域性能下降的问题, 本算法使用 SLIC 算法将图像分割为多个超像素, 并计算不同超像素之间的颜色相似关系, 从而找到图像中的颜色相似区域, 并使用语义特征向量加强算法在颜色相似区域的区分力度, 从而达到对普通区域与颜色相似区域分而治之的目的。实验证明使用块抠图算法生成的新图像比使用原抠图算法生成的新图像效果更好, 并且能够有效提升深度模型对目标与背景的区分能力。
与传统数据增广算法相比, 本文所提出的图像数据生成方法,不仅能有效扩充训练样本,而且还能增加数据的多样性,而不是与原数据相似的数据。除此之外,上述的基于深度学习的图像生成技术还能够根据应用场景的需要, 生成特定场景下的图像。对于在不同场景下部署深度学习模型具有重要意义。

Other Abstract

Deep learning is the most general method in the fields of computer vision and pattern recognition. This method uses deep neural network as the base model and builds the mapping between data and tasks. Deep learning achieved the state of the art in many tasks of computer vision, such as recognition, detection, tracking, segmentation etc. Besides, it is also widely used in many fields such as natural language processing(NLP), robots etc.
Although deep learning is the most successful method in artificial intelligence, the drawbacks also are obvious. For example, the deep neural networks need a lot of training data. For a special task in industry, an available model generally needs billions of data. It means the large amount of manual data annotation.
This paper introduced three new methods for generating training images. The first one is the image generation based on the MT method; the second one is the image generation based on CycleGAN; the third method is the image generation based on Matting.
The MT method proposes the hypothesis of linearly smooth transition in the space of pixels, and constructs a convex set supported by original training data. Besides, this method generates new images by sampling samples from the convex set and uses these new images to train deep models. Experiments proved that these new images improved the performance of deep models.
For the training problem of CycleGAN, this paper analyzed the reason of unstable gradients in the process of GAN’s training and proposed the gradients penalty of weights method(GPW). This method restricts the discriminator’s gradients to balance the training speed of discriminator and generator. The experimental results showed that GPW method stabilized network training and improved the quality of generated
image.
For relieving the performance degradation of Matting method in similarly-colored regions of image, Patch Alpha Matting method hires SLIC method to divide an image into multiple super pixels and analyzes the affinity relationships between any two super pixels for finding the similarly-colored regions. Additionally, Patch Alpha Matting method uses semantic feature vectors strengthen the discrimination of matting method in similarly-colored regions to realize the divide-and-conquer method naturally. The experimental results showed that the new images generated by Patch Alpha Matting achieved superior synthetic quality and improved the performance of deep models.
Because these methods in this paper can generate new images with diversified semantic objects and backgrounds rather than similar images with original data, these new methods can be seen as the data augmentation methods but totally different with classical data augmentation method. Besides, these methods can generate images in special scene according to the requirements of application. It is meaningful for using deep learning in special scenes.

MOST Discipline Catalogue工学::信息与通信工程
Language中文
Document Type学位论文
Identifierhttp://ir.ioe.ac.cn/handle/181551/9117
Collection光电技术研究所博硕士论文
Recommended Citation
GB/T 7714
李红运. 基于深度学习的图像生成技术[D]. 中国科学院光电技术研究所. 中国科学院大学,2019.
Files in This Item:
File Name/Size DocType Version Access License
基于深度学习的图像生成技术-李红运.pd(3937KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[李红运]'s Articles
Baidu academic
Similar articles in Baidu academic
[李红运]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[李红运]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.