基于去冗余和双重感知约束的零样本图像识别

张桂梅; 闫文尚; 黄军阳; 王远宁

doi:10.3969/j.issn.2096-8566.2024.04.001

基于去冗余和双重感知约束的零样本图像识别

Zero-shot Image Recognition Based on Redundancy Reduction and Dual-perception Constraints

摘要

摘要: 零样本图像识别旨在解决样本缺失情况下的图像识别问题。针对零样本图像识别中目标样本的判别性特征不足和域偏移问题，在生成模型的基础上，提出基于去冗余和双重感知约束的零样本图像识别方法。针对目标样本的特征判别性不足问题，首先在生成器端引入视觉中心损失约束，使生成的伪特征类内之间更加紧凑，从而提高伪特征的类间可区分性。其次，在生成器后增加去冗余模块，减少冗余信息的干扰，凸显特征的判别性信息。针对域偏移问题，提出双重感知正则化约束，对真实视觉特征与伪视觉特征进行双重感知约束，使生成的伪特征更接近真实特征。然后利用循环一致性损失对语义解码器生成的伪视觉特征进行语义重构，进一步缓解域偏移。在AWA、CUB、SUN和FLO 4个数据集上进行实验验证了本文提出方法的有效性。将提出的方法应用到零样本图像检索任务中，实验结果表明提出的方法具有较好的泛化性能，容易拓展到其他应用中。

Abstract: Zero-shot image recognition aims to address recognition issues in the absence of samples. To address the issues of insufficient discriminative features for target samples and domain shift in zero-shot image recognition, a zero-shot image recognition method based on redundancy reduction and dual perception constraints is proposed on the basis of generative models. To tackle the issue of insufficient discriminative features for target samples, the loss constraint of visual center is firstly introduced into the generator to make the generated pseudo-features between classes more compact, thereby improving the inter-class differentiation of pseudo-features. Additionally, a redundancy reduction module is added after the generator to reduce interference from redundant information and emphasize discriminative feature information. To address the domain shift problem, a dual perception regularization constraint is proposed to enforce dual perception constraint on both real visual features and pseudo-visual features, making the generated pseudo-features align with the real ones more closely. Furthermore, a cycle consistency loss is utilized to perform semantic reconstruction on pseudo-visual features generated by the semantic decoder, further alleviating domain shift. The effectiveness of the proposed method was verified by the experiments on the AWA, CUB, SUN and FLO datasets. The method was also applied to the zero-shot image retrieval task, and the experimental results showed that the proposed method had good generalization performance, thus easy to extend to other applications.

HTML全文

参考文献(34)

施引文献

资源附件(0)