Spatial multi-scale feature transformer network for fine-grained few-shot image classification

Authors

DOI:

https://doi.org/10.18488/76.v12i3.4439

Abstract

This year has seen significant advancements in deep learning, and fine-grained few-shot image classification (FGFSIC) has also made substantial progress. FGFSIC faces two key challenges: high intra-class variance and low inter-class variance, which hinder accurate classification with limited data. Despite considerable efforts to extract more discriminative features using powerful networks, few studies have specifically addressed these challenges. This paper proposes a Spatial Multi-Scale Feature Transformer Network to overcome these issues. The approach first modifies the backbone network to extract multi-scale features, with classification results derived from comparing these multi-scale representations. Additionally, a Spatial Feature Transformer network is introduced to adjust the spatial positions of multi-scale features, which helps to reduce intra-class variance. Experiments were conducted on three widely used datasets—CUB-200-2011, Stanford Cars, and Stanford Dogs. The results demonstrate that both components of the proposed model significantly enhance FGFSIC performance, with final accuracies surpassing those of most existing methods. The findings emphasize the effectiveness of the proposed approach in tackling the critical issues of high intra-class variance and low inter-class variance, making it a promising solution for fine-grained image classification tasks, particularly in situations where only limited data is available. This work paves the way for improved performance in real-world applications requiring precise, few-shot learning in fine-grained domains.

Keywords:

Few-shot learning, Fine-grained few-shot image classification, Fine-grained image classification, Multi-scale features, Spatial transformer network.

Downloads

Download data is not yet available.

Published

2025-10-02

How to Cite

Guo, L. ., & Marlisah, E. . (2025). Spatial multi-scale feature transformer network for fine-grained few-shot image classification . Review of Computer Engineering Research, 12(3), 195–205. https://doi.org/10.18488/76.v12i3.4439

Issue

Section

Articles

Most read articles by the same author(s)