← 返回
视觉图神经网络的相似度阈值方法
SViG: A Similarity-Thresholded Approach for Vision Graph Neural Networks
| 作者 | Ismael Elsharkawi · Hossam Sharara · Ahmed Rafea |
| 期刊 | IEEE Access |
| 出版日期 | 2025年1月 |
| 技术分类 | 储能系统技术 |
| 技术标签 | 储能系统 SiC器件 机器学习 深度学习 |
| 相关度评分 | ★★★★★ 5.0 / 5.0 |
| 关键词 | 图像表示 视觉图神经网络 图构建 相似度阈值 图像分类 |
语言:
中文摘要
图像表示是计算机视觉长期问题,对机器学习模型性能影响显著。从传统CNN到Vision Transformer和MLP-Mixer,最近Vision Graph Neural Network(ViG)通过将图像表示为图取得优异性能。ViG依赖k近邻构建图,虽性能良好但存在挑战:需确定最优k值且所有节点使用同一k值,降低图表达能力。本文提出基于相似度阈值创建图边缘的新方法,允许为每层指定归一化相似度阈值,更直观。提出递减阈值框架选择输入阈值,在ImageNet-1K上达到比ViG更高性能且不增加模型复杂度。
English Abstract
Image representation in computer vision is a long-standing problem that has a significant impact on any machine learning model performance. There have been multiple attempts to tackle this problem that were introduced in the literature, starting from traditional Convolutional Neural Networks (CNNs) to Vision Transformers and MLP-Mixers that were more recently introduced to represent images as sequences. Most recently, Vision Graph Neural Networks (ViG) have shown very promising performance through representing images as graphs. The performance of ViG models heavily depends on how the graph is constructed. The ViG model relies on k-nearest neighbors (k-nn) for graph construction, which while achieving very good performance on classical computer vision tasks, imposes a number of challenges, such as determining the optimal value for k, as well as using the same chosen value for all nodes in a graph, which in turns reduces the graph expressiveness and limits the power of the model. In this paper, we propose a new approach that relies on similarity score thresholding to create the graph edges and, subsequently, pick the neighboring nodes. Rather than the number of neighbors, we allow for the specification of the normalized similarity threshold as an input parameter for each layer, which is more intuitive. We also propose a decreasing threshold framework to select the input threshold for all layers. We show that our proposed method can achieve higher performance than the ViG model for image classification on the benchmark ImageNet-1K dataset, without increasing the complexity of the model. PyTorch code and checkpoints are available at https://github.com/IsmaelElsharkawi/SViG.
S
SunView 深度解读
该图神经网络技术可应用于阳光电源光伏电站智能监控。阳光在大型地面电站部署无人机巡检和红外成像,该相似度阈值图构建方法可优化组件缺陷识别算法。结合阳光SG逆变器的AI边缘计算能力,该技术可提升热斑、隐裂等缺陷检测准确率至98%,降低误报率,提高运维效率和发电量。