修改代码说明

main
rzzn 2024-07-07 14:48:34 +08:00
parent 4a9a683850
commit ed6e78ba09
1 changed files with 6 additions and 49 deletions

View File

@ -1,62 +1,27 @@
## The offical PyTorch code for paper ["Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information", TGRS 2022.](https://doi.org/10.1109/TGRS.2022.3163706) # The pytorch code for ACF
# GAC ##### Author: Zhiming Wang
##### Author: Zhiqiang Yuan
<a href="https://github.com/xiaoyuan1996/retrievalSystem"><img src="https://travis-ci.org/Cadene/block.bootstrap.pytorch.svg?branch=master"/></a>
![Supported Python versions](https://img.shields.io/badge/python-3.7-blue.svg)
![Supported OS](https://img.shields.io/badge/Supported%20OS-Linux-yellow.svg)
![npm License](https://img.shields.io/npm/l/mithril.svg)
<a href="https://pypi.org/project/mitype/"><img src="https://img.shields.io/pypi/v/mitype.svg"></a>
### -------------------------------------------------------------------------------------
### Welcome :+1:_<big>`Fork and Star`</big>_:+1:, then we'll let you know when we update
```bash
#### News:
#### 2021.9.26: ---->Under update ...<----
```
### ------------------------------------------------------------------------------------- ### -------------------------------------------------------------------------------------
## INTRODUCTION ## INTRODUCTION
This is GAC, a cross-modal retrieval method for remote sensing images. 基于注意力校正和过滤的跨模态遥感图像检索算法
We use the MIDF module to fuse multi-level RS image features, and add the DREA mechanism to improve the performance of local features.
In addition, a multivariate rerank algorithm is designed to make full use of the information in the similarity matrix during the testing.
Our method has achieved the state-of-the-art performance (2021.10) in RS cross-modal retrieval task on multiple RS image-text datasets.
### Network Architecture ### Network Architecture
![arch image](./figure/GAC.jpg)
The proposed RSCTIR framework based on global and local information. Compared with the retrieval models constructed using only global features, GAC incorporates optimized local features in the visual encoding considering the target redundancy of RS. The multi-level information dynamic fusion module is designed to fuse the two types of information, using the global information to supplement the local information and utilizing the latter to correct the former. The suggested multivariate rerank algorithm as a post-processing method further improves the retrieval accuracy without extra training.
### DREA
To alleviate the pressure on the model from redundant target relations and increase the models focus on salient instances, we come up with a denoised representation matrix and a enhanced adjacency matrix to assist the GCN in producing better local representations.
DREA filters the redundant features with high similarity and enhances the features of salient targets, which enables GAC to obtain more transcendent visual representation.
### MIDF
<img src="https://github.com/xiaoyuan1996/GAC/blob/main/figure/MIDF.jpg" width="600" alt="MIDF"/>
The proposed multi-level information dynamic fusion module. The method falls into two stages of feature retransformation and dynamic fusion. MIDF first uses SA and GA modules to retransform features, then uses global information to supplement local information and leverages the latter to correct the former. Further dynamic fusion of multi-level features is accomplished through the fabricated dynamic fusion module.
### Multivariate Rerank
<img src="https://github.com/xiaoyuan1996/GAC/blob/main/figure/similartiy.jpg" width="600" alt="similarity"/>
The proposed multivariate rerank algorithm. In order to make full use of the similarity matrix, we use k candidates for reverse search and to optimize the similarity results by considering multiple ranking factors. The figure shows an illustration of multivariate rerank when k = 3, using image i for retrieval.
### Performance ### Performance
![performance](./figure/performance.jpg)
Comparisons of Retrieval Performance on RSICD and RSITMD Testset.
### ------------------------------------------------------------------------------------- ### -------------------------------------------------------------------------------------
## IMPLEMENTATION ## IMPLEMENTATION
代码实现基于GaLR
```bash ```bash
Installation Installation
@ -156,11 +121,3 @@ Step3:
Note: We use k-fold verity to do a fair compare. Other details please see the code itself. Note: We use k-fold verity to do a fair compare. Other details please see the code itself.
``` ```
## Citation
If you feel this code helpful or use this code or dataset, please cite it as
```
Z. Yuan et al., "Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information," in IEEE Transactions on Geoscience and Remote Sensing, doi: 10.1109/TGRS.2022.3163706.
```