463465810cz commited on Jul 16, 2023

Commit

0a4be84

1 Parent(s): e5c4654

ICCV 2023

Browse files

Former-commit-id: 0f02d4ec4c7da26edff8a98e7fedb1006cc9d334

Files changed (27) hide show

README.md +120 -10
figs/.gitattributes +1 -0
figs/DAT.png +3 -0
figs/Figure-1.png +3 -0
figs/Table-1.png +3 -0
figs/img_049_Bicubic_x4.png +3 -0
figs/img_049_CAT_x4.png +3 -0
figs/img_049_DAT_x4.png +3 -0
figs/img_049_HR_x4.png +3 -0
figs/img_049_SwinIR_x4.png +3 -0
figs/img_059_Bicubic_x4.png +3 -0
figs/img_059_CAT_x4.png +3 -0
figs/img_059_DAT_x4.png +3 -0
figs/img_059_HR_x4.png +3 -0
figs/img_059_SwinIR_x4.png +3 -0
options/Test/test_DAT_S_x2.yml +93 -0
options/Test/test_DAT_S_x3.yml.yml +92 -0
options/Test/test_DAT_S_x4.yml +93 -0
options/Test/test_DAT_x2.yml +2 -2
options/Test/test_DAT_x3.yml +2 -2
options/Test/test_DAT_x4.yml +2 -2
options/Train/train_DAT_S_x2.yml +106 -0
options/Train/train_DAT_S_x3.yml.yml +109 -0
options/Train/train_DAT_S_x4.yml +110 -0
options/Train/train_DAT_x2.yml +106 -0
options/Train/train_DAT_x3.yml +109 -0
options/Train/train_DAT_x4.yml +110 -0

README.md CHANGED Viewed

@@ -1,6 +1,23 @@
 # Dual Aggregation Transformer for Image Super-Resolution
-This repository is for DAT introduced in the paper.
 ## Dependencies
@@ -9,32 +26,125 @@ This repository is for DAT introduced in the paper.
 - NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads)
 ```bash
-# Cd to the default directory 'DAT'
 pip install -r requirements.txt
 python setup.py develop
 ```
-## Test
-- Download the pre-trained [models](https://ufile.io/rf58x0s9) and place them in `experiments/pretrained_models/`.
-  We provide DAT with scale factors: x2, x3, x4.
-- Download [testing](https://ufile.io/6ek67nf8) (Set5, Set14, BSD100, Urban100, Manga109) datasets, place them in `datasets/`.
-- Run the following scripts. The testing configuration is in `options/Test/`. More detail about YML, please refer to [Configuration](https://github.com/XPixelGroup/BasicSR/blob/master/docs/Config.md).
-  **You can change the testing configuration in YML file, like 'test_DAT_x2.yml'.**
   ```shell
   # No self-ensemble
   # DAT, reproduces results in Table 2 of the main paper
   python basicsr/test.py -opt options/Test/test_DAT_x2.yml
   python basicsr/test.py -opt options/Test/test_DAT_x3.yml
   python basicsr/test.py -opt options/Test/test_DAT_x4.yml
   ```
-- The output is in `results`.
 ## Acknowledgements

 # Dual Aggregation Transformer for Image Super-Resolution
+[Zheng Chen](https://scholar.google.com/citations?user=zssRkBAAAAAJ), [Yulun Zhang](http://yulunzhang.com/), [Jinjin Gu](https://www.jasongt.com/), [Linghe Kong](https://www.cs.sjtu.edu.cn/~linghe.kong/), [Xiaokang Yang](https://scholar.google.com/citations?user=yDEavdMAAAAJ&hl), and [Fisher Yu](https://www.yf.io/), "Dual Aggregation Transformer for Image Super-Resolution", ICCV, 2023
+[[arXiv]]() [[visual results](https://drive.google.com/drive/folders/1ZMaZyCer44ZX6tdcDmjIrc_hSsKoMKg2?usp=drive_link)] [[pretrained models](https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM?usp=drive_link)]
+---
+> **Abstract:** *Transformer-based methods have recently been widely used in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current state-of-the-art methods.*
+>
+> <p align="center">
+> <img width="800" src="figs/DAT.png">
+> </p>
+---
+|                      HR                      |                        LR                         | [SwinIR](https://github.com/JingyunLiang/SwinIR) |  [CAT](https://github.com/zhengchen1999/CAT)  |                  DAT (ours)                   |
+| :------------------------------------------: | :-----------------------------------------------: | :----------------------------------------------: | :-------------------------------------------: | :-------------------------------------------: |
+| <img src="figs/img_059_HR_x4.png" height=80> | <img src="figs/img_059_Bicubic_x4.png" height=80> | <img src="figs/img_059_SwinIR_x4.png" height=80> | <img src="figs/img_059_CAT_x4.png" height=80> | <img src="figs/img_059_DAT_x4.png" height=80> |
+| <img src="figs/img_049_HR_x4.png" height=80> | <img src="figs/img_049_Bicubic_x4.png" height=80> | <img src="figs/img_049_SwinIR_x4.png" height=80> | <img src="figs/img_049_CAT_x4.png" height=80> | <img src="figs/img_049_DAT_x4.png" height=80> |
 ## Dependencies
 - NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads)
 ```bash
+# Clone the github repo and go to the default directory 'DAT'.
+git clone https://github.com/zhengchen1999/DAT.git
 pip install -r requirements.txt
 python setup.py develop
 ```
+## Contents
+1. [Datasets](#Datasets)
+1. [Models](#Models)
+1. [Training](#Training)
+1. [Testing](#Testing)
+1. [Results](#Results)
+1. [Citation](#Citation)
+1. [Acknowledgements](#Acknowledgements)
+---
+## Datasets
+Used training and testing sets can be downloaded as follows:
+| Training Set                                                 |                         Testing Set                          |                        Visual Results                        |
+| :----------------------------------------------------------- | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) (800 training images) +  [Flickr2K](https://cv.snu.ac.kr/research/EDSR/Flickr2K.tar) (2650 images) [complete training dataset [DF2K](https://drive.google.com/file/d/1TubDkirxl4qAWelfOnpwaSKoj3KLAIG4/view?usp=share_link)] | Set5 + Set14 + BSD100 + Urban100 + Manga109 [complete testing dataset [download](https://drive.google.com/file/d/1yMbItvFKVaCT93yPWmlP3883XtJ-wSee/view?usp=sharing)] | [here](https://drive.google.com/drive/folders/1ZMaZyCer44ZX6tdcDmjIrc_hSsKoMKg2?usp=drive_link) |
+Download training and testing datasets and put them into the corresponding folders of `datasets/` and `restormer/datasets`. See [datasets](datasets/README.md) for the detail of directory structure.
+## Models
+| Method | Params (M) | FLOPs (G) | Dataset  | PSNR (dB) |  SSIM  |                          Model Zoo                           |                        Visual Results                        |
+| :----- | :--------: | :-------: | :------: | :-------: | :----: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| DAT-S  |   11.21    |   203.3   | Urban100 |   27.68   | 0.8300 | [Google Drive](https://drive.google.com/drive/folders/1hb77nOTpCo9iU_jmg_izHOPRvPJujRiL?usp=drive_link) | [Google Drive](https://drive.google.com/file/d/1W-CeN2Z0e1r0rOdc3t-GcGrRV-qTGdub/view?usp=drive_link) |
+| DAT    |   14.80    |   275.8   | Urban100 |   27.87   | 0.8343 | [Google Drive](https://drive.google.com/drive/folders/1eZqgQEBQ69Vzf8afrPkvL27JHubW6o0t?usp=drive_link) | [Google Drive](https://drive.google.com/file/d/1B4zJsZaiVsu009ilTh81BV7-8Hr98BI2/view?usp=drive_link) |
+The performance is reported on Urban100 (x4, SR). The test input size of FLOPs is 128 x 128.
+## Training
+- Download [training](https://drive.google.com/file/d/1TubDkirxl4qAWelfOnpwaSKoj3KLAIG4/view?usp=share_link) (DF2K, already processed) and [testing](https://drive.google.com/file/d/1yMbItvFKVaCT93yPWmlP3883XtJ-wSee/view?usp=sharing) (Set5, Set14, BSD100, Urban100, Manga109, already processed) datasets, place them in `datasets/`.
+- Run the following scripts. The training configuration is in `options/train/`.
+  ```shell
+  # DAT-S, input=64x64, 4 GPUs
+  python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_S_x2.yml --launcher pytorch
+  python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_S_x3.yml --launcher pytorch
+  python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_S_x4.yml --launcher pytorch
+  # DAT, input=64x64, 4 GPUs
+  python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_x2.yml --launcher pytorch
+  python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_x3.yml --launcher pytorch
+  python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_x4.yml --launcher pytorch
+  ```
+- The training experiment is in `experiments/`.
+## Testing
+- Download the pre-trained [models](https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM?usp=drive_link) and place them in `experiments/pretrained_models/`.
+  We provide pre-trained models for image SR: DAT-S and DAT (x2, x3, x4).
+- Download [testing](https://ufile.io/6ek67nf8) (Set5, Set14, BSD100, Urban100, Manga109) datasets, place them in `datasets/`.
+- Run the following scripts. The testing configuration is in `options/test/`.
   ```shell
   # No self-ensemble
+  # DAT-S, reproduces results in Table 2 of the main paper
+  python basicsr/test.py -opt options/Test/test_DAT_S_x2.yml
+  python basicsr/test.py -opt options/Test/test_DAT_S_x3.yml
+  python basicsr/test.py -opt options/Test/test_DAT_S_x4.yml
   # DAT, reproduces results in Table 2 of the main paper
   python basicsr/test.py -opt options/Test/test_DAT_x2.yml
   python basicsr/test.py -opt options/Test/test_DAT_x3.yml
   python basicsr/test.py -opt options/Test/test_DAT_x4.yml
   ```
+- The output is in `results/`.
+## Results
+We achieved state-of-the-art performance on image SR, JPEG compression artifact reduction and real image denoising. Detailed results can be found in the paper. All visual results of CAT can be downloaded [here](https://drive.google.com/drive/folders/1SIQ342yyrlHTCxINf9wYNchOa5eOw_7s?usp=sharing).
+<details>
+<summary>Image SR (click to expan)</summary>
+- results in Table 2 of the main paper
+<p align="center">
+  <img width="900" src="figs/Table-1.png">
+</p>
+- visual comparison (x4) in the main paper
+<p align="center">
+  <img width="900" src="figs/Figure-1.png">
+</p>
+- </details>
+## Citation
+If you find the code helpful in your resarch or work, please cite the following paper(s).
+```
+@inproceedings{chen2023dual,
+    title={Dual Aggregation Transformer for Image Super-Resolution},
+    author={Chen, Zheng and Zhang, Yulun and Gu, Jinjin and Kong, Linghe and Yang, Xiaokang and Yu, Fisher},
+    booktitle={ICCV},
+    year={2023}
+}
+```
 ## Acknowledgements

figs/.gitattributes ADDED Viewed

	@@ -0,0 +1 @@


1	+ *.{png,jpg} filter=lfs diff=lfs merge=lfs -text

figs/DAT.png ADDED Viewed

Git LFS Details

SHA256: 39fd2add39fb54231203ea4208e69ed7f734653724f0911b378e45f51c0913ab
Pointer size: 131 Bytes
Size of remote file: 474 kB

figs/Figure-1.png ADDED Viewed

Git LFS Details

SHA256: 6421c15a4025ec957764f6cb2266c57599606791a7bd8f46c62d25bf78ea49a0
Pointer size: 132 Bytes
Size of remote file: 2.07 MB

figs/Table-1.png ADDED Viewed

Git LFS Details

SHA256: 20e74e18b57965867cbab8577e2d94c008a70d9cbe0e8d5bd48cc1b293deaa75
Pointer size: 131 Bytes
Size of remote file: 470 kB

figs/img_049_Bicubic_x4.png ADDED Viewed

Git LFS Details

SHA256: 3996380256f52615ce2f84600324f0433b00dbea264c046f050b57e7949fd408
Pointer size: 129 Bytes
Size of remote file: 5.29 kB

figs/img_049_CAT_x4.png ADDED Viewed

Git LFS Details

SHA256: f3d7632543eaf421549d5c8802707f85446e38a30f8ff15921e0d65a9548b7b9
Pointer size: 129 Bytes
Size of remote file: 6.28 kB

figs/img_049_DAT_x4.png ADDED Viewed

Git LFS Details

SHA256: 5baa481452f89c8cbceafbf8fe0e1961981c419a45e543fa8e9387d233fee144
Pointer size: 129 Bytes
Size of remote file: 7.53 kB

figs/img_049_HR_x4.png ADDED Viewed

Git LFS Details

SHA256: 52ba1f51d25bf7b6b31faea72be5a635bee9787ddd65f866fb41d0c5aceb8154
Pointer size: 130 Bytes
Size of remote file: 10.1 kB

figs/img_049_SwinIR_x4.png ADDED Viewed

Git LFS Details

SHA256: aa8157d69d9fd0717e04a1155c2ac3de4a2106e43606bb176508f3b550bd328e
Pointer size: 129 Bytes
Size of remote file: 6.28 kB

figs/img_059_Bicubic_x4.png ADDED Viewed

Git LFS Details

SHA256: cd5bfc697784b24e7b6de830ff9a8e9ba15137579c99b19fdadb3d62497ad572
Pointer size: 129 Bytes
Size of remote file: 6.53 kB

figs/img_059_CAT_x4.png ADDED Viewed

Git LFS Details

SHA256: b433449418f16a1f47f68410c8ec6a3c86d3c35a7731393f87b49d63c7f509ba
Pointer size: 130 Bytes
Size of remote file: 12.2 kB

figs/img_059_DAT_x4.png ADDED Viewed

Git LFS Details

SHA256: 95114ee9f05bc8def69543e1c356bc716c9eb2f034d971581c808bd3e03eb3d0
Pointer size: 130 Bytes
Size of remote file: 12.9 kB

figs/img_059_HR_x4.png ADDED Viewed

Git LFS Details

SHA256: 6c5a46a3e801daa31f1950fb8173649708ddc11b3db095a0568876f4b7d7e353
Pointer size: 130 Bytes
Size of remote file: 14.3 kB

figs/img_059_SwinIR_x4.png ADDED Viewed

Git LFS Details

SHA256: 2be3b9f741b517a7fea3a4998cbc9203fe16693302cd3a4b349fac107bb0e531
Pointer size: 130 Bytes
Size of remote file: 11.5 kB

options/Test/test_DAT_S_x2.yml ADDED Viewed

	@@ -0,0 +1,93 @@

+# general settings
+name: test_DAT_S_x2
+model_type: SRModel
+scale: 2
+num_gpu: 1
+manual_seed: 10
+datasets:
+  test_1:  # the 1st test dataset
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+  test_2:  # the 2st test dataset
+    task: SR
+    name: Set14
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set14/HR
+    dataroot_lq: datasets/benchmark/Set14/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+  test_3:  # the 3st test dataset
+    task: SR
+    name: B100
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/B100/HR
+    dataroot_lq: datasets/benchmark/B100/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+  test_4:  # the 4st test dataset
+    task: SR
+    name: Urban100
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Urban100/HR
+    dataroot_lq: datasets/benchmark/Urban100/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+  test_5:  # the 5st test dataset
+    task: SR
+    name: Manga109
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Manga109/HR
+    dataroot_lq: datasets/benchmark/Manga109/LR_bicubic/X2
+    filename_tmpl: '{}_LRBI_x2'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 2
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,16]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 2
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: experiments/pretrained_models/DAT/DAT_S_x2.pth
+  strict_load_g: True
+# validation settings
+val:
+  save_img: False
+  suffix: ~  # add suffix to saved images, if None, use exp name
+  use_chop: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 2
+      test_y_channel: True
+    ssim:
+      type: calculate_ssim
+      crop_border: 2
+      test_y_channel: True

options/Test/test_DAT_S_x3.yml.yml ADDED Viewed

	@@ -0,0 +1,92 @@

+# general settings
+name: test_DAT_S_x3
+model_type: SRModel
+scale: 3
+num_gpu: 1
+manual_seed: 10
+datasets:
+  test_1:  # the 1st test dataset
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+  test_2:  # the 2st test dataset
+    task: SR
+    name: Set14
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set14/HR
+    dataroot_lq: datasets/benchmark/Set14/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+  test_3:  # the 3st test dataset
+    task: SR
+    name: B100
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/B100/HR
+    dataroot_lq: datasets/benchmark/B100/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+  test_4:  # the 4st test dataset
+    task: SR
+    name: Urban100
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Urban100/HR
+    dataroot_lq: datasets/benchmark/Urban100/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+  test_5:  # the 5st test dataset
+    task: SR
+    name: Manga109
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Manga109/HR
+    dataroot_lq: datasets/benchmark/Manga109/LR_bicubic/X3
+    filename_tmpl: '{}_LRBI_x3'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 3
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,16]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 2
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: experiments/pretrained_models/DAT/DAT_S_x3.pth
+  strict_load_g: True
+# validation settings
+val:
+  save_img: False
+  suffix: ~  # add suffix to saved images, if None, use exp name
+  use_chop: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 3
+      test_y_channel: True
+    ssim:
+      type: calculate_ssim
+      crop_border: 3
+      test_y_channel: True

options/Test/test_DAT_S_x4.yml ADDED Viewed

	@@ -0,0 +1,93 @@

+# general settings
+name: test_DAT_S_x4
+model_type: SRModel
+scale: 4
+num_gpu: 1
+manual_seed: 10
+datasets:
+  test_1:  # the 1st test dataset
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+  test_2:  # the 2st test dataset
+    task: SR
+    name: Set14
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set14/HR
+    dataroot_lq: datasets/benchmark/Set14/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+  test_3:  # the 3st test dataset
+    task: SR
+    name: B100
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/B100/HR
+    dataroot_lq: datasets/benchmark/B100/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+  test_4:  # the 4st test dataset
+    task: SR
+    name: Urban100
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Urban100/HR
+    dataroot_lq: datasets/benchmark/Urban100/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+  test_5:  # the 5st test dataset
+    task: SR
+    name: Manga109
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Manga109/HR
+    dataroot_lq: datasets/benchmark/Manga109/LR_bicubic/X4
+    filename_tmpl: '{}_LRBI_x4'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 4
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,16]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 2
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: experiments/pretrained_models/DAT/DAT_S_x4.pth
+  strict_load_g: True
+# validation settings
+val:
+  save_img: False
+  suffix: ~  # add suffix to saved images, if None, use exp name
+  use_chop: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 4
+      test_y_channel: True
+    ssim:
+      type: calculate_ssim
+      crop_border: 4
+      test_y_channel: True

options/Test/test_DAT_x2.yml CHANGED Viewed

@@ -64,11 +64,11 @@ network_g:
   in_chans: 3
   img_size: 64
   img_range: 1.
-  split_size: [8,16]
   depth: [6,6,6,6,6,6]
   embed_dim: 180
   num_heads: [6,6,6,6,6,6]
-  expansion_factor: 2
   resi_connection: '1conv'
 # path

   in_chans: 3
   img_size: 64
   img_range: 1.
+  split_size: [8,32]
   depth: [6,6,6,6,6,6]
   embed_dim: 180
   num_heads: [6,6,6,6,6,6]
+  expansion_factor: 4
   resi_connection: '1conv'
 # path

options/Test/test_DAT_x3.yml CHANGED Viewed

@@ -63,11 +63,11 @@ network_g:
   in_chans: 3
   img_size: 64
   img_range: 1.
-  split_size: [8,16]
   depth: [6,6,6,6,6,6]
   embed_dim: 180
   num_heads: [6,6,6,6,6,6]
-  expansion_factor: 2
   resi_connection: '1conv'
 # path

   in_chans: 3
   img_size: 64
   img_range: 1.
+  split_size: [8,32]
   depth: [6,6,6,6,6,6]
   embed_dim: 180
   num_heads: [6,6,6,6,6,6]
+  expansion_factor: 4
   resi_connection: '1conv'
 # path

options/Test/test_DAT_x4.yml CHANGED Viewed

@@ -64,11 +64,11 @@ network_g:
   in_chans: 3
   img_size: 64
   img_range: 1.
-  split_size: [8,16]
   depth: [6,6,6,6,6,6]
   embed_dim: 180
   num_heads: [6,6,6,6,6,6]
-  expansion_factor: 2
   resi_connection: '1conv'
 # path

   in_chans: 3
   img_size: 64
   img_range: 1.
+  split_size: [8,32]
   depth: [6,6,6,6,6,6]
   embed_dim: 180
   num_heads: [6,6,6,6,6,6]
+  expansion_factor: 4
   resi_connection: '1conv'
 # path

options/Train/train_DAT_S_x2.yml ADDED Viewed

	@@ -0,0 +1,106 @@

+# general settings
+name: train_DAT_S_x2
+model_type: SRModel
+scale: 2
+num_gpu: auto
+manual_seed: 10
+# dataset and data loader settings
+datasets:
+  train:
+    task: SR
+    name: DF2K
+    type: PairedImageDataset
+    dataroot_gt: datasets/DF2K/HR
+    dataroot_lq: datasets/DF2K/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+    gt_size: 128
+    use_hflip: True
+    use_rot: True
+    # data loader
+    use_shuffle: True
+    num_worker_per_gpu: 12
+    batch_size_per_gpu: 8
+    dataset_enlarge_ratio: 100
+    prefetch_mode: ~
+  val:
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 2
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,16]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 2
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: ~
+  strict_load_g: True
+  resume_state: ~
+# training settings
+train:
+  optim_g:
+    type: Adam
+    lr: !!float 2e-4
+    weight_decay: 0
+    betas: [0.9, 0.99]
+  scheduler:
+    type: MultiStepLR
+    milestones: [250000, 400000, 450000, 475000]
+    gamma: 0.5
+  total_iter: 500000
+  warmup_iter: -1  # no warm up
+  # losses
+  pixel_opt:
+    type: L1Loss
+    loss_weight: 1.0
+    reduction: mean
+# validation settings
+val:
+  val_freq: !!float 5e3
+  save_img: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 2
+      test_y_channel: True
+# logging settings
+logger:
+  print_freq: 200
+  save_checkpoint_freq: !!float 5e3
+  use_tb_logger: True
+  wandb:
+    project: ~
+    resume_id: ~
+# dist training settings
+dist_params:
+  backend: nccl
+  port: 29500

options/Train/train_DAT_S_x3.yml.yml ADDED Viewed

	@@ -0,0 +1,109 @@

+# general settings
+name: train_DAT_S_x3
+model_type: SRModel
+scale: 3
+num_gpu: auto
+manual_seed: 10
+# dataset and data loader settings
+datasets:
+  train:
+    task: SR
+    name: DF2K
+    type: PairedImageDataset
+    dataroot_gt: datasets/DF2K/HR
+    dataroot_lq: datasets/DF2K/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+    gt_size: 192
+    use_hflip: True
+    use_rot: True
+    # data loader
+    use_shuffle: True
+    num_worker_per_gpu: 12
+    batch_size_per_gpu: 8
+    dataset_enlarge_ratio: 100
+    prefetch_mode: ~
+  val:
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 2
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,16]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 2
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: experiments/pretrained_models/DAT-S/DAT_S_x2.pth # save half of training time if we finetune from x2 and halve initial lr.
+  strict_load_g: False
+  resume_state: ~
+# training settings
+train:
+  optim_g:
+    type: Adam
+    # lr: !!float 2e-4
+    lr: !!float 1e-4
+    weight_decay: 0
+    betas: [0.9, 0.99]
+  scheduler:
+    type: MultiStepLR
+    # milestones: [ 250000, 400000, 450000, 475000 ]
+    milestones: [ 125000, 200000, 225000, 237500 ]
+    gamma: 0.5
+  # total_iter: 500000
+  total_iter: 250000
+  warmup_iter: -1  # no warm up
+  # losses
+  pixel_opt:
+    type: L1Loss
+    loss_weight: 1.0
+    reduction: mean
+# validation settings
+val:
+  val_freq: !!float 5e3
+  save_img: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 4
+      test_y_channel: True
+# logging settings
+logger:
+  print_freq: 200
+  save_checkpoint_freq: !!float 5e3
+  use_tb_logger: True
+  wandb:
+    project: ~
+    resume_id: ~
+# dist training settings
+dist_params:
+  backend: nccl
+  port: 29500

options/Train/train_DAT_S_x4.yml ADDED Viewed

	@@ -0,0 +1,110 @@

+# general settings
+name: test_DAT_S_x4
+model_type: SRModel
+scale: 4
+num_gpu: auto
+manual_seed: 10
+# dataset and data loader settings
+datasets:
+  train:
+    task: SR
+    name: DF2K
+    type: PairedImageDataset
+    dataroot_gt: datasets/DF2K/HR
+    dataroot_lq: datasets/DF2K/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+    gt_size: 256
+    use_hflip: true
+    use_rot: true
+    # data loader
+    use_shuffle: True
+    num_worker_per_gpu: 12
+    batch_size_per_gpu: 8
+    dataset_enlarge_ratio: 100
+    prefetch_mode: ~
+  val:
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 4
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,16]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 2
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: experiments/pretrained_models/DAT-S/DAT_S_x2.pth # save half of training time if we finetune from x2 and halve initial lr.
+  strict_load_g: False
+  resume_state: ~
+# training settings
+train:
+  optim_g:
+    type: Adam
+    # lr: !!float 2e-4
+    lr: !!float 1e-4
+    weight_decay: 0
+    betas: [0.9, 0.99]
+  scheduler:
+    type: MultiStepLR
+    # milestones: [ 250000, 400000, 450000, 475000 ]
+    milestones: [ 125000, 200000, 225000, 237500 ]
+    gamma: 0.5
+  # total_iter: 500000
+  total_iter: 250000
+  warmup_iter: -1  # no warm up
+  # losses
+  pixel_opt:
+    type: L1Loss
+    loss_weight: 1.0
+    reduction: mean
+# validation settings
+val:
+  val_freq: !!float 5e3
+  save_img: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 4
+      test_y_channel: True
+# logging settings
+logger:
+  print_freq: 200
+  save_checkpoint_freq: !!float 5e3
+  use_tb_logger: True
+  wandb:
+    project: ~
+    resume_id: ~
+# dist training settings
+dist_params:
+  backend: nccl
+  port: 29500

options/Train/train_DAT_x2.yml ADDED Viewed

	@@ -0,0 +1,106 @@

+# general settings
+name: train_DAT_x2
+model_type: SRModel
+scale: 2
+num_gpu: auto
+manual_seed: 10
+# dataset and data loader settings
+datasets:
+  train:
+    task: SR
+    name: DF2K
+    type: PairedImageDataset
+    dataroot_gt: datasets/DF2K/HR
+    dataroot_lq: datasets/DF2K/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+    gt_size: 128
+    use_hflip: True
+    use_rot: True
+    # data loader
+    use_shuffle: True
+    num_worker_per_gpu: 12
+    batch_size_per_gpu: 8
+    dataset_enlarge_ratio: 100
+    prefetch_mode: ~
+  val:
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X2
+    filename_tmpl: '{}x2'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 2
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,32]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 4
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: ~
+  strict_load_g: True
+  resume_state: ~
+# training settings
+train:
+  optim_g:
+    type: Adam
+    lr: !!float 2e-4
+    weight_decay: 0
+    betas: [0.9, 0.99]
+  scheduler:
+    type: MultiStepLR
+    milestones: [250000, 400000, 450000, 475000]
+    gamma: 0.5
+  total_iter: 500000
+  warmup_iter: -1  # no warm up
+  # losses
+  pixel_opt:
+    type: L1Loss
+    loss_weight: 1.0
+    reduction: mean
+# validation settings
+val:
+  val_freq: !!float 5e3
+  save_img: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 2
+      test_y_channel: True
+# logging settings
+logger:
+  print_freq: 200
+  save_checkpoint_freq: !!float 5e3
+  use_tb_logger: True
+  wandb:
+    project: ~
+    resume_id: ~
+# dist training settings
+dist_params:
+  backend: nccl
+  port: 29500

options/Train/train_DAT_x3.yml ADDED Viewed

	@@ -0,0 +1,109 @@

+# general settings
+name: train_DAT_x3
+model_type: SRModel
+scale: 3
+num_gpu: auto
+manual_seed: 10
+# dataset and data loader settings
+datasets:
+  train:
+    task: SR
+    name: DF2K
+    type: PairedImageDataset
+    dataroot_gt: datasets/DF2K/HR
+    dataroot_lq: datasets/DF2K/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+    gt_size: 192
+    use_hflip: True
+    use_rot: True
+    # data loader
+    use_shuffle: True
+    num_worker_per_gpu: 12
+    batch_size_per_gpu: 8
+    dataset_enlarge_ratio: 100
+    prefetch_mode: ~
+  val:
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X3
+    filename_tmpl: '{}x3'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 2
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,32]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 4
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: experiments/pretrained_models/DAT/DAT_x2.pth # save half of training time if we finetune from x2 and halve initial lr.
+  strict_load_g: False
+  resume_state: ~
+# training settings
+train:
+  optim_g:
+    type: Adam
+    # lr: !!float 2e-4
+    lr: !!float 1e-4
+    weight_decay: 0
+    betas: [0.9, 0.99]
+  scheduler:
+    type: MultiStepLR
+    # milestones: [ 250000, 400000, 450000, 475000 ]
+    milestones: [ 125000, 200000, 225000, 237500 ]
+    gamma: 0.5
+  # total_iter: 500000
+  total_iter: 250000
+  warmup_iter: -1  # no warm up
+  # losses
+  pixel_opt:
+    type: L1Loss
+    loss_weight: 1.0
+    reduction: mean
+# validation settings
+val:
+  val_freq: !!float 5e3
+  save_img: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 4
+      test_y_channel: True
+# logging settings
+logger:
+  print_freq: 200
+  save_checkpoint_freq: !!float 5e3
+  use_tb_logger: True
+  wandb:
+    project: ~
+    resume_id: ~
+# dist training settings
+dist_params:
+  backend: nccl
+  port: 29500

options/Train/train_DAT_x4.yml ADDED Viewed

	@@ -0,0 +1,110 @@

+# general settings
+name: test_DAT_S_x4
+model_type: SRModel
+scale: 4
+num_gpu: auto
+manual_seed: 10
+# dataset and data loader settings
+datasets:
+  train:
+    task: SR
+    name: DF2K
+    type: PairedImageDataset
+    dataroot_gt: datasets/DF2K/HR
+    dataroot_lq: datasets/DF2K/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+    gt_size: 256
+    use_hflip: true
+    use_rot: true
+    # data loader
+    use_shuffle: True
+    num_worker_per_gpu: 12
+    batch_size_per_gpu: 8
+    dataset_enlarge_ratio: 100
+    prefetch_mode: ~
+  val:
+    task: SR
+    name: Set5
+    type: PairedImageDataset
+    dataroot_gt: datasets/benchmark/Set5/HR
+    dataroot_lq: datasets/benchmark/Set5/LR_bicubic/X4
+    filename_tmpl: '{}x4'
+    io_backend:
+      type: disk
+# network structures
+network_g:
+  type: DAT
+  upscale: 4
+  in_chans: 3
+  img_size: 64
+  img_range: 1.
+  split_size: [8,32]
+  depth: [6,6,6,6,6,6]
+  embed_dim: 180
+  num_heads: [6,6,6,6,6,6]
+  expansion_factor: 4
+  resi_connection: '1conv'
+# path
+path:
+  pretrain_network_g: experiments/pretrained_models/DAT-S/DAT_S_x2.pth # save half of training time if we finetune from x2 and halve initial lr.
+  strict_load_g: False
+  resume_state: ~
+# training settings
+train:
+  optim_g:
+    type: Adam
+    # lr: !!float 2e-4
+    lr: !!float 1e-4
+    weight_decay: 0
+    betas: [0.9, 0.99]
+  scheduler:
+    type: MultiStepLR
+    # milestones: [ 250000, 400000, 450000, 475000 ]
+    milestones: [ 125000, 200000, 225000, 237500 ]
+    gamma: 0.5
+  # total_iter: 500000
+  total_iter: 250000
+  warmup_iter: -1  # no warm up
+  # losses
+  pixel_opt:
+    type: L1Loss
+    loss_weight: 1.0
+    reduction: mean
+# validation settings
+val:
+  val_freq: !!float 5e3
+  save_img: False
+  metrics:
+    psnr: # metric name, can be arbitrary
+      type: calculate_psnr
+      crop_border: 4
+      test_y_channel: True
+# logging settings
+logger:
+  print_freq: 200
+  save_checkpoint_freq: !!float 5e3
+  use_tb_logger: True
+  wandb:
+    project: ~
+    resume_id: ~
+# dist training settings
+dist_params:
+  backend: nccl
+  port: 29500