GDRNPP for FruitBin
This repository is a fork of the official GDRNPP repository modified for the purpose of FruitBin dataset.
Preprocessing of FruitBin dataset
In order to train the GDRNPP model on a new dataset, it is necessary to have this dataset in BOP format. Several scripts were created to convert the Fruitbin dataset to BOP format.
The first step is to resize the bounding boxes. It is necessary to reduce them for correct operation. Here is the script. The paths to the input and output folders are hardcoded and can be easily changed.
Then, it is necessary to use the script that does the main part of preprocessing. It creates the necessary directories, copies the necessary files into them, and creates json files with ground truth in the required format. The command to run is:
python /gdrnpp_bop2022/preprocessing/preprocess_fruitbin.py --src_directory PATH_TO_SRC_DIRECTORY --dst_directory PATH_TO_DST_DIRECTORY --scenario SCENARIO
- --src_directory - the input directory with the folders of all the fruits;
- --dst_directory - the output directory;
- --scenario - the scenario for splitting data in the dataset. Basic dataset splitting scenarios:
_world_occ_07.txt, _world_occ_05.txt, _world_occ_03.txt, _world_occ_01.txt, _camera_occ_07.txt, _camera_occ_05.txt, _camera_occ_03.txt, _camera_occ_01.txt
In order to use GT bbox in the case of FruitBin Benchmark, a script was written that generates a .json file of gt in the required format. The command is:
python /gdrnpp_bop2022/preprocessing/generate_gt.py`.
After creating the file, make sure that the path to it is correct in the main config file in GDRNPP.
The generate_image_sets_file and generate_test_targets_file scripts create two files required for testing:
python /gdrnpp_bop2022/preprocessing/generate_image_sets_file.py
python /gdrnpp_bop2022/preprocessing/generate_test_targets_file.py
The paths to the input and output directories are also hardcoded in the script. If necessary, the scenario for splitting data in the dataset can also be changed.
Model evaluation
The ADD metric is used to evaluate the accuracy of different models trained on the fruitbin dataset. It is necessary to use the script that evaluates the accuracy of detecting the position of a single fruit, i.e. it has to be run separately for each fruit. The command:
python /gdrnpp_bop2022/core/gdrn_modeling/tools/fruitbin/eval_pose.py --path_data=/gdrnpp_bop2022/datasets/BOP_DATASETS/fruitbin/ --pred_path=/gdrnpp_bop2022/output/gdrn/fruitbin/convnext_a6_AugCosyAAEGray_BG05_mlL1_DMask_amodalClipBox_classAware_fruitbin/inference_$MODEL/fruitbin_test/convnext-a6-AugCosyAAEGray-BG05-mlL1-DMask-amodalClipBox-classAware-fruitbin-test-iter0_fruitbin-test.csv --class_name=apple2 --symmetry=True
- --path_data - path to the dataset
- --pred_path - path to .csv file with detected positions created after testing the model
- --class_name - the fruit for which the pose accuracy is estimated
- --symmetry - a boolean value of whether the fruit is symmetrical. In the Fruitbin dataset, only banana and pear are asymmetrical.
Docker
To facilitate reprodibility, a docker contener is provided with GDRNPP for FruitBin dataset.
- Loading the docker image:
sudo docker pull guillaume0477/6d_pose:gdrnpp_fruitbin
- Creating a container:
xhost +
sudo docker run -it --env="DISPLAY=$DISPLAY" --env="QT_X11_NO_MITSHM=1" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --env="XAUTHORITY=$XAUTH" --volume="$Path_to_dataset/Datasets/BOP_format:/gdrnpp_bop2022/datasets/BOP_DATASETS/fruitbin" --volume="$XAUTH:$XAUTH" --net=host --gpus all --privileged --runtime=nvidia --shm-size 48G guillaume0477/6d_pose:gdrnpp_fruitbin
In the example above, there is a volume parameter with a path that copies the fruitbin dataset from the local computer to the container. The fruitbin dataset can be downloaded from this link.
Original README of GDRNPP for BOP2022
This repo provides code and models for GDRNPP_BOP2022, winner (most of the awards) of the BOP Challenge 2022 at ECCV'22 [slides].
Path Setting
Dataset Preparation
Download the 6D pose datasets from the
BOP website and
VOC 2012
for background images.
Please also download the test_bboxes
from
here OneDrive (password: groupji) or BaiDuYunPan(password: vp58).
The structure of datasets
folder should look like below:
datasets/
├── BOP_DATASETS # https://bop.felk.cvut.cz/datasets/
├──tudl
├──lmo
├──ycbv
├──icbin
├──hb
├──itodd
└──tless
└──VOCdevkit
Models
Download the trained models at Onedrive (password: groupji) or BaiDuYunPan(password: 10t3) and put them in the folder ./output
.
Requirements
- Ubuntu 18.04/20.04, CUDA 10.1/10.2/11.6, python >= 3.7, PyTorch >= 1.9, torchvision
- Install
detectron2
from source sh scripts/install_deps.sh
- Compile the cpp extensions for
-
farthest points sampling (fps)
-
flow
-
uncertainty pnp
-
ransac_voting
-
chamfer distance
-
egl renderer
sh ./scripts/compile_all.sh
Detection
We adopt yolox as the detection method. We used stronger data augmentation and ranger optimizer.
Training
Download the pretrained model at Onedrive (password: groupji) or BaiDuYunPan(password: aw68) and put it in the folder pretrained_models/yolox
. Then use the following command:
./det/yolox/tools/train_yolox.sh <config_path> <gpu_ids> (other args)
Testing
./det/yolox/tools/test_yolox.sh <config_path> <gpu_ids> <ckpt_path> (other args)
Pose Estimation
The difference between this repo and GDR-Net (CVPR2021) mainly including:
- Domain Randomization: We used stronger domain randomization operations than the conference version during training.
- Network Architecture: We used a more powerful backbone Convnext rather than resnet-34, and two mask heads for predicting amodal mask and visible mask separately.
- Other training details, such as learning rate, weight decay, visible threshold, and bounding box type.
Training
./core/gdrn_modeling/train_gdrn.sh <config_path> <gpu_ids> (other args)
For example:
./core/gdrn_modeling/train_gdrn.sh configs/gdrn/ycbv/convnext_a6_AugCosyAAEGray_BG05_mlL1_DMask_amodalClipBox_classAware_ycbv.py 0
Testing
./core/gdrn_modeling/test_gdrn.sh <config_path> <gpu_ids> <ckpt_path> (other args)
For example:
./core/gdrn_modeling/test_gdrn.sh configs/gdrn/ycbv/convnext_a6_AugCosyAAEGray_BG05_mlL1_DMask_amodalClipBox_classAware_ycbv.py 0 output/gdrn/ycbv/convnext_a6_AugCosyAAEGray_BG05_mlL1_DMask_amodalClipBox_classAware_ycbv/model_final_wo_optim.pth
Pose Refinement
We utilize depth information to further refine the estimated pose. We provide two types of refinement: fast refinement and iterative refinement.
For fast refinement, we compare the rendered object depth and the observed depth to refine translation. Run
./core/gdrn_modeling/test_gdrn_depth_refine.sh <config_path> <gpu_ids> <ckpt_path> (other args)
For iterative refinement, please checkout to the pose_refine branch for details.
Citing GDRNPP
If you use GDRNPP in your research, please use the following BibTeX entries.
@misc{liu2022gdrnpp_bop,
author = {Xingyu Liu and Ruida Zhang and Chenyangguang Zhang and
Bowen Fu and Jiwen Tang and Xiquan Liang and Jingyi Tang and
Xiaotian Cheng and Yukang Zhang and Gu Wang and Xiangyang Ji},
title = {GDRNPP},
howpublished = {\url{https://github.com/shanice-l/gdrnpp_bop2022}},
year = {2022}
}
@InProceedings{Wang_2021_GDRN,
title = {{GDR-Net}: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation},
author = {Wang, Gu and Manhardt, Fabian and Tombari, Federico and Ji, Xiangyang},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021},
pages = {16611-16621}
}