Skip to content
Snippets Groups Projects

Extended version of the DEArt dataset for object detection in artworks

Description

ECCV: European Conference on Computer Vision AI4DH 2024: 3rd International Workshop on Artificial Intelligence for Digital Humanities

This repository contains the materials presented in the paper 'An approach for dataset extension for object detection in artworks using open-vocabulary models'

Tetiana Yemelianenko, Iuliia Tkachenko, Tess Masclef, Mihaela Scuturici, Serge Miguet

Pipeline

We provide the link to download the created extended version of DEArt dataset and the code for dataset extension.

Table of content

Dataset

Extended dataset can be downloaded here, subset with new classes can be downloaded on the same page. Original DEArt dataset can be found here DEArt.

Extended dataset is annotated in YOLO style, so for using the original version of the DEArt dataset and extended version you should convert annotations of DEArt dataet in YOLO style too.

The new version of the dataset contains images from the 12th to 20th centuries in contrast with the original DEArt dataset with images from the 12th to 18th centuries. If it is necessary it is possible to restrict the period of the paintings by filtering images in the WikiArt dataset before dataset extension. In extended version images from WikiArt were used, so the new version contains not only paintings from European collections but also the paintings from Ukiyo-e - an ancient type of Japanese art, and others. If needed, you can create your own version of the dataset filtering styles by using shared code of the dataset creation.

Steps

First you need to prepare two datasets. One small with the image-level annotations of classes which you plan to extend or add to the dataset, the second one - big non-annotated dataset from which we collect and annotate images on object level using proposed approach. Next you need to train YOLO model using the original dataset which you want to extend, calculate objectnesses of the objects for the images from the big non-annotated dataset, using OWL-ViT2, create index file for the objectnesses using ANNOY.

To reproduce our steps you need finetuned on the original DEArt dataset YOLO model, file with calculated objectnesses for the images from Wikiart dataset and ANNOY index. These files are available upon a request.

Citation

@InProceedings{Yemelianenko_2024_ECCV,
    author    = {Yemelianenko, Tetiana and Tkachenko, Iuliia and Masclef, Tess and Scuturici, Mihaela and Miguet, Serge},
    title     = {An approach for dataset extension for object detection in artworks using open-vocabulary models},
    booktitle = {},
    month     = {September},
    year      = {2024},
    pages     = {}
}

License

The dataset is available under license Creative Commons Attribution-NonCommercial-ShareAlike (CC-BY-NC-SA) LiceRI.

Acknowledgments

This work was funded by french national research agency with grant ANR-20-CE38-0017. We would like to thank the PAUSE ANR-Program: Ukrainian scientists support to support the scientific stay of T. Yemelianenko in LIRIS laboratory.

LIRIS logo
ANR logo