publications

A full paper list is available at my google scholar page.

2024

  1. gd15.gif
    Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
    Tianhe Ren*, Qing Jiang*, Shilong Liu*, and 13 more authors
    arXiv:2405.10300, 2024
    Grounding DINO 1.5 Pro — our most capable model for open-set object detection.
  2. llava-plus-pv.png
    LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
    Shilong Liu, Hao Cheng, Haotian Liu, and 10 more authors
    To be shown in ECCV, 2024
    Equip multimodal large language models with tools to create multimodal agents.

2023

  1. groundingdino_pv.png
    Grounding DINO: Marrying dino with grounded pre-training for open-set object detection
    Shilong Liu, Zhaoyang Zeng, Tianhe Ren, and 8 more authors
    To be shown in ECCV, 2023
    SOTA open-set object detector. 52.5AP on COCO without COCO training data!
  2. maskdino_pv.jpeg
    Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
    Feng Li, Hao Zhang, Huaizhe Xu, and 4 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
    SOTA object detection and segmentation model.
  3. dqdetr_pv.png
    DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
    Liu Shilong, Liang Yaoyuan, Huang Shijia, and 5 more authors
    In Proceedings of the AAAI Conference on Artificial Intelligence, 2023
    A comparison of object detection, REC, and phrase grounding tasks.
  4. dino_pv.png
    DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
    Hao Zhang*, Feng Li*, Shilong Liu*, and 5 more authors
    In International Conference on Learning Representations, 2023
    The first DETR-based object detector that achieved 1st on the COCO detection leaderboard.

2022

  1. dndetr_pv.png
    DN-DETR: Accelerate detr training by introducing query denoising
    Feng Li*, Hao Zhang*, Shilong Liu, and 3 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
    A novel denoising training strategy for DETR, achieving faster convergence and better performance.
  2. dabdetr_pv.png
    DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
    Shilong Liu, Feng Li, Hao Zhang, and 5 more authors
    In International Conference on Learning Representations, 2022
    A deep understanding of DETR’s query, and formulating queries as anchor boxes.

2021

  1. q2l_pv.png
    Query2Label: A Simple Transformer Way to Multi-Label Classification
    Shilong Liu, Lei Zhang, Xiao Yang, and 2 more authors
    arXiv:2107.10834, 2021
    A novel transformer-based multi-label classification model, achieving SOTA on four benchmarks.