Videochat: Chat-centric video understanding KC Li, Y He, Y Wang, Y Li, W Wang, P Luo, Y Wang, L Wang, Y Qiao arXiv preprint arXiv:2305.06355, 2023 | 394 | 2023 |
TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model B Pang, Y Li, Y Zhang, M Li, C Lu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020 | 318 | 2020 |
InternVideo: General Video Foundation Models via Generative and Discriminative Learning Y Wang, K Li, Y Li, Y He, B Huang, Z Zhao, H Zhang, J Xu, Y Liu, Z Wang, ... arXiv preprint arXiv:2212.03191, 2022 | 256 | 2022 |
Internvid: A large-scale video-text dataset for multimodal understanding and generation Y Wang, Y He, Y Li, K Li, J Yu, X Ma, X Chen, Y Wang, P Luo, Z Liu, ... arXiv preprint arXiv:2307.06942, 2023 | 129 | 2023 |
HOI Analysis: Integrating and Decomposing Human-Object Interaction YL Li, X Liu, X Wu, Y Li, C Lu Advances in Neural Information Processing Systems 33, 2020 | 129 | 2020 |
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer K Li, Y Wang, Y He, Y Li, Y Wang, L Wang, Y Qiao arXiv preprint arXiv:2211.09552, 2022 | 109 | 2022 |
Unmasked teacher: Towards training-efficient video foundation models K Li, Y Wang, Y Li, Y Wang, Y He, L Wang, Y Qiao arXiv preprint arXiv:2303.16058, 2023 | 104 | 2023 |
Test-time personalization with a transformer for human pose estimation Y Li, M Hao, Z Di, NB Gundavarapu, X Wang Advances in Neural Information Processing Systems 34, 2583-2597, 2021 | 46 | 2021 |
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges G Chen, S Xing, Z Chen, Y Wang, K Li, Y Li, Y Liu, J Wang, YD Zheng, ... arXiv preprint arXiv:2211.09529, 2022 | 40 | 2022 |
Hake: a knowledge engine foundation for human activity understanding YL Li, X Liu, X Wu, Y Li, Z Qiu, L Xu, Y Xu, HS Fang, C Lu IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022 | 26 | 2022 |
PGT: A Progressive Method for Training Models on Long Videos B Pang, G Peng, Y Li, C Lu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 12 | 2021 |
Unsupervised representation for semantic segmentation by implicit cycle-attention contrastive learning B Pang, Y Li, Y Zhang, G Peng, J Tang, K Zha, J Li, C Lu Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 2044-2052, 2022 | 11 | 2022 |
Tdaf: Top-down attention framework for vision tasks B Pang, Y Li, J Li, M Li, H Cao, C Lu Proceedings of the AAAI Conference on Artificial Intelligence 35 (3), 2384-2392, 2021 | 11 | 2021 |
Harvest Video Foundation Models via Efficient Post-Pretraining Y Li, K Li, Y He, Y Wang, Y Wang, L Wang, Y Qiao, P Luo arXiv preprint arXiv:2310.19554, 2023 | 2 | 2023 |