Follow
Bowen Zhang
Title
Cited by
Cited by
Year
Real-time action recognition with enhanced motion vector CNNs
B Zhang, L Wang, Z Wang, Y Qiao, H Wang
CVPR, 2718-2726, 2016
5292016
Ferret: Refer and ground anything anywhere at any granularity
H You, H Zhang, Z Gan, X Du, B Zhang, Z Wang, L Cao, SF Chang, ...
arXiv preprint arXiv:2310.07704, 2023
2002023
Real-time action recognition with deeply transferred motion vector cnns
B Zhang, L Wang, Z Wang, Y Qiao, H Wang
IEEE Transactions on Image Processing 27 (5), 2326-2339, 2018
1862018
Cross-Modal and Hierarchical Modeling of Video and Text
B Zhang, H Hu, F Sha
Proceedings of the European Conference on Computer Vision (ECCV), 374-390, 2018
1642018
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
B McKinzie, Z Gan, JP Fauconnier, S Dodge, B Zhang, P Dufter, D Shah, ...
arXiv preprint arXiv:2403.09611, 2024
1382024
Cuhk & ethz & siat submission to activitynet challenge 2016
Y Xiong, L Wang, Z Wang, B Zhang, H Song, W Li, D Lin, Y Qiao, ...
CVPR'16 ActivityNet workshop, 2016
1382016
Weakly supervised patchnets: Describing and aggregating local patches for scene recognition
Z Wang, L Wang, Y Wang, B Zhang, Y Qiao
IEEE Transactions on Image Processing 26 (4), 2028-2041, 2017
1022017
Co-training Transformer with Videos and Images Improves Action Recognition
B Zhang, J Yu, C Fifty, W Han, AM Dai, R Pang, F Sha
arXiv preprint arXiv:2112.07175, 2021
522021
Cuhk & ethz & siat submission to activitynet challenge 2017
Y Zhao, B Zhang, Z Wu, S Yang, L Zhou, S Yan, L Wang, Y Xiong, D Lin, ...
CVPR'17 ActivityNet workshop 8, 8, 2017
482017
Veclip: Improving clip training via visual-enriched captions
Z Lai, H Zhang, B Zhang, W Wu, H Bai, A Timofeev, X Du, Z Gan, J Shan, ...
European Conference on Computer Vision, 111-127, 2025
35*2025
A Hierarchical Multi-Modal Encoder for Moment Localization in Video Corpus
B Zhang, H Hu, J Lee, M Zhao, S Chammas, V Jain, E Ie, F Sha
arXiv preprint arXiv:2011.09046, 2020
302020
Compressing LLMs: The Truth is Rarely Pure and Never Simple
A Jaiswal, Z Gan, X Du, B Zhang, Z Wang, Y Yang
arXiv preprint arXiv:2310.01382, 2023
252023
Learning to Represent Image and Text with Denotation Graph
B Zhang, H Hu, V Jain, E Ie, F Sha
EMNLP'20, 823-839, 2020
242020
Systematic Generalization on gSCAN: What is Nearly Solved and What is Next?
L Qiu, H Hu, B Zhang, P Shaw, F Sha
EMNLP'21, 2021
222021
Topic Augmented Generator for Abstractive Summarization
M Ailem, B Zhang, F Sha
BayLearn, 2019
222019
Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness
L Cao, B Zhang, C Chen, Y Yang, X Du, W Zhang, Z Lu, Y Zheng
arXiv preprint arXiv:2305.05095, 2023
202023
MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014.
B Zhang, Y Yi, H Wang, J Yu
MediaEval, 2014
192014
STAIR: Learning Sparse Text and Image Representation in Grounded Tokens
C Chen, B Zhang, L Cao, J Shen, T Gunter, AM Jose, A Toshev, J Shlens, ...
arXiv preprint arXiv:2301.13081, 2023
152023
Apple Intelligence Foundation Language Models
T Gunter, Z Wang, C Wang, R Pang, A Narayanan, A Zhang, B Zhang, ...
arXiv preprint arXiv:2407.21075, 2024
142024
MOFI: Learning Image Representations from Noisy Entity Annotated Images
W Wu, A Timofeev, C Chen, B Zhang, K Duan, S Liu, Y Zheng, J Shlens, ...
arXiv preprint arXiv:2306.07952, 2023
92023
The system can't perform the operation now. Try again later.
Articles 1–20