RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback H Lee, S Phatale, H Mansoor, T Mesnard, J Ferret, K Lu, C Bishop, E Hall, ... Proceedings of the 41st International Conference on Machine Learning 235 …, 2023 | 331* | 2023 |
Description-Driven Task-Oriented Dialog Modeling J Zhao, R Gupta, Y Cao, D Yu, M Wang, H Lee, A Rastogi, I Shafran, Y Wu arXiv preprint arXiv:2201.08904, 2022 | 50 | 2022 |
Show, Don't Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue R Gupta, H Lee, J Zhao, A Rastogi, Y Cao, Y Wu Proceedings of the 2022 Conference of the North American Chapter of the …, 2022 | 30 | 2022 |
SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems H Lee, R Gupta, A Rastogi, Y Cao, B Zhang, Y Wu Proceedings of the AAAI Conference on Artificial Intelligence 36 (10), 10938 …, 2022 | 29 | 2022 |
AnyTOD: A Programmable Task-Oriented Dialog System J Zhao, Y Cao, R Gupta, H Lee, A Rastogi, M Wang, H Soltau, I Shafran, ... Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2022 | 8 | 2022 |
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs J Wu, L Ning, L Liu, H Lee, N Wu, C Wang, S Prakash, S O'Banion, ... arXiv preprint arXiv:2409.04421, 2024 | | 2024 |
Conversational Recommendation as Retrieval: A Simple, Strong Baseline R Gupta, R Aksitov, S Phatale, S Chaudhary, H Lee, A Rastogi Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023), 2023 | | 2023 |