Follow
Wesley A Suttle
Wesley A Suttle
U.S. Army Research Laboratory
Verified email at army.mil - Homepage
Title
Cited by
Cited by
Year
A multi-agent off-policy actor-critic algorithm for distributed reinforcement learning
W Suttle, Z Yang, K Zhang, Z Wang, T Başar, J Liu
IFAC-PapersOnLine 53 (2), 1549-1554, 2020
752020
Beyond exponentially fast mixing in average-reward reinforcement learning via multi-level Monte Carlo actor-critic
WA Suttle, A Bedi, B Patel, BM Sadler, A Koppel, D Manocha
International Conference on Machine Learning, 33240-33267, 2023
122023
Reinforcement learning for cost-aware Markov decision processes
W Suttle, K Zhang, Z Yang, J Liu, D Kraemer
International Conference on Machine Learning, 9989-9999, 2021
92021
Global Optimality without Mixing Time Oracles in Average-reward RL via Multi-level Actor-Critic
B Patel, WA Suttle, A Koppel, V Aggarwal, BM Sadler, AS Bedi, ...
arXiv preprint arXiv:2403.11925, 2024
32024
Deceptive Path Planning via Reinforcement Learning with Graph Neural Networks
MY Fatemi, WA Suttle, BM Sadler
arXiv preprint arXiv:2402.06552, 2024
32024
Ada-nav: Adaptive trajectory-based sample efficient policy learning for robotic navigation
B Patel, K Weerakoon, WA Suttle, A Koppel, BM Sadler, AS Bedi, ...
arXiv preprint arXiv:2306.06192, 2023
32023
Reinforcement learning based distributed control of dissipative networked systems
KC Kosaraju, S Sivaranjani, W Suttle, V Gupta, J Liu
IEEE Transactions on Control of Network Systems 9 (2), 856-866, 2021
32021
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling
U Singh, WA Suttle, BM Sadler, VP Namboodiri, AS Bedi
arXiv preprint arXiv:2404.13423, 2024
12024
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search
WA Suttle, A Koppel, J Liu
arXiv preprint arXiv:2201.08832, 2022
12022
A Convergence Result for Regularized Actor-Critic Methods
W Suttle, Z Yang, K Zhang, J Liu
arXiv preprint arXiv:1907.06138, 2019
12019
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
B Patel, WA Suttle, A Koppel, V Aggarwal, BM Sadler, D Manocha, A Bedi
Forty-first International Conference on Machine Learning, 0
1
DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning
U Singh, S Chakraborty, WA Suttle, BM Sadler, VP Namboodiri, AS Bedi
arXiv preprint arXiv:2406.10892, 2024
2024
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical Systems
W Suttle, VK Sharma, KC Kosaraju, S Seetharaman, J Liu, V Gupta, ...
International Conference on Artificial Intelligence and Statistics, 4420-4428, 2024
2024
Ada-NAV: Adaptive Trajectory Length-Based Sample Efficient Policy Learning for Robotic Navigation
B Patel, K Weerakoon, WA Suttle, A Koppel, BM Sadler, T Zhou, ...
arXiv e-prints, arXiv: 2306.06192, 2023
2023
Information-Directed Policy Search in Sparse-Reward Settings via the Occupancy Information Ratio
WA Suttle, A Koppel, J Liu
2023 57th Annual Conference on Information Sciences and Systems (CISS), 1-6, 2023
2023
Policy Gradient for Ratio Optimization: A Case Study
WA Suttle, A Koppel, J Liu
2022 56th Annual Conference on Information Sciences and Systems (CISS), 281-286, 2022
2022
Reinforcement Learning for Ratio Optimization Problems
WA Suttle
State University of New York at Stony Brook, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–17