Optimistic Thompson sampling-based algorithms for episodic reinforcement learning B Hu, TH Zhang, N Hegde, M Schmidt Uncertainty in Artificial Intelligence, 890-899, 2023 | 3 | 2023 |
On PI controllers for updating Lagrange multipliers in constrained optimization M Sohrabi, J Ramirez, TH Zhang, S Lacoste-Julien, J Gallego-Posada arXiv preprint arXiv:2406.04558, 2024 | | 2024 |
Efficient and Adaptive Posterior Sampling Algorithms for Bandits B Hu, Z Huang, TH Zhang, M Lécuyer, N Hegde arXiv preprint arXiv:2405.01010, 2024 | | 2024 |
Optimistic Thompson sampling: strategic exploration in bandits and reinforcement learning TH Zhang University of British Columbia, 2023 | | 2023 |
From 6235149080811616882909238708 to 29: Vanilla Thompson Sampling Revisited B Hu, TH Zhang OPT 2023: Optimization for Machine Learning, 0 | | |