ONLINE COURSE ON REINFORCEMENT LEARNING (3:0)
Source: https://cce.iisc.ac.in/cce-proficience/reinforcement-learning-mj-2026/ Parent: https://cce.iisc.ac.in/cce-proficience/
CCE-PROFICIENCE MAY – JULY 2026
Duration
3 months May -July 2026
Schedule
Every Saturday
Saturdays 2 P.M. to 5:30 P.M. with 30 minutes break in between.
Course offered
Online
Exam Duration
31 July to 9 August 2026
Classes Start
~4 May 2026
Objectives of the Course
Reinforcement learning refers to a class of techniques that combine aspects of optimal control, simulation/data driven optimization, and approximation methods for problems of dynamic decision making under uncertainty when the model of the underlying system and its processes is unknown. A large portion of the algorithms and techniques used here are model-free in nature and as a result need no knowledge of the system dynamics and protocols used. Reinforcement Learning thus finds applications in several diverse areas such as Adaptive Control, Signal Processing, Manufacturing, Communication and Wireless Networks, Autonomous Systems and Data Mining. The objective of this course will be to provide a strong foundation in Reinforcement Learning through the various tools, techniques and algorithms used as well as to cover the state-of-the-art algorithms in Deep Reinforcement Learning involving simulation-based neural network methods.
Syllabus
Introduction to Reinforcement Learning, Multi-armed bandits, Markov decision processes, Dynamic Programming Value and Policy Iteration Methods, Model-Free Learning Approaches, Monte-Carlo Methods, Temporal Difference Learning, Q-learning, SARSA, Double Q-learning, Value Function Approximation Methods TD Learning with Linear Function Approximation, Neural Network Architectures, Deep Q-Network Algorithm, Policy Gradient Methods, ActorCritic Algorithms.
Minimum Qualification required by the candidates
B.Tech (any discipline); B.Sc in Mathematics / Statistics /Computer Science / Physics / Data Science.
Course Plan for the Reinforcement Learning:
Week 1 Introduction to Reinforcement Learning – examples and applications\ Week 2 Multi-armed Bandits – action selection strategies\ Week 3 Multi-armed Bandits – algorithms; Introduction to Markov Decision Processes\ Week 4 Markov Decision Processes – Examples, formulations\ Week 5 Numerical approaches for Markov Decision Processes\ Week 6 Monte-Carlo model-free Reinforcement Learning Algorithms for prediction\ Week 7 Monte-Carlo Algorithms for Control; Temporal Difference Methods\ Week 8 One and n-Step Temporal Difference Learning, Q-learning, SARSA, Expected SARSA, Double Q-learning\ Week 9 Function Approximation Methods, TD Learning/SARSA with Linear Function Approximation\ Week 10 Neural network architectures, Deep Q-learning\ Week 11 Introduction to policy gradient methods – basic principles and results\ Week 12 Policy gradient algorithms – REINFORCE, Actor-Critic
Reference Books
- R.Sutton and A.Barto, Reinforcement Learning, 2018 (MIT Press)\
- Recent papers (to be shared in class)
Know The Facilitators
Shalabh Bhatnagar
Professor
Dept of Computer Science and Automation,
Indian Institute of Science.
Course Fee
| Particulars | Amount |
| Course Fee | 15,000 |
| Application Fee | 300 |
| GST@18% | 2,754 |
| Total | 18,054 |