Centres Of Excellence

To focus on new and emerging areas of research and education, Centres of Excellence have been established within the Institute. These ‘virtual' centres draw on resources from its stakeholders, and interact with them to enhance core competencies

Read More >>

Faculty

Faculty members at IIMB generate knowledge through cutting-edge research in all functional areas of management that would benefit public and private sector companies, and government and society in general.

Read More >>

IIMB Management Review

Journal of Indian Institute of Management Bangalore

IIM Bangalore offers Degree-Granting Programmes, a Diploma Programme, Certificate Programmes and Executive Education Programmes and specialised courses in areas such as entrepreneurship and public policy.

Read More >>

About IIMB

The Indian Institute of Management Bangalore (IIMB) believes in building leaders through holistic, transformative and innovative education

Read More >>

Research & Publications Office to host seminar on ‘Decentralised Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms’ on 6th December

Prof. N. Hemachandra

The talk will be delivered by Prof. N. Hemachandra, IIT Bombay

30 November, 2022, Bengaluru: The Office of Research and Publications (R&P) at IIM Bangalore will host a seminar titled: ‘Decentralised Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms’ at 4:00 pm on 6th December 2022 (Tuesday), at Classroom K21. The talk will be delivered by Prof. N. Hemachandra, IIT Bombay, from the Production & Operations Management area.

Abstract: Multi-agent actor-critic algorithms are an important part of the Reinforcement Learning (RL) paradigm. The research proposes three fully decentralized multi-agent natural actor-critic (MAN) algorithms in this work. The objective is to collectively find a joint policy that maximizes the average long-term return of these agents. In the absence of a central controller and to preserve privacy, agents communicate some information to their neighbors via a time varying communication network. The research proves convergence of all the three MAN algorithms to a globally asymptotically stable set of the ODE corresponding to actor update; these use linear function approximations. It also shows that the Kullback–Leibler divergence between policies of successive iterates is proportional to the objective function’s gradient. It is observed that the minimum singular value of the Fisher information matrix is well within the reciprocal of the policy parameter dimension. Using this, the researchers theoretically show that the optimal value of the deterministic variant of the MAN algorithm at each iterate dominates that of the standard gradient-based multi-agent actor-critic (MAAC) algorithm. There is a possibility that it is the first such result in multi-agent reinforcement learning (MARL). To illustrate the usefulness of the proposed algorithms, the researchers implement these on a bi-lane traffic network to reduce the average network congestion. They observed an almost 25% reduction in the average congestion in 2 MAN algorithms; the average congestion in another MAN algorithm is at par with the MAAC algorithm.

About the speaker: Dr. N. Hemachandra is a Professor of Industrial Engineering and Operations Research, IIT Bombay. His academic interests include Operations Research and Machine Learning, including RL and Bandit problems and their applications to problems arising from Supply Chains, Communication Networks, Logistics and Financial Engineering. He has an M Tech (Control Systems Engg., EE) from IIT Kharagpur and PhD from the Dept. of Computer Science and Automation, IISc.

Webpage link: https://www.ieor.iitb.ac.in/~nh

Create Date
30 Nov

The talk will be delivered by Prof. N. Hemachandra, IIT Bombay

30 November, 2022, Bengaluru: The Office of Research and Publications (R&P) at IIM Bangalore will host a seminar titled: ‘Decentralised Multi-Agent Natural Actor-Critic Reinforcement Learning Algorithms’ at 4:00 pm on 6th December 2022 (Tuesday), at Classroom K21. The talk will be delivered by Prof. N. Hemachandra, IIT Bombay, from the Production & Operations Management area.

Abstract: Multi-agent actor-critic algorithms are an important part of the Reinforcement Learning (RL) paradigm. The research proposes three fully decentralized multi-agent natural actor-critic (MAN) algorithms in this work. The objective is to collectively find a joint policy that maximizes the average long-term return of these agents. In the absence of a central controller and to preserve privacy, agents communicate some information to their neighbors via a time varying communication network. The research proves convergence of all the three MAN algorithms to a globally asymptotically stable set of the ODE corresponding to actor update; these use linear function approximations. It also shows that the Kullback–Leibler divergence between policies of successive iterates is proportional to the objective function’s gradient. It is observed that the minimum singular value of the Fisher information matrix is well within the reciprocal of the policy parameter dimension. Using this, the researchers theoretically show that the optimal value of the deterministic variant of the MAN algorithm at each iterate dominates that of the standard gradient-based multi-agent actor-critic (MAAC) algorithm. There is a possibility that it is the first such result in multi-agent reinforcement learning (MARL). To illustrate the usefulness of the proposed algorithms, the researchers implement these on a bi-lane traffic network to reduce the average network congestion. They observed an almost 25% reduction in the average congestion in 2 MAN algorithms; the average congestion in another MAN algorithm is at par with the MAAC algorithm.

About the speaker: Dr. N. Hemachandra is a Professor of Industrial Engineering and Operations Research, IIT Bombay. His academic interests include Operations Research and Machine Learning, including RL and Bandit problems and their applications to problems arising from Supply Chains, Communication Networks, Logistics and Financial Engineering. He has an M Tech (Control Systems Engg., EE) from IIT Kharagpur and PhD from the Dept. of Computer Science and Automation, IISc.

Webpage link: https://www.ieor.iitb.ac.in/~nh