Abstract:A multiagent reinforcement learning approach based on CBR(case-based reasoning) is proposed.The system policy case library is built,and the relevant policy case subset is chosen by judging the cooperation relationship between the agents.Simulated annealing is used to find the fittest and reuseful case policy,and then the agents choose their actions based on the case.And if there is no practicable case in the case library,the agents carry out joint action learning(JAL).The system policy case library can be updated in real time based on the learning results.The detailed simulation results on pursuit problem are presented to show the superiority of the presented method in learning speed and convergency.
[1] Kaelbling L P,Littman M L,Moore A W.Reinforcement learning:A survey[J]. Journal of Artificial Intelligence Research,1996,4:237~285.
[2] Uther W T B.Tree Based Hierarchical Reinforcement Learning[D]. Pennsylvania,USA:Carnegie Mellon University,2002.
[3] Sutton R S,Precup D,Singh S.Intra-option learning about ternporally abstract actions[A]. Proceedings of the 15th International Conference on Machine Leaming[C]. San Francisco,CA,USA:Morgan Kaufmann,1998.556~564.
[4] Dietterich T G.Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. Journal of Artificial Intelligence Research,2000,13:227~303.
[5] Thrun S,Mitchell T M.Lifelong robot learning[J]. Robotics and Autonomous Systems,1995,15:25~46.
[6] Thrun S,Schwartz A.Finding structure in reinforcement learning[A]. Advances in Neural Information Systems 7[C]. Cambridge,MA,USA:MIT Press,1995.385~392
[7] Sutton R S,Barto A G.Reinforcement Learning:An Introduction[M]. Cambridge,MA,USA:MIT Press,1998.
[8] Watldns C J C H,Dayan P.Technical note.Q-learning[J]. Machine Learning,1992,8(3-4):279~292.
[9] 史忠植.高级人工智能[M]. 北京:科学出版社,1998.Shi Zhong-zhi.Advanced Artificial Intelligence[M]. Beijing:Science Press,1998.
[10] Kok J R,Vlassis N.Sparse cooperative Q-learning[A]. Proceedings of the 21st International Conference on Machine Learning[C]. New York,USA:ACM,2004.481~488.