최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기정보과학회지 = Communications of the Korean Institute of Information Scientists and Engineers, v.36 no.1 = no.344, 2018년, pp.8 - 16
초록이 없습니다.
* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.
R. S. Sutton and A. G. Barto, Reinforcement Learning. MIT press, 1998.
V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, pp. 529-533, Feb. 2015.
D. Silver et al., "Mastering the game of Go with deep neural networks and tree search," Nature, vol. 529, no. 7587, pp. 484-489, Jan. 2016.
D. Silver et al., "Mastering the game of Go without human knowledge," Nature, vol. 550, no. 7676, pp. 354-359, Oct. 2017.
J. P. O'Doherty, S. W. Lee, and D. McNamee, "The structure of reinforcement-learning mechanisms in the human brain," Curr. Opin. Behav. Sci., vol. 1, pp. 94-100, Oct. 2014.
D. P. Bertsekas, Dynamic programming and optimal control. Athena Scientific, 2005.
M. L. Puterman, Markov decision processes : discrete stochastic dynamic programming. Wiley-Interscience, 2005.
R. S. Sutton, D. A. McAllester, S. P. Singh, and Y. Mansour, "Policy Gradient Methods for Reinforcement Learning with Function Approximation." pp. 1057-1063, 2000.
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, "Deterministic policy gradient algorithms," Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32. JMLR.org, p. I-387, 2014.
W. Schultz, P. Dayan, and P. R. Montague, "A neural substrate of prediction and reward," Science (80-. )., vol. 275, pp. 1593-1599, 1997.
C. D. Fiorillo, P. N. Tobler, and W. Schultz, "Discrete coding of reward probability and uncertainty by dopamine neurons.," Science, vol. 299, no. 5614, pp. 1898-902, Mar. 2003.
B. W. Balleine and J. P. O'Doherty, "Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action.," Neuropsychopharmacology, vol. 35, no. 1, pp. 48-69, Jan. 2010.
N. D. Daw, Y. Niv, and P. Dayan, "Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control," Nat. Neurosci., vol. 8, pp. 1704-1711, 2005.
P. D. Mate Lengyel, "Hippocampal Contributions to Control: The Third Way," in Advances in Neural Information Processing Systems (NIPS), 2008, pp. 889-896.
S. a Sheth et al., "Human dorsal anterior cingulate cortex neurons mediate ongoing behavioural adaptation.," Nature, pp. 3-7, Jun. 2012.
J. Glascher, N. Daw, P. Dayan, and J. P. O'Doherty, "States versus Rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning," Neuron, vol. 66, no. 4, pp. 585-95, May 2010.
N. D. Daw, S. J. Gershman, B. Seymour, P. Dayan, and R. J. Dolan, "Model-based influences on humans' choices and striatal prediction errors.," Neuron, vol. 69, no. 6, pp. 1204-15, Mar. 2011.
S. W. Lee, S. Shimojo, and J. P. O'Doherty, "Neural Computations Underlying Arbitration between Model-Based and Model-free Learning," Neuron, vol. 81, no. 3, pp. 687-699, Feb. 2014.
E. Tricomi, B. W. Balleine, and J. P. O'Doherty, "A specific role for posterior dorsolateral striatum in human habit learning," Eur. J. Neurosci., vol. 29, pp. 2225-2232, 2009.
K. Wunderlich, P. Dayan, and R. J. Dolan, "Mapping value based planning and extensively trained choices in the human brain," Nat. Neurosci., vol. 15, pp. 786-791, 2012.
E. D. Boorman, T. E. Behrens, M. W. Woolrich, and M. F. S. Rushworth, "How Green Is the Grass on the Other Side? Frontopolar Cortex and the Evidence in Favor of Alternative Courses of Action," Neuron, vol. 62, pp. 733-743, 2009.
T. a Hare, C. F. Camerer, and A. Rangel, "Self-control in decision-making involves modulation of the vmPFC valuation system," Science (80-. )., vol. 324, pp. 646-648, 2009.
M. F. S. Rushworth, M. P. Noonan, E. D. Boorman, M. E. Walton, and T. E. Behrens, "Frontal Cortex and Reward-Guided Learning and Decision-Making," Neuron, vol. 70, pp. 1054-1069, 2011.
*원문 PDF 파일 및 링크정보가 존재하지 않을 경우 KISTI DDS 시스템에서 제공하는 원문복사서비스를 사용할 수 있습니다.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.