{"product_id":"approximate-dynamic-programming-isbn-9780470604458","title":"Approximate Dynamic Programming","description":"\u003cb\u003ePraise for the \u003ci\u003eFirst Edition\u003c\/i\u003e\u003c\/b\u003e  \u003cp\u003e\"Finally, a book devoted to dynamic programming and written using the language of operations research (OR)! This beautiful book fills a gap in the libraries of OR specialists and practitioners.\"\u003cbr\u003e —\u003cb\u003e\u003ci\u003eComputing Reviews\u003c\/i\u003e\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003eThis new edition showcases a focus on modeling and computation for complex classes of approximate dynamic programming problems\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eUnderstanding approximate dynamic programming (ADP) is vital in order to develop practical and high-quality solutions to complex industrial problems, particularly when those problems involve making decisions in the presence of uncertainty. \u003ci\u003eApproximate Dynamic Programming\u003c\/i\u003e, Second Edition uniquely integrates four distinct disciplines—Markov decision processes, mathematical programming, simulation, and statistics—to demonstrate how to successfully approach, model, and solve a wide range of real-life problems using ADP.\u003c\/p\u003e \u003cp\u003eThe book continues to bridge the gap between computer science, simulation, and operations research and now adopts the notation and vocabulary of reinforcement learning as well as stochastic search and simulation optimization. The author outlines the essential algorithms that serve as a starting point in the design of practical solutions for real problems. The three curses of dimensionality that impact complex problems are introduced and detailed coverage of implementation challenges is provided. The \u003ci\u003eSecond Edition\u003c\/i\u003e also features:\u003c\/p\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eA new chapter describing four fundamental classes of policies for working with diverse stochastic optimization problems: myopic policies, look-ahead policies, policy function approximations, and policies based on value function approximations\u003c\/p\u003e \u003c\/li\u003e \u003cli\u003e \u003cp\u003eA new chapter on policy search that brings together stochastic search and simulation optimization concepts and introduces a new class of optimal learning strategies\u003c\/p\u003e \u003c\/li\u003e \u003cli\u003e \u003cp\u003eUpdated coverage of the exploration exploitation problem in ADP, now including a recently developed method for doing active learning in the presence of a physical state, using the concept of the knowledge gradient\u003c\/p\u003e \u003c\/li\u003e \u003cli\u003e \u003cp\u003eA new sequence of chapters describing statistical methods for approximating value functions, estimating the value of a fixed policy, and value function approximation while searching for optimal policies\u003c\/p\u003e \u003c\/li\u003e \u003c\/ul\u003e \u003cp\u003eThe presented coverage of ADP emphasizes models and algorithms, focusing on related applications and computation while also discussing the theoretical side of the topic that explores proofs of convergence and rate of convergence. A related website features an ongoing discussion of the evolving fields of approximation dynamic programming and reinforcement learning, along with additional readings, software, and datasets.\u003c\/p\u003e \u003cp\u003eRequiring only a basic understanding of statistics and probability, \u003ci\u003eApproximate Dynamic Programming\u003c\/i\u003e, Second Edition is an excellent book for industrial engineering and operations research courses at the upper-undergraduate and graduate levels. It also serves as a valuable reference for researchers and professionals who utilize dynamic programming, stochastic programming, and control theory to solve problems in their everyday work.\u003c\/p\u003e  \u003cb\u003ePreface to the Second Edition xi\u003c\/b\u003e  \u003cp\u003e\u003cb\u003ePreface to the First Edition xv\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003eAcknowledgments xvii\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003e1 The Challenges of Dynamic Programming 1\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e1.1 A Dynamic Programming Example: A Shortest Path Problem, 2\u003c\/p\u003e \u003cp\u003e1.2 The Three Curses of Dimensionality, 3\u003c\/p\u003e \u003cp\u003e1.3 Some Real Applications, 6\u003c\/p\u003e \u003cp\u003e1.4 Problem Classes, 11\u003c\/p\u003e \u003cp\u003e1.5 The Many Dialects of Dynamic Programming, 15\u003c\/p\u003e \u003cp\u003e1.6 What Is New in This Book?, 17\u003c\/p\u003e \u003cp\u003e1.7 Pedagogy, 19\u003c\/p\u003e \u003cp\u003e1.8 Bibliographic Notes, 22\u003c\/p\u003e \u003cp\u003e\u003cb\u003e2 Some Illustrative Models 25\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e2.1 Deterministic Problems, 26\u003c\/p\u003e \u003cp\u003e2.2 Stochastic Problems, 31\u003c\/p\u003e \u003cp\u003e2.3 Information Acquisition Problems, 47\u003c\/p\u003e \u003cp\u003e2.4 A Simple Modeling Framework for Dynamic Programs, 50\u003c\/p\u003e \u003cp\u003e2.5 Bibliographic Notes, 54\u003c\/p\u003e \u003cp\u003eProblems, 54\u003c\/p\u003e \u003cp\u003e\u003cb\u003e3 Introduction to Markov Decision Processes 57\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e3.1 The Optimality Equations, 58\u003c\/p\u003e \u003cp\u003e3.2 Finite Horizon Problems, 65\u003c\/p\u003e \u003cp\u003e3.3 Infinite Horizon Problems, 66\u003c\/p\u003e \u003cp\u003e3.4 Value Iteration, 68\u003c\/p\u003e \u003cp\u003e3.5 Policy Iteration, 74\u003c\/p\u003e \u003cp\u003e3.6 Hybrid Value-Policy Iteration, 75\u003c\/p\u003e \u003cp\u003e3.7 Average Reward Dynamic Programming, 76\u003c\/p\u003e \u003cp\u003e3.8 The Linear Programming Method for Dynamic Programs, 77\u003c\/p\u003e \u003cp\u003e3.9 Monotone Policies*, 78\u003c\/p\u003e \u003cp\u003e3.10 Why Does It Work?**, 84\u003c\/p\u003e \u003cp\u003e3.11 Bibliographic Notes, 103\u003c\/p\u003e \u003cp\u003eProblems, 103\u003c\/p\u003e \u003cp\u003e\u003cb\u003e4 Introduction to Approximate Dynamic Programming 111\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e4.1 The Three Curses of Dimensionality (Revisited), 112\u003c\/p\u003e \u003cp\u003e4.2 The Basic Idea, 114\u003c\/p\u003e \u003cp\u003e4.3 \u003ci\u003eQ\u003c\/i\u003e-Learning and SARSA, 122\u003c\/p\u003e \u003cp\u003e4.4 Real-Time Dynamic Programming, 126\u003c\/p\u003e \u003cp\u003e4.5 Approximate Value Iteration, 127\u003c\/p\u003e \u003cp\u003e4.6 The Post-Decision State Variable, 129\u003c\/p\u003e \u003cp\u003e4.7 Low-Dimensional Representations of Value Functions, 144\u003c\/p\u003e \u003cp\u003e4.8 So Just What Is Approximate Dynamic Programming?, 146\u003c\/p\u003e \u003cp\u003e4.9 Experimental Issues, 149\u003c\/p\u003e \u003cp\u003e4.10 But Does It Work?, 155\u003c\/p\u003e \u003cp\u003e4.11 Bibliographic Notes, 156\u003c\/p\u003e \u003cp\u003eProblems, 158\u003c\/p\u003e \u003cp\u003e\u003cb\u003e5 Modeling Dynamic Programs 167\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e5.1 Notational Style, 169\u003c\/p\u003e \u003cp\u003e5.2 Modeling Time, 170\u003c\/p\u003e \u003cp\u003e5.3 Modeling Resources, 174\u003c\/p\u003e \u003cp\u003e5.4 The States of Our System, 178\u003c\/p\u003e \u003cp\u003e5.5 Modeling Decisions, 187\u003c\/p\u003e \u003cp\u003e5.6 The Exogenous Information Process, 189\u003c\/p\u003e \u003cp\u003e5.7 The Transition Function, 198\u003c\/p\u003e \u003cp\u003e5.8 The Objective Function, 206\u003c\/p\u003e \u003cp\u003e5.9 A Measure-Theoretic View of Information**, 211\u003c\/p\u003e \u003cp\u003e5.10 Bibliographic Notes, 213\u003c\/p\u003e \u003cp\u003eProblems, 214\u003c\/p\u003e \u003cp\u003e\u003cb\u003e6 Policies 221\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e6.1 Myopic Policies, 224\u003c\/p\u003e \u003cp\u003e6.2 Lookahead Policies, 224\u003c\/p\u003e \u003cp\u003e6.3 Policy Function Approximations, 232\u003c\/p\u003e \u003cp\u003e6.4 Value Function Approximations, 235\u003c\/p\u003e \u003cp\u003e6.5 Hybrid Strategies, 239\u003c\/p\u003e \u003cp\u003e6.6 Randomized Policies, 242\u003c\/p\u003e \u003cp\u003e6.7 How to Choose a Policy?, 244\u003c\/p\u003e \u003cp\u003e6.8 Bibliographic Notes, 247\u003c\/p\u003e \u003cp\u003eProblems, 247\u003c\/p\u003e \u003cp\u003e\u003cb\u003e7 Policy Search 249\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e7.1 Background, 250\u003c\/p\u003e \u003cp\u003e7.2 Gradient Search, 253\u003c\/p\u003e \u003cp\u003e7.3 Direct Policy Search for Finite Alternatives, 256\u003c\/p\u003e \u003cp\u003e7.4 The Knowledge Gradient Algorithm for Discrete Alternatives, 262\u003c\/p\u003e \u003cp\u003e7.5 Simulation Optimization, 270\u003c\/p\u003e \u003cp\u003e7.6 Why Does It Work?**, 274\u003c\/p\u003e \u003cp\u003e7.7 Bibliographic Notes, 285\u003c\/p\u003e \u003cp\u003eProblems, 286\u003c\/p\u003e \u003cp\u003e\u003cb\u003e8 Approximating Value Functions 289\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e8.1 Lookup Tables and Aggregation, 290\u003c\/p\u003e \u003cp\u003e8.2 Parametric Models, 304\u003c\/p\u003e \u003cp\u003e8.3 Regression Variations, 314\u003c\/p\u003e \u003cp\u003e8.4 Nonparametric Models, 316\u003c\/p\u003e \u003cp\u003e8.5 Approximations and the Curse of Dimensionality, 325\u003c\/p\u003e \u003cp\u003e8.6 Why Does It Work?**, 328\u003c\/p\u003e \u003cp\u003e8.7 Bibliographic Notes, 333\u003c\/p\u003e \u003cp\u003eProblems, 334\u003c\/p\u003e \u003cp\u003e\u003cb\u003e9 Learning Value Function Approximations 337\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e9.1 Sampling the Value of a Policy, 337\u003c\/p\u003e \u003cp\u003e9.2 Stochastic Approximation Methods, 347\u003c\/p\u003e \u003cp\u003e9.3 Recursive Least Squares for Linear Models, 349\u003c\/p\u003e \u003cp\u003e9.4 Temporal Difference Learning with a Linear Model, 356\u003c\/p\u003e \u003cp\u003e9.5 Bellman’s Equation Using a Linear Model, 358\u003c\/p\u003e \u003cp\u003e9.6 Analysis of TD(0), LSTD, and LSPE Using a Single State, 364\u003c\/p\u003e \u003cp\u003e9.7 Gradient-Based Methods for Approximate Value Iteration*, 366\u003c\/p\u003e \u003cp\u003e9.8 Least Squares Temporal Differencing with Kernel Regression*, 371\u003c\/p\u003e \u003cp\u003e9.9 Value Function Approximations Based on Bayesian Learning*, 373\u003c\/p\u003e \u003cp\u003e9.10 Why Does It Work*, 376\u003c\/p\u003e \u003cp\u003e9.11 Bibliographic Notes, 379\u003c\/p\u003e \u003cp\u003eProblems, 381\u003c\/p\u003e \u003cp\u003e\u003cb\u003e10 Optimizing While Learning 383\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e10.1 Overview of Algorithmic Strategies, 385\u003c\/p\u003e \u003cp\u003e10.2 Approximate Value Iteration and \u003ci\u003eQ\u003c\/i\u003e-Learning Using Lookup Tables, 386\u003c\/p\u003e \u003cp\u003e10.3 Statistical Bias in the Max Operator, 397\u003c\/p\u003e \u003cp\u003e10.4 Approximate Value Iteration and \u003ci\u003eQ\u003c\/i\u003e-Learning Using Linear Models, 400\u003c\/p\u003e \u003cp\u003e10.5 Approximate Policy Iteration, 402\u003c\/p\u003e \u003cp\u003e10.6 The Actor–Critic Paradigm, 408\u003c\/p\u003e \u003cp\u003e10.7 Policy Gradient Methods, 410\u003c\/p\u003e \u003cp\u003e10.8 The Linear Programming Method Using Basis Functions, 411\u003c\/p\u003e \u003cp\u003e10.9 Approximate Policy Iteration Using Kernel Regression*, 413\u003c\/p\u003e \u003cp\u003e10.10 Finite Horizon Approximations for Steady-State Applications, 415\u003c\/p\u003e \u003cp\u003e10.11 Bibliographic Notes, 416\u003c\/p\u003e \u003cp\u003eProblems, 418\u003c\/p\u003e \u003cp\u003e\u003cb\u003e11 Adaptive Estimation and Stepsizes 419\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e11.1 Learning Algorithms and Stepsizes, 420\u003c\/p\u003e \u003cp\u003e11.2 Deterministic Stepsize Recipes, 425\u003c\/p\u003e \u003cp\u003e11.3 Stochastic Stepsizes, 433\u003c\/p\u003e \u003cp\u003e11.4 Optimal Stepsizes for Nonstationary Time Series, 437\u003c\/p\u003e \u003cp\u003e11.5 Optimal Stepsizes for Approximate Value Iteration, 447\u003c\/p\u003e \u003cp\u003e11.6 Convergence, 449\u003c\/p\u003e \u003cp\u003e11.7 Guidelines for Choosing Stepsize Formulas, 451\u003c\/p\u003e \u003cp\u003e11.8 Bibliographic Notes, 452\u003c\/p\u003e \u003cp\u003eProblems, 453\u003c\/p\u003e \u003cp\u003e\u003cb\u003e12 Exploration Versus Exploitation 457\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e12.1 A Learning Exercise: The Nomadic Trucker, 457\u003c\/p\u003e \u003cp\u003e12.2 An Introduction to Learning, 460\u003c\/p\u003e \u003cp\u003e12.3 Heuristic Learning Policies, 464\u003c\/p\u003e \u003cp\u003e12.4 Gittins Indexes for Online Learning, 470\u003c\/p\u003e \u003cp\u003e12.5 The Knowledge Gradient Policy, 477\u003c\/p\u003e \u003cp\u003e12.6 Learning with a Physical State, 482\u003c\/p\u003e \u003cp\u003e12.7 Bibliographic Notes, 492\u003c\/p\u003e \u003cp\u003eProblems, 493\u003c\/p\u003e \u003cp\u003e\u003cb\u003e13 Value Function Approximations for Resource Allocation Problems 497\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e13.1 Value Functions versus Gradients, 498\u003c\/p\u003e \u003cp\u003e13.2 Linear Approximations, 499\u003c\/p\u003e \u003cp\u003e13.3 Piecewise-Linear Approximations, 501\u003c\/p\u003e \u003cp\u003e13.4 Solving a Resource Allocation Problem Using Piecewise-Linear Functions, 505\u003c\/p\u003e \u003cp\u003e13.5 The SHAPE Algorithm, 509\u003c\/p\u003e \u003cp\u003e13.6 Regression Methods, 513\u003c\/p\u003e \u003cp\u003e13.7 Cutting Planes*, 516\u003c\/p\u003e \u003cp\u003e13.8 Why Does It Work?**, 528\u003c\/p\u003e \u003cp\u003e13.9 Bibliographic Notes, 535\u003c\/p\u003e \u003cp\u003eProblems, 536\u003c\/p\u003e \u003cp\u003e\u003cb\u003e14 Dynamic Resource Allocation Problems 541\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e14.1 An Asset Acquisition Problem, 541\u003c\/p\u003e \u003cp\u003e14.2 The Blood Management Problem, 547\u003c\/p\u003e \u003cp\u003e14.3 A Portfolio Optimization Problem, 557\u003c\/p\u003e \u003cp\u003e14.4 A General Resource Allocation Problem, 560\u003c\/p\u003e \u003cp\u003e14.5 A Fleet Management Problem, 573\u003c\/p\u003e \u003cp\u003e14.6 A Driver Management Problem, 580\u003c\/p\u003e \u003cp\u003e14.7 Bibliographic Notes, 585\u003c\/p\u003e \u003cp\u003eProblems, 586\u003c\/p\u003e \u003cp\u003e\u003cb\u003e15 Implementation Challenges 593\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e15.1 Will ADP Work for Your Problem?, 593\u003c\/p\u003e \u003cp\u003e15.2 Designing an ADP Algorithm for Complex Problems, 594\u003c\/p\u003e \u003cp\u003e15.3 Debugging an ADP Algorithm, 596\u003c\/p\u003e \u003cp\u003e15.4 Practical Issues, 597\u003c\/p\u003e \u003cp\u003e15.5 Modeling Your Problem, 602\u003c\/p\u003e \u003cp\u003e15.6 Online versus Offline Models, 604\u003c\/p\u003e \u003cp\u003e15.7 If It Works, Patent It!, 606\u003c\/p\u003e \u003cp\u003e\u003cb\u003eBibliography 607\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003eIndex 623\u003c\/b\u003e\u003c\/p\u003e \u003cb\u003eWARREN B. POWELL\u003c\/b\u003e, PhD, is Professor of Operations Research and Financial Engineering at Princeton University, where he is founder and Director of CASTLE Laboratory, a research unit that works with industrial partners to test new ideas found in operations research. The recipient of the 2004 INFORMS Fellow Award, Dr. Powell has authored more than 160 published articles on stochastic optimization, approximate dynamicprogramming, and dynamic resource management.  \u003cb\u003ePraise for the \u003ci\u003eFirst Edition\u003c\/i\u003e\u003c\/b\u003e  \u003cp\u003e\"Finally, a book devoted to dynamic programming and written using the language of operations research (OR)! This beautiful book fills a gap in the libraries of OR specialists and practitioners.\"\u003cbr\u003e —\u003cb\u003e\u003ci\u003eComputing Reviews\u003c\/i\u003e\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003eThis new edition showcases a focus on modeling and computation for complex classes of approximate dynamic programming problems\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eUnderstanding approximate dynamic programming (ADP) is vital in order to develop practical and high-quality solutions to complex industrial problems, particularly when those problems involve making decisions in the presence of uncertainty. \u003ci\u003eApproximate Dynamic Programming\u003c\/i\u003e, Second Edition uniquely integrates four distinct disciplines—Markov decision processes, mathematical programming, simulation, and statistics—to demonstrate how to successfully approach, model, and solve a wide range of real-life problems using ADP.\u003c\/p\u003e \u003cp\u003eThe book continues to bridge the gap between computer science, simulation, and operations research and now adopts the notation and vocabulary of reinforcement learning as well as stochastic search and simulation optimization. The author outlines the essential algorithms that serve as a starting point in the design of practical solutions for real problems. The three curses of dimensionality that impact complex problems are introduced and detailed coverage of implementation challenges is provided. The \u003ci\u003eSecond Edition\u003c\/i\u003e also features:\u003c\/p\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eA new chapter describing four fundamental classes of policies for working with diverse stochastic optimization problems: myopic policies, look-ahead policies, policy function approximations, and policies based on value function approximations\u003c\/p\u003e \u003c\/li\u003e \u003cli\u003e \u003cp\u003eA new chapter on policy search that brings together stochastic search and simulation optimization concepts and introduces a new class of optimal learning strategies\u003c\/p\u003e \u003c\/li\u003e \u003cli\u003e \u003cp\u003eUpdated coverage of the exploration exploitation problem in ADP, now including a recently developed method for doing active learning in the presence of a physical state, using the concept of the knowledge gradient\u003c\/p\u003e \u003c\/li\u003e \u003cli\u003e \u003cp\u003eA new sequence of chapters describing statistical methods for approximating value functions, estimating the value of a fixed policy, and value function approximation while searching for optimal policies\u003c\/p\u003e \u003c\/li\u003e \u003c\/ul\u003e \u003cp\u003eThe presented coverage of ADP emphasizes models and algorithms, focusing on related applications and computation while also discussing the theoretical side of the topic that explores proofs of convergence and rate of convergence. A related website features an ongoing discussion of the evolving fields of approximation dynamic programming and reinforcement learning, along with additional readings, software, and datasets.\u003c\/p\u003e \u003cp\u003eRequiring only a basic understanding of statistics and probability, \u003ci\u003eApproximate Dynamic Programming\u003c\/i\u003e, Second Edition is an excellent book for industrial engineering and operations research courses at the upper-undergraduate and graduate levels. It also serves as a valuable reference for researchers and professionals who utilize dynamic programming, stochastic programming, and control theory to solve problems in their everyday work.\u003c\/p\u003e","brand":"Wiley","offers":[{"title":"Default Title","offer_id":47988755923173,"sku":"NP9780470604458","price":155.95,"currency_code":"USD","in_stock":false}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/1842\/7735\/files\/9780470604458.jpg?v=1761781467","url":"https:\/\/k12savings.com\/es\/products\/approximate-dynamic-programming-isbn-9780470604458","provider":"K12savings","version":"1.0","type":"link"}