-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdeep_rl.html
84 lines (71 loc) · 3.34 KB
/
deep_rl.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
<!DOCTYPE HTML>
<!--
Solarize by TEMPLATED
templated.co @templatedco
Released for free under the Creative Commons Attribution 3.0 license (templated.co/license)
-->
<html>
<head>
<title>Deep Reinforcement Learning</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script src="js/jquery.min.js"></script>
<script src="js/jquery.dropotron.min.js"></script>
<script src="js/skel.min.js"></script>
<script src="js/skel-layers.min.js"></script>
<script src="js/init.js"></script>
<noscript>
<link rel="stylesheet" href="css/skel.css" />
<link rel="stylesheet" href="css/style.css" />
</noscript>
<!--[if lte IE 8]><link rel="stylesheet" href="css/ie/v8.css" /><![endif]-->
</head>
<!-- Header Wrapper -->
<div class="wrapper style1">
<!-- Header -->
<div id="header">
<!-- Logo -->
<h1><a id="logo">Goldenberg Lab</a></h1>
<div class="container">
<!-- Nav -->
<nav id="nav">
<ul>
<li class="active"><a href="index.html">Home</a></li>
<li><a href="current.html">Current Research</a>
</li>
<li><a href="people.html">The Team</a>
</li>
<li><a href="Publications.html">Publications</a></li>
<li><a href="contact.html">Contact</a></li>
</ul>
</nav>
</div>
</div>
<!-- Main -->
<!-- Section Three -->
<div class="wrapper style6">
<section class="container">
<header class="major">
<h2>Deep Reinforcement Learning</h2>
</header>
<div class="12u">
<img src="images/portfolio/8_big.jpg" alt="">
</div>
<div class="text-center" style="color:#000">
Imagine a patient in critical condition. What and when should be measured to forecast detrimental events, especially under the budget constraints? We answer this question by deep reinforcement learning (RL) that jointly minimizes the measurement cost and maximizes predictive gain, by scheduling strategically-timed measurements. We learn our policy to be dynamically dependent on the patient's health history. To scale our framework to exponentially large action space, we distribute our reward in a sequential setting that makes the learning easier. In our simulation, our policy outperforms heuristic-based scheduling with higher predictive gain and lower cost. In a real-world ICU mortality prediction task (MIMIC3), our policies reduce the total number of measurements by 31% or improve predictive gain by a factor of 3 as compared to physicians, under the off-policy policy evaluation.
</div>
</section>
<a href="https://arxiv.org/abs/1901.09699">link to paper</a>
</div>
<!-- Footer -->
<div id="footer">
<section class="container">
<header class="major">
<h3><a href="contact.html">Contact Us</a></h3>
</header>
</section>
</div>
</div>
</html>