Advertisement
Guest User

Theories of consequences in ambient control

a guest
Jun 10th, 2010
110
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 12.38 KB | None | 0 0
  1. It turns out, ambient dependencies can be interestingly generalized to
  2. give a notion of "logical consequences" of a hypothetical decision,
  3. with preference comparing "theories of consequences" rather than
  4. individual outcomes for the environment. This leads to considering the
  5. issue of most of these theories containing false statements
  6. (assumptions that the agent does Y when in fact it'll turn out to do
  7. X). This doesn't necessarily mean these theories are inconsistent, and
  8. the need to be able to do non-trivial comparison of these theories
  9. thus draws attention to the strength of the underlying theory,
  10. prohibiting naive all-piercing logical transparency not just because
  11. it's not achievable.
  12.  
  13. == Summary ==
  14.  
  15. In "Basic concepts in ADT", I recap the main concepts in ambient
  16. decision theory (as described in the previous post), in "The decision
  17. problem" I recap the setting in which the agent must make a decision,
  18. and in "Ambient dependencies" how ambient dependencies can be used to
  19. infer preference about agent's strategies from preference about
  20. environment's strategies. (If the general picture of ADT was clear
  21. from the previous post, these sections can be skipped.)
  22.  
  23. "Conflicting dependencies" describes a problem with ambient
  24. dependencies, explained more clearly from the "theories of
  25. consequences" point of view in the next two sections. In "Theories of
  26. consequences", I describe a more general way of looking at the process
  27. of inferring preference for agent's strategies, and in "Consequences
  28. and inconsistency" discuss some of its properties.
  29.  
  30. == Basic concepts in ADT ==
  31.  
  32. Program - a lambda term. Agent and environment are two specific lambda
  33. terms. No notion on program execution is assumed, termination not
  34. required of agent or environment. Notation: A, B, X, Y, etc.
  35.  
  36. Strategy - an extensional equivalence class of programs (beta-eta
  37. equivalence). Each program implements some strategy (an element of
  38. some equivalence class), and a strategy can be given by some program
  39. that implements it. For two given programs, the question of whether
  40. they implement the same strategy may be undecidable. A sequence of
  41. alpha- beta- or eta-conversions can prove two programs to be
  42. equivalent. Notation: strategy of A is [[A]]; if A and B are
  43. equivalent, I write A~B, which is the same as [[A]]=[[B]].
  44.  
  45. Ambient dependence - for given programs B and C, A is an ambient
  46. dependence of C on B if C~(A B) (where (A B) is application, passing B
  47. as parameter to A). C doesn't have to have B as explicit part of its
  48. definition, or look anything like (A B), it just has to be equivalent
  49. to that program.
  50.  
  51. Preference relation - for two parts of programs, A,B and X,Y, a
  52. preference relation (A,B)>(X,Y) specifies that the agent prefers A~B
  53. to be true more than it prefers X~Y to be true. In particular, if E is
  54. the environment program, (E,P)>(E,Q) specifies that the agent prefers
  55. the environment to implement strategy [[P]] more than it prefers it to
  56. implement strategy Q. Since preference is about strategies, not
  57. specific programs, for any A~X and B~Y, (A,B)=(X,Y)=(Y,X).
  58.  
  59. == The decision problem ==
  60.  
  61. Agent A and environment E are two particular programs, where
  62. environment is the only program that agent cares about in itself. If
  63. the agent cares about multiple programs, consider as the environment a
  64. single program which enumerates them. Then, the agent will
  65. instrumentally care about the constituent programs enumerated by it,
  66. if preference about the compound environment is set up accordingly. In
  67. particular, we could take a universal machine as the environment: this
  68. will allow to specify preference for all programs, which are all parts
  69. of the universal machine, obtained by passing it an index (syntactic
  70. representation) of any given program.
  71.  
  72. All programs are isolated, there are no explicit dependencies set up
  73. between them. In particular, where the agent is in the environment is
  74. not specified, and the agent might not be present explicitly in the
  75. environment at all.
  76.  
  77. Whatever is relevant about environment, has to be represented in such
  78. a way that it's reflected in its strategy. Preference is about
  79. strategies of environment, not syntactic details. The program for
  80. environment (as well as all other programs) are fixed in their
  81. definitions, but strategies are not transparently seen through them,
  82. even though determined by them.
  83.  
  84. Preference for environment is given as a collection of preference
  85. relation for environment E, statements like (E,X)>(E,Y), where X, Y
  86. are some programs, assumed non-equivalent.
  87.  
  88. The agent is assumed to know its program, or in any case some program
  89. which implements the same strategy, environment's program, and its
  90. preference about environment. The process of decision-making consists
  91. in the agent determining its strategy, without (obviously) changing
  92. its own program, or any other program. The agent has to set own
  93. strategy in such a way as to make environment's strategy better.
  94.  
  95. == Ambient dependencies ==
  96.  
  97. The method for inferring instrumental preference I suggested in the
  98. previous post is based on ambient dependencies. Given two programs, B
  99. and C, A is an ambient dependence of C on B if (A B)~C. For any two
  100. programs X and Y, if (C,(A X))>(C,(A Y)), then (B,X)>(B,Y).
  101.  
  102. In other words, (B,X) is the event of B implementing strategy X and
  103. (B,Y) of B implementing strategy Y. The dependence (A B)~C tells us
  104. that if B~X, then also C~(A B)~(A X), so the event (C,(A X)) follows
  105. from (B,X); and similarly for Y. Thus, the choice between B~X and B~Y
  106. is also the choice between C~(A X) and C~(A Y), and so if we have
  107. preference (C,(A X))>(C,(A Y)), it follows that we have preference
  108. (B,X)>(B,Y). (Or so it seems, see the rest of the post.)
  109.  
  110. Given two dependencies, say of D on C, and of C on B, we can obtain a
  111. dependence of D on B by composition. This creates a setting for
  112. inferring dependencies of environment of the agent through exploring
  113. dependencies between other programs.
  114.  
  115. == Conflicting dependencies ==
  116.  
  117. Notice that ambient dependencies are very weakly restricted by their
  118. definition: a program B, to qualify as a dependence of C on A, only
  119. needs to satisfy (B A)~C, that is only its value on argument A is
  120. required to be equivalent to C, while its values given other arguments
  121. are not fixed. Thus, it should be relatively easy to find two
  122. dependencies, B and B', that produce non-equivalent values for some
  123. argument other than A: (B X)~Y, (B' X)~Y', where Y is not equivalent
  124. to Y'. Following the preference inference method from the previous
  125. section, if Y and Y' are differently valued as strategies for C, say
  126. (C,Y)>(C,Y'), then B and B' send conflicting suggestions about the
  127. value of X as A's strategy: it would seem that A~X implies C~Y
  128. according to dependence B, and also implies X~Y' according to
  129. dependence B'.
  130.  
  131. For example, consider an integer-valued agent A and environment E=A*A.
  132. The goal of the agent is to minimize E. The agent is only able to come
  133. up with dependencies D1=\x.x*A and D2=\x.x*x, for which it holds that
  134. (D1 A)~E and (D2 A)~E. D2 is helpful for finding a solution A~0 which
  135. gives E~0 as well. But now that we know that A~E~0, it's possible to
  136. find different dependencies D satisfying the condition (D A)~E, for
  137. example D=\x.((x-1)*(x-1)-1). It is still true that (D A)~E, but it
  138. suggests that A~1 gives a better (smaller) environment value E~-1.
  139.  
  140. This is in conflict with preference given by D2: (D2 1)~1, while (D
  141. 1)~-1, and so we D2 suggests that (A,0)>(A,1), given that (E,0)>(E,1),
  142. but D suggests that (A,0)<(A,1), given that (E,0)<(E,-1). In
  143. retrospect, the decision to make A~0 seems suboptimal, and many
  144. dependencies not provable before decision was made suddenly become
  145. obvious. These dependencies-in-retrospect also give bad ideas about
  146. preferred decisions. Why is that?
  147.  
  148. == Theories of consequences ==
  149.  
  150. The essential step in using an ambient dependence (A B)~C is that it
  151. tells us that B~X => C~(A X), for each X. Then, given two statements
  152. B~X and B~Y, we can use a dependence to show B~X => C~(A X) and B~Y =>
  153. C~(A Y), and if we prefer the consequence (outcome) C~(A X) to C~(A
  154. Y), this allows to judge the choice B~X to be preferable to the choice
  155. B~Y.
  156.  
  157. Two dependencies (P B)~D and (Q D)~C can be combined into A=\x.(Q (P
  158. x)), for which it holds that (A B)~C. Such dependencies can be
  159. combined directly this way, or we can look at their consequences for
  160. specific assumptions of the form B~X. Then, B~X => D~(P X), and D~(P
  161. X) => (Q (P X))~C, and so B~X => (A X)~C, all without directly using
  162. the dependence (A B)~C. The assumption of B~X is then judged by the
  163. fact that (A X)~C is among its logical consequences.
  164.  
  165. This suggests abandoning the mechanism of ambient dependencies for a
  166. more general mechanism of theories of consequences. Instead of
  167. inferring instrumental preference for agent's strategies in one step,
  168. using an appropriate dependence, start with an assumption of the form
  169. B~X (agent B's strategy is equivalent to X), and see what logically
  170. follows, in some fixed theory. Then, compare the theories of
  171. consequences for various assumptions with each other, based on what
  172. consequences for environment they prove (and choose the strategy which
  173. has the best-ranked theory of consequences).
  174.  
  175. Let the underlying theory (without specifying what it is, exactly) be
  176. called EQ. This theory deals with equivalence of programs, and
  177. statements of the form A~B, "A is equivalent to B", are its basic
  178. building blocks. Then knowing an ambient dependence (A B)~C means that
  179. EQ |- (A B)~C, from which it should follow that EQ |- (B~X) => (A
  180. X)~C, or in other words EQ+(B~X) |- (A X)~C. Here, EQ+(B~X) is the
  181. theory of consequences of the decision B~X, and the statements
  182. provable in this theory determine the value of this decision. In
  183. particular, if (A X)~C is known to be good, its presence in the theory
  184. of consequences of B~X gives value to B~X.
  185.  
  186. Using theories of consequences for comparing strategies suggests
  187. extending the notion of preference to compare whole theories and not
  188. just individual program equivalence statements. Theories can be seen
  189. as collections of statements, and given that the strength of EQ is
  190. necessarily limited, even equivalent strategies can be not provably
  191. equivalent, and can have non-trivially overlapping consistent theories
  192. of consequences. This suggests treating theories as "events",
  193. collections of elements from a set of possible consequences, and
  194. possibly employing something like expected utility calculation to
  195. compute their value.
  196.  
  197. == Consequences and inconsistency ==
  198.  
  199. If we use an enumeration (not necessarily exhaustive) of
  200. non-equivalent programs X,Y, ... as candidates for agent B's strategy,
  201. it will be true that B~X for at most one such X, and false for all
  202. others. How can we consider theories EQ+(B~Y) then, with B~Y false?
  203. The trick is that even where B~Y is false, it's not at all necessarily
  204. provably false, and so EQ+(B~Y) is not at all necessarily
  205. inconsistent. This allows to have all these non-trivial (consistent)
  206. theories, almost all of which have false consequences, and so to have
  207. something to have preference about.
  208.  
  209. The process of decision-making doesn't prove that B~X in EQ when X is
  210. chosen as B's strategy, this step is carried out at a different level,
  211. after comparing the theories of consequences extending EQ. This
  212. process, living outside EQ, is exactly what makes B~X true and B~Y
  213. false, and this knowledge is inaccessible to EQ itself, whose weakness
  214. shapes the decision. It is exactly because EQ is weak enough to make
  215. the false theories of consequences consistent, that the agent has the
  216. ability to compare these consequences and choose one it likes best,
  217. thus making it true.
  218.  
  219. Inconsistent theories of consequences may well be considered
  220. undesirable, according to the preference over consequences, but in
  221. general it's hopeless to demand preference to disavow inconsistent
  222. theories of consequences, because that would just be trying to prove
  223. the consistency of the outcome. Inconsistent theories of consequences
  224. are all alike, which makes them just one point among many others, most
  225. of those other theories proving false statements. It's not the
  226. greatest worry that we have an apparently false provable statement
  227. when it's expected that we have non-provably false statements on every
  228. corner.
  229.  
  230. (Exercise: Explain the spurious ambient dependencies seen in
  231. retrospect in terms of theories of consequences. What is the source of
  232. inconsistency that enables all those dependencies to be proven?)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement