Theories of consequences in ambient control

It turns out, ambient dependencies can be interestingly generalized to
give a notion of "logical consequences" of a hypothetical decision,
with preference comparing "theories of consequences" rather than
individual outcomes for the environment. This leads to considering the
issue of most of these theories containing false statements
(assumptions that the agent does Y when in fact it'll turn out to do
X). This doesn't necessarily mean these theories are inconsistent, and
the need to be able to do non-trivial comparison of these theories
thus draws attention to the strength of the underlying theory,
prohibiting naive all-piercing logical transparency not just because
it's not achievable.

== Summary ==

In "Basic concepts in ADT", I recap the main concepts in ambient
decision theory (as described in the previous post), in "The decision
problem" I recap the setting in which the agent must make a decision,
and in "Ambient dependencies" how ambient dependencies can be used to
infer preference about agent's strategies from preference about
environment's strategies. (If the general picture of ADT was clear
from the previous post, these sections can be skipped.)

"Conflicting dependencies" describes a problem with ambient
dependencies, explained more clearly from the "theories of
consequences" point of view in the next two sections. In "Theories of
consequences", I describe a more general way of looking at the process
of inferring preference for agent's strategies, and in "Consequences
and inconsistency" discuss some of its properties.

== Basic concepts in ADT ==

Program - a lambda term. Agent and environment are two specific lambda
terms. No notion on program execution is assumed, termination not
required of agent or environment. Notation: A, B, X, Y, etc.

Strategy - an extensional equivalence class of programs (beta-eta
equivalence). Each program implements some strategy (an element of
some equivalence class), and a strategy can be given by some program
that implements it. For two given programs, the question of whether
they implement the same strategy may be undecidable. A sequence of
alpha- beta- or eta-conversions can prove two programs to be
equivalent. Notation: strategy of A is [[A]]; if A and B are
equivalent, I write A~B, which is the same as [[A]]=[[B]].

Ambient dependence - for given programs B and C, A is an ambient
dependence of C on B if C~(A B) (where (A B) is application, passing B
as parameter to A). C doesn't have to have B as explicit part of its
definition, or look anything like (A B), it just has to be equivalent
to that program.

Preference relation - for two parts of programs, A,B and X,Y, a
preference relation (A,B)>(X,Y) specifies that the agent prefers A~B
to be true more than it prefers X~Y to be true. In particular, if E is
the environment program, (E,P)>(E,Q) specifies that the agent prefers
the environment to implement strategy [[P]] more than it prefers it to
implement strategy Q. Since preference is about strategies, not
specific programs, for any A~X and B~Y, (A,B)=(X,Y)=(Y,X).

== The decision problem ==

Agent A and environment E are two particular programs, where
environment is the only program that agent cares about in itself. If
the agent cares about multiple programs, consider as the environment a
single program which enumerates them. Then, the agent will
instrumentally care about the constituent programs enumerated by it,
if preference about the compound environment is set up accordingly. In
particular, we could take a universal machine as the environment: this
will allow to specify preference for all programs, which are all parts
of the universal machine, obtained by passing it an index (syntactic
representation) of any given program.

All programs are isolated, there are no explicit dependencies set up
between them. In particular, where the agent is in the environment is
not specified, and the agent might not be present explicitly in the
environment at all.

Whatever is relevant about environment, has to be represented in such
a way that it's reflected in its strategy. Preference is about
strategies of environment, not syntactic details. The program for
environment (as well as all other programs) are fixed in their
definitions, but strategies are not transparently seen through them,
even though determined by them.

Preference for environment is given as a collection of preference
relation for environment E, statements like (E,X)>(E,Y), where X, Y
are some programs, assumed non-equivalent.

The agent is assumed to know its program, or in any case some program
which implements the same strategy, environment's program, and its
preference about environment. The process of decision-making consists
in the agent determining its strategy, without (obviously) changing
its own program, or any other program. The agent has to set own
strategy in such a way as to make environment's strategy better.

== Ambient dependencies ==

The method for inferring instrumental preference I suggested in the
previous post is based on ambient dependencies. Given two programs, B
and C, A is an ambient dependence of C on B if (A B)~C. For any two
programs X and Y, if (C,(A X))>(C,(A Y)), then (B,X)>(B,Y).

In other words, (B,X) is the event of B implementing strategy X and
(B,Y) of B implementing strategy Y. The dependence (A B)~C tells us
that if B~X, then also C~(A B)~(A X), so the event (C,(A X)) follows
from (B,X); and similarly for Y. Thus, the choice between B~X and B~Y
is also the choice between C~(A X) and C~(A Y), and so if we have
preference (C,(A X))>(C,(A Y)), it follows that we have preference
(B,X)>(B,Y). (Or so it seems, see the rest of the post.)

Given two dependencies, say of D on C, and of C on B, we can obtain a
dependence of D on B by composition. This creates a setting for
inferring dependencies of environment of the agent through exploring
dependencies between other programs.

== Conflicting dependencies ==

Notice that ambient dependencies are very weakly restricted by their
definition: a program B, to qualify as a dependence of C on A, only
needs to satisfy (B A)~C, that is only its value on argument A is
required to be equivalent to C, while its values given other arguments
are not fixed. Thus, it should be relatively easy to find two
dependencies, B and B', that produce non-equivalent values for some
argument other than A: (B X)~Y, (B' X)~Y', where Y is not equivalent
to Y'. Following the preference inference method from the previous
section, if Y and Y' are differently valued as strategies for C, say
(C,Y)>(C,Y'), then B and B' send conflicting suggestions about the
value of X as A's strategy: it would seem that A~X implies C~Y
according to dependence B, and also implies X~Y' according to
dependence B'.

For example, consider an integer-valued agent A and environment E=A*A.
The goal of the agent is to minimize E. The agent is only able to come
up with dependencies D1=\x.x*A and D2=\x.x*x, for which it holds that
(D1 A)~E and (D2 A)~E. D2 is helpful for finding a solution A~0 which
gives E~0 as well. But now that we know that A~E~0, it's possible to
find different dependencies D satisfying the condition (D A)~E, for
example D=\x.((x-1)*(x-1)-1). It is still true that (D A)~E, but it
suggests that A~1 gives a better (smaller) environment value E~-1.

This is in conflict with preference given by D2: (D2 1)~1, while (D
1)~-1, and so we D2 suggests that (A,0)>(A,1), given that (E,0)>(E,1),
but D suggests that (A,0)<(A,1), given that (E,0)<(E,-1). In
retrospect, the decision to make A~0 seems suboptimal, and many
dependencies not provable before decision was made suddenly become
obvious. These dependencies-in-retrospect also give bad ideas about
preferred decisions. Why is that?

== Theories of consequences ==

The essential step in using an ambient dependence (A B)~C is that it
tells us that B~X => C~(A X), for each X. Then, given two statements
B~X and B~Y, we can use a dependence to show B~X => C~(A X) and B~Y =>
C~(A Y), and if we prefer the consequence (outcome) C~(A X) to C~(A
Y), this allows to judge the choice B~X to be preferable to the choice
B~Y.

Two dependencies (P B)~D and (Q D)~C can be combined into A=\x.(Q (P
x)), for which it holds that (A B)~C. Such dependencies can be
combined directly this way, or we can look at their consequences for
specific assumptions of the form B~X. Then, B~X => D~(P X), and D~(P
X) => (Q (P X))~C, and so B~X => (A X)~C, all without directly using
the dependence (A B)~C. The assumption of B~X is then judged by the
fact that (A X)~C is among its logical consequences.

This suggests abandoning the mechanism of ambient dependencies for a
more general mechanism of theories of consequences. Instead of
inferring instrumental preference for agent's strategies in one step,
using an appropriate dependence, start with an assumption of the form
B~X (agent B's strategy is equivalent to X), and see what logically
follows, in some fixed theory. Then, compare the theories of
consequences for various assumptions with each other, based on what
consequences for environment they prove (and choose the strategy which
has the best-ranked theory of consequences).

Let the underlying theory (without specifying what it is, exactly) be
called EQ. This theory deals with equivalence of programs, and
statements of the form A~B, "A is equivalent to B", are its basic
building blocks. Then knowing an ambient dependence (A B)~C means that
EQ |- (A B)~C, from which it should follow that EQ |- (B~X) => (A
X)~C, or in other words EQ+(B~X) |- (A X)~C. Here, EQ+(B~X) is the
theory of consequences of the decision B~X, and the statements
provable in this theory determine the value of this decision. In
particular, if (A X)~C is known to be good, its presence in the theory
of consequences of B~X gives value to B~X.

Using theories of consequences for comparing strategies suggests
extending the notion of preference to compare whole theories and not
just individual program equivalence statements. Theories can be seen
as collections of statements, and given that the strength of EQ is
necessarily limited, even equivalent strategies can be not provably
equivalent, and can have non-trivially overlapping consistent theories
of consequences. This suggests treating theories as "events",
collections of elements from a set of possible consequences, and
possibly employing something like expected utility calculation to
compute their value.

== Consequences and inconsistency ==

If we use an enumeration (not necessarily exhaustive) of
non-equivalent programs X,Y, ... as candidates for agent B's strategy,
it will be true that B~X for at most one such X, and false for all
others. How can we consider theories EQ+(B~Y) then, with B~Y false?
The trick is that even where B~Y is false, it's not at all necessarily
provably false, and so EQ+(B~Y) is not at all necessarily
inconsistent. This allows to have all these non-trivial (consistent)
theories, almost all of which have false consequences, and so to have
something to have preference about.

The process of decision-making doesn't prove that B~X in EQ when X is
chosen as B's strategy, this step is carried out at a different level,
after comparing the theories of consequences extending EQ. This
process, living outside EQ, is exactly what makes B~X true and B~Y
false, and this knowledge is inaccessible to EQ itself, whose weakness
shapes the decision. It is exactly because EQ is weak enough to make
the false theories of consequences consistent, that the agent has the
ability to compare these consequences and choose one it likes best,
thus making it true.

Inconsistent theories of consequences may well be considered
undesirable, according to the preference over consequences, but in
general it's hopeless to demand preference to disavow inconsistent
theories of consequences, because that would just be trying to prove
the consistency of the outcome. Inconsistent theories of consequences
are all alike, which makes them just one point among many others, most
of those other theories proving false statements. It's not the
greatest worry that we have an apparently false provable statement
when it's expected that we have non-provably false statements on every
corner.

(Exercise: Explain the spurious ambient dependencies seen in
retrospect in terms of theories of consequences. What is the source of
inconsistency that enables all those dependencies to be proven?)