MAS Study Guide

\documentclass[titlepage, letterpaper, fleqn]{article}
\usepackage[utf8]{inputenc}
\usepackage{fancyhdr}
\usepackage{amsmath}
\usepackage{extramarks}
\usepackage{enumitem}
\usepackage{amssymb}
\usepackage{booktabs}
\usepackage{tcolorbox}
\usepackage{gensymb}
\usepackage{booktabs}
\usepackage{graphicx}
\usepackage{caption}

\topmargin=-0.45in
\evensidemargin=0in
\oddsidemargin=0in
\textwidth=6.5in
\textheight=9.0in
\headsep=0.25in
\setlength{\parskip}{1ex}
\setlength{\parindent}{0ex}

%
% You should change this things~
%

\newcommand{\mahclass}{Multi-Agents Systems}
\newcommand{\mahtitle}{Study Guide}
\newcommand{\mahdate}{\today}
\newcommand{\spacepls}{\vspace{5mm}}

%
% Header markings
%

\pagestyle{fancy}
\lhead{Study Guide}
\chead{}
\rhead{}
\lfoot{}
\rfoot{}


\renewcommand\headrulewidth{0.4pt}
\renewcommand\footrulewidth{0.4pt}
\renewcommand{\familydefault}{\sfdefault} %The sans-serif font and the like

% Alias for the Solution section header
\newcommand{\solution}{\textbf{\large Solución}}

%Alias for the new step section
\newcommand{\steppy}[1]{\textbf{\large #1}}

%
% My actual info
%

\title{
\vspace{1in}
\textbf{Tecnológico de Monterrey} \\
\vspace{0.5in}
\textmd{\mahclass} \\
\vspace{0.5in}
\textsc{\mahtitle}
\author{01170065  - MIT \\
Xavier Fernando Cuauhtémoc Sánchez Díaz \\
\texttt{xavier.sanchezdz@gmail.com}}
\date{\mahdate}
}

\begin{document}

\begin{titlepage}
    \maketitle
\end{titlepage}

%
% Actual document starts here~
%

\section{Agent Architectures}

\subsection{What is an Agent and Environment types}

An \textbf{Agent} is a computer system capable of autonomous action in some \textbf{Environment} in order to meet its design objectives.

\textbf{Environments} can be classified in many ways:

\begin{itemize}
    \item \textbf{Accessible} or \textbf{inaccessible}.
    \item \textbf{Deterministic} or \textbf{non-deterministic}.
    \item \textbf{Static} or \textbf{dynamic}.
    \item \textbf{Discrete} or \textbf{continuous}.
\end{itemize}

An \textbf{accessible} environment is one in which the agent can obtain complete, accurate, up-to-date information about the environment's state.
On the other hand, an \textbf{inaccessible} environment is one in which the agent can't obtain all this info. E.g. the Internet is an inaccesible environment for all of us!

An environment can also be \textbf{deterministic} if any action (of the agents) has a single guaranteed effect – there is no uncertainty about the state that will result from performing an action.
This means that for each action there is a single (JUST ONE) state of the environment. If there's more than one state, then it is \textbf{non-deterministic}.
Remember about the waves of the sea or the braking example: even if the agent does an action, there could be MORE THAN ONE final result.

A \textbf{static} environment is one that changes its states ONLY by the action of an agent. A Tetris game is static, because the environment only changes with our actions.

A \textbf{dynamic} environment is the opposite: an environment which changes EVEN IF an agent does nothing.

Another way to classify environments is depending on the \texttt{cardinality} of its set of possible states (the \textit{size} of the set, how many elements it has).
If the environment has a finite cardinality (a natural number of elements), then it is considered as \textbf{discrete}. A Rubik's cube, even if it has TRILLIONS of combinations, it is a finite number. Thus, the Rubik's cube is a discrete environment.

BUT IF AN ENVIRONMENT'S CARDINALITY IS THE SAME CARDINALITY OF THE SET OF NATURAL NUMBERS (if it's infinite) then this environment is a \textbf{continuous} environment.

\subsection{How to describe a set}

The set notation is useful for describing an agent's actions, environment's states or state transformers. To describe a set, you usually use CAPITAL LETTERS. If it's a set of agent actions, usually you'll use \(A\) or if it's a set of environment states you could use \(E\).
After that, you'll use \{ and then list all elements using lower-case letters (use \(a, \alpha\) or \(e\) with a sub-index if there's no need to describe the actual action/state). Put \} at the end to mark the END of the elements and thus delimiting your set. Like this:
\[A=\{a_0, a_1, a_2, a_3, ..., a_n\}\]

\pagebreak

\subsection{How to describe a run}

Runs are described as a SEQUENCE of environment STATES that are altered via an agent's ACTION. To describe a run, we usually use (yeah, we usually use) lowercase \(r\), followed by the \texttt{such as} operator (a colon, :). After that, we need to write the state, then an arrow with the action that modifies that state, in order to get to ANOTHER state, and so on. Like this:

\[r : e_0 \xrightarrow{\alpha_0} e_1 \xrightarrow{\alpha_1} e_2 \xrightarrow{\alpha_2} e3 \xrightarrow{a_3} \dots \xrightarrow{a_{n-1}} e_n\]

\begin{tcolorbox}
    REMEMBER, KIDDO: the number of iterations depends on the number of ACTIONS, not states. If you're asked to do a 5 second run using an action per second, you should use 5 actions and thus end up with 6 environment states.
\end{tcolorbox}

\subsection{How to describe a State Transformer Function}

Now this is a little bit trickier. First of all, you should know what a state transformer function is. It is a FUNCTION that represents BEHAVIOR of the environment. It looks something like this when you define it:
\[\tau : \mathcal{R}^{Ac} \rightarrow \wp (E)\]

NOW WAIT A SECOND. This is perhaps too much to handle, but no, it isn't.
\begin{itemize}
    \item \(\tau\) is TAU, a Greek letter that refers to the state transformer function.
    \item \(:\) means that this is a function MAPPING what is on the right of the arrow to what is on the left of the arrow (even if : meant something else for runs, this is complete BS).
    \item \(\mathcal{R}^{Ac}\) is the left side, which is a SET OF RUNS which end with an action.
    \item \(\wp\) means the POWER SET of whatever is next to \(\wp\) (to the right). A power set is A SET which contains ALL POSSIBLE SUBSETS of whatever is next. By definition, its cardinality is \(2^n\), where \(n\) is the number of elements in the original set (whatever is next to \(\wp\)).
    \item \(E\) is the SET of Environment states.
\end{itemize}

SO ALL OF THIS MEANS that you should write down a FUNCTION which RECEIVES an ACTION and then RETURNS A SET, which contains ALL POSSIBLE STATES FOR THAT ACTION.
If \texttt{drinking water} is an action done by me, what could happen to the \texttt{water} is to be \texttt{drunk}, or to be \texttt{spilled} on my keyboard.

To write this function, we usually write (yeah, we don't use \textit{use} this time) it like this:

\[\tau (\dots , \sigma, \alpha) = \{\sigma_1, \sigma_2, \dots\}\]

Where \(\sigma\) is an environment state, and \(\alpha\) is an action. \(\sigma_1\) is one of the possible states that COULD HAPPEN after the \(\alpha\) action (and so on with 2, 3...). You could use any variable instead of alpha. Actually you should use whatever the actions are called in your exam!

\pagebreak

\section{Game Theory}

\subsection{Expected Value and Expected Utility}

The \textbf{expected value} is (you can guess, I guess!) the value expected of a transaction or \textit{agreement} between the agents.
The expected value is the sum of the product of the value of each of the possible outcomes by their probability.
Remember the deal or no deal TV show, 5M USD if you guess right, or 0 if you don't, or instead 2M NOW and you don't get to choose any of the suitcases.
This would be calculated as:

\begin{align*}
    EV_1 & = (0.5)(5000000) + (0.5)(0) = 2,500,000 \\
    EV_2 & = (1)(2000000) + (0)(0) = 2,000,000
\end{align*}

But the expected value is kinda useless, so we often use the \textbf{expected utility}, which is kinda the same, but different.
Each additional unit has less value than the previous one.
Remember the ice cream example, 1 ice cream is incredible in a summer day. 2 are good. 3 are too much. 4 is horrible my stomach can't handle it please stop.

\subsection{Representing games}

To summarize a story (a game) we used two different approaches.

The first approach is the \textbf{game tree} (also called the \textbf{extensive form}).
The game tree uses nodes and branches, and lists (extensively) all the outcomes (utilities for each agent).
To get the solution of the game, one should prune out terminal nodes which have less expected utility than the others in the same branch,
and then keep comparing between the non-pruned branches until getting the solution.
As you can see, this approach is valid IF we know what the other agent is going to do.

However, in a simultaneous world, the approach we used was the \textbf{game table} (also called the \textbf{strategic form}).
In this form, you use columns to describe all possible actions for an agent and rows for all possible actions for the other agent (or all other agents that are not me).
Each cell (a row, column tuple) contains the expected utility for each agent.
This is the representation that we used the most.

\subsection{Some already defined behaviors in games}

There are some games already defined. For example, there are situations in which both agents hate themselves, and in which what an agent chooses will affect the other agent.
This game is called a \textbf{zero-sum game}, because if you sum all the utilities in a single cell you should get 0 lol.

The \textbf{Prisoner's Dilemma} is another game already defined, in which the pay-offs are better IFF (if and only if) both agents choose the same.
If they choose different solutions, then a catastrophe is on its way.

We also reviewed some other games like the \textbf{Battle of the Sexes}, the \textbf{Chicken} and the \textbf{Battle of Bismarck Sea}. Refresh your memory, young Padawan.

\subsection{Dominance}

An strategy is dominant IFF we choose this strategy there is no other strategy better than this one. There are two \textit{kinds} of dominance, weak and strict.

A \textbf{strictly dominant strategy} is a strategy in which ALL my decisions (not considering what the other agent will do) are the best decisions I could take.
If you're the row player, check each cell against its corresponding column (in another row) to see if the strategy REALLY is better.
If my strategy is ALWAYS better, then it's a strictly dominant strategy.

If SOMETIMES it's better and some other times it is the same, then it is \textbf{weakly dominant}.

BUT IF SOMETIMES IT'S BETTER AND SOME OTHER TIMES IT`S WORSE THEN IT IS NOT A DOMINANT STRATEGY.

If a strategy is not dominant, and there EXISTS a dominant strategy, then that non-dominant strategy is also called a \textbf{dominated strategy}.
The same concepts (weakly and strictly) apply: If there is a weakly dominant strategy, then this strategy is weakly dominated by that dominant strategy.

Now, the fun part is when THERE'S NO DOMINANT STRATEGY for EITHER of the agents.
If that happens, you should ELIMINATE dominated strategies until you get a rational solution!

\subsection{Nash Equilibrium}

The \textbf{Nash Equilibrium} is achieved IFF an agent can't increase its pay-off by changing strategy IF the other agents were to stick with its strategy.
The safest way to determine a Nash Equilibrium is by \textbf{cell-by-cell} inspection.

\begin{tcolorbox}
    REMEMBER KIDDO: ask yourself (as an agent): IF the other agent doesn't move, should I move? IF YES, then it is not a Nash Equilibrium point.
    IF NO, then ask the same question for the other agent. IF the answer is NO for both agents, then the cell is a Nash Equilibrium point.
\end{tcolorbox}

The \textbf{min-max method} is useful to get the Nash Equilibrium point BUT ONLY FOR ZERO-SUM GAMES.

\subsection{Pareto Optimality}

A cell is said to be \textbf{Pareto optimal} if, to the eyes of an outsider, that is a preferable outcome.
The easiest way to determine if a cell is Pareto optimal is using the cell-by-cell method.

\begin{tcolorbox}
REMEMBER KIDDO: ask yourself (as a non-agent): IF we change the cell, do the agents get hurt? IF YES, then don't change cell!
IF after asking the same for ALL CELLS you're on the same spot, then it is Pareto Optimal!
\end{tcolorbox}

\subsection{Mixed Strategies (in Game Theory)}

In a mixed strategy, agents have a probability distribution over their actions.

\subsubsection{Expected Utility of a Strategy}

To calculate the \textbf{expected utility} of an agent strategy,
you should take into account the pay-off of each strategy and multiply by their probability (as you did with expected utility before).

\begin{table}[h!]
\centering
\begin{tabular}{@{}ccc@{}}
\toprule
                       & F (0.5)                    & B (0.5)                    \\ \midrule
\multicolumn{1}{c|}{F} & \multicolumn{1}{c|}{90,10} & \multicolumn{1}{c|}{20,80} \\ \cmidrule(l){2-3}
\multicolumn{1}{c|}{B} & \multicolumn{1}{c|}{30,70} & \multicolumn{1}{c|}{60,40} \\ \bottomrule
\end{tabular}
\end{table}

The expected utility for the row player is \((0.5)(0.9)+(0.5)(0.2)=55\%\) for F, and \((0.5)(0.3)+(0.5)(0.6)=45\%\) for B,
so the row player should always choose F GIVEN THAT the column player maintains its probability distribution.

So, because the row player will always choose F, the expected utility of the Server is BASED ON the fact that the row player's strategy is F:

\[(0.5)(0.1) + (0.5)(0.8) = 0.45\]

This is already better than 40 (BB), so it is a preferable mixed strategy.

\subsubsection{Finding the best probability distribution}

\begin{table}[h!]
\centering
\begin{tabular}{@{}ccc@{}}
\toprule
                       & F (q)                    & B (1-q)                    \\ \midrule
\multicolumn{1}{c|}{F} & \multicolumn{1}{c|}{90,10} & \multicolumn{1}{c|}{20,80} \\ \cmidrule(l){2-3}
\multicolumn{1}{c|}{B} & \multicolumn{1}{c|}{30,70} & \multicolumn{1}{c|}{60,40} \\ \bottomrule
\end{tabular}
\end{table}

In order to get the best probability distribution for the server (column player),
we should calculate first the expected utility of each strategy of the receiver (row player).

\begin{align*}
    EU_{RF} & = (q)(90) + (1-q)(20) \\
    & = 90q + 20 - 20q \\
    & = 20 + 70q \\[2ex]
    EU_{RB} & =(q)(30) + (1-q)(60) \\
    & = 30q + 60 - 60q \\
    & = 60 - 30q
\end{align*}

We now solve by equating both equations (yeah boi), so we end up with this:

\begin{align*}
    & 20 + 70q = 60 - 30q \\
    & 70q + 30q = 60 - 20 \\
    & 100q = 40 \therefore q = 0.4
\end{align*}

Now this is ONLY AN ASSUMPTION. Why, you ask? Well, because we equated both equations. That means that EACH STRATEGY IS INDIFFERENT.
That's right. IF \(q = 0.4\), then the receiver is indifferent between F and B.

Now, onto the server side. The equations end up like this:

\begin{align*}
    & 80 - 70q = 40 + 30q \\
    & 80 - 40 = 100q \\
    & q = \dfrac{40}{100} \therefore q = 0.4
\end{align*}

SO, YES, \(q\) is 0.4. That means \(1-q\) is 0.6, and thus, receiver is, JUST AS PLANNED, indifferent to both strategies.
But we haven't seen if the receiver uses a \(p\) probability for F, and –of course– \(1-p\) for B.

For that, the receiver should ASSUME that server alternatives are indifferent.
So calculate for which probability \(p\) the server will be indifferent:

\begin{align*}
    & 10p + 70(1-p) = 80p + 40(1-p) \\
    & 10p + 70 - 70p = 80p + 40 - 40p \\
    & 70 - 40 = 80p + 70p - 40p - 10p \\
    & 30 = 100p \therefore p = 0.3
\end{align*}

So, \(p=0.3\) in order for server to be indifferent between MY receiving alternatives. You can later check if that's true using MY (receiver's payoff equations) to find \(p\), but it's no use.

\begin{tcolorbox}
    REMEMBER KIDDO: If you were to find the probability distribution of an agent, use THE OPPOSING AGENT'S EXPECTED VALUE, not yours!
    E.g. If I want to get \(q\) for the column player, then I should use row's expected value equations.
\end{tcolorbox}

\section{Contract Net}

A \textbf{Contract Net} is a network of contracts! Guess you didn't expect that, did you?
Since we're dealing with self-interested agents, it's important to note that each agent will act to further their own interests,
and possibly at expense of others. This means that maybe, there is potential \textbf{conflict}.

Contract Net can be summarized in this 5 steps:

\begin{enumerate}
    \item Recognition
    \item Announcement
    \item Bidding
    \item Awarding
    \item Expediting
\end{enumerate}

\subsection{Recognition}

In the first stage, the agent recognizes (heh) it needs help to solve a problem. It could either be because the agent can't achieve its goal alone or
because working with the help of other agents will yield a better utility).

\subsection{Announcement}

The agent with the task sends an \textit{announcement} (heheh) which includes a specification of the task to be achieved.
This announcement should encode a \texttt{description of the task}, if there are any \texttt{constraints} and some additional information
(often referred to as \texttt{meta-task info}).

\subsection{Bidding}

Ok, you should guess that this step is where every other agent bids, right? That's right.
Those agents that received the announcement decide whether or not they wish to bid for the task.
This wish to bid depends on many factors like if an agent is actually capable of satisfying the constraints of the \textbf{manager} (the agent who sent the announcement).
It can also depend on price information (if relevant).

THEN THEY CHOOSE TO BID, and if they DID CHOOSE then they submit a \textbf{tender}.

\subsection{Awarding and Expediting}

These two are more or less at the same time. The manager needs to decide whom to award the contract to.
Then, the result of his process is communicated to all agents that submitted a bid. The successful contractor then expedites the task, and a contract is made! Voila!

\section{Negotiation}

Negotiation is a way to coordinate selfish interests. It solves characteristics-form games and more complex versions.
It can also be seen as \textbf{bargaining}. This bargaining problem is defined with some already defined features:

\begin{itemize}
\item The \textbf{utility of a deal} is a function mapping a \textbf{real value} to a \textbf{deal}.
\item One of the possible deals is the \textbf{no-deal}.
\item The utility of the no-deal is zero, always.
\end{itemize}

ONWARDS TO THE MATHEMATICAL NOTATION SYSTEMS! To represent the utility we use a lower-case \(u\) with a sub-index \(i\) for the sake of generalization, like this: \(u_i\)

As we previously stated, the utility of a deal can be seen as a FUNCTION. So we define it as follows:

\[u_i : \Delta \rightarrow \mathbb{R}\]

Where \(u_i\) is the expected utility of a deal, \(\Delta\) is the set of all possible deals, and \(\mathbb{R}\) is the set of all real numbers
(that means, any non-imaginary numerical value). Remember that the colon (:) assigns what is to the right of the arrow to what is on its left.

Because \(\Delta\) is the set of all deals, then \(\delta_i\)  (or \(\delta'\)) is any possible deal. Using \(d\) instead of lower-case delta is also common.
To refer to the no-deal, we use \(\delta^-\).

\begin{tcolorbox}
    REMEMBER KIDDO: \(u_i(\delta^-) = 0\), ALWAYS, FOR BOTH AGENTS.
\end{tcolorbox}

\subsection{Axiomatic Solution Concepts}

\subsubsection{Pareto Optimal deals}

As well as in games (since this is also kinda like a game theory analysis from hell), Negotiation deals (no pun intended) with Pareto Optimality.
A deal \(\delta\) is \textbf{Pareto optimal} if there is NO OTHER DEAL such that everyone prefers it over \(\delta\).
This means that there is no \(\delta'\) such that

\[\forall_i u_i(\delta') > u_i(\delta)\]

which reads as: FOR ALL DEALS, the utility of that deal is better than the deal I'm currently analyzing. THAT'S WHAT WE DON'T WANT.
IF there exists AT LEAST ONE which is better FOR BOTH AGENTS, then it's not Pareto optimal.

\begin{tcolorbox}
    REMEMBER KIDDO: Ask yourself (as a non-agent): does moving to another deal hurt anyone? IF YES, then don't change deal!
    Keep asking the same question for ALL OTHER DEALS. If at the end you're still on the same deal, then it is Pareto Optimal!
\end{tcolorbox}

\subsubsection{Pareto Frontier}

There's also another concept involving Pareto optimal deals, which is the \textbf{Pareto frontier}.
The Pareto frontier is the set of ALL DEALS which are Pareto optimal.
These, when plotted, look like a line on top of the other deals, which is the upper-rightmost set of points. Like a frontier, you know.

This set of deals is also called the \textbf{negotiation set}, since its elements are the best deals available and all rational agents will look for
the \textit{very best} of these deals.

\subsubsection{Symmetry}

The concept of \textbf{Symmetry} applies to negotiation protocols. The protocol is said to be symmetric if the solution remains the same as long
as the set of utility functions \(U\) remains the same, regardless of which agent has which utility.

\begin{tcolorbox}
    REMEMBER KIDDO: a protocol is symmetric IF and ONLY IF you switch the utility functions between agents, the solution is the same.
    Think about equality. All of us citizens have the same RIGHTS, FREEDOM, 'MURICA.
\end{tcolorbox}


\subsubsection{Individual Rationality}

The concept of \textbf{individual rationality} applies to DEALS. A deal \(\delta\) is individually ration if:

\[\forall_i u_i(\delta) \ge u_i(d^-)\]

This mathematically beautiful and complex assumption means that a deal is individually rational if, by itself, is rational (duh).
What we consider rational here is a value greater or equal to the value of the no-deal, which is always 0.
So ANY DEAL with positive utility is individually rational. Or you can say that \(u_i(\delta) \in \mathbb{R^+}\)
but the easiest way to remember all of this is that any deal yielding a positive utility it's individually rational.

\subsubsection{Independence from Irrelevant Alternatives}

This one is tricky. The \textbf{independence from irrelevant alternatives} applies to negotiation protocols.
A negotiation protocol is independent from irrelevant alternatives if it's true that when given \(\Delta\) it chooses \(\delta\),
AND when given \(\Delta' \subset \Delta \) where \(\delta \in \Delta'\),
the protocol STILL chooses \(\delta\) assuming \(U\) stays the same.

This is kinda confusing at first, but gets easy if you think about it:
\(\delta\) is the deal, \(\Delta\) is the set of deals, and \(U\) the set of utility functions for all deals.
If you take a sample of the deals, another set of deals (which we call \(\Delta'\)) which also contains the deal we're analyzing,
the protocol will still choose that deal as the winning deal. BUT since \(\Delta'\) can be any subset, this condition should hold TRUE
for ANY subset of \(\Delta\).

\begin{tcolorbox}
    REMEMBER KIDDO: a protocol is independent of irrelevant alternatives if,
    after choosing the best deal, you remove a \textit{bad} deal and the solution is the same.
    Keep removing bad deals until you stay with the best deal, the winning one.
    If the winning deal is ALWAYS the same, then the protocol is independent of irrelevant alternatives!
\end{tcolorbox}

\subsubsection{Egalitarian Solution}

An \textbf{egalitarian solution} is formally defined as follows:

\[\delta = \arg \max_{\delta'\in E}\sum_i u_i(\delta')\]

where \(E\) is the set of all deals where all agents receive the same utility, namely

\[E = \{\delta \vert \forall_{ij}u_i(\delta) = u_j(\delta)\}\]

Blah blah, read the kiddo tip below.

\begin{tcolorbox}
    REMEMBER KIDDO: an egalitarian solution is a solution where ALL AGENTS receive the SAME utility.
    When plotted, the solution should be in a (theoretical) line that looks like \(f(x)=x\) (that is a straight, 45\degree line).
\end{tcolorbox}

\subsubsection{Egalitarian Social Welfare Solution}

An \textbf{egalitarian social welfare solution} is formally defined as follows:

\[\delta = \arg \max_{\delta} \min_{i} u_i(\delta)\]

This means PICK the DEAL which MINIMUM UTILITY is BIGGER. Does this make sense?
\begin{tcolorbox}
    REMEMBER KIDDO: Check all the deals, and pick the one that has the GREATER MINIMUM UTILITY.
    That means, in all deals one agent could win less. Based on the agent that loses a little bit, pick the greater.
\end{tcolorbox}

\subsubsection{Utilitarian Solution}

This solution is based on the utility of the deals, really easy to grasp. Formally, is like this:

\[\delta = \arg \max \sum_i u_i (\delta)\]

which means that you should pick the deal with the greatest utility.
It doesn't matter if it's 100 + 1 for agent 1 and 2 respectively, if that solution is better than 100 with 50 + 50, then pick the first one.
No kiddo tip for you here, this is too easy.

\begin{tcolorbox}
    Just kidding.
    REMEMBER KIDDO: the \textbf{utilitarian solution} can be easily checked in a plot with a \textit{moving}, diagonal line with a
    negative slope of -45 \degree (\(f(x) = -x\)).
    Keep moving this theoretically invisible line towards the origin (0,0) until you touch a deal. That first deal you just touched, that is the utilitarian solution.
\end{tcolorbox}

\subsubsection{Nash Bargaining Solution}

This solution is interesting. It is formally defined as follows:

\[\delta = \arg \max_{\delta'} \prod u_i (\delta')\]

The definition of the \textbf{Nash Bargaining Solution} described in common words is
`pick the solution in which the product of the utility for each player is the greater'.
This approach addresses the issue that if one agent gets too little utility, then the other should also get little utility.

Some things to remember are described in the kiddo tip below

\begin{tcolorbox}
    A Nash bargaining solution is also Pareto efficient, independent of utility units, symmetric and also independent of irrelevant alternatives.
    Cool, isn't it?
\end{tcolorbox}

\subsection{Rubinstein's Alternating Offers}

This model features two agents (\(i, j\)). At each time step \(t\), one agent proposes a deal \(\delta\).
After that, the other agent can either accept or reject that \(\delta\) deal.
BUT, utilities decrease over time! So you better accept quickly, you greedy seller! YOU IMPERIALIST!
But that's all for \textbf{Rubinstein's Alternating Offers} model.

\subsection{Monotonic-concession}

A \textbf{monotonic-concession} protocol has some interesting steps:

\begin{enumerate}
    \item First both agents propose a deal which maximizes their utility, that is \(\delta_i \leftarrow \arg \max_{\delta} u_i(\delta)\).
    \item After that, each agent receives the other agent's proposal.
    \item THEN each agent CHECKS if the other agent's proposal is better than their own proposal, \(u_i(\delta_j) > u_i(\delta_i)\):
    \begin{enumerate}
        \item IF YES, ACCEPT OF COURSE!
        \item IF NOT, then propose another deal. Of course, this deal has to yield a better utility for the opposing agent, and also needs to be better than not making a deal at all.
    \end{enumerate}
    \item If there is not a solution, go to number 2 and try again!
\end{enumerate}

Mathematically speaking, a \textit{better deal} \(\delta'_i\) would be
\(\delta'_i : u_j(\delta') \ge \epsilon + u_j (\delta_i), u_i(\delta'_i) \ge u_i(\delta^-)\)

\subsection{Zeuthen Strategy}

The \textbf{Zeuthen strategy} works as follows:

As an agent:

\begin{enumerate}
    \item Propose my best deal.
    \item Receive a proposal.
    \item Calculate the risk of creating conflict by not accepting the other agent's proposal.
    \item If my risk is less than the other agent's risk, then I must concede JUST ENOUGH so that in the next round I don't concede again.
    \item Of course, if no agreement was settled, keep proposing deals that are slightly better for the other agent (goto 2)
\end{enumerate}

How do you calculate risk? That's the quotient of the utility loss of ACCEPTING the proposal, over the utility of not conceding and cause conflict. Like this:

\[risk_i = \frac{u_i(\delta_i) - ui(\delta_j)}{u_i(\delta_i)}\]

Be aware that you need to calculate risk for BOTH AGENTS and then compare the risk of BOTH.

There are some interesting features on this strategy, as it is not guaranteed to succeed or to maximize social welfare.
It is, however, guaranteed to terminate, and any agreement will be individually rational and also Pareto optimal. It is also in a Nash equilibrium.

\section{Negotiation in Task Oriented Domains}

\subsection{Basic concepts}

In \textbf{Task Oriented Domains} (abbreviated as TOD), agents have TASKS to achieve, and they look to redistribute those tasks.

A TOD is a triple in the form of \(\langle T, Ag, c \rangle\), where \(T\) is the finite set of all possible tasks,
\(Ag\) is the set of participating agents and \(c : \wp (T) \rightarrow \mathbb{R^+}\) (the power set of the set of tasks mapped to a real positive number).

Additionally, an \textbf{encounter} is a collection of tasks, of the form \(\langle T_1, \dots , T_n \rangle\) where \(T_i \subseteq T\) for each \(i \in Ag\)
(\(T_i\) is a subset of \(T\) –including \(T\) itself– for each agent).

A \textbf{deal} in TOD is a pair in the form of \((D_1, D_2) : D_1 \bigcup D_2 = T_1 \bigcup T_2\)
(such that the union of \(D_1\) and \(D_2\) is the same as the the union of \(T_1\) and \(T_2\)).

The \textbf{utility} of a deal is calculated with \(Cost(T_i) - Cost(D_i)\).

\subsection{Deception in TODs}

\textbf{Deception} occurs because it can benefit the agent in two ways: pretending that you have been given taks that you don't really have,
and pretending NOT to have a task you really have.
The former is called \textbf{phantom or decoy} task, while the latter is a \textbf{hidden} task.

\subsection{Mixed Deals in TODs}

A \textbf{mixed deal} is, again, a double in the form \((D_1, D_2) : p\).
The agents will perform (\(D_1, D_2\)) with probability \(p\), and the symmetric deal (\(D_2, D_1\)) with probability \(1-p\).

For this, we will simplify by assuming both agents agree to the GO HAM OR GO HOME principle:
\textbf{all-or-nothing} means that if I win, you will do all your homework AND MINE, but if I lose I'll have to do yours TOO.

The next examples will be hard to explain, hold your horses, KIDDO!

\pagebreak

\subsubsection{Hiding with mixed all-or-nothing deals}

This example shows how to get to that \(\dfrac{3}{8}\) which they agree upon in the figure below.

\begin{figure}[h!]
\includegraphics[width=0.5\textwidth]{hiding}
\caption*
\centering
\end{figure}

First of all, we need to assume that postmen are going to flip a coin, so the expected utility of both agents (which will do all the job)
is the probability of NOT doing all the job (0.5, because of coin) multiplied by the cost of the job you actually had:
\(EU_1 = (0.5)(8) = 4\).

Then, we can analyze what the supposed expected cost and expected utility (in case of not doing the deal) are.
The expected utility is calculated as the cost of what I'm supposed to do, minus the expected cost of doing all the job.

\begin{align*}
    EC_1 & = 8p\\[1ex]
    EU_1 & = 6 - 8p\\[2ex]
    EC_2 & = 8(1-p)\\[1ex]
    EU_2 & = 8 - 8(1-p)\\[2ex]
    6-8p & = 8 - 8(1-p)\\
    6-8p & = 8 - 8 + 8p\\
    6 & = 16p \therefore p = \dfrac{6}{16} = \dfrac{3}{8}
\end{align*}

Notice how we used \(1-p\) to denote the probability of the other agent doing all the work.

The next step consists in getting the expected utility (in case of doing all work, including what he lied about) using the probability we just got:

\[EU_1 = 6 - 8\left(\dfrac{3}{8}\right) = 6 - 3 = 3\]

The expected utility of doing all the work is LESS than the original expected utility. It really makes no sense to hide your tasks!

\subsubsection{Phantom letters with mixed deals}

This example shows how to get to that \(\dfrac{3}{4}\) which they agree upon in the figure below.

\begin{figure}[h!]
\includegraphics[width=0.5\textwidth]{phantom}
\caption*
\centering
\end{figure}

Same here, we need to calculate the expected utility of each agent (which is the same for both).
Remember that EU is the probability of doing all the job multiplied by the cost of doing all the job:
\(EU_1 = (0.5)(12) = 6\).

Now we can analyze what the supposed expected cost and expected utility (in case of not doing the deal) are.
Remember, EU is now cost of what I'm supposed to do, minus the expected cost of doing all work.

\begin{align*}
    EC_1 & = 12p\\[1ex]
    EU_1 & = 12 - 12p\\[2ex]
    EC_2 & = 12(1-p)\\[1ex]
    EU_2 & = 6 - 12(1-p)\\[2ex]
    12-12p & = 6 - 12(1-p)\\
    12 -12p & = 6 - 12 + 12p\\
    12 + 6 & = 12p + 12p\\
    18 &= 24p \therefore p = \dfrac{18}{24} = \dfrac{3}{4}
\end{align*}

Now, let's check against the ACTUAL, REAL expected utility in case of doing all work (lying agent).
Notice how if agent 1 does all the work, the actual work would be LESS than what he claimed it to be
(as opposed to hiding letters).

\[EU_1 = 6 - 6\left(\dfrac{3}{4}\right) = 6 - \dfrac{18}{4} = 1.5\]

Again, lying doesn't work in mixed deals in TOD!

\begin{tcolorbox}
    REMEMBER KIDDO: \(EC\) is cost of doing all the job, \(EU\) is COST of doing MY SUPPOSED JOB minus \(EC\).
    Then equate, and get \(p\) (or \(1-p\)). Then use \(p\) to get the supposed EU, it should always be LESS than what it was originally!
    If you HIDE info and you lose, you'll have to do more work than what you said you'd do. \\
    If you create PHANTOM info and you lose, you'll do less than what you said you'd do.
\end{tcolorbox}

\end{document}