Untitled

There is an interesting trade-off between the time taken and the accuracy of imputation when dealing with tuples with 2 or more missing attributes. Suppose that the attributes that are missing are $X = \{x_1, x_2, ...,  x_n\}$ and the attributes that are known are $A = \{a_1, a_2, ..., a_m\}$. Then one approach would be to find those attribute values for $x_i$ that maximize $P(x_i|A)$ individually. This would be the ``most likely estimation" approach, and the fastest method; however, this is also the least accurate since it ignores all interactions among the attributes in $X$. A slightly slower approach would be a ``greedy search'' approach, where the value for one of the variables in $X$ is found using $\arg\max(x_i|A)$, then it is set as evidence. The values of the remaining unknown attributes are then informed by $A \cup x_i$. An even slower, but far more accurate approach is to use ``most probable explanation'', where we find the entire set of the attributes $X$ that maximizes the joint conditional probability $P(X|A)$. This is the approach that we use in this paper.