Recommendations are commonly used to modify
user’s natural behavior, for example, increasing
product sales or the time spent on a website. This
results in a gap between the ultimate business ob-
jective and the classical setup where recommenda-
tions are optimized to be coherent with past user be-
havior. To bridge this gap, we propose a new learn-
ing setup for recommendation that optimizes for the
Incremental Treatment Effect (ITE) of the policy.
We show this is equivalent to learning to predict
recommendation outcomes under a fully random
recommendation policy and propose a new domain
adaptation algorithm that learns from logged data
containing outcomes from a biased recommenda-
tion policy and predicts recommendation outcomes
according to random exposure. We compare our
method against state-of-the-art factorization meth-
ods, in addition to new approaches of causal rec-
ommendation and show significant improvements.