Following some recent propositions to handle natural language generation in spoken
dialogue systems with long short-term memory recurrent neural network
models~\citep{Wen2016a} we first investigate a variant thereof with the objective of a
better integration of the attention subnetwork. Then our next objective is to propose
and evaluate a framework to adapt the NLG module online through direct interactions with
the users. When doing so the basic way is to ask the user to utter an alternative
sentence to express a particular dialogue act. But then the system has to decide between
using an automatic transcription or to ask for a manual transcription. To do so a
reinforcement learning approach based on an adversarial bandit scheme is retained. We
show that by defining appropriately the rewards as a linear combination of expected
payoffs and costs of acquiring the new data provided by the user, a system design can
balance between improving the system's performance towards a better match with the
user's preferences and the burden associated with it. Then the actual benefits of this
system is assessed with a human evaluation, showing that the addition of more diverse
utterances allows to produce sentences more satisfying for the user.