Attention-based multimodal neural machine translation
Conference Paper, Proceedings of 1st Conference on Machine Translation (WMT '16), Vol. 2, pp. 639 - 645, August, 2016
Abstract
We present a novel neural machine translation (NMT) architecture associating visual and textual features for translation tasks with multiple modalities. Transformed global and regional visual features are concatenated with text to form attendable sequences which are dissipated over parallel long short-term memory (LSTM) threads to assist the encoder generating a representation for attention-based decoding. Experiments show that the proposed NMT outperform the text-only baseline.
BibTeX
@conference{Huang-2016-113124,author = {P.-Y. Huang and F. Liu and S.-R. Shiang and J. Oh and C. Dyer},
title = {Attention-based multimodal neural machine translation},
booktitle = {Proceedings of 1st Conference on Machine Translation (WMT '16)},
year = {2016},
month = {August},
volume = {2},
pages = {639 - 645},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.