Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-wzw2p Total loading time: 0 Render date: 2024-04-30T10:38:47.114Z Has data issue: false hasContentIssue false

Bibliography

Published online by Cambridge University Press:  30 May 2020

Philipp Koehn
Affiliation:
The Johns Hopkins University
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abdou, Mostafa, Gloncak, Vladan, and Bojar, Ondřej. 2017. Variable mini-batch sizing and pre-trained embeddings. In Proceedings of the Second Conference on Machine Translation. Volume 2: Shared Task Papers. Association for Computational Linguistics, Copenhagen, pages 680686. www.aclweb.org/anthology/W17-4780.Google Scholar
Aharoni, Roee and Goldberg, Yoav. 2017. Towards string-to-tree neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Vancouver, BC, pages 132140. http://aclweb.org/anthology/P17–2021.Google Scholar
Aharoni, Roee, Johnson, Melvin, and Firat, Orhan. 2019. Massively multilingual neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 38743884. www.aclweb.org/anthology/N19-1388.Google Scholar
Al-Shedivat, Maruan and Parikh, Ankur. 2019. Consistency by agreement in zero-shot neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 11841197. www.aclweb.org/anthology/N19-1121.Google Scholar
Alaux, Jean, Grave, Edouard, Cuturi, Marco, and Joulin, Armand. 2019. Unsupervised hyper-alignment for multilingual word embeddings. In International Conference on Learning Representations (ICLR). New Orleans, LA. http://arxiv.org/pdf/1811.01124.pdf.Google Scholar
Alinejad, Ashkan, Siahbani, Maryam, and Sarkar, Anoop. 2018. Prediction improves simultaneous neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 30223027. www.aclweb.org/anthology/D18-1337.Google Scholar
Alkhouli, Tamer, Bretschner, Gabriel, and Ney, Hermann. 2018. On the alignment problem in multi-head attention-based neural machine translation. In Proceedings of the Third Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Brussels, pages 177185. www.aclweb.org/anthology/W18-6318.Google Scholar
Alkhouli, Tamer, Bretschner, Gabriel, Peter, Jan-Thorsten, Hethnawi, Mohammed, Guta, Andreas, and Ney, Hermann. 2016. Alignment-based neural machine translation. In Proceedings of the First Conference on Machine Translation. Association for Computational Linguistics, Berlin, pages 5465. www.aclweb.org/anthology/W/W16/W16-2206Google Scholar
Alkhouli, Tamer and Ney, Hermann. 2017. Biasing attention-based recurrent neural networks using external alignment information. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Copenhagen, pages 108117. www.aclweb.org/anthology/W17-4711.Google Scholar
Allen, Robert B.. 1987. Several studies on natural language and back-propagation. Proceedings of the IEEE First International Conference on Neural Networks 2(5):335341. http://boballen.info/RBA/PAPERS/NL-BP/nl-bp.pdf.Google Scholar
Alqaisi, Taghreed and O’Keefe, Simon. 2019. En-ar bilingual word embeddings without word alignment: Factors effects. In Proceedings of the Fourth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Florence, pages 97107. www.aclweb.org/anthology/W19-4611.Google Scholar
Alvarez-Melis, David and Jaakkola, Tommi. 2018. Gromov-Wasserstein alignment of word embedding spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 18811890. https://doi.org/10.18653/v1/D18–1214.Google Scholar
Anastasopoulos, Antonios, Lui, Alison, Nguyen, Toan Q., and Chiang, David. 2019. Neural machine translation of text from non-native speakers. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 30703080. www.aclweb.org/anthology/N19-1311.Google Scholar
Anderson, Peter, Fernando, Basura, Johnson, Mark, and Gould, Stephen. 2017. Guided open vocabulary image captioning with constrained beam search. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 936945. www.aclweb.org/anthology/D17-1098.Google Scholar
Antonova, Alexandra and Misyurev, Alexey. 2011. Building a web-based parallel corpus and filtering out machine-translated text. In Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web. Association for Computational Linguistics, Portland, OR, pages 136144. www.aclweb.org/anthology/W11-1218.Google Scholar
Argueta, Arturo and Chiang, David. 2019. Accelerating sparse matrix operations in neural networks on graphics processing units. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 62156224. www.aclweb.org/anthology/P19-1626.Google Scholar
Arivazhagan, Naveen, Cherry, Colin, Macherey, Wolfgang, Chiu, Chung-Cheng, Yavuz, Semih, Pang, Ruoming, Li, Wei, and Raffel, Colin. 2019. Monotonic infinite lookback attention for simultaneous machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 13131323. www.aclweb.org/anthology/P19-1126.Google Scholar
Artetxe, Mikel, Labaka, Gorka, and Agirre, Eneko. 2016. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 22892294. https://aclweb.org/anthology/D16–1250.Google Scholar
Artetxe, Mikel, Labaka, Gorka, and Agirre, Eneko. 2017. Learning bilingual word embeddings with (almost) no bilingual data. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Vancouver, BC, pages 451462. https://doi.org/10.18653/v1/P17–1042.Google Scholar
Artetxe, Mikel, Labaka, Gorka, and Agirre, Eneko. 2018a. Unsupervised statistical machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 36323642. www.aclweb.org/anthology/D18-1399.Google Scholar
Artetxe, Mikel, Labaka, Gorka, and Agirre, Eneko. 2019a. Bilingual lexicon induction through unsupervised machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 50025007. www.aclweb.org/anthology/P19-1494.Google Scholar
Artetxe, Mikel, Labaka, Gorka, and Agirre, Eneko. 2019b. An effective approach to unsupervised machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 194203. www.aclweb.org/anthology/P19-1019.Google Scholar
Artetxe, Mikel, Labaka, Gorka, Agirre, Eneko, and Cho, Kyunghyun. 2018b. Unsupervised neural machine translation. In International Conference on Learning Representations. Vancouver, BC. https://openreview.net/forum?id=Sy2ogebAW.Google Scholar
Artetxe, Mikel and Schwenk, Holger. 2018. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Ithaca, NY: Cornell University abs/1812.10464. http://arxiv.org/abs/1812.10464.Google Scholar
Artetxe, Mikel and Schwenk, Holger. 2019. Margin-based parallel corpus mining with multilingual sentence embeddings. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 31973203. www.aclweb.org/anthology/P19-1309.Google Scholar
Arthur, Philip, Neubig, Graham, and Nakamura, Satoshi. 2016a. Incorporating discrete translation lexicons into neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 15571567. https://aclweb.org/anthology/D16–1162.Google Scholar
Arthur, Philip, Neubig, Graham, and Nakamura, Satoshi. 2016b. Incorporating discrete translation lexicons into neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 15571567. https://aclweb.org/anthology/D16–1162.Google Scholar
Ataman, Duygu and Federico, Marcello. 2018a. Compositional representation of morphologically-rich input for neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Melbourne, pages 305311. http://aclweb.org/anthology/P18–2049.Google Scholar
Ataman, Duygu and Federico, Marcello. 2018b. An evaluation of two vocabulary reduction methods for neural machine translation. In Annual Meeting of the Association for Machine Translation in the Americas (AMTA). Boston, MA. www.aclweb.org/anthology/W18-1810.Google Scholar
Ataman, Duygu, Antonino Di Gangi, Mattia, and Federico, Marcello. 2018. Compositional source word representations for neural machine translation. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation. Melbourne. https://arxiv.org/pdf/1805.02036.pdf.Google Scholar
Ataman, Duygu, Negri, Matteo, Turchi, Marco, and Federico, Marcello. 2017. Linguistically motivated vocabulary reduction for neural machine translation from Turkish to English. The Prague Bulletin of Mathematical Linguistics 108:331342. https://ufal.mff.cuni.cz/pbml/108/art-ataman-negri-turchi-federico.pdf.Google Scholar
Axelrod, Amittai, He, Xiaodong, and Gao, Jianfeng. 2011. Domain adaptation via pseudo in-domain data selection. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Edinburgh, pages 355362. www.aclweb.org/anthology/D11-1033.Google Scholar
Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua. 2015. Neural machine translation by jointly learning to align and translate. In ICLR. San Diego, CA. http://arxiv.org/pdf/1409.0473v6.pdf.Google Scholar
Baltescu, Paul and Blunsom, Phil. 2015. Pragmatic neural language modelling in machine translation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Denver, CO, pages 820829. www.aclweb.org/anthology/N15-1083.Google Scholar
Baltescu, Paul, Blunsom, Phil, and Hoang, Hieu. 2014. Oxlm: A neural language modelling framework for machine translation. The Prague Bulletin of Mathematical Linguistics 102:8192. http://ufal.mff.cuni.cz/pbml/102/art-baltescu-blunsom-hoang.pdf.Google Scholar
Banerjee, Tamali and Bhattacharyya, Pushpak. 2018. Meaningless yet meaningful: Morphology grounded subword-level nmt. In Proceedings of the Second Workshop on Subword/Character LEvel Models. Association for Computational Linguistics, New Orleans, LA, pages 5560. https://doi.org/10.18653/v1/W18–1207.Google Scholar
Bapna, Ankur, Chen, Mia, Firat, Orhan, Cao, Yuan, and Wu, Yonghui. 2018. Training deeper neural machine translation models with transparent attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 30283033. www.aclweb.org/anthology/D18-1338.Google Scholar
Bapna, Ankur and Firat, Orhan. 2019. Non-parametric adaptation for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 19211931. www.aclweb.org/anthology/N19-1191.Google Scholar
Barbu, Eduard, Escartín, Carla Parra, Bentivogli, Luisa, Negri, Matteo, Turchi, Marco, Orasan, Constantin, and Federico, Marcello. 2016. The first automatic translation memory cleaning shared task. Machine Translation 30(3):145166. https://doi.org/10.1007/s10590016-9183-x.Google Scholar
Belinkov, Yonatan and Bisk, Yonatan. 2017. Synthetic and natural noise both break neural machine translation. Ithaca, NY: Cornell University, abs/1711.02173. http://arxiv.org/abs/1711.02173.Google Scholar
Belinkov, Yonatan, Durrani, Nadir, Dalvi, Fahim, Sajjad, Hassan, and Glass, James. 2017a. What do neural machine translation models learn about morphology? In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Vancouver, BC, pages 861872. http://aclweb.org/anthology/P17–1080.Google Scholar
Belinkov, Yonatan, Màrquez, Lluís, Sajjad, Hassan, Durrani, Nadir, Dalvi, Fahim, and Glass, James. 2017b. Evaluating layers of representation in neural machine translation on part-of-speech and semantic tagging tasks. In Proceedings of the Eighth International Joint Conference on Natural Language Processing. Volume 1: Long Papers. Asian Federation of Natural Language Processing, Taipei, pages 110. www.aclweb.org/anthology/I17-1001.Google Scholar
Bengio, Yoshua, Ducharme, Réjean, Vincent, Pascal, and Jauvin, Christian. 2003. A neural probabilistic language model. Journal of Machine Learning Research 3:11371155.Google Scholar
Bentivogli, Luisa, Bisazza, Arianna, Cettolo, Mauro, and Federico, Marcello. 2016. Neural versus phrase-based machine translation quality: A case study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 257267. https://aclweb.org/anthology/D16–1025.Google Scholar
Bentivogli, Luisa, Bisazza, Arianna, Cettolo, Mauro, and Federico, Marcello. 2018. Neural versus phrase-based MT quality: An in-depth analysis on English–German and English–French. Computer Speech and Language 49:5270. https://doi.org/10.1016/j.csl.2017.11.004.Google Scholar
Bicici, Ergun and Yuret, Deniz. 2011. Instance selection for machine translation using feature decay algorithms. In Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics, Edinburgh, pages 272283. www.aclweb.org/anthology/W11-2131.Google Scholar
Blackwood, Graeme, Ballesteros, Miguel, and Ward, Todd. 2018. Multilingual neural machine translation with task-specific attention. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, NM, pages 31123122. www.aclweb.org/anthology/C18-1263.Google Scholar
Blain, Frédéric, Specia, Lucia, and Madhyastha, Pranava. 2017. Exploring hypotheses spaces in neural machine translation. In Machine Translation Summit XVI. Nagoya, Japan. www.doc.ic.ac.uk/pshantha/papers/mtsummit17.pdf.Google Scholar
Blei, David M, Ng, Andrew Y, and Jordan, Michael I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3:9931022.Google Scholar
Bojar, Ondřej, Chatterjee, Rajen, Federmann, Christian, Graham, Yvette, Haddow, Barry, Huck, Matthias, Yepes, Antonio Jimeno, Koehn, Philipp, Logacheva, Varvara, Monz, Christof, Negri, Matteo, Neveol, Aurelie, Neves, Mariana, Popel, Martin, Post, Matt, Rubino, Raphael, Scarton, Carolina, Specia, Lucia, Turchi, Marco, Verspoor, Karin, and Zampieri, Marcos. 2016. Findings of the 2016 conference on machine translation. In Proceedings of the First Conference on Machine Translation. Association for Computational Linguistics, Berlin, pages 131198. www.aclweb.org/anthology/W/W16/W162301.Google Scholar
Braune, Fabienne, Hangya, Viktor, Eder, Tobias, and Fraser, Alexander. 2018. Evaluating bilingual word embeddings on the long tail. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 2: Short Papers. Association for Computational Linguistics, New Orleans, LA, pages 188193. http://aclweb.org/anthology/N18–2030.Google Scholar
Bulte, Bram and Tezcan, Arda. 2019. Neural fuzzy repair: Integrating fuzzy matches into neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 18001809. www.aclweb.org/anthology/P19-1175.Google Scholar
Burchardt, Aljoscha, Macketanz, Vivien, Dehdari, Jon, Heigold, Georg, Peter, Jan-Thorsten, and Williams, Philip. 2017. A linguistic evaluation of rule-based, phrase-based, and neural MT engines. The Prague Bulletin of Mathematical Linguistics 108:159170. https://doi.org/10.1515/pralin-2017-0017.Google Scholar
Burlot, Franck, García-Martínez, Mercedes, Barrault, Loïc, Bougares, Fethi, and Yvon, François. 2017. Word representations in factored neural machine translation. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Copenhagen, pages 2031. www.aclweb.org/anthology/W17-4703.Google Scholar
Burlot, Franck and Yvon, François. 2017. Evaluating the morphological competence of machine translation systems. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Paper. Association for Computational Linguistics, Copenhagen, pages 4355. www.aclweb.org/anthology/W17-4705.Google Scholar
Burlot, Franck and Yvon, François. 2018. Using monolingual data in neural machine translation: a systematic study. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 144155. www.aclweb.org/anthology/W18-6315.Google Scholar
Callison-Burch, Chris, Koehn, Philipp, Monz, Christof, Peterson, Kay, Przybocki, Mark, and Zaidan, Omar. 2010. Findings of the 2010 joint workshop on statistical machine translation and metrics for machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR. Association for Computational Linguistics, Uppsala, pages 1753. www.aclweb.org/anthology/W10-1703.Google Scholar
Carpuat, Marine, Vyas, Yogarshi, and Niu, Xing. 2017. Detecting cross-lingual semantic divergence for neural machine translation. In Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics, Vancouver, BC, pages 6979. www.aclweb.org/anthology/W17-3209.Google Scholar
Castaño, M. Asunción, Casacuberta, Francisco, and Vidal, Enrique. 1997. Machine translation using neural networks and finite-state models. In Theoretical and Methodological Issues in Machine Translation, Santa Fe, NM, pages 160167. www.mt-archive.info/TMI-1997-Castano.pdf.Google Scholar
Castilho, Sheila and Guerberof, Ana. 2018. Reading comprehension of machine translation output: What makes for a better read? In Proceedings of the 21st Annual Conference of the European Association for Machine Translation. Melbourne. https://rua.ua.es/dspace/bitstream/10045/76032/1/EAMT2018-Proceedings10.pdf.Google Scholar
Castilho, Sheila, Moorkens, Joss, Gaspari, Federico, Calixto, Iacer, Tinsley, John, and Way, Andy. 2017a. Is neural machine translation the new state of the art? The Prague Bulletin of Mathematical Linguistics 108:109120. https://doi.org/10.1515/pralin-2017-0013.Google Scholar
Castilho, Sheila, Moorkens, Joss, Gaspari, Federico, Sennrich, Rico, Sosoni, Vilelmini, Georgakopoulou, Panayota, Lohar, Pintu, Way, Andy, Barone, Antonio Valerio Miceli, and Gialama, Maria. 2017b. A comparative quality evaluation of PBSMT and NMT using professional translators. In Machine Translation Summit XVI. Nagoya, Japan.Google Scholar
Caswell, Isaac, Chelba, Ciprian, and Grangier, David. 2019. Tagged back-translation. In Proceedings of the Fourth Conference on Machine Translation. Association for Computational Linguistics, Florence, pages 5363. www.aclweb.org/anthology/W19-5206.Google Scholar
Cettolo, Mauro, Federico, Marcello, Bentivogli, Luisa, Niehues, Jan, Stüker, Sebastian, Sudoh, Katsuitho, Yoshino, Koichiro, and Federmann, Christian. 2017. Overview of the IWSLT 2017 evaluation campaign. In International Workshop on Spoken Language Translation. Tokyo, pages 214.Google Scholar
Chatterjee, Rajen, Negri, Matteo, Turchi, Marco, Federico, Marcello, Specia, Lucia, and Blain, Frédéric. 2017. Guiding neural machine translation decoding with external knowledge. In Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper. Association for Computational Linguistics, Copenhagen, pages 157168. www.aclweb.org/anthology/W17-4716.Google Scholar
Chen, Boxing, Cherry, Colin, Foster, George, and Larkin, Samuel. 2017. Cost weighting for neural machine translation domain adaptation. In Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics, Vancouver, BC pages 4046. www.aclweb.org/anthology/W17-3205.Google Scholar
Chen, Jianmin, Monga, Rajat, Bengio, Samy, and Jozefowicz, Rafal. 2016a. Revisiting distributed synchronous SGD. In International Conference on Learning Representations Workshop Track. https://arxiv.org/abs/1604.00981.Google Scholar
Chen, Mia Xu, Firat, Orhan, Bapna, Ankur, Johnson, Melvin, Macherey, Wolfgang, Foster, George, Jones, Llion, Schuster, Mike, Shazeer, Noam, Parmar, Niki, Vaswani, Ashish, Uszkoreit, Jakob, Kaiser, Lukasz, Chen, Zhifeng, Wu, Yonghui, and Hughes, Macduff. 2018. The best of both worlds: Combining recent advances in neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Melbourne, pages 7686. http://aclweb.org/anthology/P18–1008.Google Scholar
Chen, Wenhu, Matusov, Evgeny, Khadivi, Shahram, and Peter, Jan-Thorsten. 2016b. Guided alignment training for topic-aware neural machine translation. Ithaca, NY: Cornell University, abs/1607.01628. https://arxiv.org/pdf/1607.01628.pdf.Google Scholar
Chen, Xilun and Cardie, Claire. 2018. Unsupervised multilingual word embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 261270. www.aclweb.org/anthology/D18-1024.Google Scholar
Cheng, Yong, Zhaopeng, Tu, Meng, Fandong, Zhai, Junjie, and Yang, Liu. 2018. Towards robust neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Melbourne, pages 17561766. http://aclweb.org/anthology/P18–1163.Google Scholar
Chinea-Rios, Mara, Peris, Álvaro, and Casacuberta, Francisco. 2017. Adapting neural machine translation with parallel synthetic data. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Paper. Association for Computational Linguistics, Copenhagen, pages 138147. www.aclweb.org/anthology/W17-4714.Google Scholar
Cho, Kyunghyun. 2016. Noisy parallel approximate decoding for conditional recurrent language model. Ithaca, NY, Cornell University, abs/1605.03835. http://arxiv.org/abs/1605.03835.Google Scholar
Cho, Kyunghyun and Esipova, Masha. 2016. Can neural machine translation do simultaneous translation? Ithaca, NY, Cornell University, abs/1606.02012. http://arxiv.org/abs/1606.02012.Google Scholar
Cho, Kyunghyun, van Merrienboer, Bart Dzmitry Bahdanau, , and Bengio, Yoshua. 2014. On the properties of neural machine translation: Encoder–decoder approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, Doha, Qatar, pages 103111. www.aclweb.org/anthology/W14-4012.Google Scholar
Choi, Heeyoul, Cho, Kyunghyun, and Bengio, Yoshua. 2018. Fine-grained attention mechanism for neural machine translation. Neurocomputing 284:171176.Google Scholar
Chorowski, Jan and Jaitly, Navdeep. 2017. Towards better decoding and language model integration in sequence to sequence models. In Interspeech. Stockholm, pages 523527.Google Scholar
Chu, Chenhui, Dabre, Raj, and Kurohashi, Sadao. 2017. An empirical comparison of domain adaptation methods for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Vancouver, BC, pages 385391. http://aclweb.org/anthology/P17–2061.Google Scholar
Chung, Junyoung, Cho, Kyunghyun, and Bengio, Yoshua. 2016. A character-level decoder without explicit segmentation for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 16931703. www.aclweb.org/anthology/P16-1160.Google Scholar
Church, Kenneth W. and Hovy, Eduard H.. 1993. Good applications for crummy machine translation. Machine Translation 8(4):239258. www.isi.edu/natural-language/people/hovy/papers/93churchhovy.pdf.Google Scholar
Cohn, Trevor, Duy, Cong Hoang, Vu, Vymolova, Ekaterina, Yao, Kaisheng, Dyer, Chris, and Gholamreza, Haffari. 2016. Incorporating structural alignment biases into an attentional neural translation model. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, CA, pages 876885. www.aclweb.org/anthology/N16-1102.Google Scholar
Cohn-Gordon, Reuben and Goodman, Noah. 2019. Lost in machine translation: A method to reduce meaning loss. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 437441. www.aclweb.org/anthology/N19-1042.Google Scholar
Conneau, Alexis, Lample, Guillaume, Ranzato, Marc’Aurelio, Denoyer, Ludovic, and Jégou, Hervé. 2018. Word translation without parallel data. In International Conference on Learning Representations. Vancouver, BC. https://openreview.net/pdf?id=H196sainb.Google Scholar
Costa-jussà, Marta R., Bonet, Cristina España, Madhyastha, Pranava, Escolano, Carlos, and Fonollosa, José A. R.. 2016. The TALP–UPC Spanish–English WMT biomedical task: Bilingual embeddings and char-based neural language model rescoring in a phrase-based system. In Proceedings of the First Conference on Machine Translation. Association for Computational Linguistics, Berlin, pages 463468. www.aclweb.org/anthology/W/W16/W162336.Google Scholar
Costa-jussà, Marta R. and Fonollosa, José A. R.. 2016. Character-based neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Berlin, pages 357361. http://anthology.aclweb.org/P16–2058.Google Scholar
Coulmance, Jocelyn, Marty, Jean-Marc, Wenzek, Guillaume, and Benhalloum, Amine. 2015. Trans-gram, fast cross-lingual word-embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, pages 11091113. http://aclweb.org/anthology/D15–1131.Google Scholar
Crego, Josep Maria, Kim, Jungi, Klein, Guillaume, Rebollo, Anabel, Yang, Kathy, Senellart, Jean, Akhanov, Egor, Brunelle, Patrice, Coquard, Aurelien, Deng, Yongchao, Enoue, Satoshi, Geiss, Chiyo, Johanson, Joshua, Khalsa, Ardas, Khiari, Raoum, Ko, Byeongil, Kobus, Catherine, Lorieux, Jean, Martins, Leidiana, Nguyen, Dang-Chuan, Priori, Alexandra, Riccardi, Thomas, Segal, Natalia, Servan, Christophe, Tiquet, Cyril, Wang, Bo, Yang, Jin, Zhang, Dakun, Zhou, Jing, and Zoldan, Peter. 2016. Systran’s pure neural machine translation systems. Ithaca, NY: Cornell University, abs/1610.05540. http://arxiv.org/abs/1610.05540.Google Scholar
Cui, Lei, Zhang, Dongdong, Liu, Shujie, Li, Mu, and Zhou, Ming. 2013. Bilingual data cleaning for SMT using graph-based random walk. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Sofia, Bulgaria, pages 340345. www.aclweb.org/anthology/P13-2061.Google Scholar
Currey, Anna, Barone, Antonio Valerio Miceli, and Heafield, Kenneth. 2017. Copied monolingual data improves low-resource neural machine translation. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Paper. Association for Computational Linguistics, Copenhagen, pages 148156. http://www.aclweb.org/anthology/W174715.Google Scholar
Dabre, Raj, Cromieres, Fabien, and Kurohashi, Sadao. 2017. Enabling multi-source neural machine translation by concatenating source sentences in multiple languages. In Machine Translation Summit XVI. Nagoya, Japan https://arxiv.org/pdf/1702.06135.pdf.Google Scholar
Dakwale, Praveen and Monz, Christof. 2017. Fine-tuning for neural machine translation with limited degradation across in- and out-of-domain data. In Machine Translation Summit XVI.Google Scholar
Dalvi, Fahim, Durrani, Nadir, Sajjad, Hassan, and Vogel, Stephan. 2018. Incremental decoding and training methods for simultaneous translation in neural machine translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2: Short Papers. Association for Computational Linguistics, New Orleans, LA, pages 493499. http://aclweb.org/anthology/N18–2079.Google Scholar
Gispert, Adrià de, Iglesias, Gonzalo, and Byrne, Bill. 2015. Fast and accurate preordering for SMT using neural networks. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Denver, CO, pages 10121017. www.aclweb.org/anthology/N15-1105.Google Scholar
Dehghani, Mostafa, Gouws, Stephan, Vinyals, Oriol, Uszkoreit, Jakob, and Kaiser, Lukasz. 2019. Universal transformers. In International Conference on Learning Representations (ICLR). New Orleans, LA. https://openreview.net/pdf?id=HyzdRiR9Y7.Google Scholar
Dessloch, Florian, Ha, Thanh-Le, Müller, Markus, Niehues, Jan, Nguyen, Thai Son, Pham, Ngoc-Quan, Salesky, Elizabeth, Sperber, Matthias, Stüker, Sebastian, Zenkel, Thomas, and Waibel, Alex. 2018. Kit lecture translator: Multilingual speech translation with one-shot learning. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations. Association for Computational Linguistics, Santa Fe, NM, pages 8993. www.aclweb.org/anthology/C18-2020.Google Scholar
Devlin, Jacob. 2017. Sharp models on dull hardware: Fast and accurate neural machine translation decoding on the CPU. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 28102815. http://aclweb.org/anthology/D17–1300.Google Scholar
Devlin, Jacob, Chang, Ming-Wei, Lee, Kenton, and Toutanova, Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 41714186. www.aclweb.org/anthology/N19-1423.Google Scholar
Devlin, Jacob, Zbib, Rabih, Huang, Zhongqiang, Lamar, Thomas, Schwartz, Richard, and Makhoul, John. 2014. Fast and robust neural network joint models for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Baltimore, MD, pages 13701380. www.aclweb.org/anthology/P14-1129.Google Scholar
Dhar, Prajit and Bisazza, Arianna. 2018. Does syntactic knowledge in multilingual language models transfer across languages? In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, pages 374377. www.aclweb.org/anthology/W18-5453.Google Scholar
Ding, Shuoyang, Xu, Hainan, and Koehn, Philipp. 2019a. Salience-driven word alignment interpretation for neural machine translation. In Proceedings of the Conference on Machine Translation (WMT). Florence.Google Scholar
Ding, Shuoyang, Xu, Hainan, and Koehn, Philipp. 2019b. Saliency-driven word alignment interpretation for neural machine translation. In Proceedings of the Fourth Conference on Machine Translation. Association for Computational Linguistics, Florence, pages 112. www.aclweb.org/anthology/W19-5201.Google Scholar
Ding, Yanzhuo, Yang, Liu, Luan, Huanbo, and Sun, Maosong. 2017. Visualizing and understanding neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Vancouver, BC, pages 11501159. http://aclweb.org/anthology/P17–1106.Google Scholar
Dinu, Georgiana, Mathur, Prashant, Federico, Marcello, and Al-Onaizan, Yaser. 2019. Training neural machine translation to apply terminology constraints. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 30633068. www.aclweb.org/anthology/P19-1294.Google Scholar
Dong, Daxiang, Wu, Hua, He, Wei, Yu, Dianhai, and Wang, Haifeng. 2015. Multi-task learning for multiple language translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Volume 1: Long Papers. Association for Computational Linguistics, Beijing, pages 17231732. https://doi.org/10.3115/v1/P15–1166.Google Scholar
Duchi, John, Hazan, Elad, and Singer, Yoram. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12:21212159. Nagoya, JapanGoogle Scholar
Dyer, Chris, Chahuneau, Victor, and Smith, Noah A.. 2013. A simple, fast, and effective reparameterization of IBM model 2. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Atlanta, GA, pages 644648. www.aclweb.org/anthology/N13-1073.Google Scholar
Eck, Matthias, Vogel, Stephan, and Waibel, Alex. 2005. Low cost portability for statistical machine translation based on n-gram frequency and TF-IDF. In Proceedings of the International Workshop on Spoken Language Translation. Pittsburgh, PA. http://20.210-193-52.unknown.qala.com.sg/archive/iwslt_05/papers/slt5061.pdf.Google Scholar
Edunov, Sergey, Ott, Myle, Auli, Michael, and Grangier, David. 2018a. Understanding back-translation at scale. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 489500. www.aclweb.org/anthology/D18-1045.Google Scholar
Edunov, Sergey, Ott, Myle, Auli, Michael, Grangier, David, and Ranzato, Marc’Aurelio. 2018b. Classical structured prediction losses for sequence to sequence learning. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 355364. http://aclweb.org/anthology/N18–1033.Google Scholar
Eetemadi, Sauleh, Lewis, William, Toutanova, Kristina, and Radha, Hayder. 2015. Survey of data-selection methods in statistical machine translation. Machine Translation 29(3–4):189223.Google Scholar
Efron, Bradley and Tibshirani, Robert J.. 1993. An Introduction to the Bootstrap. Boca Raton, FL: Chapman & Hall.Google Scholar
España-Bonet, C., Varga, Á’. C., Barrón-Cedeño, A., and van Genabith, J.. 2017. An empirical analysis of NMT-derived interlingual embeddings and their use in parallel sentence identification. IEEE Journal of Selected Topics in Signal Processing 11(8):13401350. https://doi.org/10.1109/JSTSP.2017.2764273.Google Scholar
Etchegoyhen, Thierry, Torné, Anna Fernández, Azpeitia, Andoni, Garcia, Eva Martínez, and Matamala, Anna. 2018. Evaluating domain adaptation for machine translation across scenarios. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan, pages 618.Google Scholar
Fadaee, Marzieh, Bisazza, Arianna, and Monz, Christof. 2017. Data augmentation for low-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Vancouver, BC, pages 567573. http://aclweb.org/anthology/P17–2090.Google Scholar
Fadaee, Marzieh and Monz, Christof. 2018. Back-translation sampling by targeting difficult words in neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 436446. www.aclweb.org/anthology/D18-1040.Google Scholar
Farajian, M. Amin, Turchi, Marco, Negri, Matteo, Bertoldi, Nicola, and Federico, Marcello. 2017a. Neural vs. phrase-based machine translation in a multi-domain scenario. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Valencia, Spain, pages 280284. www.aclweb.org/anthology/E17-2045.Google Scholar
Farajian, M. Amin, Turchi, Marco, Negri, Matteo, and Federico, Marcello. 2017b. Multi-domain neural machine translation through unsupervised adaptation. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Paper. Association for Computational Linguistics, Copenhagen, pages 127137. www.aclweb.org/anthology/W17-4713.Google Scholar
Faruqui, Manaal and Dyer, Chris. 2014. Improving vector space word representations using multilingual correlation. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Gothenburg, Sweden, pages 462471. www.aclweb.org/anthology/E14-1049.Google Scholar
Feng, Shi, Liu, Shujie, Yang, Nan, Li, Mu, Zhou, Ming, and Zhu, Kenny Q.. 2016. Improving attention modeling with implicit distortion and fertility for machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, pages 30823092. http://aclweb.org/anthology/C16–1290.Google Scholar
Finch, Andrew, Dixon, Paul, and Sumita, Eiichiro. 2012. Rescoring a phrase-based machine transliteration system with recurrent neural network language models. In Proceedings of the 4th Named Entity Workshop (NEWS) 2012. Association for Computational Linguistics, Jeju, Korea, pages 4751. www.aclweb.org/anthology/W12-4406.Google Scholar
Firat, Orhan, Cho, Kyunghyun, and Bengio, Yoshua. 2016a. Multi-way, multilingual neural machine translation with a shared attention mechanism. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, CA, pages 866875. http://www.aclweb.org/anthology/N16-1101.Google Scholar
Firat, Orhan, Sankaran, Baskaran, Al-Onaizan, Yaser, Yarman Vural, Fatos T., and Cho, Kyunghyun. 2016b. Zero-resource translation with multi-lingual neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 268277. https://aclweb.org/anthology/D16–1026.Google Scholar
Forcada, Mikel L. and Ñeco, Ramón P.. 1997. Recursive hetero-associative memories for translation. In Biological and Artificial Computation: From Neuroscience to Technology, Lanzarote, Canary Islands, pages 453462.Google Scholar
Freitag, Markus and Al-Onaizan, Yaser. 2016. Fast domain adaptation for neural machine translation. Ithaca, NY: Cornell University, abs/1612.06897. http://arxiv.org/abs/1612.06897.Google Scholar
Freitag, Markus and Al-Onaizan, Yaser. 2017. Beam search strategies for neural machine translation. In Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics, Vancouver, BC. pages 5660. www.aclweb.org/anthology/W17-3207.Google Scholar
Fügen, Christian, Waibel, Alex, and Kolss, Muntsin. 2007. Simultaneous translation of lectures and speeches. Machine Translation 21(4):209252.Google Scholar
Gangi, Mattia Antonino Di and Federico, Marcello. 2017. Monolingual embeddings for low resourced neural machine translation. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Stockholm. http://workshop2017.iwslt.org/downloads/P05-Paper.pdf.Google Scholar
Garmash, Ekaterina and Monz, Christof. 2016. Ensemble learning for multi-source neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, pages 14091418. http://aclweb.org/anthology/C16–1133.Google Scholar
Gehring, Jonas, Auli, Michael, Grangier, David, Yarats, Denis, and Dauphin, Yann N.. 2017. Convolutional sequence to sequence learning. Ithaca, NY: Cornell University, abs/1705.03122. http://arxiv.org/abs/1705.03122.Google Scholar
Gemici, Mevlana, Hung, Chia-Chun, Santoro, Adam, Wayne, Greg, Mohamed, Shakir, Danilo Jimenez Rezende, David Amos, and Timothy P. Lillicrap. 2017. Generative temporal models with memory. arXiv:1702.04649. Cornell University, Ithaca, NY. http://arxiv.org/abs/1702.04649.Google Scholar
Geng, Xinwei, Feng, Xiaocheng, Qin, Bing, and Liu, Ting. 2018. Adaptive multi-pass decoder for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 523532. www.aclweb.org/anthology/D18-1048.Google Scholar
Ghader, Hamidreza and Monz, Christof. 2017. What does attention in neural machine translation pay attention to? In Proceedings of the Eighth International Joint Conference on Natural Language Processing. Volume 1: Long Papers. Asian Federation of Natural Language Processing, Taipei, pages 3039. www.aclweb.org/anthology/I17-1004.Google Scholar
Giulianelli, Mario, Harding, Jack, Mohnert, Florian, Hupkes, Dieuwke, and Zuidema, Willem. 2018. Under the hood: Using diagnostic classifiers to investigate and improve how language models track agreement information. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, pages 240248. www.aclweb.org/anthology/W18-5426.Google Scholar
Glorot, Xavier and Bengio, Yoshua. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS). Sardinia.Google Scholar
Goldberg, Yoav. 2017. Neural Network Methods for Natural Language Processing. Volume 37: Synthesis Lectures on Human Language Technologies. Morgan & Claypool, San Rafael, CA. https://doi.org/10.2200/S00762ED1V01Y201703HLT037.Google Scholar
Goodfellow, Ian, Bengio, Yoshua, and Courville, Aaron. 2016. Deep Learning. MIT Press, Boston. www.deeplearningbook.org.Google Scholar
Gouws, Stephan, Bengio, Yoshua, and Corrado, Greg. 2015. Bilbowa: Fast bilingual distributed representations without word alignments. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Volume 37. JMLR.org, ICML’15, Lille, France, pages 748756. http://arxiv.org/pdf/1410.2455.pdf.Google Scholar
Gu, Jiatao, Cho, Kyunghyun, and Li, Victor O.K.. 2017a. Trainable greedy decoding for neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 19581968. http://aclweb.org/anthology/D17–1210.Google Scholar
Gu, Jiatao, Hassan, Hany, Devlin, Jacob, and Li, Victor O.K.. 2018a. Universal neural machine translation for extremely low resource languages. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 344354. http://aclweb.org/anthology/N18–1032.Google Scholar
Gu, Jiatao, Lu, Zhengdong, Li, Hang, and Li, Victor O.K.. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 16311640. www.aclweb.org/anthology/P16-1154.Google Scholar
Gu, Jiatao, Neubig, Graham, Cho, Kyunghyun, and Li, Victor O.K.. 2017b. Learning to translate in real-time with neural machine translation. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Valencia, Spain, pages 10531062. www.aclweb.org/anthology/E17-1099.Google Scholar
Gu, Jiatao, Wang, Yong, Chen, Yun, Li, Victor O. K., and Cho, Kyunghyun. 2018b. Meta-learning for low-resource neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 36223631. www.aclweb.org/anthology/D18-1398.Google Scholar
Jiatao, Gu, Wang, Yong, Cho, Kyunghyun, and Li, Victor O.K.. 2018c. Search engine guided non-parametric neural machine translation. In Proceedings of the American Association for Artificial Intelligence. Monterey, CA. https://arxiv.org/pdf/1705.07267.Google Scholar
Guillou, Liane and Hardmeier, Christian. 2018. Automatic reference-based evaluation of pronoun translation misses the point. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 47974802. www.aclweb.org/anthology/D18-1513.Google Scholar
Gulcehre, Caglar, Ahn, Sungjin, Nallapati, Ramesh, Zhou, Bowen, and Bengio, Yoshua. 2016. Pointing the unknown words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 140149. www.aclweb.org/anthology/P16-1014.Google Scholar
Gülçehre, Çaglar, Firat, Orhan, Xu, Kelvin, Cho, Kyunghyun, Barrault, Loïc, Lin, Huei-Chi, Bougares, Fethi, Schwenk, Holger, and Bengio, Yoshua. 2015. On using monolingual corpora in neural machine translation. Ithaca, NY: Cornell University, abs/1503.03535. http://arxiv.org/abs/1503.03535.Google Scholar
Gulordava, Kristina, Bojanowski, Piotr, Grave, Edouard, Linzen, Tal, and Baroni, Marco. 2018. Colorless green recurrent networks dream hierarchically. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 11951205. https://doi.org/10.18653/v1/N18–1108.Google Scholar
Guo, Mandy, Yang, Yinfei, Stevens, Keith, Cer, Daniel, Ge, Heming, Sung, Yun-hsuan, Strope, Brian, and Kurzweil, Ray. 2019. Hierarchical document encoder for parallel corpus mining. In Proceedings of the Fourth Conference on Machine Translation. Association for Computational Linguistics, Florence, pages 6472. www.aclweb.org/anthology/W19-5207.Google Scholar
Ha, Thanh-Le, Niehues, Jan, and Waibel, Alex. 2016. Toward multilingual neural machine translation with universal encoder and decoder. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Seattle, WA. http://workshop2016.iwslt.org/downloads/IWSLT_2016_paper_5.pdf.Google Scholar
Ha, Thanh-Le, Niehues, Jan, and Waibel, Alex. 2017. Effective strategies in zero-shot neural machine translation. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Tokyo. http://workshop2017.iwslt.org/downloads/P06-Paper.pdf.Google Scholar
Harris, Kim, Specia, Lucia, and Burchardt, Aljoscha. 2017. Feature-rich NMT and SMT post-edited corpora for productivity and evaluation tasks with a subset of MQM-annotated data. In Machine Translation Summit XVI. Nagoya, Japan.Google Scholar
Hashimoto, Kazuma and Tsuruoka, Yoshimasa. 2019. Accelerated reinforcement learning for sentence generation by vocabulary prediction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 31153125. www.aclweb.org/anthology/N19-1315.Google Scholar
Hasler, Eva, Blunsom, Phil, Koehn, Philipp, and Haddow, Barry. 2014. Dynamic topic adaptation for phrase-based MT. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Gothenburg, pages 328337. www.aclweb.org/anthology/E14-1035.Google Scholar
Hasler, Eva, Gispert, Adrià, Iglesias, Gonzalo, and Byrne, Bill. 2018. Neural machine translation decoding with terminology constraints. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 2: Short Papers. Association for Computational Linguistics, New Orleans, LA, pages 506512. http://aclweb.org/anthology/N18–2081.Google Scholar
Hassan, Hany, Aue, Anthony, Chen, Chang, Chowdhary, Vishal, Clark, Jonathan, Federmann, Christian, Huang, Xuedong, Junczys-Dowmunt, Marcin, Lewis, William, Li, Mu, Liu, Shujie, Liu, Tie-Yan, Luo, Renqian, Menezes, Arul, Qin, Tao, Seide, Frank, Xu, Tan, Tian, Fei, Wu, Lijun, Wu, Shuangzhi, Xia, Yingce, Zhang, Dongdong, Zhang, Zhirui, and Zhou, Ming. 2018. Achieving human parity on automatic chinese to English news translation. Ithaca, NY: Cornell University abs/1803.05567. http://arxiv.org/abs/1803.05567.Google Scholar
He, Di, Xia, Yingce, Qin, Tao, Wang, Liwei, Yu, Nenghai, Liu, Tieyan, and Ma, Wei-Ying. 2016a. Dual learning for machine translation. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., and Garnett, R., editors, Advances in Neural Information Processing Systems 29, Barcelona, pages 820828. http://papers.nips.cc/paper/6469-dual-learning-for-machine-translation.pdf.Google Scholar
He, Wei, He, Zhongjun, Wu, Hua, and Wang, Haifeng. 2016b. Improved neural machine translation with SMT features. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Phoenix, AZ, pages 151157.Google Scholar
Heyman, Geert, Verreet, Bregt, Vulić, Ivan, and Moens, Marie-Francine. 2019. Learning unsupervised multilingual word embeddings with incremental multilingual hubs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 18901902. www.aclweb.org/anthology/N19-1188.Google Scholar
Hill, Felix, Cho, Kyunghyun, Jean, Sébastien, and Bengio, Yoshua. 2017. The representational geometry of word meanings acquired by neural machine translation models. Machine Translation 31(1-2):318. https://doi.org/10.1007/s10590017-9194-2.Google Scholar
Hill, Felix, Cho, Kyunghyun, Jean, Sébastien, Devin, Coline, and Bengio, Yoshua. 2014. Embedding word similarity with neural machine translation. Ithaca, NY: Cornell University, abs/1412.6448. http://arxiv.org/abs/1412.6448.Google Scholar
Hirasawa, Tosho, Yamagishi, Hayahide, Matsumura, Yukio, and Komachi, Mamoru. 2019. Multimodal machine translation with embedding prediction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, Minneapolis, MN, pages 8691. www.aclweb.org/anthology/N19-3012.Google Scholar
Hirschmann, Fabian, Nam, Jinseok, and Fürnkranz, Johannes. 2016. What makes word-level neural machine translation hard: A case study on english-german translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, pages 31993208. http://aclweb.org/anthology/C161301.Google Scholar
Hoang, Cong Duy Vu, Haffari, Gholamreza, and Cohn, Trevor. 2017. Towards decoding as continuous optimisation in neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 146156. http://aclweb.org/anthology/D17–1014.Google Scholar
Hoang, Hieu, Dwojak, Tomasz, Krislauks, Rihards, Torregrosa, Daniel, and Heafield, Kenneth. 2018a. Fast neural machine translation implementation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 116121. http://aclweb.org/anthology/W18–2714.Google Scholar
Hoang, Vu Cong Duy, Koehn, Philipp, Haffari, Gholamreza, and Cohn, Trevor. 2018b. Iterative back-translation for neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 1824. http://aclweb.org/anthology/W18–2703.Google Scholar
Hokamp, Chris and Liu, Qun. 2017. Lexically constrained decoding for sequence generation using grid beam search. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Vancouver, BC, pages 15351546. http://aclweb.org/anthology/P17–1141.Google Scholar
Hornik, Kurt, Stinchcombe, Maxwell, and White, Halbert. 1989. Multilayer feedforward networks are universal approximators. Neural Networks 2:359366.Google Scholar
Hoshen, Yedid and Wolf, Lior. 2018. Non-adversarial unsupervised word translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 469478. www.aclweb.org/anthology/D18-1043.Google Scholar
Hu, Baotian, Zhaopeng, Tu, Lu, Zhengdong, Li, Hang, and Chen, Qingcai. 2015a. Context-dependent translation selection using convolutional neural network. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Volume 2: Short Papers. Association for Computational Linguistics, Beijing, pages 536541. www.aclweb.org/anthology/P15-2088.Google Scholar
Edward Hu, J., Khayrallah, Huda, Culkin, Ryan, Xia, Patrick, Chen, Tongfei, Post, Matt, and Van Durme, Benjamin. 2019. Improved lexically constrained decoding for translation and monolingual rewriting. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 839850. www.aclweb.org/anthology/N19-1090.Google Scholar
Hu, Xiaoguang, Li, Wei, Lan, Xiang, Wu, Hua, and Wang, Haifeng. 2015b. Improved beam search with constrained softmax for nmt. In Machine Translation Summit XV. Miami, FL, pages 297309. www.mt-archive.info/15/MTS-2015-Hu.pdf.Google Scholar
Huang, Jiaji, Qiu, Qiang, and Church, Kenneth. 2019. Hubless nearest neighbor search for bilingual lexicon induction. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 40724080. www.aclweb.org/anthology/P19-1399.Google Scholar
Huang, Liang, Zhao, Kai, and Ma, Mingbo. 2017. When to finish? optimal beam search for neural text generation (modulo beam size). In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 21342139. https://doi.org/10.18653/v1/D17–1227.Google Scholar
Huck, Matthias, Riess, Simon, and Fraser, Alexander. 2017. Target-side word segmentation strategies for neural machine translation. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Paper. Association for Computational Linguistics, Copenhagen, pages 5667. www.aclweb.org/anthology/W17-4706.Google Scholar
Iglesias, Gonzalo, Tambellini, William, Gispert, Adrià, Hasler, Eva, and Byrne, Bill. 2018. Accelerating NMT batched beam decoding with lmbr posteriors for deployment. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 3: Industry Papers. Association for Computational Linguistics, New Orleans, LA, pages 106113. http://aclweb.org/anthology/N18–3013.Google Scholar
Imamura, Kenji, Fujita, Atsushi, and Sumita, Eiichiro. 2018. Enhancement of encoder and attention using target monolingual corpora in neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 5563. http://aclweb.org/anthology/W18–2707.Google Scholar
Imamura, Kenji and Sumita, Eiichiro. 2018. Nict self-training approach to neural machine translation at NMT-2018. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 110115. http://aclweb.org/anthology/W18–2713.Google Scholar
Irvine, Ann, Morgan, John, Carpuat, Marine, Daume III, Hal, and Munteanu, Dragos. 2013. Measuring machine translation errors in new domains. In Transactions of the Association for Computational Linguistics (TACL). 1, pages 429440. www.transacl.org/wp-content/uploads/2013/10/paperno35.pdf.Google Scholar
Isabelle, Pierre, Cherry, Colin, and Foster, George. 2017. A challenge set approach to evaluating machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 24762486. http://aclweb.org/anthology/D17–1262.Google Scholar
Jean, Sébastien, Cho, Kyunghyun, Memisevic, Roland, and Bengio, Yoshua. 2015. On using very large target vocabulary for neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Volume 1: Long Papers. Association for Computational Linguistics, Beijing, pages 110. www.aclweb.org/anthology/P15-1001.Google Scholar
Johnson, Melvin, Schuster, Mike, Le, Quoc, Krikun, Maxim, Wu, Yonghui, Chen, Zhifeng, Thorat, Nikhil, Viegas, Fernanda, Wattenberg, Martin, Corrado, Greg, Hughes, Macduff, and Dean, Jeffrey. 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5:339351. https://transacl.org/ojs/index.php/tacl/article/view/1081.Google Scholar
Joulin, Armand, Bojanowski, Piotr, Mikolov, Tomas, Jégou, Hervé, and Grave, Edouard. 2018. Loss in translation: Learning bilingual word mapping with a retrieval criterion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 29792984. www.aclweb.org/anthology/D18-1330.Google Scholar
Junczys-Dowmunt, Marcin. 2019. Microsoft translator at wmt 2019: Towards large-scale document-level neural machine translation. In Proceedings of the Fourth Conference on Machine Translation. Shared Task Papers. Association for Computational Linguistics, Florence.Google Scholar
Junczys-Dowmunt, Marcin, Dwojak, Tomasz, and Hoang, Hieu. 2016. Is neural machine translation ready for deployment? A case study on 30 translation directions. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Seattle, WA. http://workshop2016.iwslt.org/downloads/IWSLT_2016_paper_4.pdf.Google Scholar
Kalchbrenner, Nal and Blunsom, Phil. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, pages 17001709. www.aclweb.org/anthology/D13-1176.Google Scholar
Kanouchi, Shin, Sudoh, Katsuhito, and Komachi, Mamoru. 2016. Neural reordering model considering phrase translation and word alignment for phrase-based translation. In Proceedings of the 3rd Workshop on Asian Translation (WAT2016). The COLING 2016 Organizing Committee, Osaka, pages 94103. aclweb.org/anthology/W164607.Google Scholar
Karpathy, Andrej, Johnson, Justin, and Li, Fei-Fei. 2016. Visualizing and understanding recurrent networks. In International Conference on Learning Representations (ICLR). San Juan, Puerto Rico. https://arxiv.org/pdf/1506.02078.Google Scholar
Khayrallah, Huda, Kumar, Gaurav, Duh, Kevin, Post, Matt, and Koehn, Philipp. 2017. Neural lattice search for domain adaptation in machine translation. In Proceedings of the Eighth International Joint Conference on Natural Language Processing. Volume 2: Short Papers. Asian Federation of Natural Language Processing, Taipei, pages 2025. www.aclweb.org/anthology/I17-2004.Google Scholar
Khayrallah, Huda, Thompson, Brian, Duh, Kevin, and Koehn, Philipp. 2018a. Regularized training objective for continued training for domain adaptation in neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 3644. http://aclweb.org/anthology/W18–2705.Google Scholar
Khayrallah, Huda, Thompson, Brian, Duh, Kevin, and Koehn, Philipp. 2018b. Regularized training objective for continued training for domain adaption in neural machine translation. In Proceedings of the Second Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics. Melbourne.Google Scholar
Kikuchi, Yuta, Neubig, Graham, Sasano, Ryohei, Takamura, Hiroya, and Okumura, Manabu. 2016. Controlling output length in neural encoder-decoders. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 13281338. https://aclweb.org/anthology/D16–1140.Google Scholar
Kim, Yoon, Jernite, Yacine, Sontag, David, and Rush, Alexander M.. 2016. Character-aware neural language models. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press, AAAI’16, Pheonix, AZ, pages 27412749. http://dl.acm.org/citation.cfm?id=3016100.3016285.Google Scholar
Kingma, Diederik P. and Ba, Jimmy. 2015. Adam: A method for stochastic optimization. Paper presented at the 3rd International Conference on Learning Representations, San Diego, CA. https://arxiv.org/pdf/1412.6980.pdf.Google Scholar
Klubička, Filip, Toral, Antonio, and Sánchez-Cartagena, Víctor M.. 2017. Fine-grained human evaluation of neural versus phrase-based machine translation. The Prague Bulletin of Mathematical Linguistics 108:121132. https://doi.org/10.1515/pralin-2017-0014.Google Scholar
Knowles, Rebecca and Koehn, Philipp. 2016. Neural interactive translation prediction. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). Austin, TX.Google Scholar
Knowles, Rebecca and Koehn, Philipp. 2018. Context and copying in neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 30343041. www.aclweb.org/anthology/D18-1339.Google Scholar
Knowles, Rebecca, Sanchez-Torron, Marina, and Koehn, Philipp. 2019. A user study of neural interactive translation prediction. Machine Translation 33(1):135154. https://doi.org/10.1007/s10590019-09235-8.Google Scholar
Kobus, Catherine, Crego, Josep, and Senellart, Jean. 2017. Domain control for neural machine translation. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017. INCOMA Ltd., Varna, Bulgaria, pages 372378. https://doi.org/10.26615/978-954-452-049-6_049.Google Scholar
Kocmi, Tom and Bojar, Ondřej. 2017. Curriculum learning and minibatch bucketing in neural machine translation. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017. INCOMA Ltd., Varna, Bulgaria, pages 379386. https://doi.org/10.26615/978-954-452-049-6_050.Google Scholar
Koehn, Philipp. 2010. Statistical Machine Translation. Cambridge: Cambridge University Press.Google Scholar
Koehn, Philipp, Khayrallah, Huda, Heafield, Kenneth, and Forcada, Mikel L.. 2018. Findings of the wmt 2018 shared task on parallel corpus filtering. In Proceedings of the Third Conference on Machine Translation. Association for Computational Linguistics, Belgium, pages 739752. www.aclweb.org/anthology/W18-64081.Google Scholar
Koehn, Philipp and Knowles, Rebecca. 2017. Six challenges for neural machine translation. In Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics, Vancouver, BC, pages 2839. www.aclweb.org/anthology/W17-3204.Google Scholar
Koehn, Philipp and Monz, Christof. 2005. Shared task: Statistical machine translation between European languages. In Proceedings of the ACL Workshop on Building and Using Parallel Texts. Association for Computational Linguistics, Ann Arbor, MI, pages 119124. www.aclweb.org/anthology/W/W05/W050820.Google Scholar
Kothur, Sachith Sri Ram, Knowles, Rebecca, and Koehn, Philipp. 2018a. Document-level adaptation for neural machine translation. In Proceedings of the Second Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics. Melbourne.Google Scholar
Kothur, Sachith Sri Ram, Knowles, Rebecca, and Koehn, Philipp. 2018b. Document-level adaptation for neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 6473. http://aclweb.org/anthology/W18–2708.Google Scholar
Kudo, Taku. 2018. Subword regularization: Improving neural network translation models with multiple subword candidates. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Melbourne, pages 6675. http://aclweb.org/anthology/P18–1007.Google Scholar
Kudo, Taku and Richardson, John. 2018. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Brussels, pages 6671. www.aclweb.org/anthology/D18-2012.Google Scholar
Kumar, Gaurav, Foster, George, Cherry, Colin, and Krikun, Maxim. 2019. Reinforcement learning based curriculum optimization for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 20542061. www.aclweb.org/anthology/N19-1208.Google Scholar
Lakew, Surafel Melaku, Cettolo, Mauro, and Federico, Marcello. 2018a. A comparison of transformer and recurrent neural networks on multilingual neural machine translation. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, NM, pages 641652. www.aclweb.org/anthology/C18-1054.Google Scholar
Melaku Lakew, Surafel, Erofeeva, Aliia, and Federico, Marcello. 2018b. Neural machine translation into language varieties. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 156164. www.aclweb.org/anthology/W18-6316.Google Scholar
Lakew, Surafel Melaku, Erofeeva, Aliia, Negri, Matteo, Federico, Marcello, and Turchi, Marco. 2018c. Transfer learning in multilingual neural machine translation with dynamic vocabulary. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Bruge, Belgium. https://arxiv.org/pdf/1811.01137.pdf.Google Scholar
Lample, Guillaume, Conneau, Alexis, Denoyer, Ludovic, and Ranzato, Marc’Aurelio. 2018a. Unsupervised machine translation using monolingual corpora only. In International Conference on Learning Representations. Vancouver, BC. https://openreview.net/forum?id=rkYTTf-AZ.Google Scholar
Lample, Guillaume, Ott, Myle, Conneau, Alexis, Denoyer, Ludovic, and Ranzato, Marc’Aurelio. 2018b. Phrase-based & neural unsupervised machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 50395049. www.aclweb.org/anthology/D18-1549.Google Scholar
Läubli, Samuel, Sennrich, Rico, and Volk, Martin. 2018. Has machine translation achieved human parity? A case for document-level evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 47914796. www.aclweb.org/anthology/D18-1512.Google Scholar
Yann, LeCun, Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., and Jackel, L.. 1989. Backpropagation applied to handwritten zip code recognition. Neural Computation 1(4):541551.Google Scholar
Lee, Jaesong, Shin, Joong-Hwi, and Kim, Jun-Seok. 2017. Interactive visualization and manipulation of attention-based neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Copenhagen, pages 121126. http://aclweb.org/anthology/D17–2021.Google Scholar
Lei Ba, J., Kiros, J. R., and Hinton, Geoffrey. 2016. Layer normalization. Ithaca, NY: Cornell University, ArXiv e-prints.Google Scholar
Lewis, William and Eetemadi, Sauleh. 2013. Dramatically reducing training data size through vocabulary saturation. In Proceedings of the Eighth Workshop on Statistical Machine Translation. Association for Computational Linguistics, Sofia, Bulgaria, pages 281291. www.aclweb.org/anthology/W13-2235.Google Scholar
Li, Guanlin, Liu, Lemao, Li, Xintong, Zhu, Conghui, Zhao, Tiejun, and Shi, Shuming. 2019. Understanding and improving hidden representations for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 466477. www.aclweb.org/anthology/N19-1046.Google Scholar
Li, Jiwei and Jurafsky, Dan. 2016. Mutual information and diverse decoding improve neural machine translation. Ithaca, NY: Cornell University, abs/1601.00372. http://arxiv.org/abs/1601.00372.Google Scholar
Li, Jiwei, Monroe, Will, and Jurafsky, Dan. 2016. A simple, fast diverse decoding algorithm for neural generation. Ithaca, NY: Cornell University, abs/1611.08562. http://arxiv.org/abs/1611.08562.Google Scholar
Li, Peng, Yang, Liu, Sun, Maosong, Izuha, Tatsuya, and Zhang, Dakun. 2014. A neural reordering model for phrase-based translation. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. Dublin City University and Association for Computational Linguistics, Dublin, pages 18971907. www.aclweb.org/anthology/C14-1179.Google Scholar
Li, Xiaoqing, Zhang, Jiajun, and Zong, Chengqing. 2018. One sentence one model for neural machine translation. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan.Google Scholar
Linzen, Tal, Dupoux, Emmanuel, and Goldberg, Yoav. 2016. Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics 4:521535. https://doi.org/10.1162/tacl_a_00115.Google Scholar
Lison, P. and Tiedemann, Jörg. 2016. Opensubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), Portorož, Slovenia.Google Scholar
Liu, Lemao, Utiyama, Masao, Finch, Andrew, and Sumita, Eiichiro. 2016a. Agreement on target-bidirectional neural machine translation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, CA, pages 411416. www.aclweb.org/anthology/N16-1046.Google Scholar
Liu, Lemao, Utiyama, Masao, Finch, Andrew, and Sumita, Eiichiro. 2016b. Neural machine translation with supervised attention. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, pages 30933102. http://aclweb.org/anthology/C16–1291.Google Scholar
Lu, Shixiang, Chen, Zhenbiao, and Bo, Xu. 2014. Learning new semi-supervised deep auto-encoder features for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Baltimore, MD, pages 122132. www.aclweb.org/anthology/P14-1012.Google Scholar
Lu, Yichao, Keung, Phillip, Ladhak, Faisal, Bhardwaj, Vikas, Zhang, Shaonan, and Sun, Jason. 2018. A neural interlingua for multilingual machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 8492. www.aclweb.org/anthology/W18-6309.Google Scholar
Luong, Minh-Thang and Manning, Christopher. 2015. Stanford neural machine translation systems for spoken language domains. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Da Nang, Vietnam, pages 7679. www.mt-archive.info/15/IWSLT-2015-luong.pdf.Google Scholar
Luong, Thang, Kayser, Michael, and Manning, Christopher D.. 2015a. Deep neural language models for machine translation. In Proceedings of the Nineteenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, Beijing, pages 305309. www.aclweb.org/anthology/K15-1031.Google Scholar
Luong, Thang, Pham, Hieu, and Manning, Christopher D.. 2015b. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, pages 14121421. http://aclweb.org/anthology/D15–1166.Google Scholar
Luong, Thang, Sutskever, Ilya, Le, Quoc, Vinyals, Oriol, and Zaremba, Wojciech. 2015c. Addressing the rare word problem in neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Volume 1: Long Papers. Association for Computational Linguistics, Beijing, pages 1119. www.aclweb.org/anthology/P15-1002.Google Scholar
Luong, Thang, Sutskever, Ilya, Le, Quoc V., Vinyals, Oriol, and Zaremba, Wojciech. 2015d. Addressing the rare word problem in neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Volume 1: Long Papers. Association for Computational Linguistics, Beijing, pages 1119. www.aclweb.org/anthology/P15-1002.Google Scholar
Ma, Chunpeng, Tamura, Akihiro, Utiyama, Masao, Sumita, Eiichiro, and Zhao, Tiejun. 2019a. Improving neural machine translation with neural syntactic distance. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 20322037. www.aclweb.org/anthology/N19-1205.Google Scholar
Ma, Mingbo, Liang, Huang, Xiong, Hao, Zheng, Renjie, Liu, Kaibo, Zheng, Baigong, Zhang, Chuanqiang, He, Zhongjun, Liu, Hairong, Li, Xing, Wu, Hua, and Wang, Haifeng. 2019b. STACL: Simultaneous translation with implicit anticipation and controllable latency using prefix-to-prefix framework. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 30253036. www.aclweb.org/anthology/P19-1289.Google Scholar
Ma, Mingbo, Zheng, Renjie, and Liang, Huang. 2019c. Learning to stop in structured prediction for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 18841889. www.aclweb.org/anthology/N19-1187.Google Scholar
Ma, Xutai, Li, Ke, and Koehn, Philipp. 2018. An analysis of source context dependency in neural machine translation. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation. Melbourne.Google Scholar
Malaviya, Chaitanya, Neubig, Graham, and Littell, Patrick. 2017. Learning language representations for typology prediction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 25292535. http://aclweb.org/anthology/D17–1268.Google Scholar
Manning, Christopher D.. 2015. Computational linguistics and deep learning. Computational Linguistics 41(4):701707.CrossRefGoogle Scholar
Marie, Benjamin and Fujita, Atsushi. 2019. Unsupervised joint training of bilingual word embeddings. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 32243230. www.aclweb.org/anthology/P19-1312.Google Scholar
Martindale, Marianna J. and Carpuat, Marine. 2018. Fluency over adequacy: A pilot study in measuring user trust in imperfect MT. In Annual Meeting of the Association for Machine Translation in the DAmericas (AMTA). Boston. https://arxiv.org/pdf/1802.06041.pdf.Google Scholar
Maruf, Sameen, Martins, André F. T., and Haffari, Gholamreza. 2018. Contextual neural model for translating bilingual multi-speaker conversations. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 101112. www.aclweb.org/anthology/W18-6311.Google Scholar
Maruf, Sameen, Martins, André F. T., and Haffari, Gholamreza. 2019. Selective attention for context-aware neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 30923102. www.aclweb.org/anthology/N19-1313.Google Scholar
Marvin, Rebecca and Koehn, Philipp. 2018. Exploring word sense disambiguation abilities of neural machine translation systems. In Annual Meeting of the Association for Machine Translation in the Americas (AMTA). Boston, MA.Google Scholar
Mattoni, Giulia, Nagle, Pat, Collantes, Carlos, and Shterionov, Dimitar. 2017. Zero-shot translation for low-resource Indian languages. In Machine Translation Summit XVI. Nagoya, Japan.Google Scholar
McCulloch, S. and Pitts, W.. 1943. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics 5(4):115133.Google Scholar
Meng, Fandong, Lu, Zhengdong, Li, Hang, and Liu, Qun. 2016. Interactive attention for neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, pages 21742185. http://aclweb.org/anthology/C16–1205.Google Scholar
Mi, Haitao, Wang, Zhiguo, and Ittycheriah, Abe. 2016. Vocabulary manipulation for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Berlin, pages 124129. http://anthology.aclweb.org/P16–2021.Google Scholar
Barone, Antonio Valerio Miceli. 2016. Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders. In Proceedings of the 1st Workshop on Representation Learning for NLP. Association for Computational Linguistics, Berlin, pages 121126. https://doi.org/10.18653/v1/W16–1614.Google Scholar
Miceli Barone, Antonio Valerio, Haddow, Barry, Germann, Ulrich, and Sennrich, Rico. 2017a. Regularization techniques for fine-tuning in neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 14901495. http://aclweb.org/anthology/D17–1156.Google Scholar
Barone, Antonio Valerio Miceli, Helcl, Jindřich, Sennrich, Rico, Haddow, Barry, and Birch, Alexandra. 2017b. Deep architectures for neural machine translation. In Proceedings of the Second Conference on Machine Translation, Volume 1: Research Paper. Association for Computational Linguistics, Copenhagen, pages 99107. www.aclweb.org/anthology/W17-4710.Google Scholar
Michel, Paul and Neubig, Graham. 2018. Extreme adaptation for personalized neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics. Vancouver, BC.Google Scholar
Miculicich, Lesly, Ram, Dhananjay, Pappas, Nikolaos, and Henderson, James. 2018. Document-level neural machine translation with hierarchical attention networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 29472954. www.aclweb.org/anthology/D18-1325.Google Scholar
Mikolov, Tomas. 2012. Statistical language models based on neural networks. PhD thesis, Brno University of Technology. www.fit.vutbr.cz/imikolov/rnnlm/thesis.pdf.Google Scholar
Mikolov, Tomas, Chen, Kai, Corrado, Greg, and Dean, Jeffrey. 2013a. Efficient estimation of word representations in vector space. Ithaca, NY: Cornell University, abs/1301.3781. http://arxiv.org/abs/1301.3781.Google Scholar
Mikolov, Tomas, Le, Quoc V., and Sutskever, Ilya. 2013b. Exploiting similarities among languages for machine translation. Ithaca, NY: Cornell University, abs/1309.4168. http://arxiv.org/abs/1309.4168.Google Scholar
Mikolov, Tomas, Sutskever, Ilya, Chen, Kai, Corrado, Greg, and Dean, Jeffrey. 2013c. Distributed representations of words and phrases and their compositionality. Ithaca, NY: Cornell University, abs/1310.4546. http://arxiv.org/abs/1310.4546.Google Scholar
Mikolov, Tomas, Yih, Wen-tau, and Zweig, Geoffrey. 2013d. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Atlanta, GA, pages 746751. www.aclweb.org/anthology/N13-1090.Google Scholar
Minsky, Marvin and Papert, Seymour. 1969. Perceptrons. An Introduction to Computational Geometry. MIT Press, Cambridge, MA.Google Scholar
Mohiuddin, Tasnim and Joty, Shafiq. 2019. Revisiting adversarial autoencoder for unsupervised word translation with cycle consistency and improved training. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 38573867. www.aclweb.org/anthology/N19-1386.Google Scholar
Morishita, Makoto, Suzuki, Jun, and Nagata, Masaaki. 2018. Improving neural machine translation by incorporating hierarchical subword features. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, NM, pages 618629. www.aclweb.org/anthology/C18-1052.Google Scholar
Mukherjee, Tanmoy, Yamada, Makoto, and Hospedales, Timothy. 2018. Learning unsupervised word translations without adversaries. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 627632. www.aclweb.org/anthology/D18-1063.Google Scholar
Müller, Mathias, Rios, Annette, Voita, Elena, and Sennrich, Rico. 2018. A large-scale test set for the evaluation of context-aware pronoun translation in neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 6172. www.aclweb.org/anthology/W18-6307.Google Scholar
Murray, Kenton and Chiang, David. 2018. Correcting length bias in neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 212223. www.aclweb.org/anthology/W18-6322.Google Scholar
Murthy, Rudra, Kunchukuttan, Anoop, and Bhattacharyya, Pushpak. 2019. Addressing word-order divergence in multilingual neural machine translation for extremely low resource languages. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 38683873. www.aclweb.org/anthology/N19-1387.Google Scholar
Nadejde, Maria, Reddy, Siva, Sennrich, Rico, Dwojak, Tomasz, Junczys-Dowmunt, Marcin, Koehn, Philipp, and Birch, Alexandra. 2017. Predicting target language CCG supertags improves neural machine translation. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Copenhagen, pages 6879. www.aclweb.org/anthology/W17-4707.Google Scholar
Nakashole, Ndapa. 2018. Norma: Neighborhood sensitive maps for multilingual word embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 512522. www.aclweb.org/anthology/D18-1047.Google Scholar
Nakashole, Ndapa and Flauger, Raphael. 2018. Characterizing departures from linearity in word translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Melbourne, pages 221227. http://aclweb.org/anthology/P18–2036.Google Scholar
Neubig, Graham. 2016. Lexicons and minimum risk training for neural machine translation: Naist-CMU at wat2016. In Proceedings of the 3rd Workshop on Asian Translation (WAT2016). The COLING 2016 Organizing Committee, Osaka, pages 119125. http://aclweb.org/anthology/W16–4610.Google Scholar
Neubig, Graham, Dou, Zi-Yi, Hu, Junjie, Michel, Paul, Pruthi, Danish, and Wang, Xinyi. 2019. compare-mt: A tool for holistic comparison of language generation systems. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Association for Computational Linguistics, Minneapolis, MN, pages 3541. www.aclweb.org/anthology/N19-4007.Google Scholar
Neubig, Graham and Hu, Junjie. 2018. Rapid adaptation of neural machine translation to new languages. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 875880. www.aclweb.org/anthology/D18-1103.CrossRefGoogle Scholar
New navy device learns by doing; psychologist shows embryo of computer designed to read and grow wiser. 1958. New York Times. www.nytimes.com/1958/07/08/archives/new-navy-device-learns-by-doing-psychologist-shows-embryo-of.html.Google Scholar
Nguyen, Toan Q. and Chiang, David. 2017. Transfer learning across low-resource, related languages for neural machine translation. In Proceedings of the Eighth International Joint Conference on Natural Language Processing. Volume 2: Short Papers. Asian Federation of Natural Language Processing, Taipei, pages 296301. www.aclweb.org/anthology/I17-2050.Google Scholar
Niehues, Jan and Cho, Eunah. 2017. Exploiting linguistic resources for neural machine translation using multi-task learning. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Copenhagen, pages 8089. http://www.aclweb.org/anthology/W174708.Google Scholar
Niehues, Jan, Cho, Eunah, Ha, Thanh-Le, and Waibel, Alex. 2016. Pre-translation for neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, pages 18281836. http://aclweb.org/anthology/C16–1172.Google Scholar
Niehues, Jan, Cho, Eunah, Ha, Thanh-Le, and Waibel, Alex. 2017. Analyzing neural MT search and model performance. In Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics, Vancouver, BC, pages 1117. www.aclweb.org/anthology/W17-3202.Google Scholar
Nikolov, Nikola, Hu, Yuhuang, Tan, Mi Xue, and Hahnloser, Richard H.R.. 2018. Character-level Chinese-English translation through ascii encoding. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 1016. www.aclweb.org/anthology/W18-6302.Google Scholar
Nishimura, Yuta, Sudoh, Katsuhito, Neubig, Graham, and Nakamura, Satoshi. 2018a. Multi-source neural machine translation with data augmentation. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Bruges, Belgium. https://arxiv.org/pdf/1810.06826.pdf.Google Scholar
Nishimura, Yuta, Sudoh, Katsuhito, Neubig, Graham, and Nakamura, Satoshi. 2018b. Multi-source neural machine translation with missing data. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 9299. http://aclweb.org/anthology/W182711.Google Scholar
Niu, Xing, Denkowski, Michael, and Carpuat, Marine. 2018. Bi-directional neural machine translation with synthetic parallel data. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 8491. http://aclweb.org/anthology/W18–2710.Google Scholar
Niu, Xing, Xu, Weijia, and Carpuat, Marine. 2019. Bi-directional differentiable input reconstruction for low-resource neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 442448. www.aclweb.org/anthology/N19-1043.Google Scholar
Ott, Myle, Auli, Michael, Grangier, David, and Ranzato, Marc’Aurelio. 2018a. Analyzing uncertainty in neural machine translation. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning. Volume 80: Proceedings of Machine Learning Research. PMLR, Stockholmsmässan, Stockholm, pages 39563965. http://proceedings.mlr.press/v80/ott18a/ott18a.pdf.Google Scholar
Ott, Myle, Auli, Michael, Grangier, David, and Ranzato, Marc’Aurelio. 2018b. Analyzing uncertainty in neural machine translation. Ithaca, NY: Cornell University, abs/1803.00047. http://arxiv.org/abs/1803.00047.Google Scholar
Ott, Myle, Edunov, Sergey, Grangier, David, and Auli, Michael. 2018c. Scaling neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 19. www.aclweb.org/anthology/W18-6301.Google Scholar
Parida, Shantipriya and Bojar, Ondřej. 2018. Translating short segments with NMT: A case study in English-to-Hindi. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation. Melbourne, https://rua.ua.es/dspace/bitstream/10045/76083/1/ EAMT2018-Proceedings_25.pdf.Google Scholar
Pascanu, Razvan, Mikolov, Tomas, and Bengio, Yoshua. 2013. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, ICML. Atlanta, GA, pages 13101318. http://proceedings.mlr.press/v28/pascanu13.pdf.Google Scholar
Patra, Barun, Moniz, Joel Ruben Antony, Garg, Sarthak, Gormley, Matthew R., and Neubig, Graham. 2019. Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 184193. www.aclweb.org/anthology/P19-1018.Google Scholar
Pennington, Jeffrey, Socher, Richard, and Manning, Christopher. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pages 15321543. www.aclweb.org/anthology/D14-1162.Google Scholar
Peris, Álvaro and Casacuberta, Francisco. 2019. A neural, interactive-predictive system for multimodal sequence to sequence tasks. In Proceedings of the 57th Conference of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics, Florence, pages 8186. www.aclweb.org/anthology/P19-3014.Google Scholar
Peris, Álvaro, Cebrián, Luis, and Casacuberta, Francisco. 2017a. Online learning for neural machine translation post-editing. Ithaca, NY: Cornell University, abs/1706.03196. http://arxiv.org/abs/1706.03196.Google Scholar
Peris, lvaro, Domingo, Miguel, and Casacuberta, Francisco. 2017b. Interactive neural machine translation. Computer Speech Language 45(C):201220. https://doi.org/10.1016/j.csl.2016.12.003.Google Scholar
Peters, Matthew, Neumann, Mark, Iyyer, Mohit, Gardner, Matt, Clark, Christopher, Lee, Kenton, and Zettlemoyer, Luke. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 22272237. http://aclweb.org/anthology/N18–1202.Google Scholar
Platanios, Emmanouil Antonios, Sachan, Mrinmaya, Neubig, Graham, and Mitchell, Tom. 2018. Contextual parameter generation for universal neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 425435. www.aclweb.org/anthology/D18-1039.Google Scholar
Platanios, Emmanouil Antonios, Stretcu, Otilia, Neubig, Graham, Poczos, Barnabas, and Mitchell, Tom. 2019. Competence-based curriculum learning for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 11621172. www.aclweb.org/anthology/N19-1119.Google Scholar
Plitt, Mirko and Masselot, Francois. 2010. A productivity test of statistical machine translation post-editing in a typical localisation context. The Prague Bulletin of Mathematical Linguistics 94:716. http://ufal.mff.cuni.cz/pbml/93/art-plitt-masselot.pdf.Google Scholar
Poliak, Adam, Belinkov, Yonatan, Glass, James, and Van Durme, Benjamin. 2018. On the evaluation of semantic phenomena in neural machine translation using natural language inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 2: Short Papers. Association for Computational Linguistics, New Orleans, LA, pages 513523. http://aclweb.org/anthology/N18–2082.Google Scholar
Popović, Maja. 2017. Comparing Language Related Issues for NMT and PBMT between German and English. The Prague Bulletin of Mathematical Linguistics 108:209220. https://doi.org/10.1515/pralin-2017-0021.Google Scholar
Post, Matt and Vilar, David. 2018. Fast lexically constrained decoding with dynamic beam allocation for neural machine translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 13141324. http://aclweb.org/anthology/N18–1119.Google Scholar
Xiao, Pu, Pappas, Nikolaos, and Popescu-Belis, Andrei. 2017. Sense-aware statistical machine translation using adaptive context-dependent clustering. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Copenhagen, pages 110. www.aclweb.org/anthology/W17-4701.Google Scholar
Qi, Ye, Sachan, Devendra, Felix, Matthieu, Padmanabhan, Sarguna, and Neubig, Graham. 2018. When and why are pre-trained word embeddings useful for neural machine translation? In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 2: Short Papers. Association for Computational Linguistics, New Orleans, LA, pages 529535. http://aclweb.org/anthology/N18–2084.Google Scholar
Raganato, Alessand ro and Tiedemann, Jörg. 2018. An analysis of encoder representations in transformer-based machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, pages 287297. www.aclweb.org/anthology/W18-5431.Google Scholar
Rarrick, Spencer, Quirk, Chris, and Lewis, Will. 2011. MT detection in web-scraped parallel corpora. In Proceedings of the 13th Machine Translation Summit (MT Summit XIII). International Association for Machine Translation, Xiamen, China, pages 422430. www.mt-archive.info/MTS-2011-Rarrick.pdf.Google Scholar
Ren, Shuo, Chen, Wenhu, Liu, Shujie, Li, Mu, Zhou, Ming, and Ma, Shuai. 2018. Triangular architecture for rare language translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Melbourne, pages 5665. http://aclweb.org/anthology/P18–1006.Google Scholar
Ren, Shuo, Zhang, Zhirui, Liu, Shujie, Zhou, Ming, and Ma, Shuai. 2019. Unsupervised neural machine translation with SMT as posterior regularization. In Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, HI, pages 241248. https://doi.org/10.1609/aaai.v33i01.3301241.Google Scholar
Rios, Annette, Mascarell, Laura, and Sennrich, Rico. 2017. Improving word sense disambiguation in neural machine translation with sense embeddings. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Copenhagen, pages 1119. www.aclweb.org/anthology/W17-4702.Google Scholar
Rosenblatt, Frank. 1957. The perceptron, a perceiving and recognizing automaton. Technical report, Buffalo, NY: Cornell Aeronautical Laboratory.Google Scholar
Ruder, Sebastian, Vulić, Ivan, and Søgaard, Anders. 2017. A survey of cross-lingual embedding models. Ithaca, NY: Cornell University, abs/1706.04902. http://arxiv.org/abs/1706.04902.Google Scholar
Ruiter, Dana, España-Bonet, Cristina, and van Genabith, Josef. 2019. Self-supervised neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 18281834. www.aclweb.org/anthology/P19-1178.Google Scholar
Rumelhart, David E., Hinton, Geoffrey E., and Williams, Ronald J.. 1986. Learning internal representations by error propagation. Parallel Distributed Processing: Explorations in the Microstructure of Cognition 1:318362.Google Scholar
Sachan, Devendra and Neubig, Graham. 2018. Parameter sharing methods for multilingual self-attentional translation models. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 261271. www.aclweb.org/anthology/W18-6327.Google Scholar
Sanchez-Torron, Marina and Koehn, Philipp. 2016. Machine translation quality and post-editor productivity. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). Austin, TX.Google Scholar
Satija, Harsh and Pineau, Joelle. 2016. Simultaneous machine translation using deep reinforcement learning. In Abstraction in Reinforcement Learning (ICML Workshop). New York. http://docs.wixstatic.com/ugd/3195dc538b63de8e2644b782db920c55f74650.pdf.Google Scholar
Schuster, M. and Nakajima, K.. 2012. Japanese and korean voice search. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, pages 51495152. https://doi.org/10.1109/ICASSP.2012.6289079.Google Scholar
Schwarzenberg, Robert, Harbecke, David, Macketanz, Vivien, Avramidis, Eleftherios, and Möller, Sebastian. 2019. Train, sort, explain: Learning to diagnose translation models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Association for Computational Linguistics, Minneapolis, MN, pages 2934. https://www.aclweb.org/anthology/N194006.Google Scholar
Schwenk, Holger. 2007. Continuous space language models. Computer Speech and Language 3(21):492518. https://wiki.inf.ed.ac.uk/twiki/pub/CSTR/ListenSemester2_2009_10/sdarticle.pdf.Google Scholar
Schwenk, Holger. 2010. Continuous-space language models for statistical machine translation. The Prague Bulletin of Mathematical Linguistics 93:137146. http://ufal.mff.cuni.cz/pbml/93/art-schwenk.pdf.Google Scholar
Schwenk, Holger. 2012. Continuous space translation models for phrase-based statistical machine translation. In Proceedings of COLING 2012: Posters. The COLING 2012 Organizing Committee, Mumbai, pages 10711080. www.aclweb.org/anthology/C12-2104.Google Scholar
Schwenk, Holger. 2018. Filtering and mining parallel data in a joint multilingual space. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Melbourne, pages 228234. http://aclweb.org/anthology/P18–2037.Google Scholar
Schwenk, Holger, Chaudhary, Vishrav, Sun, Shuo, Gong, Hongyu, and Guzmán, Francisco. 2019. Wikimatrix: Mining 135m parallel sentences in 1620 language pairs from wikipedia. Ithaca, NY: Cornell University, abs/1907.05791. http://arxiv.org/abs/1907.05791.Google Scholar
Schwenk, Holger, Dechelotte, Daniel, and Gauvain, Jean-Luc. 2006. Continuous space language models for statistical machine translation. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions. Association for Computational Linguistics, Sydney, Australia, pages 723730. www.aclweb.org/anthology/P/P06/P062093.Google Scholar
Schwenk, Holger and Douze, Matthijs. 2017. Learning joint multilingual sentence representations with neural machine translation. In Proceedings of the 2nd Workshop on Representation Learning for NLP. Association for Computational Linguistics, Vancouver, BC, pages 157167. https://doi.org/10.18653/v1/W17–2619.Google Scholar
Schwenk, Holger, Rousseau, Anthony, and Attik, Mohammed. 2012. Large, pruned or continuous space language models on a GPU for statistical machine translation. In Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT. Association for Computational Linguistics, Montréal, pages 1119. www.aclweb.org/anthology/W12-2702.Google Scholar
Senellart, Jean, Zhang, Dakun, Bo, WANG, Guillaume, KLEIN, Ramatchandirin, Jean-Pierre, Crego, Josep, and Rush, Alexander. 2018. Opennmt system description for wnmt 2018: 800 words/sec on a single-core cpu. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, Melbourne, pages 122128. http://aclweb.org/anthology/W18–2715.Google Scholar
Sennrich, Rico. 2017. How grammatical is character-level neural machine translation? Assessing MT quality with contrastive translation pairs. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Valencia, Spain, pages 376382. www.aclweb.org/anthology/E17-2060.Google Scholar
Sennrich, Rico and Haddow, Barry. 2016. Linguistic input features improve neural machine translation. In Proceedings of the First Conference on Machine Translation. Association for Computational Linguistics, Berlin, pages 8391. www.aclweb.org/anthology/W/W16/W162209.Google Scholar
Sennrich, Rico, Haddow, Barry, and Birch, Alexandra. 2016a. Controlling politeness in neural machine translation via side constraints. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, CA, pages 3540. www.aclweb.org/anthology/N16-1005.Google Scholar
Sennrich, Rico, Haddow, Barry, and Birch, Alexandra. 2016b. Edinburgh neural machine translation systems for wmt 16. In Proceedings of the First Conference on Machine Translation. Association for Computational Linguistics, Berlin, pages 371376. www.aclweb.org/anthology/W/W16/W162323.Google Scholar
Sennrich, Rico, Haddow, Barry, and Birch, Alexandra. 2016c. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 8696. www.aclweb.org/anthology/P16-1009.Google Scholar
Sennrich, Rico, Haddow, Barry, and Birch, Alexandra. 2016d. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 17151725. www.aclweb.org/anthology/P16-1162.Google Scholar
Sennrich, Rico, Haddow, Barry, and Birch, Alexandra. 2016e. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 17151725. www.aclweb.org/anthology/P16-1162.Google Scholar
Servan, Christophe, Crego, Josep Maria, and Senellart, Jean. 2016. Domain specialization: A post-training domain adaptation for neural machine translation. Ithaca, NY: Cornell University, abs/1612.06141. http://arxiv.org/abs/1612.06141.Google Scholar
Shao, Yutong, Sennrich, Rico, Webber, Bonnie, and Fancellu, Federico. 2018. Evaluating machine translation performance on chinese idioms with a blacklist method. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan. https://arxiv.org/pdf/1711.07646.pdf.Google Scholar
Shareghi, Ehsan, Petri, Matthias, Haffari, Gholamreza, and Cohn, Trevor. 2016. Fast, small and exact: Infinite-order language modelling with compressed suffix trees. Transactions of the Association for Computational Linguistics 4:477490. https://transacl.org/ojs/index.php/tacl/article/view/865.Google Scholar
Shen, Shiqi, Cheng, Yong, He, Zhongjun, He, Wei, Wu, Hua, Sun, Maosong, and Yang, Liu. 2016. Minimum risk training for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 16831692. www.aclweb.org/anthology/P16-1159.Google Scholar
Shi, Weijia, Chen, Muhao, Tian, Yingtao, and Chang, Kai-Wei. 2019. Learning bilingual word embeddings using lexical definitions. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019). Association for Computational Linguistics, Florence, pages 142147. www.aclweb.org/anthology/W19-4316.Google Scholar
Shi, Xing and Knight, Kevin. 2017. Speeding up neural machine translation decoding by shrinking run-time vocabulary. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Vancouver, BC, pages 574579. http://aclweb.org/anthology/P172091.Google Scholar
Shi, Xing, Knight, Kevin, and Yuret, Deniz. 2016a. Why neural translations are the right length. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 22782282. https://aclweb.org/anthology/D16–1248.Google Scholar
Shi, Xing, Padhi, Inkit, and Knight, Kevin. 2016b. Does string-based neural MT learn source syntax? In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 15261534. https://aclweb.org/anthology/D16–1159.Google Scholar
Shu, Raphael and Nakayama, Hideki. 2018. Improving beam search by removing monotonic constraint for neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Melbourne, pages 339344. http://aclweb.org/anthology/P18–2054.Google Scholar
Simianer, Patrick, Wuebker, Joern, and DeNero, John. 2019. Measuring immediate adaptation performance for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 20382046. www.aclweb.org/anthology/N19-1206.Google Scholar
Smith, Samuel L., Turban, David H. P., Hamblin, Steven, and Hammerla, Nils Y.. 2017. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. In Proceedings of the International Conference on Learning Representations (ICLR). Toulon, France.Google Scholar
Snover, Matthew, Dorr, Bonnie J., Schwartz, Richard, Micciulla, Linnea, and Makhoul, John. 2006. A study of translation edit rate with targeted human annotation. In 5th Conference of the Association for Machine Translation in the Americas (AMTA). Boston. http://mt-archive.info/AMTA-2006-Snover.pdf.Google Scholar
Søgaard, Anders, Ruder, Sebastian, and Vulić, Ivan. 2018. On the limitations of unsupervised bilingual dictionary induction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Melbourne, pages 778788. https://doi.org/10.18653/v1/P18–1072.Google Scholar
Song, Kai, Zhang, Yue, Yu, Heng, Luo, Weihua, Wang, Kun, and Zhang, Min. 2019. Code-switching for enhancing NMT with pre-specified translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 449459. www.aclweb.org/anthology/N19-1044.Google Scholar
Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15:19291958. http://jmlr.org/papers/v15/ srivastava14a.html.Google Scholar
Stahlberg, Felix, de Gispert, Adrià, Hasler, Eva, and Byrne, Bill. 2017. Neural machine translation by minimising the Bayes-risk with respect to syntactic translation lattices. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Valencia, Spain, pages 362368. www.aclweb.org/anthology/E17-2058.Google Scholar
Stahlberg, Felix, Saunders, Danielle, and Byrne, Bill. 2018. An operation sequence model for explainable neural machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, pages 175186. www.aclweb.org/anthology/W18-5420.Google Scholar
Strobelt, Hendrik, Gehrmann, Sebastian, Behrisch, Michael, Perer, Adam, Pfister, Hanspeter, and Rush, Alexander M. 2019. Seq2seq-vis: A visual debugging tool for sequence-to-sequence models. IEEE Transactions on Visualization and Computer Graphics 25(1):353363.Google Scholar
Sun, Haipeng, Wang, Rui, Chen, Kehai, Utiyama, Masao, Sumita, Eiichiro, and Zhao, Tiejun. 2019. Unsupervised bilingual word embedding agreement for unsupervised neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 12351245. www.aclweb.org/anthology/P19-1119.Google Scholar
Sundermeyer, Martin, Oparin, Ilya, Gauvain, Jean-Luc, Freiberg, Ben, Schlüter, Ralf, and Ney, Hermann. 2013. Comparison of feedforward and recurrent neural network language models. In IEEE International Conference on Acoustics, Speech, and Signal Processing. Vancouver, BC, pages 84308434. www.eu-bridge.eu/downloads/_Comparison_of_Feedforward_and_Recurrent_Neural_Network_Language _Models.pdf.Google Scholar
Sutskever, Ilya, Vinyals, Oriol, and Quoc, V. Le. 2014. Sequence to sequence learning with neural networks. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D., and Weinberger, K. Q., editors, Advances in Neural Information Processing Systems 27. Barcelona, pages 31043112. http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf.Google Scholar
Taghipour, Kaveh, Khadivi, Shahram, and Jia, Xu. 2011. Parallel corpus refinement as an outlier detection algorithm. In Proceedings of the 13th Machine Translation Summit (MT Summit XIII). International Association for Machine Translation, Xiamen, China, pages 414421. www.mt-archive.info/MTS-2011-Taghipour.pdf.Google Scholar
Tamchyna, Aleš, Weller-Di Marco, Marion, and Fraser, Alexander. 2017. Modeling target-side inflection in neural machine translation. In Proceedings of the Second Conference on Machine Translation. Volume 1: Research Papers. Association for Computational Linguistics, Copenhagen, pages 3242. www.aclweb.org/anthology/W17-4704.Google Scholar
Xu, Tan, Ren, Yi, He, Di, Qin, Tao, Zhao, Zhou, and Liu, Tie-Yan. 2019. Multilingual neural machine translation with knowledge distillation. In International Conference on Learning Representations (ICLR). New Orleans, LA. https://openreview.net/pdf?id=S1gUsoR9YX.Google Scholar
Tang, Gongbo, Sennrich, Rico, and Nivre, Joakim. 2018. An analysis of attention mechanisms: The case of word sense disambiguation in neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 2635. www.aclweb.org/anthology/W18-6304.Google Scholar
Tars, Sander and Fishel, Mark. 2018. Multi-domain neural machine translation. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation. Melbourne.Google Scholar
Ter-Sarkisov, Alex, Schwenk, Holger, Bougares, Fethi, and Barrault, Loïc. 2015. Incremental adaptation strategies for neural network language models. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality. Association for Computational Linguistics, Beijing, pages 4856. www.aclweb.org/anthology/W15-4006.Google Scholar
Thompson, Brian, Gwinnup, Jeremy, Khayrallah, Huda, Duh, Kevin, and Koehn, Philipp. 2019. Overcoming catastrophic forgetting during domain adaptation of neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 20622068. www.aclweb.org/anthology/N19-1209.Google Scholar
Thompson, Brian, Khayrallah, Huda, Anastasopoulos, Antonios, McCarthy, Arya D., Duh, Kevin, Marvin, Rebecca, McNamee, Paul, Gwinnup, Jeremy, Anderson, Tim, and Koehn, Philipp. 2018. Freezing subnetworks to analyze domain adaptation in neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 124132. www.aclweb.org/anthology/W18-6313.Google Scholar
Tiedemann, Jörg. 2012. Parallel data, tools and interfaces in opus. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012). European Language Resources Association (ELRA), Istanbul, pages 22142218. ACL Anthology Identifier: L121246. www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf.Google Scholar
Toral, Antonio, Castilho, Sheila, Hu, Ke, and Way, Andy. 2018. Attaining the unattainable? reassessing claims of human parity in neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 113123. www.aclweb.org/anthology/W18-6312.Google Scholar
Toral, Antonio and Sánchez-Cartagena, Víctor M.. 2017. A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Valencia, Spain, pages 10631073. www.aclweb.org/anthology/E17-1100.Google Scholar
Tran, Ke, Bisazza, Arianna, and Monz, Christof. 2016. Recurrent memory networks for language modeling. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, CA, pages 321331. www.aclweb.org/anthology/N16-1036.Google Scholar
Tran, Ke, Bisazza, Arianna, and Monz, Christof. 2018. The importance of being recurrent for modeling hierarchical structure. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 47314736. www.aclweb.org/anthology/D18-1503.Google Scholar
Zhaopeng, Tu, Yang, Liu, Lu, Zhengdong, Liu, Xiaohua, and Li, Hang. 2016a. Context gates for neural machine translation. Ithaca, NY: Cornell University, abs/1608.06043. http://arxiv.org/abs/1608.06043.Google Scholar
Zhaopeng, Tu, Yang, Liu, Shang, Lifeng, Liu, Xiaohua, and Li, Hang. 2017. Neural machine translation with reconstruction. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco. http://arxiv.org/abs/1611.01874.Google Scholar
Zhaopeng, Tu, Lu, Zhengdong, Yang, Liu, Liu, Xiaohua, and Li, Hang. 2016b. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 7685. www.aclweb.org/anthology/P16-1008.Google Scholar
Vaibhav, Vaibhav, Singh, Sumeet, Stewart, Craig, and Neubig, Graham. 2019. Improving robustness of machine translation with synthetic noise. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers). Association for Computational Linguistics, Minneapolis, MN, pages 19161920. www.aclweb.org/anthology/N19-1190.Google Scholar
van der Maaten, L. J. P. and Hinton, Geoffrey. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9:25792605. http://jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf.Google Scholar
van der Wees, Marlies, Bisazza, Arianna, and Monz, Christof. 2017. Dynamic data selection for neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 14111421. http://aclweb.org/anthology/D17–1147.Google Scholar
Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N, Kaiser, Ł ukasz, and Polosukhin, Illia. 2017. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30. Barcelona, pages 59986008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf.Google Scholar
Vaswani, Ashish, Zhao, Yinggong, Fossum, Victoria, and Chiang, David. 2013. Decoding with large-scale neural language models improves translation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, WA, pages 13871392. www.aclweb.org/anthology/D13-1140.Google Scholar
Vauquois, Bernard. 1968. Structures profondes et traduction automatique. le système du ceta. Revue Roumaine de linguistique 13(2):105130.Google Scholar
Venugopal, Ashish, Uszkoreit, Jakob, Talbot, David, Och, Franz, and Ganitkevitch, Juri. 2011. Watermarking the outputs of structured prediction with an application in statistical machine translation. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Edinburgh, pages 13631372. www.aclweb.org/anthology/D11-1126.Google Scholar
Vilar, David. 2018. Learning hidden unit contribution for adapting neural machine translation models. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 2: Short Papers. Association for Computational Linguistics, New Orleans, LA, pages 500505. http://aclweb.org/anthology/N18–2080.Google Scholar
Virpioja, Sami, Smit, Peter, Grönroos, Stig-Arne, and Kurimo, Mikko. 2013. Morfessor 2.0: Python implementation and extensions for Morfessor baseline. Technical Report 25, Espoo, Finland: Aalto University.Google Scholar
Vulić, Ivan and Korhonen, Anna. 2016. On the role of seed lexicons in learning bilingual word embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 247257. www.aclweb.org/anthology/P16-1024.Google Scholar
Vulić, Ivan and Moens, Marie-Francine. 2015. Bilingual word embeddings from non-parallel document-aligned data applied to bilingual lexicon induction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Volume 2: Short Papers. Association for Computational Linguistics, Beijing, pages 719725. www.aclweb.org/anthology/P15-2118.Google Scholar
Wada, Takashi, Iwata, Tomoharu, and Matsumoto, Yuji. 2019. Unsupervised multilingual word embedding with limited resources using neural language models. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 31133124. www.aclweb.org/anthology/P19-1300.Google Scholar
Alex, Waibel, Jain, A. N., McNair, A. E., Saito, H., Hauptmann, A. G., and Tebelskis, J.. 1991. Janus: A speech-to-speech translation system using connectionist and symbolic processing strategy.ies. In Proceedings of the 1991 International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, pages 793796.Google Scholar
Wallace, Eric, Feng, Shi, and Boyd-Graber, Jordan. 2018. Interpreting neural networks with nearest neig.hbors. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, pages 136144. www.aclweb.org/anthology/W18-5416.Google Scholar
Wang, Qiang, Li, Bei, Xiao, Tong, Zhu, Jingbo, Li, Changliang, Wong, Derek F., and Chao, Lidia S.. 2019a. Learning deep transformer models for machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 18101822. www.aclweb.org/anthology/P19-1176.Google Scholar
Wang, Rui, Finch, Andrew, Utiyama, Masao, and Sumita, Eiichiro. 2017a. Sentence embedding for neural machine translation domain adaptation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Vancouver, BC, pages 560566. http://aclweb.org/anthology/P17–2089.Google Scholar
Wang, Rui, Utiyama, Masao, Goto, Isao, Sumita, Eiichro, Zhao, Hai, and Lu, Bao-Liang. 2013. Converting continuous-space language models into n-gram language models for statistical machine translation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, WA, pages 845850. www.aclweb.org/anthology/D13-1082.Google Scholar
Wang, Rui, Utiyama, Masao, Liu, Lemao, Chen, Kehai, and Sumita, Eiichiro. 2017b. Instance weighting for neural machine translation domain adaptation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen pages 14831489. http://aclweb.org/anthology/D17–1155.Google Scholar
Wang, Rui, Utiyama, Masao, and Sumita, Eiichiro. 2018a. Dynamic sentence sampling for efficient training of neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Melbourne, pages 298304. http://aclweb.org/anthology/P18–2048.Google Scholar
Wang, Rui, Zhao, Hai, Lu, Bao-Liang, Utiyama, Masao, and Sumita, Eiichiro. 2014. Neural network based bilingual language model growing for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pages 189195. www.aclweb.org/anthology/D14-1023.Google Scholar
Wang, Xinyi, Pham, Hieu, Arthur, Philip, and Neubig, Graham. 2019b. Multilingual neural machine translation with soft decoupled encoding. In International Conference on Learning Representations (ICLR). New Orleans, LA. https://openreview.net/pdf?id=Skeke3C5Fm.Google Scholar
Wang, Yining, Zhang, Jiajun, Zhai, Feifei, Xu, Jingfang, and Zong, Chengqing. 2018b. Three strategies to improve one-to-many multilingual translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 29552960. www.aclweb.org/anthology/D18-1326.Google Scholar
Weaver, Warren. 1947. Letter to Norbert Wiener. Translated in 1949 and reprinted in Locke and Booth (1955).Google Scholar
Wei, Hao-Ran, Huang, Shujian, Wang, Ran, Dai, Xin-yu, and Chen, Jiajun. 2019. Online distilling from checkpoints for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 19321941. www.aclweb.org/anthology/N19-1192.Google Scholar
Wiseman, Sam and Rush, Alexander M.. 2016. Sequence-to-sequence learning as beam-search optimization. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 12961306. https://aclweb.org/anthology/D16–1137.Google Scholar
Wu, Lijun, Tan, Xu, He, Di, Tian, Fei, Qin, Tao, Lai, Jianhuang, and Liu, Tie-Yan. 2018. Beyond error propagation in neural machine translation: Characteristics of language also matter. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 36023611. www.aclweb.org/anthology/D18-1396.Google Scholar
Wu, Lijun, Wang, Yiren, Xia, Yingce, Tian, Fei, Gao, Fei, Qin, Tao, Lai, Jianhuang, and Liu, Tie-Yan. 2019. Depth growing for neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 55585563. www.aclweb.org/anthology/P19-1558.Google Scholar
Wu, Lijun, Xia, Yingce, Zhao, Li, Tian, Fei, Qin, Tao, Lai, Jianhuang, and Liu, Tie-Yan. 2017. Adversarial neural machine translation. Ithaca, NY: Cornell University, abs/1704.06933. https://arxiv.org/pdf/1704.06933.pdf.Google Scholar
Wu, Yonghui, Schuster, Mike, Chen, Zhifeng, Le, Quoc V., Norouzi, Mohammad, Macherey, Wolfgang, Krikun, Maxim, Cao, Yuan, Gao, Qin, Macherey, Klaus, Klingner, Jeff, Shah, Apurva, Johnson, Melvin, Liu, Xiaobing, Kaiser, Lukasz, Gouws, Stephan, Kato, Yoshikiyo, Kudo, Taku, Kazawa, Hideto, Stevens, Keith, Kurian, George, Patil, Nishant, Wang, Wei, Young, Cliff, Smith, Jason, Riesa, Jason, Rudnick, Alex, Vinyals, Oriol, Corrado, Greg, Hughes, Macduff, and Dean, Jeffrey. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. Ithaca, NY: Cornell University, abs/1609.08144. http://arxiv.org/abs/1609.08144.pdf.Google Scholar
Wu, Youzheng, Yamamoto, Hitoshi, Lu, Xugang, Matsuda, Shigeki, Hori, Chiori, and Kashioka, Hideki. 2012. Factored recurrent neural network language model in TED lecture transcription. In Proceedings of the Seventh International Workshop on Spoken Language Translation (IWSLT). Hong Kong, pages 222228. www.mt-archive.info/IWSLT-2012-Wu.pdf.Google Scholar
Wuebker, oern, Green, Spence, DeNero, John, Hasan, Sasa, and Luong, Minh-Thang. 2016. Models and inference for prefix-constrained machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Berlin, pages 6675. www.aclweb.org/anthology/P16-1007.Google Scholar
Wuebker, Joern, Simianer, Patrick, and DeNero, John. 2018. Compact personalized models for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 881886. www.aclweb.org/anthology/D18-1104.Google Scholar
Xing, Chao, Wang, Dong, Liu, Chao, and Lin, Yiye. 2015. Normalized word embedding and orthogonal transform for bilingual word translation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Denver, CO, pages 10061011. www.aclweb.org/anthology/N15-1104.Google Scholar
Xu, Hainan and Koehn, Philipp. 2017. Zipporah: A fast and scalable data cleaning system for noisy web-crawled parallel corpora. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, pages 29352940. http://aclweb.org/anthology/D17–1318.Google Scholar
Xu, Ruochen, Yang, Yiming, Otani, Naoki, and Wu, Yuexin. 2018. Unsupervised cross-lingual transfer of word embedding spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 24652474. https://doi.org/10.18653/v1/D18–1268.Google Scholar
Xu, Weijia, Niu, Xing, and Carpuat, Marine. 2019. Differentiable sampling with flexible reference word order for neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 20472053. www.aclweb.org/anthology/N19-1207.Google Scholar
Yang, Yilin, Huang, Liang, and Ma, Mingbo. 2018a. Breaking the beam search curse: A study of (re-)scoring methods and stopping criteria for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 30543059. www.aclweb.org/anthology/D18-1342.Google Scholar
Yang, Zhen, Chen, Wei, Wang, Feng, and Bo, Xu. 2018b. Improving neural machine translation with conditional sequence generative adversarial nets. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 13461355. http://aclweb.org/anthology/N18–1122.Google Scholar
Yang, Zhen, Chen, Wei, Wang, Feng, and Bo, Xu. 2018c. Unsupervised neural machine translation with weight sharing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Melbourne, pages 4655. http://aclweb.org/anthology/P18–1005.Google Scholar
Yang, Zhilin, Dai, Zihang, Yang, Yiming, Carbonell, Jaime G., Salakhutdinov, Ruslan, and Quoc, V. Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. Ithaca, NY: Cornell University, abs/1906.08237. http://arxiv.org/abs/1906.08237.Google Scholar
Yehezkel Lubin, Noa, Goldberger, Jacob, and Goldberg, Yoav. 2019. Aligning vector-spaces with noisy supervised lexicon. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 460465. www.aclweb.org/anthology/N19-1045.Google Scholar
Zaremoodi, Poorya and Haffari, Gholamreza. 2018. Neural machine translation for bilingually scarce scenarios: A deep multi-task learning approach. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 13561365. http://aclweb.org/anthology/ N181123.Google Scholar
Zeiler, Matthew D.. 2012. ADADELTA: An adaptive learning rate method. Ithaca, NY: Cornell University, abs/1212.5701. http://arxiv.org/abs/1212.5701.Google Scholar
Zenkel, Thomas, Wuebker, Joern, and DeNero, John. 2019. Adding interpretable attention to neural translation models improves word alignment. In arXiv. https://arxiv.org/pdf/1901.11359.Google Scholar
Zhang, Jiajun, Liu, Shujie, Li, Mu, Zhou, Ming, and Zong, Chengqing. 2014. Bilingually-constrained phrase embeddings for machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Baltimore, MD, pages 111121. www.aclweb.org/anthology/P14-1011.Google Scholar
Zhang, Jian, Liangyou, Li, Way, Andy, and Liu, Qun. 2016. Topic-informed neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, pages 18071817. http://aclweb.org/anthology/C16–1170.Google Scholar
Zhang, Jingyi, Utiyama, Masao, Sumita, Eiichro, Neubig, Graham, and Nakamura, Satoshi. 2017a. Improving neural machine translation through phrase-based forced decoding. In Proceedings of the Eighth International Joint Conference on Natural Language Processing. Volume 1: Long Papers. Asian Federation of Natural Language Processing, Taipei, pages 152162. www.aclweb.org/anthology/I17-1016.Google Scholar
Zhang, Jingyi, Utiyama, Masao, Sumita, Eiichro, Neubig, Graham, and Nakamura, Satoshi. 2018a. Guiding neural machine translation with retrieved translation pieces. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Association for Computational Linguistics, New Orleans, LA, pages 13251335. http://aclweb.org/anthology/N18–1120.Google Scholar
Zhang, Kelly and Bowman, Samuel. 2018. Language modeling teaches you more than translation does: Lessons learned through auxiliary syntactic task analysis. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, pages 359361. www.aclweb.org/anthology/W18-5448.Google Scholar
Zhang, Longtu and Komachi, Mamoru. 2018. Neural machine translation of logographic language using sub-character level information. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Belgium, pages 1725. www.aclweb.org/anthology/W18-6303.Google Scholar
Zhang, Meng, Yang, Liu, Luan, Huanbo, and Sun, Maosong. 2017b. Adversarial training for unsupervised bilingual lexicon induction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Vancouver, BC, pages 19591970. https://doi.org/10.18653/v1/P17–1179.Google Scholar
Zhang, Meng, Liu, Yang, Luan, Huanbo, and Sun, Maosong. 2017c. Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Long Beach, CA, pages 19241935. www.aclweb.org/anthology/D17-1207.Google Scholar
Zhang, Pei, Ge, Niyu, Chen, Boxing, and Fan, Kai. 2019a. Lattice transformer for speech translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 64756484. www.aclweb.org/anthology/P19-1649.Google Scholar
Zhang, Wen, Yang, Feng, Meng, Fandong, You, Di, and Liu, Qun. 2019b. Bridging the gap between training and inference for neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 43344343. www.aclweb.org/anthology/P19-1426.Google Scholar
Zhang, Wen, Huang, Liang, Feng, Yang, Shen, Lei, and Liu, Qun. 2018b. Speeding up neural machine translation decoding by cube pruning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 42844294. www.aclweb.org/anthology/D18-1460.Google Scholar
Zhang, Xuan, Shapiro, Pamela, Kumar, Gaurav, McNamee, Paul, Carpuat, Marine, and Duh, Kevin. 2019c. Curriculum learning for domain adaptation in neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 19031915. www.aclweb.org/anthology/N19-1189.Google Scholar
Zhang, Zhisong, Wang, Rui, Utiyama, Masao, Sumita, Eiichiro, and Zhao, Hai. 2018c. Exploring recombination for efficient decoding of neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pages 47854790. www.aclweb.org/anthology/D18-1511.Google Scholar
Zheng, Baigong, Zheng, Renjie, Ma, Mingbo, and Liang, Huang. 2019. Simultaneous translation with flexible policy via restricted imitation learning. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pages 58165822. www.aclweb.org/anthology/P19-1582.Google Scholar
Zhou, Chunting, Ma, Xuezhe, Wang, Di, and Neubig, Graham. 2019. Density matching for bilingual word embedding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long and Short Papers. Association for Computational Linguistics, Minneapolis, MN, pages 15881598. www.aclweb.org/anthology/N19-1161.Google Scholar
Zhou, Long, Hu, Wenpeng, Zhang, Jiajun, and Zong, Chengqing. 2017. Neural system combination for machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. Association for Computational Linguistics, Vancouver, BC, pages 378384. http://aclweb.org/anthology/P172060.Google Scholar
Zoph, Barret and Knight, Kevin. 2016. Multi-source neural translation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, CA, pages 3034. www.aclweb.org/anthology/N16-1004.Google Scholar
Zoph, Barret, Yuret, Deniz, May, Jonathan, and Knight, Kevin. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, TX, pages 15681575. https://aclweb.org/anthology/D16–1163.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Bibliography
  • Philipp Koehn, The Johns Hopkins University
  • Book: Neural Machine Translation
  • Online publication: 30 May 2020
  • Chapter DOI: https://doi.org/10.1017/9781108608480.019
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • Bibliography
  • Philipp Koehn, The Johns Hopkins University
  • Book: Neural Machine Translation
  • Online publication: 30 May 2020
  • Chapter DOI: https://doi.org/10.1017/9781108608480.019
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • Bibliography
  • Philipp Koehn, The Johns Hopkins University
  • Book: Neural Machine Translation
  • Online publication: 30 May 2020
  • Chapter DOI: https://doi.org/10.1017/9781108608480.019
Available formats
×