Participant Tracking in Text Unfolding: Insights from Portuguese-Chinese Translation and Post-Editing Task Logs AuTema-PostEd Group Ana Luísa V. Leal1, Márcia Schmaltz1, Derek Wong1, Lidia Chao1, James TL Wang1, Adriana Pagano2, Fabio Alves2, Igor da Silva3, Paulo Quaresma4 University of Macau (UM)1 Federal University of Minas Gerais (UFMG)2 Federal University of Uberlândia (UFU)3 University of Evora (UE)4 Outline • Research Context and Aims • Research Assumptions and Questions • Experimental Design • Methodology of Analysis • Preliminary Results • Discussion • Next Steps Research Context and Aims Joint Project between Department of Portuguese /University of Macau and LETRA – Laboratory for Experimentation in Translation, Federal University of Minas Gerais, to investigate the process and final output of human translated text as compared to human postediting of machine translated texts by Portuguese-Chinese Translation System (PCT). Research Assumptions and Questions • Identity cohesive chains responsible for co-reference and participant tracking are crucial to construing a coherent representation of a text. (Halliday & Hasan 1976, Halliday 1989, Halliday & Matthiessen 2004) Do translation process data show evidence of the role of identity chains as exerting cognitive demands upon task executors? • Translating demands more cognitive effort than post-editing (Carl et al 2011) Do data in our study confirm impact of task type on effort? Translation Process Research • Eye-Mind Assumption (Just & Carpenter 1980) • User Activity Data (Carl & Jakobsen 2009, Carl 2012b) • (Un)challenge translation (Carl & Dragsted 2012) • More / less time consuming • Look few / more words ahead into the ST • More / less keyboard activities (insertions, deletions) • Long pauses, regressive saccades and refixation on words already read (Carl & Jakobsen 2009) Integrating quantitative and qualitative analysis to have a full picture of the translation process (Alves & Gonçalves 2013, O’Brien 2006) • Experimental Design • Materials • Translog-II (Carl 2012a) • Eye tracker Tobii T120 • Tobii Studio 3.2.1 • Participants • 12 translators, L1 Chinese, 23-32 years old, BA Portuguese Studies, < 1 year experience, glasses or contact lenses • Setting and conditions • Eye-tracking lab at University of Macau • No time pressure Experimental Design • Input • 4 texts: ca. 80 words/word-equivalents – news reports • 2 Chinese source texts • 2 Portuguese source texts • MT output provided by PCT (Portuguese-Chinese Translator) (Wong et al 2012) • Randomized for both subjects and tasks • Tasks • 1 L1 translation • 1 L2 translation • 1 post-editing into L1 • 1 post-editing into L2 • Recall Protocols Experimental Design • Input • 4 texts: ca. 80 words/word-equivalents – news reports • 2 Chinese source texts • 2 Portuguese source texts [Focus on Text 2] • MT output provided by PCT (Portuguese-Chinese Translator) (Wong et al 2012) • Randomized for both subjects and tasks • Tasks • 1 L1 translation • 1 L2 translation • 1 post-editing into L1 • 1 post-editing into L2 • Recall Protocols Input ST Identity Cohesive Chains Os brasileiros estão em lua-de-mel com o mundo. The Brazilians are in honeymoon with the world. A Petrobras e sua presidente e sua presidente estão entre as maiores do mundo. Petrobras and their/its president are among the greatest of the world. Conforme uma revista americana, a presidente é uma das 100 maiores lideranças mundiais e a petrolífera estatal é uma das 10 maiores companhias do planeta. According to an American magazine, the president is one of the 100 greatest world leaders, and the stateowned oil company is one of the 10 greatest companies in the planet. A revista classifica as empresas com base em diversos indicadores além do lucro. The magazine classifies the companies building on several indicators besides profit. É por isso que a brasileira surge em posição de liderança na classificação. That’s why the Brazilian [company/president] emerges in leadership position in the classification. [Elipse] Está inclusive à frente de grandes como a Apple. [The company/Brazil/The president] are even beating big [ones/companies/countries/presidents], such as Apple. Input ST Identity Cohesive Chains Os brasileiros estão em lua-de-mel com o mundo. The Brazilians are in honeymoon with the world. A Petrobras e sua presidente e sua presidente estão entre as maiores do mundo. Petrobras and their/its president are among the greatest of the world. Conforme uma revista americana, a presidente é uma das 100 maiores lideranças mundiais e a petrolífera estatal é uma das 10 maiores companhias do planeta. According to an American magazine, the president is one of the 100 greatest world leaders, and the stateowned oil company is one of the 10 greatest companies in the planet. A revista classifica as empresas com base em diversos indicadores além do lucro. The magazine classifies the companies building on several indicators besides profit. É por isso que a brasileira surge em posição de liderança na classificação. That’s why the Brazilian [company/president] emerges in leadership position in the classification. [Elipse] Está inclusive à frente de grandes como a Apple. [The company/Brazil/The president] are even beating big [ones/companies/countries/presidents], such as Apple. Methodology of Analysis • User Activity Data • FU ≤400ms, PU ≤ 1000ms • Source Tokens (ST) • Target Tokens (TT) Extracted: gaze time, fixation number, insertions, deletions, editions, first pass fixation time (FD), STid, TTid. • Statistical studies • Comprehension of ST, TT and Production • Subset studies with the 6 identities references (Principal, Secondary, Others) Methodology of Analysis • Quantitative • Linear Mixed-Effects Regression Model (LMER) lmerTest running on the statistical tool R (3.0.2) (Baayen 2008, Balling 2008, 2013, Sjorup 2012) • p<0.05 • Tests (Student’s T-Test; Mann Whitney U) Methodology of Analysis Comprehension Variables Dependent Total Fixation Time Production Production time Total Fixation Number First Pass Fixation Time Independent / Random AOIs, Participant AOIs, Participant Independent / Fixed Length Character count Position Position Frequency (corpus’ Banco de Português) Rendition (Yes or No) Trigram Probability Trigram Probability Task (Translation or Post-editing) Task (Trans. or Post-Ed.) Frequency (corpus’ CCL/UPK) Type (Reference or Comparative) Type (Reference or Comparative) Preliminary Results Source Text Comprehension Gazing and Fixation Variables Total Fixation Time Total Fixation Number First Pass Fixation Time (Intercept) <2e-16 3.02-10 <2e-16 Length <2e-16 <2e-16 0.000478 0.0247 0.018131 Frequency Target Text Comprehension Gazing and Fixation Variables Total Fixation Time Total Fixation Number First Pass Fixation Time (Intercept) <2e-16 4.81e-08 <2e-16 Length 6.90e06 6.90e-06 8.6e-10 Position 5.62e-09 5.62e-09 TypeRef 0.0171 Target Text Production Keylogging Variables (Intercep) Production Time <2e-16 Character count 2.67e-13 Renditions 0.000586 Input ST Identity Cohesive Chains Os brasileiros estão em lua-de-mel com o mundo. The Brazilians are in honeymoon with the world. A Petrobras e sua presidente e sua presidente estão entre as maiores do mundo. Petrobras and their/its president are among the greatest of the world. Conforme uma revista americana, a presidente é uma das 100 maiores lideranças mundiais e a petrolífera estatal é uma das 10 maiores companhias do planeta. According to an American magazine, the president is one of the 100 greatest world leaders, and the stateowned oil company is one of the 10 greatest companies in the planet. A revista classifica as empresas com base em diversos indicadores além do lucro. The magazine classifies the companies building on several indicators besides profit. É por isso que a brasileira surge em posição de liderança na classificação. That’s why the Brazilian [company/president] emerges in leadership position in the classification. [Elipse] Está inclusive à frente de grandes como a Apple. [The company/Brazil/The president] are even beating big [ones/companies/countries/presidents], such as Apple. ST Identity Chains Gazing and Fixation Variables Total Fixation Time Total Fixation Number First Pass Fixation Time (Intercept) <2e-16 1.99e-09 <2e-16 Length 0.000319 4.58e-05 0.000823 Frequency 0.001553 0.00483 TT Identity Chains Gazing and Fixation Variables Total Fixation Time Total Fixation Number First Pass Fixation Time (Intercept) <2e-16 2.99e-11 0.000245 Length 3.16e-05 1.23e-08 0.002858 Position 8.20e-10 4.27e-10 Frequency 7.35e-05 Production Identity Chains Keylogging Variables (Intercep) Production Time <2e-16 Character count 1.72e-08 Renditions 0.01672 Position 0.00498 Non-Parametric Tests • ST Reading: • Significant differences between reference types for TNF (0.27) and TFT (0.46) and between subjects for TNF, TFT and FIRST • No significant differences between tasks •TT Reading: • Significant differences between subjects for TNF, TFT and FIRST • Significant differences between tasks for TNF (0.13) and TFT (0.025) • No significant differences between reference types ProgGraph P23 • Some translators take the wrong road when accepted the MT or due the 15 20 25 30 35 40 45 50 comprehension of the ST. 西 巴 裁 统 10 及 以 133000 134000 135000 136000 137000 138000 139000 140000 141000 142000 143000 Prograph P23 15 20 25 30 35 40 45 50 55 60 65 70 75 80 • Rereading and ... “click”! 10 统总 西 579500 580500 581500 582500 583500 584500 585500 586500 587500 588500 589500 其 巴 590500 591500 592500 裁 总 593500 594500 Discussion Research question: Do translation process data show evidence of the role of identity chains as exerting cognitive demands upon task executors? • They do with regard to the source text considering only the non-parametric tests. The multiple regression tests show no significant results considering the set of variables used in the experiment. • Cohesive chains seem to have an impact on comprehension when reading the ST – reading for translation involves anticipating how chains will have to be dealt with in the TT (whether explicitation will be needed in the TT) Discussion Research question: Do data in our study confirm impact of task type on effort? • No. However four subjects built identity chains different from those in the ST. Their inferential path was different from the other subjects • The results of non-parametric tests shed light on relevant aspects of the inferential processing in post-editing and translation Translation TT-driven, but a result of a comprehension process Tasks differ concerning the target text production Post-editing demands less cognitive effort for lexical rendition (as confirmed by RVP, i.e. less character insertions), but requires more effort to reorganize structures (as shown in RVP) Next Steps • Fine-grained, qualitative analysis of user activity data (UAD) and translation progression graphs • Analysis of the impact of the first task on the second task (Ferreira 2010) • Further studies collecting data from native speakers of Portuguese to contrast reading effort in cohesive chains. • Analyses of results for all four texts used in the experiment will permit more robust results. References ALVES, F., GONÇALVES, J. 2013. “Investigating the Conceptual-procedural Distinction in the Translation Process”. Target 25:1. p. 107-124. BALLING, L., CARL M. (to appear). “Production Time Across Language and Tasks: A Large-scale Analysis Using the CRITT Translation Process Database. In: Schwieter, J., Ferreira, A. (eds.) The development of Translation Competence: Theories and Methodologies from Psycholinguistics and Cognitive Science. Cambridge: Cambridge Scholar Publishing. CARL, M. 2012. ‘Translog-II: a program for Recording User Activity Data for Empirical Reading and Writing Research’, in Proceedings of the Eighth International Conference on Language Resource and Evaluation. Istanbul 21-27 May 2012, Istanbul: European Language Resources Association, 4108-4112. CARL M., DRAGSTED, B., ELMING, J., HARDT, D. & JAKOBSEN, A. L. 2011.The Process of Post-Editing: a Pilot Study. In B. Sharp, M. Zock, M. Carl, A.L. Jakobsen (orgs.). Proceedings of the 8th Natural Language Processing and Cognitive Science Workshop. Copenhagen Studies in Language Series 41. p. 131-142. HALLIDAY, M. A.K., HASAN, R. 1976. Cohesion in English. London: Longman. HALLIDAY, M. A.K. 1989. Spoken and Written Language. Oxford: Oxford University Press. HALLIDAY, M. A.K., Matthiessen, C. 2004. An Introduction to Functional Grammar. London: Arnold O'BRIEN, S. 2002. Teaching post-editing: A proposal for course content. In: Teaching Machine Translation - the 6th International Workshop of the European Association of Machine Translation, Centre for Computational Linguistics, UMIST: Manchester. p. 99-106. O`BRIEN, S. 2004. Machine translatability and post-editing effort: how do they relate. In: Proceedings of translating and the computer 26. London: Aslib. O’BRIEN, S. 2006. Controlled Language and Post-Editing. The Guide From Multilingual. p. 17-19. O’BRIEN, S. & ALMEIDA, G. 2010. Analysing post-editing performance: correlations with years of translation equivalence. In: Proceedings of EAMT 2010: the European Association for Machine Translation, St Raphael, France. PAVLOVIĆ, N. & JENSEN, K. T. H. 2009. Eye tracking translation directionality. In A. Pym and A. Perekrestenko (eds). Translation Research Projects 2. Tarragona: Universitat Rovira i Virgili. p.101-119. SJORUP, A. (2013). Cognitive Effort in Metaphor Translation. PhD Thesis. Copenhagen: Copenhagen Business School. WONG, D., OLIVEIRA, F., LI, YP. 2012. Hybrid Machine Aided Translation System based on Constraint Synchronous Grammar and Translation Corresponding Tree. Journal of Computers, 7(2): p. 309-316. Thank you! Obrigada! 谢谢! 谢谢