Working with
COMPARA
an online parallel corpus
of English and Portuguese fiction
Ana Frankenberg-Garcia
An online parallel corpus
of English and
Portuguese
fiction ???
An online corpus
Allows you to study Portuguese and English
fiction and their translations into English and
Portuguese in an automatic way…
COMPARA
Machine Translation
Human Translation
The study of human
translation
Traditionally not a hard science
Difficult to be systematic
But with the technology of
corpus linguistics, things
can change …
What is a corpus?
Advantages of using
corpora to study human
translation
An enormous amount of translated
texts
Systematic analyses
Quantifiable results
Baker (1993), Frankenberg-Garcia (2004),
Olohan & Baker (2000), Øverås (1998),
Sardinha (2002)
A parallel corpus can also be
used in language learning
Barlow (2000), Frankenberg-Garcia (2000, 2004, forthcoming),
Pearson (2003), Roussel (1991)
Advantages of using
corpora in language
learning
• Authentic examples of language use
• Access to information often absent from
conventional grammars and dictionaries
• Learner autonomy (don’t have to rely on native
speakers)
• Risk-taking
COMPARA
COMPARA team
Ana Frankenberg-Garcia, Diana Santos
Rosário Silva, Susana Inácio, Rosa Pires
Initial support (1999-2000)
FCT (Portugal)
ISLA Lisboa
Oxford University Language Centre
Present funding (2001-2004)
Linguateca: FCT/ POSI (POSI/PLP/43931/2001)
COMPARA
EN translations
PT source texts
structure
PT translations
EN source texts
COMPARA
English
Portuguese
Original
Source
Translations
Translated
Original
Translated
Portuguese
Texts
Portuguese
English
English
COMPARA users and uses
Language learners - bilingual dictionary with examples
Language teachers - exercises and tests
Translators - language equivalents
Translation lecturers - exercises & problems
Translation theorists - test translation hypotheses
Bilingual lexicographers - bilingual dictionaries
Computational linguists - machine translation
Since 2001:
+ 70 000 queries
Before using it…
Remember that the results you get are “only
as good as the corpus”
J. Sinclair Corpus concordance collocation (1991: 13)
Why can’t I find
the Portuguese translation
of greenhouse gas
in COMPARA?
COMPARA 5.6 varieties
UK
Portugal
US
Mozambique
Brazil
South Africa
Angola
PORTUGUESE ENGLISH
COMPARA
5.6
Publication dates
2000
1997
1988
1914
1880
1837
COMPARA 5.6 genre
Published
other
genres
fiction
EXTENSIBLE
COMPARA 5.6 authors
Portuguese writers
Camilo Castelo Branco
Eça de Queirós
José Cardoso Pires
Jorge de Sena
Mário de Carvalho
Sá Carneiro
COMPARA 5.6 authors
Brazilian writers
Aluísio Azevedo
Autran Dourado
Chico Buarque
José de Alencar
Machado de Assis
Manuel Antônio de Almeida
Marcos Rey
Patrícia Melo
Paulo Coelho
Rubem Fonseca
COMPARA 5.6 authors
Angolan writers
José Eduardo Agualusa
Mozambiquean writers
Mia Couto
COMPARA 5.6 authors
British writers
David Lodge
Julian Barnes
Joseph Conrad
Joanna Trollope
Lewis Carrol
Oscar Wilde
COMPARA 5.6 authors
American writers
Henry James
Edgar Allan Poe
Richard Zimler
South African writers
Nadine Gordimer
+ copyright permission to use more
Can any text be included in
the corpus?
Only published source texts and
translations
Only English translated directly
from Portuguese, and Portuguese
translated directly from English
Only human translations!
COMPARA 5.6
texts
49 translations
46 source texts (extracts)
COMPARA 5.6 size
973317 893150
words
words
in
English
in
Portuguese
Largest edited parallel corpus in the world
Now I know why
I can’t find
greenhouse gas
in COMPARA!
COMPARA 5.6
syntax
general language
fiction
technical terms
other genres
One more thing…
When using corpora, remember:
Language is “constructed out of a finite set of
elements”, but it is something that is used creatively!
N. Chomsky Syntactic Structures (1957:13)
“As a rule of thumb
you need a litre of
paint to every 12
square metres of
wall”
“rule”
“as a rule”
“rule of thumb”
COMPARA availability
Free, online
For research and
education
COMPARA access
www.linguateca.pt/COMPARA/
Download

FrankenbergGarcia2004-ISLA