???????

 

? ?

??????????? lee5110@263.net

 

???????

•????????????????????Translational English Corpus????????????parallel corpus????????multilingual corpus????????comparable corpus??????????????????????????????????????????????????co-occurrence??????????????????????????????????????????????????????????????????

• 

•1)        ????????????12???

•2)        ?????????????24???

•3)        ???????????????24???

•4)        ???????????????????????48???

• 

•????H.J.Vermeer????????????????????????????????????Baker, 1995:238?????????????????????????????????????????????simplification??????explication??????conventionalization????????????“????”?data-driven??????????????????????????????????????????????????????????????????????????

???????????? . ????????? . ?????2001 (5).?

 

•??????????????????????????????????????????????????????????????????????????Translational English Corpus????????????????????????????????????????????2001????????????2000??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

           ----???. ???????????? ——????????.???????????2002?3?

 

•???????????CTIS?????????????????????---“???????”?TEC?????????????????????????????????????????

??????????????????????????????2002(6)?

What is a corpus

•“A corpus is a collection of pieces of language that are selected and ordered according to explicit linguistic criteria in order to be used as a sample of the language.”

•“A computer corpus is a corpus which is encoded in a standardised and homogenous way for open-ended retrieval tasks. Its constituent pieces of language are documented as to their origins and provenance.” Eagles Preliminary Recommendations on Corpus Typology

What is TEC?

•TEC is a computerised collection of contemporary translational English text. It is freely available to the research community, with a set of software tools to allow scholars to investigate the language of translated English. The corpus is continually being enlarged and the software tools refined and made more versatile and user-friendly.

 

•TEC is a corpus of contemporary translational English: it consists of written texts translated into English from a variety of source languages, European and non-European. It was set up and is currently managed by Professor Mona Baker at the Centre for Translation & Intercultural Studies. The custom-made software for processing the corpus, which is downloadable from the web, is designed by Dr. Saturnino Luz, Trinity College Dublin, who is also in charge of maintaining the corpus.

What does TEC consist of?

•TEC consists of four subcorpora: fiction, biography, news and inflight magazines. The overall size of the corpus is currently (2003) around 10 million words. It can be accessed freely via the web, using a custom-built concordancer designed by Dr. Saturnino Luz.

 

•TEC is meticulously documented in terms of extralinguistic features such as gender, nationality and occupation of the translator, direction of translation, source language, publisher of the translated text, etc. This information is held in a separate header file for each text. The concordancing software is designed to make the information in the header file available to the researcher at a glance

What type of research does TEC support?

•TEC has supported a broad range of studies in two main areas: the way in which the patterning of translated text might be different from that of non-translated text in the same language, and stylistic variation across individual translators. Examples of both types of study can be found in the Selected Bibliography attached to this document.

TEC  files

1. Subcorpus: Inflight magazines

        Lufthansa Bordbuch and Blue Wings

2. Subcorpus: Newspapers

        The Guardian and The European

3. Subcorpus: Biography

4. Subcorpus: Fiction

Sample Header File

•TITLE : 

•Filename: fn000009.txt

•Subcorpus: Fiction

•Collection: Memoirs of Leticia Valle

•TRANSLATOR

•Name: Carol Maier

•Gender: female

•Nationality: American

•Employment: Lecturer

•TRANSLATION

•Mode: written

•Extent: 55179

•Publisher: University of Nebraska Press

•Place: USA

•Date: 1994

•Copyright: University of Nebraska Press

•Comments: Title in European Women Writers Series

 

 

•AUTHOR

•Name: Rosa Chacel

•Gender: female

•Nationality: Spanish

•SOURCE TEXT

•Language: Spanish

•Mode: written

•Status: original

•Place: Spain

•Date: 1945

Basic Methodologies

•Comparable: two corpora in the same language, one consisting of translated and the other of non-translated texts;

•Parallel: corpora of source texts and their translations;

•Multilingual: corpora of non-translated texts in two or more languages, from the same domain, time period, etc.

Examples of corpora studies

•Comparable corpora (corpora of translated and non-translated texts in the same language, and in similar domains)     e.g. Olohan & Baker (based on TEC and BNC)

•Parallel corpora (corpora of source texts and their single or multiple translations)   e.g. Bosseaux (the Waves + 2 French translations )

•Parallel corpora, with monolingual reference corpus in the language of the translated subcorpus   e.g. Wallace (ECPC; IT & popular science English texts and two sets of Chinese translations, plus SINICA Chinese reference corpus)

•Parallel corpora, with a monolingual reference corpus in each language (translated and non-translated)   e.g. kenny (GEPCOLT; experimental German literary texts and single English translations, plus BNC & Mannheim Reference Corpora)

Features Investigated (translated vs. non-translated texts)

•Broad features (‘universals’?): explicitation, simplification, normalisation, levelling out

•Specific features (syntactic, lexical, literary): zero/that variation; contractions; split infinitives; use of idioms; recurrent lexical patterns; reformulation markers; marked collocations; point of view; deixis, etc.    (Mona Baker)

 

•Laviosa?1998b??????????????????????????????????????????????????????????????????????????????????????????????????????????????????Baker 1993, 1998?????????????Ψverεs, 1998?????????????????Kenny 1998??????????????????????????????????????????????Frawley?1984????“????”??????????????????????????  (Zhonghua Xiao)

????????????????????

1. ??????????????????????????????????????????

2.     ??????????????????????????????????????

3.     ????????????????????

4. ???????????????????????-----Aijmer & Altenberg?1996: 12?Cited in Zhonghua Xiao

??-???????

•http://icl.pku.edu.cn/project/parallel/default.htm ???????????

•http://www.ling.lancs.ac.uk/corplang/babel/babel.htm (The Babel English-Chinese Parallel Corpus)

????????????????????????????????

•???????????“?????????”??????????????????“?????”???????????????????????????????????????150??/??170??/??100??/??130??/??????????40%??????60%????????????55%?45%?????? 2003??

???????

•??BNC????????????BNC???http://info.ox.ac.uk/bnc/index.html ???????ftp?ftp://sable.ox.ac.uk/pub/ota/BNC/SARA/?????????????SARA??????????20??

•COBUILD?????????????????COBUILD?????????http://titania.cobuild.collins.co.uk/index.html???????????????????????????50??

•TeCCon??????????????????????Manchester?????http://www.art.man.ac.uk/SML/ctis/research/tec.htm ???????????????????????????????????????????????????????????????????????????????????????TEC Browser?http://ronaldo.cs.tcd.ie/tec/jnlp/ ????????????????????

References

•??. ?????????????????????????2002(6).

•????????????????????, 2000 (5).

•??? . ????????? . ????, 2001 (5).

•???. ????????????????2000?2003?

•???. ????????????????????,??????,2003 (1).

•???. ???????????? ——????????. ?????????, 2002 (3).

•Baker, Mona. Corpus-based Translation Studies (Lecture Handout), 2004.

•http://www.monabaker.com/tsresources/TranslationalEnglishCorpus.htm

•http://www.art.man.ac.uk/SML/ctis/research/tec.htm