Zhu Xiaomin
Step One: Download Gate 5.1 from https://sourceforge.net/projects/gate/files/gate/5.1/gate-5.1-build3431-installer-win.exe/download
Step Two:
Install the program
Step Three: Go to C:/Program files/GATE-5.1/plugins/Lang_Chinese/resources/models/model-paum-pku-utf8.zip and unzip this file to a location that you can remember
Step Four:
      1. Start GATE
      2. File, Open Manage Creole Plugins
      3. Find Lang_Chinese
and click in the box under Load Now, OK
      4. Processing Resources (Right Click), type in Chinese Segmenter,
OK
      5. Language Resources (Right Click), New, GATE Corpus, Name the corpus(for example, Chinesecorpus), OK
      6. Right Click on Chinesecorpus, Populate, Browse to the folder that contains the corpus and add that path to the Directory URL, Click on the pencil symbol, Type txt, Add, OK, Encoding Type utf-8, OK.
      7. Right Click Applications, select Corpus Pipeline, OK
      8. Double Click Corpus Pipeline, Double Click Chinese Segmenter in Loaded Processing Resources, then Chinese Segmenter moves into Selected Processing Resources. Make sure the following is correct:
       learningAlg = PAUM
       learningMode = SEGMENTING
       modelURL = model-paum-pku-utf8 (the place where you unzip the file in Step Three)
       textCode = utf8
       textFilesURL = (browse to the corpus folder)
      9. Click on Run this Application. This can take some (approximately 5 minutes for 40 texts) time depending on the size of the corpus.
(Provided by Zhu Xiaomin on June 11, 2010)
      Gait was developed by Cunningham, Hamish et al [The University of Sheffield (http://gate.ac.uk/)]. (2001-2010).
Step One:
Install Java(JRE) on your computer
      You can download Java from http://sdlc-esd.sun.com/ESD6/JSCDL/jre/6u18-b79/jxpiinstall.exe?AuthParam=1269156422_b6361febd3fd5bf0c616837bde692629&GroupName=JSC&FilePath=/ESD6/JSCDL/jre/6u18-b79/jxpiinstall.exe&File=jxpiinstall.exe&BHost=javadl.sun.com
      Check your Java version:
      1. Click Start
      2. Type cmd and press enter
      3. This will open the command prompt window
      4. Type java Cversion and press enter
      5. You will get a message: java
version
Step Two
      Download Standford
Postagger from http://nlp.stanford.edu/software/stanford-postagger-full-2010-05-26.tgz
Step Three
      Unzip the file to places you are comfortable with using an archive manager software, such as WinRAR, 7-Zip, or WinZip.
      You might want to change the name of this
unzipped folder to stanTagger. I do this
because the original name is too long: stanford-postagger-full-
Step Four
      In stanTagger folder create two folders to hold your files, e.g myCorpus and myTaggedCorpus, Now put some text files (or your corpus) in myCorpus. Make sure there are no spaces in your file names. For example, writtenArgument.txt instead of written Argument.txt
Step Five
      1. Start your command window as described in Step One
      2. Go to the folder that contains the Stanford Tagger:
      This is how you do it:
      cd places where you unzip the Stanford Postagger\stanTagger
      3. Run the program using your command prompt window:
      For tagging one segmented Chinese text:
      java -mx
      For tagging more than one segmented Chinese texts:
FOR %a IN (Place
where Stanford Postagger is unzipped\stanTagger\myCorpus\*.txt) DO java -mx
      4. After typing the script above press enter
(June 11, 2010)