From ce599e4f9f94b4eb00c1b5edb85bce5431ab3df2 Mon Sep 17 00:00:00 2001
From: toma
+
+Copyright (C) 2003 The Electronic Dictionary Research and Development Group,
+Monash University.
+
+Contents:
+
+INTRODUCTION
+
+The EDICT file results from a long-running project to produce a freely
+available Japanese/English Dictionary in machine-readable form.
+
+The EDICT file is copyright, and is distributed in accordance with the
+Licence Statement, which can found at the WWW site of the
+ Electronic Dictionary Research and Development Group
+who are the owners of the copyright.
+
+The version date and sequence number is included in the dictionary itself
+under the entry "EDICT". (Actually it is under the JIS-ASCII code "????".
+This keeps it as the first entry when it is sorted.)
+
+The master copy of EDICT is in the pub/nihongo directory of
+ ftp.cc.monash.edu.au.
+There are other copies around, but they may not be
+as up-to-date. The easy way to check if the version you have is the latest is
+from the size/date.
+
+As of V96-001, the EDICT file no longer contains proper names. These have
+been moved to a separate file called "ENAMDICT".
+From V99-002, the EDICT file has been generated from an extended dictionary
+database which includes additional fields and information. See the later
+section on the new JMdict project for details of this.
+
+FORMAT
+
+EDICT's format is that of the original "EDICT" format used by the early
+PC Japanese word-processor MOKE (Mark's Own Kanji Editor).
+It uses EUC-JP coding for kana and kanji, however this can be converted to
+JIS (ISO-2022-JP) or Shift-JIS
+by any of the several conversion programs around. It is a text file with one
+entry per line. The format of entries is:
+
+
+or
+
+
+(NB: Only the KANJI and KANA are in EUC; all the other characters, including
+spaces, must be ASCII.)
+
+The English translations are deliberately brief, as the application of the
+dictionary is expected to be primarily on-line look-ups, etc.
+
+The EDICT file is not intended to have its entries in any particular order.
+In fact it almost always is in order as a by-product of the update method I
+use, however there is no guarantee of this. (The order is almost always JIS
++ alphabetical, starting with the head-word.)
+
+EDICT has developed as follows:
+
+
+At this stage EDICT has many more entries than many good commercial dictionaries,
+which typically have 20,000+ non-name entries with examples, etc. It is
+certainly bigger than some of the smaller printed dictionaries, and when used
+in conjunction with a search-and-display program like JDIC or XJDIC it
+provides a highly effective on-line dictionary service.
+
+Dictionary copyright is a difficult point, because clearly the first
+lexicographer who published "inu means dog" could not claim a copyright
+violation over all subsequent Japanese dictionaries. While it is usual to
+consult other dictionaries for "accurate lexicographic information", as
+Nelson put it, wholesale copying is, of course, not permissible. What makes
+each dictionary unique (and copyrightable) is the particular selection of
+words, the phrasing of the meanings, the presentation of the contents (a very
+important point in the case of EDICT), and the means of publication. Of
+course, the fact that for the most part the kanji and kana of each entry are
+coming from public sources, and the structure and layout of the entries
+themselves are quite unlike those in any published dictionary, adds a degree
+of protection to EDICT.
+
+The advice I have received from people who know about these things is that
+EDICT is just as much a new dictionary as any others on the market. Readers
+may see an entry which looks familiar, and say "Aha! That comes from the XYZ
+Jiten!". They may be right, and they may be wrong. After all there aren't
+too many translations of neko. Let me make one thing quite clear, despite
+considerable temptation (Electronic Books can be easily decoded), NONE of
+this dictionary came from commercial machine-readable dictionaries. I have a
+case of RSI in my right elbow to prove it.
+
+Please do not contribute entries to EDICT which have come directly from
+copyrightable sources. It is hard to check these, and you may be
+jeopardizing EDICT's status.
+
+Introduction
+
+EDICT is actually a Japanese->English dictionary, although the words within
+it can be selected in either language using appropriate software. (JDIC uses
+it to provide both E->J and J->E functionality.)
+
+The early stages of EDICT had size limitations due to its usage (MOKE scans
+it sequentially and JDXGEN, which is JDIC's index generator, held it in RAM.)
+This meant that examples of usage could not be included, and inclusion of
+phrases was very limited. JDIC/JDXGEN can now handle a much larger
+dictionary, but the compact format has continued.
+
+No inflections of verbs or adjectives have been included, except in idiomatic
+expressions. Similarly particles are handled as separate entries. Adverbs
+formed from adjectives (-ku or ni) are generally not included. Verbs are, of
+course, in the plain or "dictionary" form.
+
+Priority Entries
+
+Starting with the 2001 editions, approximately 20,000 entries comprising the most commonly-used words in Japanese are marked
+with a "(P)" at the end of the entry. This list has been identified by
+examining several small
+dictionaries, and lists of common gairaigo from Japanese newspapers.
+
+Parts of Speech
+
+In working on EDICT, bearing in mind I want to use it in MOKE and with JDIC,
+I had to come up with a solution to the problem of adjectival nouns
+[keiyoudoushi] (e.g. kirei and kantan), nouns which can be used adjectivally
+with the particle "no" and verbs formed by adding suru (e.g. benkyousuru).
+If I put entries in EDICT with the "na" and "suru" included, MOKE would not
+find a match when they are omitted or, the case of suru, inflected. What I
+decided to do is to put the basic noun into the dictionary and add
+"(vs)" where it can be used to form a verb with suru, "(a-no)" for common
+"no" usage, and "(an)" if it is an adjectival noun. Entries appeared as:
+
+
+In early 2001, as part of the JMdict project (see below), I completely revised
+this system, instead introducing a comprehensive system of Part of Speech
+(POS) tags. In the EDICT version of the file these tags usually appear in
+parentheses
+at the start of the entry, separated into general tags and POS tags. Where
+a tag applies to a single gloss or meaning, it will be included there instead.
+
+The (hopefully) full list of such markers is:
+
+
+Multiple Senses
+
+From the 2001 editions of EDICT, the differing senses associated with
+the Japanese head-words are being progessively marked. The marking takes the
+form of a "(1)", "(2)", etc. in front of the senses.
+
+Spellings
+
+I have endeavoured to cater for many possible variants of English translation
+and spelling. Where appropriate different translations are included for
+national variants (e.g. autumn/fall). I use Oxford (British) standard
+spelling (-our, -ize) for the entries I make, but I leave other entries in
+the national spelling of the submitter.
+
+At some stage in the future I intend to regularize the English spellings in such
+a way that allows searches on either British or American spellings
+to be successful.
+
+Gairaigo and Regional Words
+
+For gairaigo which have not been derived from English words, I have attempted
+to indicate the source language and the word in that language. Languages have
+been coded in the two-letter codes from the ISO 639:1988 "Code for the
+representation of names of languages" standard, e.g. "(fr: avec)". See
+Appendix C for more on this. (Thanks to Holger Gruber for suggesting this
+language coding.)
+
+In addition to the language codes described in Appendix C, a number of tags
+are used to indicate that a word or phrase is associated with a particular
+regional language variant within Japan. The tags are:
+
+
+In the case of gairaigo which have a meaning which is not apparent from the
+original (English) words, the literal transcription is included, with
+the tag (lit).
+
+Early in 1999 work began on the JMdict project, which aims to extend the
+structure and content of the EDICT file to enable it to contain
+additional information and provided an improved service to users.
+
+The project has several broad goals:
+
+By May 1999 the EDICT file had been converted into the new format. A major
+part of this consisted of identifying and combining entries which were
+effectively variants of each other.
+
+Since V99-002, the EDICT file has been generated from the new format.
+This has meant:
+
+EDICT can be freely used provided satisfactory acknowledgement is made,
+and a number of other conditions are met.
+Consult the Licence Statement information at Appendix A.
+
+It is, of course, the main dictionary used by PD and GPL Copyright software
+such as JDIC, JREADER, XJDIC, MacJDic, etc. It can be used as the
+dictionary within MOKE (it may need to be renamed JTOE.DCT if used with
+version 2.1 of MOKE), and it is also used by the NJSTAR and JWP Word
+Processor packages.
+
+I will be delighted if people send me corrections, suggestions, and ESPECIALLY
+additions. Before ripping in with a lot of suggestions, make sure you have the
+latest version, as others may have already made the same comments.
+
+The preferred format for submissions is a JIS, EUC or Shift-JIS file (uuencoded
+for safety) containing replacement/new entries. This can be emailed to me at
+the address at the end of this file.
+
+Feel free to use the following format:
+
+
+Please provide an annotated reason for any deletions or amendments you send.
+
+I prefer not to get a "diff" or "patch" file as the master EDICT is under
+continuous revision, and may have had quite a few changes since you got your
+copy.
+
+Users intending to make submissions to EDICT should follow the following
+simple rules:
+
+
+
+
+
+
+
+
+
+
+The following people, in roughly chronological order, have played a part in
+the development of EDICT. (I stopped adding to this list some years ago, so
+it is of historical interest now.)
+
+Mark Edwards, Spencer Green, Alina Skoutarides, Takako Machida, Theresa
+Martin, Satoshi Tadokoro, Stephen Chung, Hidekazu Tozaki, Clifford Olling,
+David Cooper, Ken Lunde, Joel Schulman, Hiroto Kagotani, Truett Smith, Mike
+Rosenlof, Harold Rowe, Al Harkom, Per Hammarlund, Atsushi Fukumoto, John
+Crossley, Bob Kerns, Frank O'Carroll, Rik Smoody, Scott Trent, Curtis
+Eubanks, Jamie Packer, Hitoshi Doi, Thalawyn Silverwood, Makato Shimojima,
+Bart Mathias, Koichi Mori, Steven Sprouse, Jeffrey Friedl, Yazuru Hiraga, Kurt
+Stueber, Rafael Santos, Bruce Casner, Masato Toho, Carolyn Norton, Simon
+Clippingdale, Shiino Masayoshi, Susumu Miki, Yushi Kaneda, Masahiko
+Tachibana, Naoki Shibata, Yuzuru Hiraga, Yasuaki Nakano, Atsu Yagasaki,
+Hitoshi Oi, Chizuko Kanazawa, Lars Huttar, Jonathan Hanna, Yoshimasa Tsuji,
+Masatsugu Mamimura, Keiichi Nakata, Masako Nomura, Hiroshi Kamabe, Shi-Wen
+Peng, Norihiro Okada, Jun-ichi Nakamura, Yoshiyuki Mizuno, Minoru Terada,
+Itaru Ichikawa, Toru Matsuda, Katsumi Inoue, John Finlayson, David Luke, Iain
+Sinclair, Warwick Hockley, Jamii Corley, Howard Landman, Tom Bryce, Jim
+Thomas, Paul Burchard, Kenji Saito, Ken Eto, Niibe Yutaka, Hideyuki Ozaki,
+Kouichi Suzuki, Sakaguchi Takeyuki, Haruo Furuhashi, Takashi Hattori,
+Yoshiyuki Kondo, Kusakabe Youichi, Nobuo Sakiyama, Kouhei Matsuda, Toru Sato,
+Takayuki Ito, Masayuki Tokoshima, Kiyo Inaba, Dan Cohn, Yo Tomita, Ed Hall,
+Takashi Imamura, Bernard Greenberg, Michael Raine, Akiko Nagase, Ben Bullock,
+Scott Draves, Matthew Haines, Andy Howells, Takayuki Ito, Anders Brabaek,
+Michael Chachich, Masaki Muranaka, Paul Randolph, Vesa Karhu, Bruce Bailey,
+Gal Shalif, Riichiro Saito, Keith Rogers, Steve Petersen, Bill Smith, Barry
+Byrne, Satoshi Kuramoto, Jason Molenda, Travis Stewart, Yuichiro Kushiro
+Keiko Okushi, Wayne Lammers, Koichi Fujino, Joerg Fischer, Satoru Miyazaki,
+Gaspard Gendreau, David Olson, Peter Evans, Steven Zaveloff, Larry Tyrrell,
+Heinz Clemencon, Justin Mayer, David Jones, Holger Gruber, David Wilson,
+John De Hoog, Stephen Davis, Dan Crevier, Ron Granich, Bruce Raup, Scott
+Childress, Richard Warmington, Jean-Jacques Labarthe, Matt Bloedel, Szabolcs
+Varga, Alan Bram, Hidetaka Koie, David Villareale, Hirokazu Ohata, Toshiki
+Sasabe, William Maton, Tom Salmon, Kian Yap, Paul Denisowski, Glen Pankow,
+Richard Northcott, Roger Meunier, Petteri Kettunen, Jeff Korpa, Kanji
+Haitani, Liam O'Brien, Serdar Yegulalp, Jonathan Way, Gururaj Rao, Yoichiro
+Niitsu, Ralph Seewald, Andreas Jordell, Chua Hian Koon, Hartmut Pilch,
+Shouichi Takeuchi, Ayumu Yasutomi, Mike Wright, James Rose, Nich Hill.
+
+Jim Breen
+ E D I C T
+ JAPANESE/ENGLISH DICTIONARY FILE
+
+
+
+KANJI [KANA] /English_1/English_2/.../
+
+
+KANA /English_1/.../
+
+
+
+A reasonably full list of contributors is at the back of this file,
+although I am sure to have missed a few.
+
+KANJI [benkyou] /study (vs)/
+KANJI [kantan] /simple (an)/
+
+
+abbr abbreviation
+adj adjective (keiyoushi)
+adv adverb (fukushi)
+adv-n adverbial noun
+adj-na adjectival nouns or quasi-adjectives (keiyodoshi)
+adj-no nouns which may take the genitive case particle "no"
+adj-pn pre-noun adjectival (rentaishi)
+adj-s special adjective (e.g. ookii)
+adj-t "taru" adjective
+arch archaism
+ateji ateji reading of the kanji
+aux auxiliary word or phrase
+aux-v auxiliary verb
+conj conjunction
+col colloquialism
+exp Expressions (phrases, clauses, etc.)
+ek exclusively kanji, rarely just in kana
+fam familiar language
+fem female term or language
+gikun gikun (meaning) reading
+gram grammatical term
+hon honorific or respectful (sonkeigo) language
+hum humble (kenjougo) language
+id idiomatic expression
+int interjection (kandoushi)
+iK word containing irregular kanji usage
+ik word containing irregular kana usage
+io irregular okurigana usage
+MA martial arts term
+male male term or language
+m-sl manga slang
+n noun (common) (futsuumeishi)
+n-adv adverbial noun (fukushitekimeishi)
+n-t noun (temporal) (jisoumeishi)
+n-suf noun, used as a suffix
+n-pref noun, used as a prefix
+neg negative (in a negative sentence, or with negative verb)
+neg-v negative verb (when used with)
+num number, numeric
+obs obsolete term
+obsc obscure term
+oK word containing out-dated kanji
+ok out-dated or obsolete kana usage
+pol polite (teineigo) language
+pref prefix
+prt particle
+qv quod vide (see another entry)
+sl slang
+suf suffix
+uK word usually written using kanji alone
+uk word usually written using kana alone
+v1 Ichidan verb
+v5 Godan verb (not completely classified)
+v5u Godan verb with `u' ending
+v5u-s Godan verb with `u' ending - special class
+v5k Godan verb with `ku' ending
+v5g Godan verb with `gu' ending
+v5s Godan verb with `su' ending
+v5t Godan verb with `tsu' ending
+v5n Godan verb with `nu' ending
+v5b Godan verb with `bu' ending
+v5m Godan verb with `mu' ending
+v5r Godan verb with `ru' ending
+v5k-s Godan verb - Iku/Yuku special class
+v5z Godan verb - -zuru special class (alternative form of -jiru verbs)
+v5aru Godan verb - -aru special class
+v5uru Godan verb - Uru old class verb (old form of Eru)
+vi intransitive verb
+vs noun or participle which takes the aux. verb suru
+vs-i suru verb - irregular
+vs-s suru verb - special class
+vk Kuru verb - special class
+vt transitive verb
+vulg vulgar expression or word
+X rude or X-rated term (not displayed in educational software)
+
+
+kyb Kyoto-ben
+osb Osaka-ben
+ksb Kansai-ben
+ktb Kantou-ben
+tsb Tosa-ben
+
+
+
+For more information on the JMdict project, please see the documentation
+files.
+
+
+
+
+USAGE
+
+NEW: KANJI1 [kana1] /new entry #1/
+
+NEW: KANJI2 [kana2] /new entry #2/
+
+old: KANJI3 [kana3] /old entry to be replaced/
+new: KANJI3 [kana3] /replacement entry/
+
+DEL: KANJI4 [kana4] /entry to be deleted/
+
+
+
+
+
+jwb@csse.monash.edu.au
+
+School of Computer Science & Software Engineering
+
+Monash University
+
+Clayton 3168
+
+AUSTRALIA
+
+APPENDIX A: EDICT LICENCE STATEMENT
+
+In March 2000, James William Breen assigned ownership of the copyright +of the dictionary files assembled, coordinated and edited by him to the +The Electronic Dictionary Research and Development Group at Monash +University. +
++EDICT can be freely used provided satisfactory acknowledgement is made, +and a number of other conditions are met. +Information about the licence and copyright for EDICT can be found on +the Group's WWW page at: http://www.csse.monash.edu.au/groups/edrdg/ +
++In summary, EDICT can be freely used with acknowledgement. +
++
+The following language codes have been used with non-English derived +gairaigo. They have been derived from the ISO 639:1988 "Code for the +representation of names of languages" standard. +
++
++ar Arabic +zh Chinese (Zhongwen) +de German (Deutsch) +en English +fr French +el Greek (Ellinika) +iw Hebrew (Iwrith) +ja Japanese +ko Korean +nl Dutch (Nederlands) +no Norwegian +pl Polish +ru Russian +sv Swedish +bo Tibetan (Bodskad) +eo Esperanto +es Spanish +in Indonesian +it Italian +lt Latin +pt Portugese +hi Hindi +ur Urdu +mn Mongolian +kl Inuit (formerly Eskimo) ++
+And I have added the following, which are not in the Standard: +
++
++ai Ainu ++ +