diff options
author | tpearson <tpearson@283d02a7-25f6-0310-bc7c-ecb5cbfe19da> | 2010-01-20 01:29:50 +0000 |
---|---|---|
committer | tpearson <tpearson@283d02a7-25f6-0310-bc7c-ecb5cbfe19da> | 2010-01-20 01:29:50 +0000 |
commit | 8362bf63dea22bbf6736609b0f49c152f975eb63 (patch) | |
tree | 0eea3928e39e50fae91d4e68b21b1e6cbae25604 /kexi/kexiutils/transliteration_table.readme | |
download | koffice-8362bf63dea22bbf6736609b0f49c152f975eb63.tar.gz koffice-8362bf63dea22bbf6736609b0f49c152f975eb63.zip |
Added old abandoned KDE3 version of koffice
git-svn-id: svn://anonsvn.kde.org/home/kde/branches/trinity/applications/koffice@1077364 283d02a7-25f6-0310-bc7c-ecb5cbfe19da
Diffstat (limited to 'kexi/kexiutils/transliteration_table.readme')
-rw-r--r-- | kexi/kexiutils/transliteration_table.readme | 54 |
1 files changed, 54 insertions, 0 deletions
diff --git a/kexi/kexiutils/transliteration_table.readme b/kexi/kexiutils/transliteration_table.readme new file mode 100644 index 00000000..ff00d8ab --- /dev/null +++ b/kexi/kexiutils/transliteration_table.readme @@ -0,0 +1,54 @@ +Transliteration Table README +---------------------------- + +1. Rationale: Identifiers within the database or programming languages +only accept latin-1 characters, numbers and '_' character. + +Application developers can enter captions (titles) to give +objects or variables a meaningful name using full unicode set. + +Transliteration is used to convert unicode captions to identifiers +without loosing meaning of the names. + +More info: + http://en.wikipedia.org/wiki/Transliteration + http://en.wikipedia.org/wiki/Romanization + +2. We use special kind of romanization as we only allow characters +described in 1. + +3. Implementation: transliteration table, was generated by +generate_transliteration_table.sh shell script is used +to transliterate any unicode character (having code < 65535) +to an identifier, what gives constant time for converting +single character. + +The resulting generated code is kept in transliteration_table.{h|cpp} files, +included by identifier.cpp for use in public utility functions. + +For each item, the table (basically a table of c-strings) contains: +- a NULL string it the resulting conversion have to be "_" string; +- a c-string of size 1 or more containing a valid transliteration + as described in 1; +- an empty string "" if the transliteration should return empty string + (can be useful e.g. for soft signs in Cyrillic) + +4. Fixes: Because iconv/recode tools are not fully implemented in regards +to transliteration to latin-1 (e.g. no good support +for Greek and Cyrillic/Serbian characters), +the transliteration_table.cpp file is patched with +transliteration_table.cpp.patch which provides fixes written by hand. + +If you find invalid or missing transliterations: + a) edit transliteration_table.cpp (using UTF-8-compliant text editor!) + - if transliteration_table.cpp file does not exist, + extract it from transliteration_table.bz2 archive + b) run update_transliteration_table_patch.sh shell script, + what will update the transliteration_table.cpp.patch file + c) send the transliteration_table.cpp.patch file to the Kexi team + +5. Credits + Jaroslaw Staniek <js at iidea.pl> + Michael Drueing <michael at drueing.de> + Chusslove Illich <caslav.ilic at gmx.net> + Michal Svec <rebel at atrey.karlin.mff.cuni.cz> |