You must have the GF Zemen Unicode font installed for the software to work. And in many applications you must set the font to GF Zemen Unicode to see the Geez characters. This nearly final version of the software adds several of the features recommended in earlier test runs including immediate visual output of keystrokes, control key functionality in Word, and font selection capabilities for those applications where the characters are input using rich text format (RTF is used in Word for example). The software works with Netscape Navigator and Composer in all of the Windows operating systems since the Netscape applications reliably comply with the multi-lingual encoding standards.
Making Eritreans more comfortable with computers and computer-based information will be a crucial element of bringing the benefits of the Internet and computers to the average Eritrean.
Limited access to technology (and the enhanced productivity it brings) is one of the main barriers to raising the standard of living and the value of national economic production in Eritrea. The ease of computer access, and the relevance of computer information will be a major element of rapidly transferring and applying computer technologies to Eritrea's economic and productive activities.
The for-profit private sector development model for multi-lingual computer infrastructure has failed Eritrea. First of all there are very few multi-lingual software providers and developers in Eritrea (there are approximately two), and the software that has been developed is expensive, uses non-standard character encodings, and is mutually incompatible with other software in Eritrea and Ethiopia. Prices for existing software range from $20 to $90 per copy and there were no free versions of the software until UniGeez began free distribution. This had lead to the contradiction that English-speaking people can use computers in their own language for free, while Tigrigna speakers (who have a mean per-capital of $250/year) may have to pay $90 to use a computer in their own language.
The large public benefits of multi-lingual computer access means that basic utilities that provide easy multi-lingual computer access should be public, rather than private property. A person trained and proficient in computers has an earning potential of perhaps 2-10 times that of a person without computer training. This means that if the existence of free public multi-lingual computer utilities facilitate computer access to just 1000 more people, the national economic benefit will be at least 1000 people*$1000/yr = $1 million/year. This justifies significant pulbic sector investment in the development of basic multi-lingual computer infrastructure.
To this end the Eritrea Technical Exchange Project of the International Collaborative for Science Education and the Environment (ETEP/ICSEE) has a project to develop and enhance the basic multi-lingual Ge`ez (g'Iz) and Arabic computer facilities and infrastructure in Eritrea. All multi-lingual utilities developed by ETEP will be public property software distributed under the GNU public license.
There are several technical issues that will be important for establishing an efficient multi-lingual computer communications infrastructure in Eritrea.
The most important technical issue is how to consistently encode or represent Ge`ez and Arabic text strings and formatted documents. Fortunately this problem has been largely solved already through the international standard-setting process. There are a set of standards commonly referred to as the Unicode standards, or more technically known as ISO/IEC 10646-1. These standards describe with technical specificity how to encode characters of most of the world's languages, including the Ge`ez syllabary. These standards include "Ethiopic" in Amendment 10, which even though it is mis-named includes all letters of the Ge`ez syllabary. Details of the Unicode standards are available at the Unicode Home Page.
But the setting of an international standard for the encoding of Ge`ez is just a first step. Once the encoding standard is set, fonts that comply with the standard need to be designed and computer software need to be developed that allows users to create computerized information content that complies with the standard. To date, there is only on unicode-compliant true type Ge`ez font ( GF Zemen Unicode) though there are several unicode compliant Unix/Linux fonts. And until recently there has not been any Windows keyboard software that complies with the international encoding standards. As a result, there are about 70 mutually incompatible Ge`ez encodings in use in both Eritrea and Ethiopia.
Another rather large task for enabling an efficient multi-lingual infrastructure is providing configuration and software modification support so that common applications can and do utilize and display standard compliant Ge`ez documents and data. There remains a lot of work to be done in this area with regards to graphic design and database software.
The efficiency and productivity of computer communications depends directly on the speed and cost of transfering information from one person or application to another. Currently, in the U.S. the largest amount of time is spent in gettting information is not the network transfer, but locating and reading the information. Perhaps it takes 15 seconds to go to a search engine, a minute to find the page in the search engine results, and another minute to read the information. If documents are not prepared in a consistent encoding and format, then an other step may have to be taken to read or use information that is in Ge`ez because of the need to convert or translate the character encoding between formats. That step, even if it takes less than a minute, can increase the time of presenting or retrieving information by up to 30%. In addition, developers and content providers would have to spend extra resources to provide translation and conversion facilities for different types of Ge`ez. Even worse, after standards become dominant, the cumulative archives of non-standard Ge`ez documents will have to be converted to be useful. A reasonable estimate of the costs of conversion and conversion support in a computer communications environment without standards is about 10% of computer communications activity.
Current computer communications markets in Eritrea are running at more than $200,000 per year and doubling at about 100%/year. The computer services sector is probably 5 to ten times this amount. This means that the cost of non-compliance with standards can be tens to hundreds of thousands of dollars per year in the near future. Mostly this cost is reflected in the lost opportunities of people using English letters and text when they would be much happier and effective in using Ge`ez or Arabic if it was convenient and readily available.
In this section we describe the technical
specifications of the public property standard compliant
Ge`ez software for Windows. The software is referred to
as "the package." The software is schedule for its
version 1.0 release in October 2000. The software was
written and developed by Marcus Wright and Will Briggs of the
Lynchburg College Computer Science Department.
1. The package runs in Microsoft Windows 95 or later
including Windows 98, Windows NT and Windows 2000.
2. The package runs in the background, so that it can intercept
keyboard input and converts it, before it gets to the active program, to
Ge`ez script.
3. There should be two modes: Ge`ez and Roman. When in Roman, the
input is transferred directly to the output, unchanged. When in Ge`ez,
the output of the package is unicode representations of Ge`ez characters
in UTF-8 encoding scheme as specified by ISO/IEC 10646-1 Amendment 10 and
ISO/IEC 10646-1 Amendment 2 respectively. The key mapping for the Ge`ez
characters will be modifiable either though a configuration file, or a
configuration table in the software source code. There is the possibility
of adding a third mode to similarly accomodate UTF-8 encoded Arabic as
specified by the ISO/IEC 10646-1 standard.
The ISO/IEC 10646-1 compliant UTF-8 encoded font that will be used
for testing of the Ge`ez mode is GF Zemen Unicode. Available at:
ftp://ftp.ethiopic.org/pub/fonts/TrueType/gfzemenu.ttf
The character charts for ISO/IEC 10646-1 Amendment 10 can be found at:
http://www.unicode.org/charts/PDF/U1200.pdf
A description of the UTF-8 encoding scheme (ISO/IEC 10646-1 Amendment 2)
is be available at:
http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n1335
A technical discussion for character encoding schemes can be found at:
http://www.unicode.org/unicode/reports/tr17/
The character charts for the Arabic presentation forms can be similarly
found at:
http://www.unicode.org/charts/PDF/UFB50.pdf
4. A syllable should be made available to the active
program as soon as it's
ready; it's ready when the next character not in the syllable is entered.
For example, in the string "hama," once the "m" is typed, the "ha" is ready
to go, as "ham" is not part of any syllable.
To accomodate languages like Arabic, the syllable made available
to the program may consist of strings of more than one UTF-8 character.
For example for initial character forms in Arabic one types a
space/character combination to release a space/translated_character
combination to the program because the character takes
different glyph depending on if it is an initial, medial, or final form.
Similarly for final forms a character/space releases a
translated_character/space to the active program. The mapping of
syllables to UTF-8 characters may be many-to-one as specified in
the configuration file.
5. Source code is part of the package, and it will be
released using the GNU licensing agreement. Part of this source
code is a configuration file for the coding standard used by
Ge`ezFree (which is Unicode with UTF-8 encoding scheme and
SERA transliteration standard); changing this configuration file
would allow the use of other fonts, or even other languages that have
similar needs for conversion (like Arabic).
In addition to the character table provided at unicode.org, an additional
character table is provided at:
http://enh.ethiopiaonline.net/info/Fidel.ixbm.html
The initial transliteration standard will be the "System for Ethiopic
Representation in ASCII" or (SERA) as specified at:
http://www.abyssiniacybergateway.net/fidel/sera-faq_0.html
or as specified by the corresponding unicode values at:
http://www.punchdown.org/rvb/email/sera.html
6. For testing: The packed will be tested on Microsoft
Windows 95, 98, and 2000, with MS Office 97 and 2000. It will also be
checked with various network and Internet programs such as Netscape
Communicator, Internet Explorer, MS Frontpage and MS Outlook.
The tests will ensure that the program does not crash and does
look Ethiopian. The Eritrea Technical Exchange will take responsibility
for final testing and quality assurance for the software performance
and ability to produce valid/legible Ge`ez
text.
In addition to the multi-lingual utilities recently developed by ETEP/ICSEE, there are also several resources for standard compliant Ge`ez developed by the LibETH project and the 'AbyssiniaCyberGateway' sites. Also of general interest are side on Unicode standards, font software and other language encodings and transliteration methods. These include: