[go: up one dir, main page]

0% found this document useful (0 votes)
27 views6 pages

Toward Optimal Arabic Keyboard Layout Using Geneti

Uploaded by

Fyodor Trezvin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views6 pages

Toward Optimal Arabic Keyboard Layout Using Geneti

Uploaded by

Fyodor Trezvin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228921828

Toward optimal Arabic keyboard layout using genetic algorithm

Article in International Journal of Design Engineering

CITATIONS READS

19 12,348

3 authors:

Mazharul Tareq Sinan Taifour


University of Information Technology and Sciences 8 PUBLICATIONS 69 CITATIONS
5 PUBLICATIONS 22 CITATIONS
SEE PROFILE
SEE PROFILE

Gheith A. Abandah
University of Jordan
62 PUBLICATIONS 938 CITATIONS

SEE PROFILE

All content following this page was uploaded by Sinan Taifour on 31 May 2014.

The user has requested enhancement of the downloaded file.


In Proc. 9th Int'l Middle Eastern Multiconference on Simulation and Modeling (MESM 2008), Aug 26-28, Amman, Jordan.

TOWARD OPTIMAL ARABIC KEYBOARD LAYOUT USING GENETIC


ALGORITHM
Tareq M. Malas, Sinan S. Taifour, and Gheith A. Abandah
Faculty of Engineering and Technology
The University of Jordan, Amman 11942, Jordan
E-mail: tarik.malas@gmail.com, staiii@gmail.com, abandah@ju.edu.jo

KEYWORDS keyboard standard supports the ASMO standard for the 7-bit
Genetic Algorithm, Optimization, Arabic Keyboard Layout. Arabic characters code (ASMO 1985). However, this
keyboard standard was not used by the computer industry
ABSTRACT and the market adopted instead the currently used Arabic
keyboard layout shown in Figure 2. This keyboard layout
The common Arabic keyboard layout is derived from the gained wide acceptance in PCs and servers over other
Arabic typewriter keyboard’s layout. This layout is not layouts when Microsoft adopted it for its Arabized products.
optimized for typing speed and has several problems. We Another Arabic keyboard layout currently in use is the
use a genetic algorithm to optimize the design of the Arabic layout shown in Figure 3. This layout is used in Apple MAC
keyboard layout for typing speed. In the genetic algorithm, computers.
we use a fitness function that includes into consideration key
distance, finger used, and hand alternation. We also use this
fitness function to evaluate the optimized layout against
other layouts including the common layout. The optimized
layout is relatively 35% faster than the common layout and
has additional advantages.

INTRODUCTION

As the keyboard is the main device for text entry in Figures 1: ASMO 663 Arabic Keyboard Layout
computers, it is very important to have a comfortable
keyboard layout which also promotes fast text entry. In a
look at the history of the English keyboard layout, we see
that the standard layout (QWERTY) is not optimized for
efficient typing, but is actually quite arbitrary (Noyes 1998).
A more optimized keyboard layout known as the Dvorak
Simplified Keyboard (DSK) has been developed(Campbell-
Kelly 2005). But the QWERTY layout remains most widely
used because many people were already trained on using the
QWERTY, and were reluctant to change.
Figures 2: Common Arabic Keyboard Layout
Arabic Keyboard History

The layout of the Arabic keyboard is derived from the


Arabic typewriter keyboard’s layout. In the 1970s and
1980s, many computer companies developed more than 20
variants for this layout (JISM 1994). These various variants
had consistent allocation for some letters (namely ‫ا ب ت ث ج‬
‫)ح خ س ش ص ض ع غ ف ق ك ل م ن ﻩ ي‬. However, they have
differences in allocating the other letters particularly the
letters in the lower row (such as ‫)أ د ﺁ ئ ر ط ء ؤ ة ز ظ و ى إ ذ‬.
Some of the famous keyboard layouts in the late 1980s are Figures 3: Apple MAC Arabic Keyboard Layout
the layouts of Microsoft Arabic Word, Apple MAC, Sakher,
AMEER, ALIS, and Nafitha. The common Arabic keyboard layout is not optimized for
performance and also has many obvious drawbacks. For
To solve this chaos created by the existence of many Arabic example, Letter Thal "‫ "ذ‬is placed in an awkward place in
keyboard layouts, the Arab Standardization and Metrology the top-left corner of the keyboard despite its frequent use.
Organization (ASMO) developed a standard for the Arabic This letter is more frequently used than other well-positioned
keyboard layout shown in Figure 1 (ASMO 1987). This letters such as Letter Ghain “‫”غ‬. Moreover, the two-letter
In Proc. 9th Int'l Middle Eastern Multiconference on Simulation and Modeling (MESM 2008), Aug 26-28, Amman, Jordan.

combination Lam-Alef "‫ "ﻻ‬has one key dedicated for it from almost all fields. So the related character frequencies
despite the fact that it is not as frequent as other letter are not biased to any particular field. This site contains more
combinations. than 10,000 articles, mostly well-written. These articles can
be conveniently and freely downloaded in a single XML file
Related Work (http://en.wikipedia.org/wiki/Wikipedia:Database_download). The
Arabic Wikipedia was downloaded, parsed, and analyzed to
Many researchers used various algorithms to design better find frequency information.
keyboard layouts for other languages. A simulated annealing
algorithm was used by (Light and Anderson 1993) to The single XML file, around 200 MB in size, contains all the
optimize the English keyboard. They used the travel time articles structured with some meta information. The first step
and the frequency of letter pairs to optimize for typing was to extract the articles and save them in separate files for
speed. An evolutionary algorithm was used by (Walker analysis. The meta information was kept in an accessible
2003) for the English keyboard to evolve it for a layout format for latter usage. Due to the large size of this XML
faster than the QWERTY layout by 30%. Also a genetic file, it is not possible to use a parsing tool that loads the
algorithm was used by (Goettl et al. 2005) to rearrange the entire file into the memory, such as the Document Object
English letters and some punctuation marks. Model (DOM). Instead, a serial event-driven method must
be used. We use the Simple API for XML (SAX), which is
Ant Colony algorithm was used by (Wagner et al. 2003) to available in the Python programming language.
optimize keyboard layouts for English, French, and German
text. They used ergonomic criterion to develop the desired The next step was to remove all wikitext. Wikipedia articles
keyboard layouts. are formatted using a special syntax called wikitext
(http://en.wikipedia.org/wiki/Wikitext). For removing the
A new design for Hindi keyboard layout was done by wikitext, we used the Regular Expressions (RegExp) tool,
(Deshwal and Deb 2003) using a genetic algorithm. They which is also available in Python. The regular expressions
used an ergonomic criterion to optimize the layout for typing describe patterns of text that can be matched against longer
convenience. We are unaware of similar optimization work strings of text or used for substitution. Most of the wikitext
done for the Arabic keyboard layout. can be fully described using rather simple RegExp patterns.

Approach Used The last step was to count occurrences of characters and
character pairs and generate result files. In this step, the user
A keyboard layout can be optimized for several is prompted every time a new unknown character is
characteristics such as typing speed, comfort, low error rate, encountered. The user is given three options: taking the
and trainability. In this paper, we use a genetic algorithm to character into consideration (for Arabic characters),
design a layout optimized for typing speed. We also evaluate considering it as a special character (for special characters
the typing speed of important Arabic keyboard layouts. and white space), and ignoring it (for non-Arabic
Although our focus is on optimizing the typing speed, the characters). The special characters are given special status to
designed layout is also more comfortable than the common enable finding word boundaries. This allows finding the
keyboard layout. probabilities of words starting or ending with particular
Arabic characters.
The rest of this paper is organized as follows. The next
section describes the procedure used in this research. Genetic Algorithm for Design Optimization
Section 3 presents the results obtained and analyzes these
results. Finally, Section 4 presents the paper conclusions and A genetic Algorithm is a guided random search technique
future work. inspired by evolutionary Biology (Goldberg 1989). It
involves a population of candidate solutions to a problem,
PROCEDURE called individuals, which live, reproduce, and die based on
their fitness relative to the rest of the population. The genetic
Our procedure consists of two stages. First we collect data algorithm consists of five phases: initialization, evaluation,
and find the letters’ frequencies, and then we use these selection, reproduction, and competition. A typical genetic
frequencies in a genetic algorithm to optimize the design of algorithm requires two definitions:
the keyboard layout for speed. The keyboard layout is 1. Genetic representation of the solution domain
divided into two sets: the main set and the shift set. The main 2. Fitness function to evaluate the solution domain
set contains all keys that can be accessed without using the
shift key, while the shift set contains the same keys when the The optimization performed by a genetic algorithm has the
shift key is pressed. ability to search a large space for a solution which is close to
the optimum with relatively short time. The space of
Finding Frequencies of the Arabic Characters possible keyboard layouts is the factorial of 33 keys
(8.68×1036). Evaluating all these layouts is infeasible as it
The used optimization requires information about the Arabic takes very long time.
characters and character pairs’ frequencies. The data we
used to find these frequencies was taken from the Arabic The genetic algorithm is applied on the main set of the
Wikipedia (http://ar.wikipedia.org), which contains articles keyboard, where we allocated the most frequent Arabic
In Proc. 9th Int'l Middle Eastern Multiconference on Simulation and Modeling (MESM 2008), Aug 26-28, Amman, Jordan.

letters and symbols. The representation of the individuals is Selection


done using a 33-element vector, which corresponds to the During each successive generation, a proportion of the
characters allocated on the 33 keys of the main set. current population is selected to breed a new generation. The
selected individuals "parents" comprise the best individuals
The used fitness function is the typing time using the and randomly selected individuals. The randomly selected
keyboard layout. This fitness function takes into individuals ensure maintaining high diversity of the
consideration attributes that give short typing time such as population. Thus preventing the algorithm from getting
hand alternation and using the home row. It also take into stuck in a local minimum.
consideration attributes that give long typing time such as
typing two consecutive letters with the same finger. Smaller Reproduction
fitness values are better. The used fitness is the aggregate of The reproduction is usually made by crossover and/or
the products of all letter pair frequencies f i and the mutation from the parents. The produced individuals
corresponding letter pair typing times ti , formally "children" share many of the characteristics of their
"parents". We only use mutation, because there is no direct
method to apply crossover. The mutation is applied by
Fitness = ∑ f i * ti . (1) swapping the locations of a number of the parent's keys. We
i breed more children than we need so that we can select the
best children to compete with the original population. The
The pair typing times are found using a linear regression children who have better fitness replace some of the original
model (Hiraga et al. 1980). This model was built after population. This process gives the next generation
measuring the time intervals between key presses. We used population, which is different from the initial generation.
the following model to find the key pair typing time. Generally the average fitness increases by this process.

ti = 185.8 − 40.0h + 18.3r − 11.0 f + 0.514R + 1.07 F (2) Termination


The termination is performed manually. The program saves
Table 1 describes the attributes used in this model and the logs of the population every 200 generations. We monitor
possible values of these attributes. Letters mapped in the these logs seeking satisfactory results and convergence. The
shift set are given the maximum pair typing time of the main program was stopped after 160,000 generations. This took
set pair typing times. 32 hours of execution time on a 2.2 GHz Core 2 Duo
notebook.
Table 1: Attributes Affective Key Pair Typing Times
RESULTS AND ANALYSIS
Attribute Symbol Values
Hand Figure 4 shows the frequencies for Arabic letters. The most
h 0: same hand, 1: alternate hand frequent three letters (Alef “‫”ا‬, Lam “‫”ل‬, and Yeh “‫ )”ي‬have
Transition
Row Number of rows moved across in aggregate frequency of 34%. The most frequent 32 letters in
r addition to the period were selected for allocation on the
Transition the same hand transition
Finger The distance of finger columns in main set. The three least frequent letters and the special
f characters are allocated in the shift set.
Transition the same hand
Linear sum of weights for each
row position, where the 16%
Row
R weights given are 1, 2, and 3 for 14%
Weight
home, upper, and bottom 12%
rows, respectively.
Linear sum of weights on each of 10%

the finger positions of the 8%


Finger
F key pair, where the weights given
Weight 6%
are 4.5, 4.5, 1, 2, and 3
from the outer column inwards. 4%

2%

The following paragraphs describe our implementation of 0%


the used genetic algorithm. ‫ؤ ظ ء ئ غ ض ذ ث ز إ خ ى ط شص ج أ ح ق ك ف س ﻩ د ة ع ب ت ر ن و م ي ل ا‬

Initialization
Figures 4: Frequency of the Arabic Letters
Initially 2000 individuals (layouts) were randomly generated
to form an initial population. The population was generated
Table 2 shows the frequencies of letter pairs. Columns are
randomly so that the algorithm is not biased. This helps to
for the first letter and rows are for the second letter. This
avoid convergence to a local minimum. The randomness
table shows that the frequency of the frequent pair Alef
gives disperse layouts that cover wide regions of the space.
followed by Lam “‫ ”ال‬is 8.3%, Lam-Meem “‫ ”ﻟﻢ‬is 1.6%,
Yeh-Teh Marbuta “‫ ”ﻳﺔ‬is 1.3%, and Lam-Alef “‫ ”ﻻ‬is 1.3%.
In Proc. 9th Int'l Middle Eastern Multiconference on Simulation and Modeling (MESM 2008), Aug 26-28, Amman, Jordan.

Table 2: Frequency of Arabic Letter Pairs (First letter on the column, second letter on the row)

‫ا‬ ‫ل‬ ‫ي‬ ‫م‬ ‫و‬ ‫ن‬ ‫ر‬ ‫ت‬ ‫ب‬ ‫ع‬ ‫ة‬ ‫د‬ ‫ﻩ‬ ‫ف س‬ ‫ك‬ ‫ق‬ ‫ح‬ ‫أ‬ ‫ج‬ ‫ش ص‬ . ‫ط‬ ‫ى‬ ‫خ‬ ‫إ‬ ‫ز‬ ‫ث‬ ‫ذ‬ ‫ض‬ ‫غ‬ ‫ئ‬
‫ا‬ 0.0 1.2 0.7 1.0 1.1 0.6 0.8 0.3 0.7 0.6 0.0 0.4 0.7 0.4 0.2 0.5 0.4 0.3 0.0 0.3 0.2 0.2 0.0 0.2 0.0 0.1 0.0 0.1 0.1 0.2 0.1 0.1 0.0
‫ ل‬8.3 0.5 0.4 0.3 0.7 0.0 0.0 0.2 0.2 0.7 0.0 0.1 0.1 0.2 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.0 0.0 0.1 0.0 0.2 0.3 0.0 0.1 0.1 0.0 0.0 0.1
‫ ي‬0.3 0.9 0.1 0.5 0.4 0.6 0.8 0.3 0.6 0.2 0.0 0.6 0.2 0.4 1.3 0.2 0.2 0.3 0.1 0.2 0.1 0.1 0.0 0.1 0.0 0.1 0.0 0.2 0.1 0.1 0.1 0.1 0.2
‫ م‬0.8 1.6 0.3 0.1 0.4 0.1 0.1 0.3 0.1 0.2 0.0 0.2 0.3 0.2 0.0 0.2 0.0 0.2 0.2 0.2 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0
‫و‬ 0.2 0.4 0.3 0.4 0.0 0.2 0.3 0.3 0.2 0.1 0.0 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.3 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0
‫ ن‬1.2 0.3 0.9 1.1 0.5 0.0 0.1 0.1 0.3 0.3 0.0 0.1 0.1 0.1 0.0 0.2 0.0 0.0 0.3 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0
‫ر‬ 0.8 0.2 0.6 0.3 0.5 0.0 0.0 0.3 0.4 0.3 0.0 0.2 0.1 0.1 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.0 0.1 0.0 0.0 0.1 0.0
‫ ت‬0.9 0.7 0.3 0.2 0.2 0.3 0.1 0.1 0.1 0.1 0.0 0.0 0.0 0.4 0.1 0.2 0.1 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ب‬0.4 0.4 0.2 0.1 0.2 0.1 0.3 0.2 0.0 0.2 0.0 0.0 0.0 0.2 0.0 0.1 0.1 0.0 0.1 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ع‬0.3 0.6 0.2 0.4 0.2 0.0 0.1 0.2 0.3 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.1 0.0 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ة‬ 0.1 0.3 1.3 0.2 0.0 0.2 0.4 0.0 0.1 0.2 0.0 0.3 0.0 0.1 0.1 0.1 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫د‬ 0.5 0.3 0.4 0.3 0.2 0.2 0.1 0.1 0.2 0.3 0.0 0.1 0.1 0.0 0.0 0.0 0.3 0.3 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ﻩ‬ 0.2 0.4 0.3 0.1 0.1 0.3 0.1 0.3 0.2 0.1 0.0 0.1 0.0 0.1 0.1 0.0 0.0 0.0 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0
‫ س‬0.4 0.4 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.1 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0
‫ ف‬0.2 0.3 0.1 0.0 0.2 0.1 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ك‬0.2 0.5 0.2 0.1 0.1 0.0 0.2 0.1 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ق‬0.2 0.4 0.2 0.1 0.2 0.1 0.1 0.2 0.1 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ح‬0.2 0.4 0.1 0.2 0.1 0.0 0.1 0.2 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫أ‬ 0.0 0.6 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ج‬0.2 0.3 0.1 0.2 0.2 0.1 0.2 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ص‬0.1 0.2 0.0 0.1 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ش‬0.1 0.3 0.1 0.1 0.0 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
. 0.1 0.0 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ط‬0.1 0.1 0.1 0.0 0.1 0.1 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ى‬0.0 0.7 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ خ‬0.1 0.2 0.1 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫إ‬ 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ز‬ 0.1 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ث‬0.1 0.1 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ذ‬ 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ض‬0.1 0.0 0.1 0.0 0.1 0.0 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ غ‬0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
‫ ئ‬0.3 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Figure 5 shows the convergence curve of the genetic Figure 6 shows the optimized keyboard layout as found by
algorithm. This curve shows the best fitness value in a the genetic algorithm. This layout has the common letters
generation. Note the log scale of the generation number. located in the comfortable home row and gives good hand
This curve shows that the algorithm stabilizes fast after alternation. The following can be noticed from the main set
around 1000 generations (10 minutes of execution). layout:
1. The most frequent key-pair Alef-Lam “‫ ”ال‬are
1,800,000 under different hands, on the middle row, and under
the strongest fingers (the index and the middle
fingers).
1,750,000
2. Most of the frequent letters are placed in the middle
Fitness Value

row and/or using the strongest fingers, as in the


1,700,000 case of Yeh “‫”ي‬, Waw “‫”و‬, and Teh “‫”ت‬.
3. The least frequent letters are placed in the bottom
and top rows and most of them are under the pinky
1,650,000
finger (the weakest finger), as in the case of Thal
“‫”ذ‬, Dad “‫”ض‬, and Ghain “‫”غ‬.
1,600,000
1 10 100 1000 10000 100000 1000000
No algorithm was used to optimize the shift set because the
Generation
letters in this set are infrequently used in Arabic typing. The
shown arrangement was done manually to keep the layout
Figures 5: Convergence of the Genetic Algorithm
In Proc. 9th Int'l Middle Eastern Multiconference on Simulation and Modeling (MESM 2008), Aug 26-28, Amman, Jordan.

similar to the common layout. However, some modifications letter pair times to adjust the model used in Equation (2).
were necessary to accommodate for the additional letters not Moreover, we intend to try to take more factors in the
included in the main set. The letters which were moved to optimization such as typist comfort and error rate.
the shift set are put on keys of similar letters in the main set;
where users expect to find them. For example the pair Tah REFERENCES
"‫ "ط‬and Zah "‫"ظ‬, and the pair Alef with Hamza Above "‫"أ‬
Arab Standardization and Metrology Organization (ASMO). 1985.
and Hamza "‫ "ء‬are put on same keys. Standard 449: 7-Bit Arabic Code. Arab League, Cairo, Egypt.
Arab Standardization and Metrology Organization (ASMO). 1987.
Standard 663: Arabic Keyboard Layout. Arab League, Cairo,
Egypt.
Campbell-Kelly M. 2005. The User-friendly Typewriter. The
Rutherford Journal. Retrieved Jul 20, 2008 from
http://www.rutherfordjournal.org/article010105.html
Deshwal P.; and Deb K. 2003. “Design of an Optimal Hindi
Keyboard for Convenient and Efficient Use.” Technical report
KanGAL 2003005, Indian Institute of Technology, Kanpur.
Goettl J.; Brugh A.; and Julstrom B. 2005. “Call Me E-Mail:
Arranging the Keyboard with a Permutation-Coded Genetic
Figures 6: Arabic Keyboard Layout Optimized for Speed Algorithm.” In Proceedings of the ACM Symposium on Applied
Computing (Santa Fe, New Mexico, USA, Mar), 947-951.
A comparison is made between the optimized keyboard Goldberg E. 1989. Genetic Algorithms in Search, Optimization and
Machine Learning. Addison-Wesley Longman, Boston, MA.
layout and some other layouts by applying the fitness
Hiraga Y.; Ono Y.; and Yamada-Hisao. 1980. “An Analysis of the
function. Table 3 presents the results of this comparison. The Standard English Keyboard.” In Proc. 8th Conf. on
worst layout was found by running the genetic algorithm Computational Linguistics (Tokyo, Japan). 242-248.
with the fitness function complemented. The random layout Jordan Institution for Standards and Metrology (JISM). 1994.
was built by generating a random layout. We note that the Minutes of Meeting of the Arabic Letter Committee. Mar 10,
common layout is better than the ASMO standard layout. 1994, Amman, Jordan.
The main reason for this difference is mapping the Light L.; and Anderson P. 1993. “Typewriter Keyboards via
moderately used letter Teh Marbuta “‫ ”ة‬to the shift set in the Simulated Annealing.” AI Expert, (Sep).
ASMO layout. Taking the interval between the worst and Noyes, J. 1998. “QWERTY-The Immortal Keyboard.” Computing
optimized layouts as a reference base, the common layout is & Control Engineering Journal 9, No.3 (Jun), 117-122.
Wagner M.; Yannou B.; Kehl S.; Feillet D.; and Eggers J. 2003.
6% better than the ASMO standard and the optimized layout
“Ergonomic Modeling and Optimization of the Keyboard
is 35% better than the common layout. Arrangement with an Ant Colony Algorithm.” European
Journal of Operations Research 148, No.3 (Aug), 672–686
Table 3: Comparison for Arabic Keyboard Layouts Walker C. 2003. “Evolving a More Optimal Keyboard.” Technical
Report (Dec).
Fitness Relative
Layout AUTHOR BIOGRAPHY
Value Performance
Worst 1,942,000 0%
Random 1,773,000 53% Tareq Malas is a bachelor student in the Computer
ASMO 663 1,756,100 59% Engineering Department at the University of Jordan. He is
Apple MAC 1,740,600 64% expected to graduate in Sep 2008. He has won a full
Common 1,735,000 65% scholarship for pursuing postgraduate studies in King
Optimized 1,637,000 100% Abdullah University of Science and Technology (KAUST).

CONCLUSIONS Sinan Taifour is a bachelor student in the Electrical


Engineering Department at the University of Jordan. He is
In this paper we have used a genetic algorithm to optimize expected to graduate in Feb 2009.
the layout of the Arabic keyboard. The optimized layout
gives 35% relative increase in the typing speed. Moreover, Gheith Abandah holds BS degree in Electrical Engineering
the optimized layout solves some of the problems found in from the University of Jordan (1985) and MS and PhD
the common Arabic keyboard layout. This layout gives degrees in Computer Engineering from the University of
better allocation for the frequently used letters such as Thal Michigan (1995 and 1998). He is an Assistant Professor of
“‫”ذ‬, and does not dedicate precious key locations for two- Computer Engineering in the University of Jordan since
letter key combinations such as Lam-Alef “‫”ﻻ‬. We have 1998, and department chair since 2006. His areas of research
also found that the common layout by is 6% better than the are Computer Arabization, Arabic Optical Character
ASMO 663 layout. Recognition, Computer Architecture, Parallel Processing,
and Software Development.
For future work, we intend to train volunteers on using the
optimized layout to obtain experimental data about the
optimized layout’s performance, comfort, and trainability.
We can collect timing data from these volunteers and use the

View publication stats

You might also like