You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do not further compose LVT hangul precomposed syllable blocks
The algorithm for composition of Hangul Jamo is:
- L (choseong jamo) + V (jungseong jamo) = LV (syllable block)
- LV (syllable block) + T (jongseong jamo) = LVT (syllable block)
However, the LV and LVT syllable blocks are intermingled in the unicode
block. In particular, for each pair LV, you will first see the syllable block
LV, followed by syllable blocks for LVT for each T. The LV+T
composition was a simple addition of offsets.
Our algorithm did not ignore the LVT syllable blocks, which meant that
LVT+T would just offset further and produce an unrelated syllable block.
By ensuring that the `S_index` is a multiple of `T_count`, we filter
for only LV syllable blocks (which occur every `T_count` codepoints in
the S block)
0 commit comments