Fix normalization of jamo #11

Manishearth · 2016-12-19T22:38:53Z

No description provided.

The algorithm for composition of Hangul Jamo is: - L (choseong jamo) + V (jungseong jamo) = LV (syllable block) - LV (syllable block) + T (jongseong jamo) = LVT (syllable block) However, the LV and LVT syllable blocks are intermingled in the unicode block. In particular, for each pair LV, you will first see the syllable block LV, followed by syllable blocks for LVT for each T. The LV+T composition was a simple addition of offsets. Our algorithm did not ignore the LVT syllable blocks, which meant that LVT+T would just offset further and produce an unrelated syllable block. By ensuring that the `S_index` is a multiple of `T_count`, we filter for only LV syllable blocks (which occur every `T_count` codepoints in the S block)

SimonSapin and others added 2 commits December 19, 2016 21:25

Manishearth mentioned this pull request Dec 19, 2016

Update to Unicode 9.0.0 #10

Merged

SimonSapin force-pushed the 9 branch from f0e9362 to 7299191 Compare December 19, 2016 23:04

SimonSapin closed this in e4fd0e1 Dec 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix normalization of jamo #11

Fix normalization of jamo #11

Uh oh!

Uh oh!

Uh oh!

Fix normalization of jamo #11

Fix normalization of jamo #11

Uh oh!

Conversation

Uh oh!

Uh oh!