|
Welcome and Introduction
- Who's who?
- What do we want from this workshop?
Coded character sets: Past, Present and
Future
- Coded character sets
- Controls vs. Graphics
- Glyphs
- A short history of character sets
- Morse code
- Baudot code
- ISO 646, ISO 2022 and ISO 8859
- Windows code pages
- Shift-JIS
- GBK
- Unicode
- Unicode 1.0
- High-Level structure
- Unicode 4.0
Unicode character set and standards
- Overview
- The Unicode character set
- Notation
- The 10 Unicode design principles
- Special characters
- Special non-characters
- The Unicode standard
- Unicode vs. ISO 10646
- Main differences
- Unicode conformance
Representing Unicode: choosing the proper
form
- Unicode encodings
- UTF-32
- UTF-16
- UTF-8
- CESU-8
- Compression schemes
- Normalization Forms
Unicode implementation
- Reference i18n model
- Transcoding
- Text processing
- Case handling
- Case mapping, folding
- Text boundaries
- Grapheme clusters
- Words
- Lines
- Sentences
- Collation
- Sorting and searching
Database issues
- Overview
- Unicode support
- Multilingual schema design
- Stringtables per column
- Stringtables per table
- Database global stringtable
- Database migration to Unicode
- Migration concerns
- How to migrate
Asian character sets
- GB 18030-2000
- Background
- Properties
- Encoding
- Conformance
- HKSCS-2001
- Background
- Encoding
- HKSCS & Unicode
- HKSCS & Big-5
- JIS 0213:2000
- Background
- Properties
- Encoding
- Conformance
- Korean character sets
- The Korean writing system
- Jamo
- Hangul syllables
- Hangul - Implementation
- KS X 1001 character set
- KS X 1001 encoding
- Microsoft code page 949
- Unicode
|