newsletterlibrary.com

Top : Computers : Software : Globalization :
Character Encoding

Categories
Arabic 
Chinese 
CJKV 
Cyrillic 
Greek 
Hangul 
Hebrew 
Indic 
Japanese 
Korean @
Latin 
Native American 
Unicode 
Vietnamese @

Websites
Internationalization issues beyond HTML3.2 and ISO-8859-1. Includes information on Baltic encodings.
http://ppewww.ph.gla.ac.uk/~flavell/charset/

How to validate HTML documents in various character encodings.
http://www.htmlhelp.com/tools/validator/charset.html

Chapter covering document character sets and encodings in HTML from the World Wide Web Consortium's HTML 4.0 Specification.
http://www.w3.org/TR/REC-html40/charset.html

Covers code tables, Unicode, HTML and XML and links to other resources and discusses internationalization and localization issues relating to character sets.
http://www.w3.org/International/O-charset.html

Query character sets, encoding, codepages and Unicode information in an easy-to-use web form. Held at the Institute of the Estonian Language.
http://www.eki.ee/letter/

Hints and tips about character sets and fonts in web development. Includes links to related resources.
http://www.dantobias.com/webtips/char.html

A library for Windows developers that allows applications to encode binary data and files into text and vice-versa.
http://www.xceedsoft.com/products/binEncod/

A tutorial that explains HTML character sets, character encodings and character references from Webreference.com.
http://webreference.com/html/tutorial17/

A tutorial on character code issues in digital processing and transfer of text data, on the Internet or otherwise. Includes tables and a detailed listing of control codes. In English and Finnish.
http://www.cs.tut.fi/~jkorpela/chars/

Covers the beginnings of the ASCII standards from ASCII-1963 onwards and information on Cyrillic, Japanese, Korean, Thai and Vietnamese encoding systems, including various localized versions of EBCDIC. With tables and links to other resources.
http://www.cwi.nl/~dik/english/codes/stand.html

The standard names for use in SGML and XML, including a complete list of language name codes.
http://xml.coverpages.org/iso639a.html

A review of the HTML authoring problems caused by some special characters which belong to MS Windows character set but not to ISO Latin 1. Includes technical details and substitution tables. In English and Finnish.
http://www.cs.tut.fi/~jkorpela/www/windows-chars.html

A concise history of the development of character encoding in Western and East Asian languages, including ASCII, EBCDIC, Unicode and TRON.
http://tronweb.super-nova.co.jp/characcodehist.html

Codetables for ISO 8859-6, ASMO 449 plus, ASMO 708 (Arabic) and ISO 8859-8 (Hebrew) and further information about the company's work in multilingual UNIX.
http://www.langbox.com/

The official names for character sets that may be used in the Internet and referred to in Internet documentation - held at the Internet Assigned Number Authority.
http://www.iana.org/assignments/character-sets

A wide range of articles on Unicode, East Asian localization and Internationalization issues.
http://www.basistech.com/news/presentations/index.html

A comparison of two of these two basic encoding systems, with tables.
http://www.dynamoo.com/technical/ascii-ebcdic.htm

A side-by-side comparision of ASCII and EBCDIC encoding.
http://www.egrannie.com/cheatsheets/asciiebcdic.html

A character set conversion component for Unicode, Japanese, Chinese, Korean, Cyrillic, Arabic, Hebrew, Thai, Vietnamese and all Western languages.
site exerpt
Chilkat Charset Convert ActiveX Component for Character Encoding Conversion  Charset converts text data from one character encoding to another. It works identically on all computers, regardless of locale or internationalization settings. Supports Unicode, iso-8859 windows utf-7, utf-8, utf-16, utf-32, Shift_JIS, gb2312, ks_c_5601-1987, big5, iso-2022 euc-jp, euc-kr, x-mac asmo-708, ibm...
http://www.chilkatsoft.com/ChilkatCharset.asp

Specifies the structure of ECMA-35, for 8-bit codes and 7-bit codes which provide for the coding of character sets, with a detailed PDF document.
http://www.ecma-international.or...lications/standards/ECMA-035.HTM

Front end to several search engines and portals that allows you to enter queries in various character sets.
http://code.cside.com/3rdpage/

Information on Latin and non-Latin encoding systems, codepages and character sets by Roman Czyborra.
http://aspell.net/charsets/

Pennsylvania State University's guide to reading and publishing different languages on the web. Includes details of various encoding systems and links.
http://tlt.its.psu.edu/suggestions/international/

Mirror of Roman Czyborra's work on character sets and encoding systems. In English and German.
http://www.unicodecharacter.com/

Quick reference and searchable ASCII code and conversion tables.
http://www.whatasciicode.com