This web page is part of the Universe set of experiments.

To specify codepoints, use plain hexadecimal such as "0041". Separate distinct codepoints with ",". Commas with nothing in between (e.g. ",,") specify empty cells. Contiguous ranges can increase (e.g. "0041-005A") or decrease (e.g. "005A-0041"). Use "|" to break rows.
Output:

Table of Contents

  1. Introduction
  2. Character Sets
  3. Alphabets, Abjads and Abugidas
  4. Other Blocks
  5. Miscellanea
  6. Index

Introduction

Unicode is a standardised mapping between numbers and characters. The numbers, or "codepoints", are typically expressed in hexadecimal, like "U+0041". This codepoint (65 in decimal) corresponds to the capital letter "A" in Unicode:

0041

Version 14.0.0 of the Unicode standard contains over 144,000 characters. Visual representations of characters are called glyphs. Some characters have more than one glyph associated with them (e.g. "variations"). There is no single outline font that covers all the Unicode characters. Notoverse is an attempt at a bitmap font that does. It is limited to one glyph per character, but is perfectly adequate for illustrating codepoints.

0041

Hovering the mouse over the glyph above zooms in to the bitmap. Clicking it jumps to a simple user interface for rendering ranges.

Character Sets

Before Unicode, there were many ways of encoding character sets.

ASCII

The first 128 codepoints of Unicode map one-to-one with characters from 7-bit ASCII from 1963.

00-0F|10-1F|20-2F|30-3F|40-4F|50-5F|60-6F|70-7F

Coded Character Sets, History and Development by Charles E. Mackenzie (1980) has an exhaustive history of ASCII development.

EBCDIC

Contemporaneous with ASCII was IBM's EBCDIC. Below is the character set of EBCDIC code page 037. A few control codes have no Unicode equivalent.

0000-0003,,0009,,007F,,,,000B-000F|0010-0013,,0085,0008,,0018,0019,,,001C-001F|,0098,,,,000A,0017,001B,,,,,,0005-0007|,,0016,,,,,0004,,,,,0014-0015,,001A|0020,00A0,00E2,00E4,00E0,00E1,00E3,00E5,00E7,00F1,00A2,002E,003C,0028,002B,007C|0026,00E9,00EA,00EB,00E8,00ED-00EF,00EC,00DF,0021,0024,002A,0029,003B,00AC|002D,002F,00C2,00C4,00C0,00C1,00C3,00C5,00C7,00D1,00A6,002C,0025,005F,003E,003F|00F8,00C9,00CA,00CB,00C8,00CD-00CF,00CC,0060,003A,0023,0040,0027,003D,0022|00D8,0061-0069,00AB,00BB,00F0,00FD,00DE,00B1|00B0,006A-0072,00AA,00BA,00E6,00B8,00C6,00A4|00B5,007E,0073-007A,00A1,00BF,00D0,00DD,00FE,00AE|005E,00A3,00A5,00B7,00A9,00A7,00B6,00BC-00BE,005B,005D,00E4,0022,00B4,00D7|007B,0041-0049,00AD,00F4,00F6,00F2,00F3,00F5|007D,004A-0052,00B9,00FB,00FC,00F9,00FA,00FF|005C,00F7,0053-005A,00B2,00D4,00D6,00D2,00D3,00D5|0030-0039,00B3,00DB,00DC,00D9,00DA

EBCDIC is surprisingly still in use today.

Home Computers

Beyond the realm of business, the explosion of home computers in the 1980s led to a parallel explosion of character sets.

PETSCII (1977)

The PETSCII character set was used on Commodore's PET, VIC-20 and C64 machines. Below are the unshifted (left) and shifted (right) variations.

0020-002F,,0020-002F|0030-003F,,0030-003F|0040-004F,,0040,0061-006F|0050-005B,00A3,005D,2191-2190,,0070-007A,005B,00A3,005D,2191-2190|1FB79,2660,1FB72,1FB78-1FB76,1FB7A,1FB71,1FB74,256E,2570-256F,1FB7C,2572-2571,1FB7D,,1FB79,0041-004F|1FB7E,25CF,1FB7B,2665,1FB70,256D,2573,25CB,2663,1FB75,2666,253C,1FB8C,2502,03C0,25E5,,0050-005A,253C,1FB8C,2502,1FB96,1FB98|00A0,258C,2584,2594,2581,258F,2592,2595,1FB8F,25E4,1FB87,251C,2597,2514,2510,2582,,00A0,258C,2584,2594,2581,258F,2592,2595,1FB8F,1FB99,1FB87,251C,2597,2514,2510,2582|250C,2534,252C,2524,258E-258D,1FB88,1FB82-1FB83,2583,1FB7F,2596,259D,2518,2598,259A,,250C,2534,252C,2524,258E-258D,1FB88,1FB82-1FB83,2583,2713,2596,259D,2518,2598,259A

Sinclair ZX80 (1980)

The ZX80 only had 64 printable characters (along with their inverse forms), but their organisation meant hexadecimal encoding/decoding was trivial.

0020,0022,258C,2584,2598,259D,2596,2597,259E,2592,1FB8F,1FB8E,00A3,0024,003A,003F|0028-0029,002D,002B-002A,002F,003D-003E,003C-003B,002C,002E,0030-0033|0034-0039,0041-004A|004B-005A

Sinclair ZX81 (1981)

ZX81 had the same characters as the ZX80, but in a slightly different order.

0020,2598,259D,2580,2596,258C,259E,259B,2592,1FB8F,1FB8E,0022,00A3,0024,003A,003F|0028-0029,003E,003C-003D,002B,002D,002A,002F,003B,002C,002E,0030-0033|0034-0039,0041-004A|004B-005A

Acorn BBC Micro (1981)

The character set for modes 0 to 6 was essentially ASCII but with backquote replaced with a pound sign:

20-2F|30-3F|40-4F|50-5F|A3,61-6F|70-7E

For mode 7, the Latin G0 (English) Teletext character set is adhered to when the most significant bit (MSB) is set. When the MSB is clear (the upper grid below), the hash, pound and underscore characters are shuffled to try to more closely align with ASCII.

0020-002F|0030-003F|0040-004F|0050-005A,2190,00BD,2192-2191,2500|00A3,0061-006F|0070-007A,00BC,2016,00BE,00F7,25A0||0020-0022,00A3,0024-002F|0030-003F|0040-004F|0050-005A,2190,00BD,2192-2191,0023|2500,0061-006F|0070-007A,00BC,2016,00BE,00F7,25A0

When Teletext graphics mode is toggled on, this shuffling wreaks havoc with bit-to-pixel mapping. For this reason, it is advisable to set the MSB when rendering block graphics.

0020,1FB00-1FB01,0023,1FB03-1FB0E|1FB0F-1FB13,258C,1FB14-1FB1D|0040-004F|0050-005A,2190,00BD,2192-2191,1FB1E|1FB02,1FB1F-1FB27,2590,1FB28-1FB2C|1FB2D-1FB3B,2588||0020,1FB00-1FB0E|1FB0F-1FB13,258C,1FB14-1FB1D|0040-004F|0050-005A,2190,00BD,2192-2191,0023|1FB1E-1FB27,2590,1FB28-1FB2C|1FB2D-1FB3B,2588

Sinclair ZX Spectrum (1982)

The printable ZX Spectrum character set was very close to ASCII. Only the "↑", "£" and "©" characters were swapped in.

0020-002F|0030-003F|0040-004F|0050-005D,2191,005F|00A3,0061-006F|0070-007E,00A9

ISO/IEC 8859-1

ISO/IEC 8859-1 (1987), sometimes erroneously named "8-bit ASCII", is the basis for the first 256 codepoints of Unicode:

00-0F|10-1F|20-2F|30-3F|40-4F|50-5F|60-6F|70-7F|80-8F|90-9F|A0-AF|B0-BF|C0-CF|D0-DF|E0-EF|F0-FF

Alphabets, Abjads and Abugidas

Here is a chronological list of writing systems according to ISO-15924 (which is also maintained by the Unicode Consortium).

ISO-15924 Name Code Classification Region Since Until Unicode Block(s) Codepoints

Latin Alphabets

The 26 letters of the basic Latin alphabet are used for English and French:

0041-005A|0061-007A

The German alphabet adds four letters:

0041-005A,00C4,00D6,00DC,1E9E|0061-007A,00E4,00F6,00FC,00DF

The story of the capital Eszett "ß" is quite interesting.

Officially, the Spanish alphabet used to treat "CH" and "LL" as separate letters until 2010. Now, only "Ñ" is treated as an additional twenty-seventh letter:

0041-004E,00D1,004F-005A|0061-006E,00F1,006F-007A

The Dutch alphabet also has an additional letter, "IJ":

0041-0059,0132,005A|0061-0079,0133,007A

The modern Italian alphabet has only 21 letters:

0041-0049,004C-0056,005A|0061-0069,006C-0076,007A

The Polish alphabet consists of 32 letters:

0041,0104,0042-0043,0106,0044-0045,0118,0046-004C,0141,004D-004E,0143,004F,00D3,0050,0052-0053,015A,0054-0055,0057,0059-005A,0179,017B|0061,0105,0062-0063,0107,0064-0065,0119,0066-006C,0142,006D-006E,0144,006F,00F3,0070,0072-0073,015B,0074-0075,0077,0079-007A,017A,017C

Icelandic also has 32 letters, albeit very different:

0041,00C1,0042,0044,00D0,0045,00C9,0046-0049,00CD,004A-004F,00D3,0050,0052-0055,00DA,0056,0058-0059,00DD-00DE,00C6,00D6|0061,00E1,0062,0064,00F0,0065,00E9,0066-0069,00ED,006A-006F,00F3,0070,0072-0075,00FA,0076,0078-0079,00FD-00FE,00E6,00F6

Esperanto has 28 letters:

0041-0043,0108,0044-0047,011C,0048,0124,0049-004A,0134,004B-0050,0052-0053,015C,0054-0055,016C,0056,005A|0061-0063,0109,0064-0067,011D,0068,0125,0069-006A,0135,006B-0070,0072-0073,015D,0074-0075,016D,0076,007A

The Latin alphabet has also been the basis for supranational alphabets such as the International African Alphabet (1928) with 36 letters:

0041-0042,0181,0043-0044,0189,0045,0190,018E,0046,0191,0047,0194,0048,0058,0049-004E,014A,004F,0186,0050,0052-0053,01A9,0054-0056,01B2,0057,0059-005A,01B7|0061-0062,0253,0063-0064,0256,0065,025B,01DD,0066,0192,0067,0263,0068,0078,0069-006E,014B,006F,0254,0070,0072-0073,0283,0074-0076,028B,0077,0079-007A,0292

This was developed into the World Orthography (1948) alphabet with 31 letters:

0041-0044,00D0,0045,018F,0046-004E,014A,004F-0050,0052-0053,01A9,0054,0398,0055-005A,01B7|0061-0064,00F0,0065,0259,0066-006E,014B,006F-0070,0072-0073,0283,0074,03B8,0075-007A,0292

Cyrillic Alphabets

The 33 letters of the basic Cyrillic alphabet are used for Russian:

0410-0415,0401,0416-042F|0430-0435,0451,0436-044F

The Belarusian Cyrillic alphabet has 32 letters:

0410-0415,0401,0416-0417,0406,0419-0423,040E,0424-0428,042B-042F|0430-0435,0451,0436-0437,0456,0439-0443,045E,0444-0448,044B-044F

The Ukrainian alphabet has 33 letters:

0410-0413,0490,0414-0415,0404,0416-0418,0406-0407,0419-0429,042C,042E-042F|0430-0433,0491,0434-0435,0454,0436-0438,0456-0457,0439-0449,044C,044E-044F

The Bulgarian alphabet has 30 letters:

0410-042A,042C,042E-042F|0430-044A,044C,044E-044F

The Serbian Cyrillic alphabet has 30 letters:

0410-0414,0402,0415-0418,0408,041A-041B,0409,041C-041D,040A,041E-0422,040B,0423-0427,040F,0428|0430-0434,0452,0435-0438,0458,043A-043B,0459,043C-043D,045A,043E-0442,045B,0443-0447,045F,0448

Armenian Alphabet

The Armenian alphabet consists of 38 letters:

0531-0556|0561-0586

Caucasian Albanian Alphabet

The Caucasian Albanian alphabet has 52 letters but no case distinction:

10530-10549|1054A-10563

Elbasan Alphabet

The Elbasan alphabet had 40 letters and was used in Albanian religious texts:

10500-10527

Aramaic Alphabets

Imperial Aramaic

10840-10855

Syriac

0710,0712-0713,0715,0717-071B,071D,071F-0723,0725-0726,0728-072C

Palmyrene

10860-1086C,1086E-10876

Mandaic

0840-0856

Chorasmian (Khwarezmian)

10FB0-10FC4

Elymaic

10FE0-10FF5

Hatran

108E0-108F2,108F4-108F5

Avestan Alphabet

The extended Avestan alphabet has 38 consonants and 16 vowels:

10B10-10B35|10B00-10B0F

Coptic Alphabet

The Coptic alphabet has uppercase and lowercase letters:

2C80,2C82,2C84,2C86,2C88,2C8A,2C8C,2C8E,2C90,2C92,2C94,2C96,2C98,2C9A,2C9C,2C9E,2CA0,2CA2,2CA4,2CA6,2CA8,2CAA,2CAC,2CAE,2CB0,03E2,03E4,03E6,03E8,03EA,03EC,03EE,2CC0|2C81,2C83,2C85,2C87,2C89,2C8B,2C8D,2C8F,2C91,2C93,2C95,2C97,2C99,2C9B,2C9D,2C9F,2CA1,2CA3,2CA5,2CA7,2CA9,2CAB,2CAD,2CAF,2CB1,03E3,03E5,03E7,03E9,03EB,03ED,03EF,2CC1

Glagolitic Alphabet

This Slavic alphabet also has uppercase and lowercase:

2C00-2C09,2C0B-2C19,2C1B-2C21,2C26,2C23-2C24,2C27-2C2B|2C30-2C39,2C3B-2C49,2C4B-2C51,2C56,2C53-2C54,2C57-2C5B

Bamum Syllabary

The Phase G (1918) Bamum script has 80 characters:

A6A0-A6C7|A6C8-A6EF

Bassa Vah Alphabet

The abandoned Bassa Vah alphabet of Liberia had 23 consonants and 7 vowels:

16AD0-16AE6|16AE7-16AED

Greek Alphabet

The Greek alphabet consists of 24 letters with uppercase and lowercase forms. Sigma also has a word-final form:

0391-03A1,03A3-03A9|03B1-03C1,03C3-03C9|,,,,,,,,,,,,,,,,,03C2

Lycian Alphabet

10280,10282,10284-10288,1029B,10289-1028B,1028D-1028F,10292-10297,10281,10299-1029A,10290-10291,10298,1028C,10283,1029C

Lydian

10920-10939

Gothic

Based on the Greek alphabet with additions for the Gothic language:

10330-10349

Carian

The Carian alphabet from Kaunos is thought to be the most complete version:

102A0,102CC,102A2-102A5,102CF,102A8-102AB,10313,102AF-102B5,102B7,0398,102B9-102BA,102BC,102BE,102C1-102C2,102C4,102C6,102B6

Georgian Alphabet

There are four forms of the modern 33-letter Georgian alphabet:

  1. Asomtavruli is the oldest form, dating from the fifth century A.D.
  2. Nuskhuri dates from the ninth century A.D.
  3. Mkhedruli is the current Georgian script
  4. Mtavruli is the uppercase version of Mkhedruli
All four forms are represented in Unicode (in order):

10A0-10C0|2D00-2D20|10D0-10F0|1C90-1CB0

Korean Alphabet

Modern Hangul is written using 24 basic letters (14 consonants and 10 vowels):

3131,3134,3137,3139,3141-3142,3145,3147-3148,314A-314E|314F,3151,3153,3155,3157,315B-315C,3160-3161,3163

These are organised into jamo (19 compound consonants and 21 compound vowels):

3131-3132,3134,3137-3139,3141-3143,3145-314E|314F-3163

Additionally, precomposed Hangul syllables are encoded as individual Unicode codepoints.

Hebrew Abjad

The Hebrew abjad is a right-to-left script. So when rendered as text below, the first letter appears on the right-hand side:

05D0-05D9,05DB-05DC,05DE,05E0-05E2,05E4,05E6-05EA|,,,,,,,,,,05DA,,05DD,05DF,,,05E3,05E5

Unicode includes three special format characters:

These can be used to override the usual text direction behaviour:

202D,05D0-05D9,05DB-05DC,05DE,05E0-05E2,05E4,05E6-05EA,202C|202D,,,,,,,,,,05DA,,05DD,05DF,,,05E3,05E5,,,,,,202C

Like many non-Latin scripts, it does not have a distinction between uppercase and lowercase, but does have variations for letters that appear at the end of words.

Arabic Abjad

Arabic script is also written right-to-left but in a cursive (joined) form:

064A,0648-0641,063A-062A,0628-0627

Adlam Alphabet

The Adlam alphabet for the Fulani language was invented in 1990 by two young brothers:

1E900-1E91B|1E922-1E93D

Brahmic Abugidas

Brahmic abugidas are not true alphabets. They typically use diacritics to represent some vowels. Consequently, the following usually only list the consonants.

Ahom

The 24 consonants of the extinct Ahom language:

11700-11704,11706-11715,11717-11719

Assamese

The 33 pure consonants of the Assamese language:

0995-09A8,09AA-09AF,09F0,09B2,09F1,09B6-09B9

Balinese

The 33 consonants of the Balinese language:

1B13-1B33

Batak

The 19 basic characters (surat) of the Karo variant of the Batak script:

1BC0,1BC2,1BC6-1BC7,1BC9,1BCB,1BCE,1BD0-1BD2,1BD4,1BD7-1BD8,1BDB,1BDD-1BDE,1BE1,1BE4-1BE5

Baybayin/Tagalog

The 15 consonants of Tagalog from the Philippines:

1703-1711

Bengali

The 31 consonants of the Bengali (Bangla) language:

0995-09A8,09AA-09AF,09B2,09B6-09B9

Bhaiksuki

Bhaiksuki has 33 consonants and was used around the turn of the first millennium for writing Sanskrit:

11C0E-11C2E

Brahmi

The original Brahmi script had 34 consonants:

11013-11034

Buhid

The 15 consonants of the Buhid language:

1743-1751

Burmese

The 34 consonants of the Burmese language:

1000-1021

Chakma

The 32 consonants of the Chakma language:

11107-11126

Cham

The 35 consonants of the Cham language:

AA06-AA28

Devanagari

The 33 consonants of Devanagari:

0915-0928,092A-0930,0932,0935-0939

Dhives Akuru

Dhives Akuru was used to write the Maldivian language up until the 20th century:

1190C-11913,11915-11916,11918-1192F

Dogra Akkhar

Dogra Akkhar was used to write Dogri:

1180A-1182B

Grantha

Grantha is in traditional Vedic schools to write Sanskrit:

11315-11328,1132A-11330,11332-11333,11335-11339

Gujarati

The 34 consonants of the Gujarati language:

0A95-0AA8,0AAA-0AB0,0AB2-0AB3,0AB5-0AB9

Gunjala Gondi

Used for writing the Adilabad dialect of the Gondi language:

11D6C-11D89

Gurmukhi

Used for writing the Punjabi language:

0A15-0A28,0A2A-0A30,0A32-0A33,0A35-0A36,0A38-0A39

Hanunó'o

Used for writing the Hanunó'o language:

1723-1731

Javanese

The 20 consonants of the Javanese script in hanacaraka order:

A9B2,A9A4,A995,A9AB,A98F,A9A2,A9A0,A9B1,A9AE-A9AD,A9A5,A99D,A997,A9AA,A99A,A9A9,A992,A9A7,A99B,A994

Kaithi

Historically used for writing legal, administrative, and private records:

1108D-110AF

Kannada

Used for writing Kannada, Konkani, Tulu, Badaga, Kodava, Beary and others:

0C95-0CA8,0CAA-0CB3,0CDE,0CB5-0CB9

Kayah Li

Used for writing Kayah languages

A90A-A921

Kharoshthi

Used by the Khasa, Saka, and Yuezhi peoples:

10A10-10A13,10A15-10A17,10A19-10A35

Khmer

There are 35 consonants in the Khmer language, though two ("ឝ" and "ឞ") are obsolete:

1780-17A2

Khojki

The Khojki script was used by the Khoja community for Muslim religious literature:

11208-11211,11213-1122B

Khudawadi

Khudawadi, also known as Khudabadi, is used for writing the Sindhi language:

112BA-112DE

Lao

There are 27 consonants of the modern Lao language:

0E81-0E82,0E84,0E87-0E88,0EAA,0E8A,0E8D,0E94-0E97,0E99-0E9F,0EA1-0EA3,0EA5,0EA7,0EAB,0EAD-0EAE

The name pairs "FO TAM"/"FO SUNG" and "LO LING"/"LO LOOT" were accidentally and irrevocably switched when Lao was added to Unicode.

Lepcha

Used for writing the Lepcha (Róng) language:

1C23,1C00,1C02-1C03,1C05-1C0E,1C10-1C11,1C13,1C15,1C17-1C1D,1C1F,1C21-1C20,1C22,1C01,1C04,1C0F,1C12,1C14,1C16,1C1E,1C4D-1C4F

Limbu

Used for writing the Limbu language:

1901-191E

Lontara

Used for writing the Buginese language:

1A00-1A16

Mahajani

Historically used in northern India for writing accounts and financial records.

11155-11172

Makasar

Used in South Sulawesi, Indonesia for writing the Makassarese language:

11EE0-11EF1

Malayalam

Used for writing the Malayalam language:

0D15-0D28,0D2A-0D39

Marchen

Used in the Tibetan Bön tradition to write the extinct Zhang-Zhung language:

11C72-11C8F

Masaram Gondi

Used for writing Gondi but based on Brahmi characters:

11D0C-11D2D

Meetei Mayek

Used for the Meitei language:

ABC0-ABDA

Modi

Used to write the Marathi language:

1160E-1162F

Multani

Used to write the Multani language:

11284-11286,11288,1128A-1128D,1128F-1129D,1129F-112A8

Nandinagari

Historically used to write Sanskrit in southern India:

119AE-119D0

New Tai Lue

The 44 consonants of the Tai Lü language come in pairs to denote two tonal registers (high and low):

1980,1982-1984,1988-198A,198E-1990,1994-1996,199A-199C,19A1-19A2,19A0,19A6-19A7,19AA|1981,1985-1987,198B-198D,1991-1993,1997-1999,199D-199F,19A4-19A5,19A3,19A8-19A9,19AB

Odia

The Odia (or Oriya) script is used for writing the Odia language:

0B15-0B28,0B2A-0B2E,0B5F,0B2F-0B30,0B32-0B33,0B71,0B36-0B39

ʼPhags-pa

There are 41 basic letters in the ʼPhags-pa script, historically used during the Mongol Yuan dynasty:

A840-A868

Prachalit (Newa)

Used to write Sanskrit, Nepali, Hindi, Bengali, and Maithili languages:

1140E-11434

Rejang

The Rejang language is mostly obsolete:

A930-A946

Saurashtra

The Saurashtra language also mostly obsolete:

A892-A8B3

Sharada

Used for writing Sanskrit and Kashmiri:

11191-111B2

Siddham

Used for writing Sanskrit

1158E-115AE

Sinhala

In addition to 18 consonants, the modern Sinhala language has 12 independent vowels:

0D9A,0D9C,0D9F-0DA0,0DA2,0DA7,0DA9,0DAB-0DAD,0DAF,0DB3-0DB4,0DB6,0DB8-0DBB|0D85-0D8C,0D91-0D92,0D94-0D95

Soyombo

Developed by the monk and scholar Zanabazar in 1686 to write Mongolian:

11A5C-11A83

The national symbol for Mongolia derives from this script:

11A9E-11AA0

Sundanese

Modern Sundanese has 18 main consonants and 7 independent vowels:

1B8A,1B8C-1B8F,1B91-1B95,1B98-1B9E,1BA0|1B83,1B86,1B84,1B87,1B85,1B88-1B89

Sylheti Nagri

There are 27 main consonants of the Sylheti language:

A807-A80A,A80C-A822

Tagbanwa

There are 13 consonants in the Tagbanwa languages:

1763-176C,176E-1770

Tai Le

Used for writing the Tai Nüa language:

1950-1962

Tai Tham

There are 47 consonants in the full Tai Tham (Lanna) script:

1A20-1A3F|1A40-1A4C,1A53-1A54

Tai Viet

Used for writing the Tai Dam language with 48 consonants split into high and low forms:

AA81,AA83,AA85,AA87,AA89,AA8B,AA8D,AA8F,AA91,AA93,AA95,AA97,AA99,AA9B,AA9D,AA9F,AAA1,AAA3,AAA5,AAA7,AAA9,AAAB,AAAD,AAAF|AA80,AA82,AA84,AA86,AA88,AA8A,AA8C,AA8E,AA90,AA92,AA94,AA96,AA98,AA9A,AA9C,AA9E,AAA0,AAA2,AAA4,AAA6,AAA8,AAAA,AAAC,AAAE

Takri

Used for writing Chambeali and other languages:

1168A-116AA

Tamil

There are 18 basic consonants in the Tamil language:

0B95,0B99-0B9A,0B9E-0B9F,0BA3-0BA4,0BA8,0BAA,0BAE-0BB0,0BB2,0BB5-0BB3,0BB1,0BA9

Telugu

The 35 main consonants of the Telugu language:

0C15-0C28,0C2A-0C30,0C32,0C35,0C33,0C36-0C39,0C31

Thai

The Thai script has 44 consonants:

0E01-0E23,0E25,0E27-0E2E

Tibetan

Tibetan has 30 basic consonants:

0F40-0F42,0F44-0F47,0F49,0F4F-0F51,0F53-0F56,0F58-0F5B,0F5D-0F64,0F66-0F68

Tirhuta

Historically used for the Maithili language with 33 consonants:

1148F-114AF

Zanabazar Square

Used to write Mongolian, Tibetan and Sanskrit:

11A0B-11A32

Ethiopic Scripts

Geʽez Abjad

The Geʽez abjad was used in Ethiopia until the advent of Christianity and had 26 consonants. Vowels were not indicated.

1200,1208,1210,1218,1220,1228,1230,1240,1260,1270,1280,1290,12A0,12A8,12C8,12D0,12D8,12E8,12F0,1308,1320,1330,1338,1340,1348,1350

Geʽez Abugida

Since about 350 A.D. Geʽez has been written as an abugida (alphasyllabary). However, instead of using diacritics to denote vowels, Geʽez uses different letter forms:

1200,1208,1210,1218,1220,1228,1230,1240,1260,1270,1280,1290,12A0,12A8,12C8,12D0,12D8,12E8,12F0,1308,1320,1330,1338,1340,1348,1350|1201,1209,1211,1219,1221,1229,1231,1241,1261,1271,1281,1291,12A1,12A9,12C9,12D1,12D9,12E9,12F1,1309,1321,1331,1339,1341,1349,1351|1202,120A,1212,121A,1222,122A,1232,1242,1262,1272,1282,1292,12A2,12AA,12CA,12D2,12DA,12EA,12F2,130A,1322,1332,133A,1342,134A,1352|1203,120B,1213,121B,1223,122B,1233,1243,1263,1273,1283,1293,12A3,12AB,12CB,12D3,12DB,12EB,12F3,130B,1323,1333,133B,1343,134B,1353|1204,120C,1214,121C,1224,122C,1234,1244,1264,1274,1284,1294,12A4,12AC,12CC,12D4,12DC,12EC,12F4,130C,1324,1334,133C,1344,134C,1354|1205,120D,1215,121D,1225,122D,1235,1245,1265,1275,1285,1295,12A5,12AD,12CD,12D5,12DD,12ED,12F5,130D,1325,1335,133D,1345,134D,1355|1206,120E,1216,121E,1226,122E,1236,1246,1266,1276,1286,1296,12A6,12AE,12CE,12D6,12DE,12EE,12F6,130E,1326,1336,133E,1346,134E,1356|,120F,1217,121F,1227,122F,1237,124B,1267,1277,128B,1297,12A7,12B3,,,12DF,,12F7,1313,1327,1337,133F,,134F,1357|,,,1359,,1358,,,,,,,,,,,,,,,,,,,135A

Bopomofo Phonetic Script

Bopomofo (Zhuyin fuhao) is an official transliteration system in Taiwan.

3105-3129

Unified Canadian Aboriginal Syllabics

Canadian syllabic scripts are abugidas where vowels are denoted by rotation of the consonants instead of by diacritics:

1401,142F,144C,146B,1489,14A3,14C0,14ED,1526,005A|1403,1431,144E,146D,148B,14A5,14C2,14EF,1528,005A|1405,1433,1450,146F,148D,14A7,14C4,14F1,152A,004E|140A,1438,1455,1472,1490,14AA,14C7,14F4,152D,0418

Cherokee Syllabary

The Cherokee syllabary has 86 letters (including the archaic "Ᏽ"):

13A0,,,13A1,,13A2,,13A3-13A5|13A6-13A7,,13A8,,13A9,,13AA-13AC|13AD,,,13AE,,13AF,,13B0-13B2|13B3,,,13B4,,13B5,,13B6-13B8|13B9,,,13BA,,13BB,,13BC-13BD,13F5|13BE-13C1,,13C2,,13C3-13C5|13C6,,,13C7,,13C8,,13C9-13CB|13CC-13CD,,13CE,,13CF,,13D0-13D2|13D3-13D4,,13D5-13DB|13DC-13DD,,13DE,,13DF,,13E0-13E2|13E3,,,13E4,,13E5,,13E6-13E8|13E9,,,13EA,,13EB,,13EC-13EE|13EF,,,13F0,,13F1,,13F2-13F4

Cypro-Minoan Syllabary

12F90-12FAF|12FB0-12FCF|12FD0-12FEF

Cypriot Syllabary

10800-10804|10832-10835|1083C,,,1083F|10805,,,10808|1080A-1080E|1080F-10813|10814-10818|10819-1081D|10837-10838|1081E-10822|10823-10827|10828-1082C|1082D-10831

Deseret Alphabet

The Deseret alphabet was an attempt at spelling reform by the Mormons in the mid-nineteenth century:

10400-10427|10428-1044F

Duployan Shorthand

1BC00-1BC18|1BC19-1BC31|1BC32-1BC45|1BC46-1BC60

Lisu Alphabet

Developed in the early 20th century by missionary James Fraser, with 30 consonants and 10 vowels:

A4D0-A4ED|A4EE-A4F7

Braille

The English Braille alphabet:

2801,2803,2809,2819,2811,280B,281B,2813,280A,281A,2805,2807,280D,281D,2815,280F,281F,2817,280E,281E,2825,2827,283A,282D,283D,2835

Sutton SignWriting

Developed in 1974 by Valerie Sutton for writing sign languages. It was based on her experience two years earlier developing a system for writing down dance movements.

1D800,1D80E,1D81E,1D844,1D84C,1D886,1D8A4,1D8BA,1D8CD,1D8F5,1D905,1D916,1D92A,1D9F5,1D9FF,1DA6D,1DA7F

Japanese Syllabaries

Hiragana

3042,3044,3046,3048,304A|304B,304D,304F,3051,3053|3055,3057,3059,305B,305D|305F,3061,3064,3066,3068|306A-306E|306F,3072,3075,3078,307B|307E-3082|3084,,3086,,3088|3089-308D|308F-3090,,3091-3092|,,3093

Katakana

30A2,30A4,30A6,30A8,30AA|30AB,30AD,30AF,30B1,30B3|30AC,30AE,30B0,30B2,30B4|30B5,30B7,30B9,30BB,30BD|30B6,30B8,30BA,30BC,30BE|30BF,30C1,30C4,30C6,30C8|30C0,30C2,30C5,30C7,30C9|30CA-30CE|30CF,30D2,30D5,30D8,30DB|30D0,30D3,30D6,30D9,30DC|30D1,30D4,30D7,30DA,30DD|30DE-30E2|30E4,,30E6,,30E8|30E9-30ED|30EF-30F0,,30F1-30F2|,,30F3

Pahawh Hmong Script

16B1C-16B2F

Nyiakeng Puachue Hmong Script

1E100-1E123

Old Hungarian Alphabet

10C80-10C82,10C84,10C86-10C87,10C89,10C8B-10C94,10C96-10C9D,10C9F-10CA0,10CA2,10CA4-10CA6,10CA8,10CAA-10CB0|10CC0-10CC2,10CC4,10CC6-10CC7,10CC9,10CCB-10CD4,10CD6-10CDD,10CDF-10CE0,10CE2,10CE4-10CE6,10CE8,10CEA-10CF0

Khitan Small Script

18B00-18B1F

Manichaean Abjad

10AC0-10AC7,10AC9-10AE4

Medefaidrin Alphabet

16E40-16E5F|16E60-16E7F

Mende Kikakui Syllabary

1E800-1E807|1E808-1E810|1E811-1E813|1E814-1E81A|1E81B-1E821|1E822-1E82D,1E857,1E887-1E888|1E889-1E88B|1E88C-1E892|1E893-1E89E|1E89F-1E8A5|1E8A6-1E8AC|1E8AD|1E8AE-1E8B4|1E8B5-1E8B7,1E857-1E859|1E85A-1E862|1E863-1E867|1E868-1E876|1E877-1E882|1E883-1E886,1E8B8,1E8B8|1E8B9-1E8BF|1E8C0-1E8C4

Meroitic Cursive Alphabet

109A4-109AF,109B1-109B7|109A0-109A3

Mongolian Alphabet

Traditionally written vertically, from top to bottom, Mongolian script is typically rendered horizontally on devices:

1820-1842

Mro Alphabet

16A40-16A5E

Old North Arabian Alphabet

10A80-10A9C

Old South Arabian Alphabet

10A60-10A7C

Nabataean Abjad

10881,10883-10885,10887-1088B,1088D,1088F,10891,10893,10895-1089B,1089D-1089E|10880,10882,,,10886,,,,,1088C,1088E,10890,10892,10894,,,,,,,1089C

N’Ko Alphabet

07CA-07E7

Ogham Alphabet

The story behind the Ogham space mark is itself interesting.

1681-1694

Ol Chiki Alphabet

1C5A-1C77

Old Turkic Alphabets

Orkhon

10C00,10C03,10C06-10C07,10C09,10C0B,10C0D,10C0F,10C11,10C13-10C14,10C16,10C18,10C1A,10C1C,10C1E,10C20-10C24,10C26,10C28,10C2A,10C2D,10C2F,10C31-10C32,10C34,10C36,10C38,10C3A,10C3C-10C3E,10C41,10C43,10C45,10C47-10C48

Yenisei Variant

10C01-10C02,10C04-10C06,10C08,10C0A,10C0C,10C0E,10C10,10C12-10C14,10C17,10C19,10C1B,10C1D,10C1F-10C23,10C25-10C26,10C28,10C2A,10C2D,10C2F,10C31-10C32,10C35-10C36,10C39,10C3B-10C3E,10C41,10C44,10C46-10C48

Old Uyghur Alphabet

10F70-10F81

Osage Alphabet

104B0-104D3|104D8-104FB

Osmanya Alphabet

10480-1049D

Pau Cin Hau Alphabet

11AC0-11ADB

Old Permic Alphabet

10350-10375

Phoenician Alphabet

10900-10915

Pollard Miao Abugida

16F00-16F05,16F07-16F12,16F14-16F24|16F26-16F31,16F33-16F3E,16F40-16F4A

Hanifi Rohingya Alphabet

10D00-10D1B|10D1D-10D21

Samaritan Abjad

0800-0815

Shavian Alphabet

Named after George Bernard Shaw who posthumously funded a competition for English language writing reform. It was ultimately "won" by Kingsley Read:

10450,1045A,10451,1045B,10452,1045C,10453,1045D,10454,1045E,10455,1045F,10456,10460,10457,10461,10458,10462,10459,10463|10464,1046E,10465,1046F,10466,10470,10467,10471,10468,10472,10469,10473,1046A,10474,1046B,10475,1046C,10476,1046D,10477

Old Sogdian Abjad

10F00,10F02,10F04-10F05,10F07-10F0E,10F11-10F12,10F14-10F15,10F18-10F1A

Sogdian Abjad

10F30-10F3C,10F3E-10F44

Sora Sompeng Alphabet

110D0-110E8

Tangsa Alphabet

16AA0-16ABE

Berber Alphabets

Tifinagh

2D30-2D31,2D33,2D37,2D39,2D3B-2D3D,2D40,2D43-2D45,2D47,2D49-2D4A,2D4D-2D4F,2D53-2D56,2D59-2D5C,2D5F,2D61-2D63,2D65

Northern-Berber Latin

0041-0043,010C,0044,1E0C,0045,0190,0046-0047,01E6,0194,0048,1E24,0049-004E,0051-0052,0158,1E5A,0053,1E62,0054,1E6C,0055,0057-005A,1E92|0061-0063,010D,0064,1E0D,0065,025B,0066-0067,01E7,0263,0068,1E25,0069-006E,0071-0072,0159,1E5B,0073,1E63,0074,1E6D,0075,0077-007A,1E93

Southern-Berber Latin

0041,0102,0042,1E04,0044,1E0C,0045,018E,0046-0047,0194,0048,1E24,0049-004C,1E36,004D-004E,014A,004F,0051-0053,1E62,0160,0054,1E6C,0055,0057-005A,017D,1E92,0393|0061,0103,0062,1E05,0064,1E0D,0065,01DD,0066-0067,0263,0068,1E25,0069-006C,1E37,006D-006E,014B,006F,0071-0073,1E63,0161,0074,1E6D,0075,0077-007A,017E,1E93,0295

Tuareg-Berber Latin

There is no single uppercase codepoint for 'ǰ' (U+01F0), but a combining caron can be used 'J̌' (U+004A U+030C)

0041,0102,018E,0042-0044,1E0C,0045-0047,01E6,0048-004A,,0194,004B-004C,1E36,004D-004E,014A,004F-0053,1E62,0160,0054,1E6C,0055,0057-005A,1E92|0061,0103,01DD,0062-0064,1E0D,0065-0067,01E7,0068-006A,01F0,0263,006B-006C,1E37,006D-006E,014B,006F-0073,1E63,0161,0074,1E6D,0075,0077-007A,1E93

Thaana Abjad

0780-0797

Toto Alphabet

1E290-1E2A0|1E2A1-1E2AD

Ugaritic Abjad

Although pressed into clay with a pointed stick, its thirty symbols were unrelated to Sumero-Akkadian cuneiform.

10380-1039D

Old Persian Semisyllabary

A semi-alphabetic cuneiform script loosely inspired by Sumero-Akkadian cuneiform.

103A3-103C3

Vai Syllabary

A500-A51E|A51F-A53D|A53E-A55C|A55D-A57B|A57C-A59A|A59B-A5B9|A5BA-A5D8|A5D9-A5F7|A5F8-A60B

Vithkuqi Alphabet

10570-1057A,1057C-1058A,1058C-10592,10594-10595|10597-105A1,105A3-105B1,105B3-105B9,105BB-105BC

Warang Citi Alphabet

118A0-118BF|118C0-118DF

Wancho Alphabet

1E2C0-1E2CE|1E2CF-1E2DD|1E2DE-1E2EB

Yezidi Alphabet

Modern Yezidi is a Kurdish alphabet without ligatures.

10E80-10E9A,10E9C-10EA9

Modern Yi Syllabary

A000-A01F

Pahlavi Abjads

Inscriptional Parthian

10B40-10B55

Inscriptional Pahlavi

10B60-10B72

Psalter Pahlavi

10B80-10B91

Old Italic Alphabets

Several Old Italic alphabets shared the same Unicode codepoints. It is assumed that font character variations are used to display the glyphs slightly differently, where appropriate.

Marsiliana

From the Marsiliana d'Albegna tablet of the 7th century BCE:

10300-10319

Archaic Etruscan

Used from 7th century BCE to 5th century BCE:

10300,10302,10304-1030D,10310-1031A

Neo-Etruscan

Used from 4th century BCE to 1st century BCE:

10300,10302,10304-10309,1030B-1030D,10310-10311,10313-10316,10318-1031A

Oscan

Used from 5th century BCE to 1st century CE:

10300-10306,10309-1030D,10310,10313-10316,1031A,1031D-1031E

Umbrian

Used from 7th century BCE to 1st century BCE:

10300-10301,10304-1030D,10310-10311,10313-10316,1031A-1031C

Faliscan

Used from 7th century BCE to 2nd century BCE:

10300,10302,10304,10306-1030D,1030F-10310,10312-10317,1032E

North Picene

From only four inscriptions from about 650 BCE:

10300-10305,10308-1030D,1030F-10311,10313,10315,1031E

South Picene

Used from 6th century BCE to 4th century BCE:

10300-10301,10304,10307-1030D,1031F,10310,10312-10316,1031D-1031E

Venetic

Used from 6th century BCE to 1st century BCE:

10300,10304-1030D,1030F-10311,10313-10316,10318-10319,1032D

Raetic

Used from 5th century BCE to 1st century CE:

10300,10304-1030D,10310-10311,10313-10316,10318-10319,1032E-1032F

Cisalpine Celtic

Also known as the Lepontic alphabet. Used from 550 BCE to 100 CE:

10300,10304-10306,10308-1030D,1030F-10311,10313-10316,10319

Runic Alphabets

Elder Futhark

Used from 2nd to 8th centuries CE:

16A0,16A2,16A6,16A8,16B1-16B2,16B7,16B9-16BA,16BE,16C1,16C3,16C7-16CA,16CF,16D2,16D6-16D7,16DA,16DC,16DE-16DF

Anglo-Saxon

Used from 5th to 11th centuries CE:

16A0,16A2-16A3,16A6,16A8-16AB,16B1,16B3,16B7-16B9,16BB,16BE,16C1,16C4,16C7-16C9,16CB,16CF,16D2,16D6-16D7,16DA,16DD-16E5

Younger Futhark (long-branch)

Used in Denmark from 8th to 12th centuries CE:

16A0,16A2,16A6,16AC,16B1,16B4,16BC,16BE,16C1,16C5,16CB,16CF,16D2,16D8,16DA,16E6

Younger Futhark (short-twig)

Used in Sweden and Norway from 8th to 12th centuries CE:

16A0,16A2,16A6,16AD,16B1,16B4,16BD,16BF,16C1,16C6,16CC,16D0,16D3,16D9-16DA,16E7

Medieval

Used from 12th to 15th centuries CE:

16A0-16A2,16A4-16A7,16AE-16B1,16B4-16B6,16BF-16C2,16C5-16C6,16CB-16CE,16D0-16D2,16D4-16D5,16D8-16DB,16E6,16E8-16EA

Cirth

A runic alphabet created by J.R.R. Tolkien for "The Hobbit" to transliterate English.

16AA-16AB,16D2,16B3,16DE,16D6,16A0,16B7,16BB,16C1,16F1,16DA,16D7,16BE,16A9,16C8,16B1,16CB,16CF,16A2,16B9,16C9,16A3,16E3,16A6,16E0,16E5,16DF,16DD,16C7,16F2-16F3

Logographic Scripts

These are scripts that use characters (often pictorial) to represent words or morphemes. This includes hieroglyphs and Chinese characters.

Egyptian Hieroglyphs

13000,13050,1305A,13076,130D2,130FE,1313F,1317F,13188,1319B,131A3,131AD,131EF,13250,1329B,132A8,132AF,132D1,13307,13333,13362,133AF,133CF,133DB

Meroitic Hieroglyphs

10980-1099F

Anatolian Hieroglyphs

14400,1446A,144B0,144D1,14502,14530,14577,14596,145AD,145E6,14629

Sumero-Akkadian Cuneiform

Originally used for the Sumerian language, cuneiform was also used for Akkadian (Assyrian/Babylonian), Eblaite, Amorite, Elamite, Hattic, Hurrian, Urartian, Hittite and other languages.

12000-1201F

See also Ugaritic abjad.

Linear A

Still undeciphered but assumed to be syllabic.

10600-1061F

Linear B

Deciphered by Michael Ventris in 1952.

10000-1000B,1000D-10020|10080-1009F

Han (Hanzi, Kanji, Hanja)

73B0,4EE3,6C49,8BED,5E38,7528,5B57,8868

Nüshu

A syllabic script created and used exclusively by women in Hunan Province, China. Women were forbidden formal education there and developed the script in order to communicate with one another.

1B170-1B18F

Tangut

Used for writing the extinct Tangut language of the Western Xia dynasty, China.

17000-1701F

Other Blocks

Aegean Numbers

The "Aegean Numbers" Aegean_Numbers block includes symbols for units (1-9, first row), tens (10-90, second row), hundreds, thousands and ten-thousands used in Linear A, Linear B and the Cypriot syllabary.

10107-1010F|10110-10118|10119-10121|10122-1012A|1012B-10133

Alchemical Symbols

The "Alchemical Symbols" Alchemical block:

1F700-1F713|1F714-1F727|1F728-1F73B|1F73C-1F74F|1F750-1F763|1F764-1F773

Alphabetic Presentation Forms

The "Alphabetic Presentation Forms" Alphabetic_PF block includes Latin, Armenian and Hebrew ligatures:

FB00-FB06

Ancient Greek Musical Notation

The "Ancient Greek Musical Notation" Ancient_Greek_Music block:

1D200-1D21C|1D21D-1D241|1D242-1D245

Ancient Greek Numbers

The "Ancient Greek Numbers" Ancient_Greek_Numbers block:

10140-10157

Ancient Symbols

The "Ancient Symbols" Ancient_Symbols block:

10190-1019C,101A0

Arabic Mathematical Alphabetic Symbols

The "Arabic Mathematical Alphabetic Symbols" Arabic_Math block:

1EE00-1EE03,1EE05-1EE1F

Arrows

The "Arrows" Arrows|Sup_Arrows_A|Sup_Arrows_B|Sup_Arrows_C blocks:

2190-21A8|21A9-21C3|21C4-21DB|21DC-21F6|21F7-21FF
27F0-27FF
2900-2918|2919-2932|2933-2949|294A-2961|2962-297F
1F800-1F80B,1F810-1F81F|1F820-1F83B|1F83C-1F847,1F850-1F859|1F860-1F877|1F878-1F887,1F890-1F89B|1F89C-1F8AD,1F8B0-1F8B1

Block Elements

The "Block Elements" Block_Elements block:

2580-259F

Box Drawing

The "Box Drawing" Box_Drawing block:

2500-250B|250C-251B|251C-252B|252C-253B|253C-254F|2550-256C|256D-257F

Byzantine Musical Symbols

The "Byzantine Musical Symbols" Byzantine_Music block:

1D000-1D01F

Chess Symbols

The "Chess Symbols" Chess_Symbols block (when combined with standard chess pieces U+2654-265F from the "Miscellaneous Symbols" block):

2654-265F,1FA00-1FA05|1FA09-1FA1A|1FA1E-1FA2F|1FA33-1FA44

CJK Compatibility

The "CJK Compatibility" CJK_Compat block includes symbols for hours of the day, days of the month and various Latin abbreviations for units:

3358-3370|33E0-33FE|3371-337A,3380-3394|3395-33B3|33B4-33C1|33C2-33DF,33FF

CJK Compatibility Forms

The "CJK Compatibility Forms" CJK_Compat_Forms block:

FE30-FE4F

CJK Strokes

The "CJK Strokes" CJK_Strokes block:

31C0-31E3

CJK Symbols and Punctuation

The "CJK Symbols and Punctuation" CJK_Symbols block:

3000-3020|3021-303F

Control Pictures

The "Control Pictures" Control_Pictures block:

2400-2426

Coptic Epact Numbers

The "Coptic Epact Numbers" Coptic_Epact_Numbers block:

102E0-102FB

Counting Rod Numerals

The "Counting Rod Numerals" Counting_Rod block:

1D360-1D378

Currency Symbols

The "Currency Symbols" Currency_Symbols block:

20A0-20C0

Combining Diacritical Marks

The "Combining Diacritical Marks" Diacriticals|Diacriticals_Ext|Diacriticals_Sup|Diacriticals_For_Symbols|Half_Marks block:

0300-031F|0320-033F|0340-035F|0360-036F
1AB0-1ACE
20D0-20F0
1DC0-1DDF|1DE0-1DFF
FE20-FE2F

For example, to compose the missing uppercase 'J' with caron, we use a combining caron (U+004A U+030C):

004A,030C

Dingbats

The "Dingbats" Dingbats block:

2713-2727,2729-2730|2731-274B,274D,274F-2752|2756,2758-2775|2776-2793|2794,2798-27AF,27B1-27B2|27B3-27BE

Domino Tiles

The "Domino Tiles" Domino block:

1F031-1F037,,1F063-1F069|1F038-1F03E,,1F06A-1F070|1F03F-1F045,,1F071-1F077|1F046-1F04C,,1F078-1F07E|1F04D-1F053,,1F07F-1F085|1F054-1F05A,,1F086-1F08C|1F05B-1F061,,1F08D-1F093

Emoticons

The "Emoticons" Emoticons block:

1F600-1F60F|1F610-1F61F|1F620-1F62F|1F630-1F63F|1F640-1F64F

Enclosed Alphanumerics

The "Enclosed Alphanumerics" Enclosed_Alphanum block:

2460-2473|2474-2487|2488-249B|249C-24B5|24B6-24CF|24D0-24E9|24F5-24FE
1F102-1F10A|1F110-1F129|1F130-1F149|1F150-1F169|1F170-1F189|1F1E6-1F1FF

The "Enclosed Alphanumerics Supplement" Enclosed_Alphanum_Sup block includes twenty-six "Regional indicator symbols" which can be paired together to produce regional flags with the right font support. In this case, I'm using the BabelStone Flags webfont:

1F1E6,1F1E9,,1F1E6,1F1EA,,1F1E6,1F1EB,,1F1E6,1F1EC,,1F1E6,1F1F1,,1F1E6,1F1F2,,1F1E6,1F1F4,,1F1E6,1F1F6,,1F1E6,1F1F7,,1F1E6,1F1F9,,1F1E6,1F1FA,,1F1E6,1F1FC,,1F1E6,1F1FD,,1F1E6,1F1FF|1F1E7-1F1E6,,1F1E7,1F1E7,,1F1E7,1F1E9,,1F1E7,1F1EA,,1F1E7,1F1EB,,1F1E7,1F1EC,,1F1E7,1F1ED,,1F1E7,1F1EE,,1F1E7,1F1EF,,1F1E7,1F1F3,,1F1E7,1F1F1,,1F1E7,1F1F4,,1F1E7,1F1F6,,1F1E7,1F1F7|1F1E7,1F1F8,,1F1E7,1F1F9,,1F1E7,1F1FB,,1F1E7,1F1FC,,1F1E7,1F1FE|1F1E8,1F1E6,,1F1E8,1F1E8,,1F1E8-1F1E9,,1F1E8,1F1EB,,1F1E8,1F1EC,,1F1E8,1F1ED,,1F1E8,1F1EE,,1F1E8,1F1F0,,1F1E8,1F1F1,,1F1E8,1F1F2,,1F1E8,1F1F3,,1F1E8,1F1F4,,1F1E8,1F1F5,,1F1E8,1F1F7|1F1E8,1F1FA,,1F1E8,1F1FB,,1F1E8,1F1FC,,1F1E8,1F1FD,,1F1E8,1F1FE,,1F1E8,1F1FF|1F1E9-1F1EA,,1F1E9,1F1F0,,1F1E9,1F1FF|1F1EA,1F1EA,,1F1EA,1F1F8,,1F1EA,1F1FA|1F1EB,1F1EE,,1F1EB,1F1F4,,1F1EB,1F1F7|1F1EC,1F1E7,,1F1EC,1F1EA,,1F1EC-1F1EB,,1F1EC,1F1EC,,1F1EC,1F1EE,,1F1EC,1F1F1,,1F1EC,1F1F5,,1F1EC,1F1F7,,1F1EC,1F1FE|1F1ED,1F1F0,,1F1ED,1F1F7,,1F1ED,1F1FA|1F1EE,1F1E9,,1F1EE,1F1EA,,1F1EE,1F1F1,,1F1EE,1F1F2,,1F1EE,1F1F3,,1F1EE,1F1F6,,1F1EE,1F1F7,,1F1EE,1F1F8,,1F1EE,1F1F9|1F1EF,1F1EA,,1F1EF,1F1F5|1F1F0,1F1EC,,1F1F0,1F1ED,,1F1F0,1F1F5,,1F1F0,1F1F7,,1F1F0,1F1FF|1F1F1,1F1E6,,1F1F1,1F1EE,,1F1F1,1F1F9,,1F1F1,1F1FA,,1F1F1,1F1FB|1F1F2,1F1E8,,1F1F2,1F1E9,,1F1F2,1F1EA,,1F1F2,1F1EB,,1F1F2,1F1F0,,1F1F2-1F1F1,,1F1F2,1F1F2,,1F1F2-1F1F3,,1F1F2,1F1F4,,1F1F2,1F1F6,,1F1F2,1F1F7,,1F1F2,1F1F9,,1F1F2,1F1FE|1F1F3,1F1EC,,1F1F3,1F1F1,,1F1F3-1F1F4,,1F1F3,1F1F5,,1F1F3,1F1FA,,1F1F3,1F1FF|1F1F5,1F1EA,,1F1F5,1F1EB,,1F1F5,1F1EC,,1F1F5,1F1ED,,1F1F5,1F1F1,,1F1F5,1F1F2,,1F1F5,1F1F7,,1F1F5,1F1F8,,1F1F5,1F1F9|1F1F6,1F1E6|1F1F7,1F1EA,,1F1F7,1F1F4,,1F1F7-1F1F8,,1F1F7,1F1FA|1F1F8,1F1E6,,1F1F8,1F1EA,,1F1F8,1F1EC,,1F1F8,1F1EE,,1F1F8,1F1EF,,1F1F8,1F1F0,,1F1F8,1F1F2,,1F1F8-1F1F7,,1F1F8,1F1FA,,1F1F8,1F1FD|1F1F9,1F1EB,,1F1F9,1F1ED,,1F1F9,1F1EF,,1F1F9,1F1F2,,1F1F9,1F1F7,,1F1F9,1F1FC|1F1FA,1F1E6,,1F1FA,1F1F2,,1F1FA,1F1F3,,1F1FA,1F1F8,,1F1FA,1F1FF|1F1FB,1F1E6,,1F1FB,1F1F3|1F1FC,1F1EB|1F1FD,1F1F0|1F1FE,1F1F9|1F1FF,1F1E6,,1F1FF,1F1FC

Enclosed CJK Letters and Months

The "Enclosed CJK Letters and Months" Enclosed_CJK block:

3220-3230|3280-3290|32C0-32CB|32FF

Enclosed Ideographic Supplement

The "Enclosed Ideographic Supplement" Enclosed_Ideographic_Sup block includes six symbols from Chinese folk religion: "luck", "prosperity", "longevity", "happiness", "double happiness" and "wealth":

1F260-1F265

Geometric Shapes

The "Geometric Shapes" Geometric_Shapes block:

25A0-25A9|25AA-25C7|25C8-25E5|25E6-25FF

Geometric Shapes Extended

The "Geometric Shapes Extended" Geometric_Shapes_Ext block:

1F780-1F783|1F784-1F7A0|1F7A1-1F7BF|1F7C0-1F7D8|1F7E0-1F7EB

Halfwidth and Fullwidth Forms

The "Halfwidth and Fullwidth Forms" Half_And_Full_Forms block includes fullwidth versions of the ASCII characters for use alongside ideographic glyphs.

,FF01-FF0F|FF10-FF1F|FF20-FF2F|FF30-FF3F|FF40-FF4F|FF50-FF5E

Ideographic Description Characters

The "Ideographic Description Characters" IDC block:

2FF0-2FFB

Indic Number Forms

The "Common Indic Number Forms" Indic_Number_Forms block:

A830-A839

Indic Siyaq Numbers

The "Indic Siyaq Numbers" Indic_Siyaq_Numbers block:

1EC71-1EC8B|1EC8C-1ECA6|1ECA7-1ECB4

International Phonetic Alphabet

The "IPA Extensions", "Phonetic Extensions" and "Phonetic Extensions Supplement" IPA_Ext|Phonetic_Ext|Phonetic_Ext_Sup blocks:

1D00-1D1F|1D20-1D3F|1D40-1D5F|1D60-1D7F
1D80-1D9F|1DA0-1DBF
0250-026F|0270-028F|0290-02AF

Kanbun

The "Kanbun" Kanbun block:

3190-319F

The block name in Unicode 1.0 was "CJK Miscellaneous" and its codepoint range was defined differently, including the then-unallocated space now occupied by "Bopomofo Extended", "CJK Strokes" and "Katakana Phonetic Extensions".

Letterlike Symbols

The "Letterlike Symbols" Letterlike_Symbols block:

2100-211F|2120-213F|2140-214F

Mahjong Tiles

The "Mahjong Tiles" Mahjong block:

1F000-1F018|1F019-1F02B

Mathematical Alphanumeric Symbols

The "Mathematical Alphanumeric Symbols" Math_Alphanum block:

1D400-1D419|1D41A-1D433|1D434-1D44D|1D44E-1D467|1D468-1D481|1D482-1D49B|1D49C-1D4B5|1D4B6-1D4CF|1D4D0-1D4E9|1D4EA-1D503|1D504-1D51D|1D51E-1D537|1D538-1D551|1D552-1D56B|1D56C-1D585|1D586-1D59F|1D5A0-1D5B9|1D5BA-1D5D3|1D5D4-1D5ED|1D5EE-1D607|1D608-1D621|1D622-1D63B|1D63C-1D655|1D656-1D66F|1D670-1D689|1D68A-1D6A5|1D6A8-1D6C1|1D6C2-1D6E1|1D6E2-1D6FB|1D6FC-1D71B|1D71C-1D735|1D736-1D755|1D756-1D76F|1D770-1D78F|1D790-1D7A9|1D7AA-1D7C9|1D7CA-1D7CB|1D7CE-1D7D7|1D7D8-1D7E1|1D7E2-1D7EB|1D7EC-1D7F5|1D7F6-1D7FF

Mathematical Operators

The "Mathematical Operators" Math_Operators|Sup_Math_Operators blocks:

2200-221F|2220-223F|2240-225F|2260-227F|2280-229F|22A0-22BF|22C0-22DF|22E0-22FF
2A00-2A1F|2A20-2A3F|2A40-2A5F|2A60-2A7F|2A80-2A9F|2AA0-2ABF|2AC0-2ADF|2AE0-2AFF

Mayan Numerals

The "Mayan Numerals" Mayan_Numerals block:

1D2E0-1D2F3

Miscellaneous Mathematical Symbols

The "Miscellaneous Mathematical Symbols" Misc_Math_Symbols_A|Misc_Math_Symbols_B blocks:

27C0-27DF|27E0-27EF
2980-299F|29A0-29BF|29C0-29DF|29E0-29FF

Miscellaneous Symbols

The "Miscellaneous Symbols" Misc_Symbols block:

2600-261F|2620-263F|2640-265F|2660-267F|2680-269F|26A0-26BF|26C0-26DF|26E0-26FF

Miscellaneous Symbols and Arrows

The "Miscellaneous Symbols and Arrows" Misc_Arrows block:

2B00-2B19,2B1D-2B24|2B25-2B46|2B47-2B54,2B56-2B69|2B6A-2B73,2B76-2B8D|2B8E-2B95,2B97-2BB0|2BB1-2BD2|2BD3-2BF4|2BF5-2BFF

Miscellaneous Symbols and Pictographs

The "Symbols and Pictographs" Misc_Pictographs|Sup_Symbols_And_Pictographs|Symbols_And_Pictographs_Ext_A blocks:

1F300-1F31F|1F320-1F33F|1F340-1F35F|1F360-1F37F|1F380-1F39F|1F3A0-1F3BF|1F3C0-1F3DF|1F3E0-1F3FF|1F400-1F41F|1F420-1F43F|1F440-1F45F|1F460-1F47F|1F480-1F49F|1F4A0-1F4BF|1F4C0-1F4DF|1F4E0-1F4FF|1F500-1F51F|1F520-1F53F|1F540-1F55F|1F560-1F57F|1F580-1F59F|1F5A0-1F5BF|1F5C0-1F5DF|1F5E0-1F5FF
1F900-1F91F|1F920-1F93F|1F940-1F95F|1F960-1F97F|1F980-1F99F|1F9A0-1F9BF|1F9C0-1F9DF|1F9E0-1F9FF
1FA70-1FA74,1FA78-1FA7C,1FA80-1FA86,1FA90-1FA9E|1FA9F-1FAAC,1FAB0-1FABA,1FAC0-1FAC5,1FAD0|1FAD1-1FAD9,1FAE0-1FAE7,1FAF0-1FAF6

Miscellaneous Technical

The "Miscellaneous Technical" Misc_Technical block:

2300-231F|2320-233F|2340-235F|2360-237F|2380-239F|23A0-23BF|23C0-23DF|23E0-23FF

Modifier Letters

The "Spacing Modifier Letters" Modifier_Letters block:

02B0-02C1|02D8-02DD

Modifier Tone Letters

The "Modifier Tone Letters" Modifier_Tone_Letters block:

A700-A71F

Musical Symbols

The "Musical Symbols" Music block:

1D100-1D11F|1D120-1D126,1D129-1D141|1D142-1D158,1D15A-1D162|1D163-1D164,1D16A-1D16C,1D18C-1D1A6|1D1A7-1D1C6|1D1C7-1D1E6|1D1E7-1D1EA

Number Forms

The "Number Forms" Number_Forms block:

2150-215F|2160-216F|2170-217F|2180-218B

Optical Character Recognition

The "Optical Character Recognition" OCR block:

2440-244A

Ornamental Dingbats

The "Ornamental Dingbats" Ornamental_Dingbats block:

1F650-1F66F|1F670-1F67F

Ottoman Siyaq Numbers

The "Ottoman Siyaq Numbers" Ottoman_Siyaq_Numbers block:

1ED01-1ED09|1ED0A-1ED12|1ED13-1ED1B|1ED1C-1ED24|1ED25-1ED2D|1ED2E-1ED3D

Phaistos Disc

The "Phaistos Disc" Phaistos block:

101D0-101E6|101E7-101FD

Playing Cards

The "Playing Cards" Playing_Cards block:

1F0A1-1F0AE|1F0B1-1F0BE|1F0C1-1F0CE|1F0D1-1F0DE

Private Use Areas

The blocks PUA|Sup_PUA_A|Sup_PUA_B are reserved for private use. This can include codepoints not covered by the Unicode Standard, e.g. Klingon.

The ConScript Unicode Registry maintains a list of Private Use codepoints allocated for constructed/artificial scripts. The text below is rendered with the "Klingon pIqaD HaSta" font.

F8D0-F8E9|F8F0-F8F9|F8FD-F8FF

Punctuation

The "General Punctuation" Punctuation block:

2000-200F|2028-202F|205F-2064|2066-206F
2010-2027|2030-2047|2048-205E

The "Supplemental Punctuation" Sup_Punctuation block:

2E00-2E1F|2E20-2E3F|2E40-2E5D

Rumi Numeral Symbols

The "Rumi Numeral Symbols" Rumi block:

10E60-10E7E

Shorthand Format Controls

The "Shorthand Format Controls" Shorthand_Format_Controls block:

1BCA0-1BCA3

Sinhala Archaic Numbers

The "Sinhala Archaic Numbers" Sinhala_Archaic_Numbers block:

111E1-111F4

Small Form Variants

The "Small Form Variants" Small_Forms block contains small punctuation characters for compatibility with the Chinese National Standard 11643.

FE50-FE52,FE54-FE66,FE68-FE6B

Specials

The "Specials" Specials block:

FFF9-FFFF

Superscripts and Subscripts

The "Superscripts and Subscripts" Super_And_Sub block:

2070-2071,2074-207F|2080-208E|2090-209C

Surrogates

These blocks High_Surrogates|High_PU_Surrogates|Low_Surrogates are reserved for surrogate codepoints.

Symbols for Legacy Computing

The "Symbols for Legacy Computing" Symbols_For_Legacy_Computing block:

1FB00-1FB1F|1FB20-1FB3F|1FB40-1FB5F|1FB60-1FB7F|1FB80-1FB9F|1FBA0-1FBBF|1FBC0-1FBCA|1FBF0-1FBF9

Tags

The "Tags" Tags block:

E0001|E0020-E002F|E0030-E003F|E0040-E004F|E0050-E005F|E0060-E006F|E0070-E007F
1F3F4,E0063,E006E,E0068,E006B,E007F

Tai Xuan Jing Symbols

The "Tai Xuan Jing Symbols" Tai_Xuan_Jing block:

1D300-1D305|1D306-1D320|1D321-1D33B|1D33C-1D356

Transport and Map Symbols

The "Transport and Map Symbols" Transport_And_Map block:

1F680-1F69F|1F6A0-1F6BF|1F6C0-1F6D7,1F6DD-1F6E4|1F6E5-1F6EC,1F6F0-1F6FC

Variation Selectors

The "Variation Selectors" VS|VS_Sup blocks:

FE00-FE0F
E0100-E010F|E0110-E011F|E0120-E012F|E0130-E013F|E0140-E014F|E0150-E015F|E0160-E016F|E0170-E017F|E0180-E018F|E0190-E019F|E01A0-E01AF|E01B0-E01BF|E01C0-E01CF|E01D0-E01DF|E01E0-E01EF

Yijing Hexagram Symbols

The "Yijing Hexagram Symbols" Yijing block:

4DC0-4DDF|4DE0-4DFF

Znamenny Musical Notation

The "Znamenny Musical Notation" Znamenny_Music block:

1CF50-1CF6F|1CF70-1CF8F|1CF90-1CFAF|1CFB0-1CFC3

Miscellanea

Many of these topics are covered by my "Unicode Trivia" blogs posts.

Development of the English Alphabet

The following should be taken with a pinch of salt. In particular, the changes of positions of letters and the re-allocation of sounds are completely ignored. But according to sidebars for individual letters in Wikipedia, the development of the majuscules of the English alphabet was as follows:

  1. Phoenician alphabet (c.1050 BCE)
  2. Ancient Greek alphabet (c.750 BCE)
  3. Etruscan alphabet (c.700 BCE)
  4. Archaic Latin alphabet (c.600 BCE)
  5. Old Latin alphabet (c.250 BCE)
  6. Classical Latin alphabet (c.50 CE)
  7. Old English alphabet (c.750 CE)
  8. Modern English alphabet (c.1550 CE)

10900,,10901-10903,,10904-10905,,10907,,,1090A-1090D,1090F-10910,10912-10915,,,,,,,10906|0391,,0392-0394,,0395,03DC,,0397,0399,,039A-039D,039F-03A0,03D8,03A1,03A3-03A4,,,,,03A7,03A5,0396|10300,,10301-10303,,10304-10305,,10307,10309,,1030A-1030D,1030F-10310,10312-10315,,,10316,,10317,,10306|0041,,0042-0044,,0045-0046,,0048-0049,,004B-0054,,,0056,,0058,,005A|0041,,0042-0044,,0045-0049,,004B-0054,,,0056,,0058|0041,,0042-0044,,0045-0049,,004B-0054,,,0056,,0058-005A|0041,00C6,0042-0044,00D0,0045-0046,A77D,0048-0049,,,004C-0050,,0052-0054,00DE,0055,,01F7,0058-0059|0041,,0042-0044,,0045-0054,,0055-005A

Index

This is a list of all blocks in Unicode 14.0: