Note: This part of my post contains technical terms so be prepared to encounter a bunch of them!
And this is going to be a long, dry and a boring post.
To know more about a term, open google.com type "define:your_keyword",
where your_keyword replaces the word you want google to define for you.
Now the question:
What is Unicode?Unicode Consortium answers it like this:
Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
no matter what the language.Fundamentally, computers just deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different encoding systems for assigning these numbers. No single encoding could contain enough characters: for example, the European Union alone requires several different encodings to cover all its languages. Even for a single language like English no single encoding was adequate for all the letters, punctuation, and technical symbols in common use.
These encoding systems also conflict with one another. That is, two encodings can use the same number for two different characters, or use different numbers for the same character. Any given computer (especially servers) needs to support many different encodings; yet whenever data is passed between different encodings or platforms, that data always runs the risk of corruption.
for more visit:
http://www.unicode.org/standard/WhatIsUnicode.html Now, the ASCII table contains 256 (0-255) letters, which include the upper and lower case alphabets of English, numbers from 0 to 9, punctuation marks etc,and others for system.Now, there are hundreds of languages in the world which have their own scripts. To solve this problem
Unicode consortium has been established.
U have to thoroughly read the Unicode website to get an idea of what really is Unicode and compare it with ASCII (American standard code for information interchange) to see the difference.
Unicode supports many scripts.one script may contain several languages, for example Arabic script.etc, Baluchi Farsi Urdu etc are based on the Arabic script which means all the languages share a common writing system.
Every Script in the Unicode has a code and the languages in a particular script is identified by that code.
Unicode website maintains and updates the list of "alive" scripts and and languages.
the Language code for Baluchi is ISO639-2 bal
Besides, Unicode also has a huge repository of alphabets of every (almost) language of the world. As its said in the beginning that each letter is given a unique code.
Arabic is given a range of 0600 to 06ff. Which means 0600 (0600 is the Unicode code for
) is first letter in the range and 0600ff (06ff is Unicode code for
ۿ). In between this range are all the alphabets of arabic, syriac, sindhi etc.
If we try we can (actually we should) submit missing characters such as
to Unicode.
...and for fun or knowledge:
open Ms word, type a 4digits or a combination of letters and numbers. letters a to f and numbers 0-9
and try this :
type 062C and place the cursor on the right(right of C) and hold down Alt key and press x and release booth keys!. tell me what you see! ok.
you can try the next consecutive code that is 062d
...to be continued