Thursday, June 12, 2008

Punycode

Very recently I got to know about this.
Punycode is a simple and efficient transfer encoding syntax designed for use with Internationalized Domain Names in Applications (IDNA). It uniquely and reversibly transforms a Unicode string into an ASCII string. ASCII characters in the Unicode string are represented literally, and non-ASCII characters are represented by ASCII characters that are allowed in host name labels (letters, digits, and hyphens).
Punycode is an instance of a more general algorithm called Bootstring, which allows strings composed from a small set of "basic" code points to uniquely represent any string of code points drawn from a larger set. Punycode is Bootstring with particular parameter values appropriate for IDNA.
-http://www.faqs.org/rfcs/rfc3492.html

Basically the idea is to have the domain names in local languages. The current naming conventions allow us to have ASCII characters only. But to have the names in Unicode, Unicode should be mapped to ASCII. The punycode does the job.
For example : www.தமிழ்.com, to have this we may need to enter the punycode of this on DNS. This may look like : www.xn--rlcus7b3d.com. You can notice that, Punycode starts with "xn--".
There are converters, using which we can get punycode of out domain names (http://www.nameisp.com/puny.asp).

I think there are conventions to transform the top level domains too. But I didnt study about them yet.


No comments: