Posted on September 18, 2015

A Clear Description for a Valid Domain Name

Posted on September 18, 2015 by .BIBLE Registry Categories: Domains, Online Presence

It’s easy to take domain names for granted, like the air we breathe, as we together experience how the Internet has changed the first world and the current generation of humanity. But finding a clear and concise description for what is considered a valid domain name or not is not so easy.

parts of a domain name 

A sufficiently good definition would need to be technically accurate and understandable for domain names consisting of Latin characters only. A robust definition would need to account for non-Latin characters and IDNs (Internationalized Domain Names) too, for total compatibility, known as Universal Acceptance. Here’s what I’ve found so far:

via ICANN Beginner’s Guide to Domain Names

... domain names in gTLDs can be registered using the 26 letters of the basic Latin script (A to Z), and can include the numbers 0-9. They can also include a hyphen “-”, although not as the first or last character of the domain name.

via Wikipedia entry for Domain Name

Domain names may be formed from the set of alphanumeric ASCII characters (a-z, A-Z, 0-9), but characters are case-insensitive. In addition the hyphen is permitted if it is surrounded by characters, digits or hyphens, although it is not to start or end a label.

via Section 3.5 of RFC 1034 (also Section 2.3.1 of RFC 1035) —

The labels must follow the rules for ARPANET host names. They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen. There are also some restrictions on the length. Labels must be 63 characters or less.

via domain.me/policies

Domain name length: .Me domain names must be at least 3 characters (second level) or 2 characters (third level) and a maximum 63 characters in length, excluding the extension (.me, org.me, etc.). Allowable characters: Only the Latin alphabet letters a-z, digits, and hyphens are currently accepted in a domain name. Domain names cannot begin or end with hyphens.

via gandi.net/domain/name/info

Syntax: from 2 to 63 alphanumeric characters or a hyphen (excluding in the first and last place)

To summaraize, synthesizing from these various descriptions and a few others, here’s my attempt at crafting the wording a more clear and concise explanation for a domain name, particularly the string to the “left of the dot”:

A second-level domain may be 1 to 63 characters in length, consisting of alphanumeric characters (A-Z, 0-9). A second-level domain may also use the hyphen (”-”) character, except in the first or last position.

The above description may be a sufficient description for domain names that only use Latin characters. And all this effort is to make it easier to explain what domain names are permitted for use here at the .BIBLE top-level domain.

However, the Internet today also has domain names A robust and complete definition would need to account for non-Latin characters and IDNs (Internationalized Domain Names) too, known as Universal Acceptance, for total compatibility.

Because today’s Internet, with the new gTLD program, these domain names (and websites) are valid: ביטוחרכב.co.il and 為替レート.jp

To get at a more comprehensive definition of a valid domain name on the Internet is a bit more challenging. Here’s a couple references:

via gnso.icann.org/en/issues/new-gtlds/pdp-dec05-fr-parta-08aug07.htm

In the absence of standardization activity and appropriate IANA registration, all labels with hyphens in both the third and fourth character positions (e.g., “bq—1k2n4h4b” or “xn—ndk061n”) must be reserved at the top-level. [cf. Internationalized Domain Names]

via website.com/beginnerguide/domainnames/8/5/idn-domain-names.ws

IDN are domain names that are written in foreign languages, like Chinese, Japanese or Russian. IDN stands for Internationalized Domain Names. IDN domain names allow people from all over the world to communicate websites, domain names and URLs in their native languages.

Most domain names registered to date are written using the 26-character Latin/English alphabets and numbers, an encoding called ASCII. IDN allows for the use of non ASCII characters in domain names. When an IDN is registered, the foreign characters are encoded in Punycode using a number of algorithms. Punycode is simply an ASCII version for the IDN, allowing it to resolve with the current internet system.

Punycode domains can be identified by the “xn-” beginning.

Would you know of a paragraph that succinctly describes what a valid domain name on today’s Internet looks like? Please add a comment in this collaborative process. Having a clear explanation for the valid syntax of domain names can be a very helpful part to getting all the browsers, apps, softwares, and emails to properly function with all domain names everywhere.