So if my IP is XXX.XXX.XXX.42 I can expect misconfigured clients to be trying to...

zck · on Feb 4, 2011

Well, only if someone tries to go to http://XXX.XXX.XXX.42, which seems rare.

I can't think of a way to tell whether XXX.XXX.XXX.42 is an IP or a website from that string alone.

You could require that no subdomain has four "parts" (i.e., A.B.C.42), but that's harsh.

You could require that somewhere in the URL for a four-part numeric-TLD URL there be a character.

You could also require that at least one of the parts of A.B.C.D be X<0 or 255<X.

The last two suggestions are more lighthanded, but harder to implement. They'd all rule out URLs like 1.2.3.4, though.

Quick, register your favorite sequence! (http://oeis.org/)

etherealG · on Feb 4, 2011

reading the rules from an rfc mentioned on the wiki, if XXX.XXX.XXX.42 is a valid IP address, the client should try to use that first, if not it should only then try to use DNS to resolve.

qjz · on Feb 4, 2011

Reference: http://tools.ietf.org/html/rfc1123#page-13

But note that the RFC assumes there will never be a numeric TLD. Furthermore, it suggests checking the string syntactically to determine if it's in dotted decimal form, remaining somewhat ambiguous about what to do if the hostname looks like IPv4, but the connection fails (it seems unwise to do a DNS lookup for every IPv4 address that's offline). Since IPv4 is relatively easy to validate, it would have made a lot more sense for a numeric TLD to select a number outside the range of 0-255, such as 4200. Then it's obvious that 1.2.3.4200 is a hostname (or harmless typo), and not an IP address.

metageek · on Feb 4, 2011

There's also RFC-1738 (URL syntax), which says:

    host
        The fully qualified domain name of a network host, or its IP
        address as a set of four decimal digit groups separated by
        ".". Fully qualified domain names take the form as described
        in Section 3.5 of RFC 1034 [13] and Section 2.1 of RFC 1123
        [5]: a sequence of domain labels separated by ".", each domain
        label starting and ending with an alphanumerical character and
        possibly also containing "-" characters. The rightmost domain
        label will never start with a digit, though, which
        syntactically distinguishes all domain names from the IP
        addresses.

http://www.ietf.org/rfc/rfc1738.txt

andfarm · on Feb 4, 2011

A lot of parsers do support numbers outside 0-255, though, and treat them as "packed" addresses. For instance, 127.0.1, 32512.1 and 2130706433 are all sometimes treated as alternate representations of 127.0.0.1.

zck · on Feb 4, 2011

That certainly works, but it seems like a large part of this market would want numeric-only URLs, and so you couldn't have e.g., 1.2.3.4, or 1.1.2.3, etc.

nickknw · on Feb 4, 2011

> but it seems like a large part of this market would want numeric-only URLs

Except for maybe douglasadams.42 ;)