Hacker News new | past | comments | ask | show | jobs | submit login
Most URLs are syntactically valid JavaScript code (mand.is)
164 points by adius on July 23, 2021 | hide | past | favorite | 74 comments



The URL https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... is also close to valid code as an expression.

Consider something like:

  foo ? https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label
I was hoping that was where the post was going with this :)

Sadly, JavaScript doesn't have the // operator (floor division) so it doesn't quite work. And double-sadly, I can't think of a language that has all the necessary ingredients; Python uses `and` and `or` for trinary syntax, not ? and :, and besides, you'd need to be able to overload the // operator as unary rather than binary, which ... well, I'm not sure we'd want a language to be able to do that. But I do!

Still, if you want to confuse your coworkers, consider sneaking in something like:

  developer = {mozilla: 42}
  en = US = docs = web = Web = JavaScript = Reference = Statements = label = 420
  console.log(developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label)
and be sure to hand in your resignation after pushing the commit.


For you young people, there is one language which is called "executable line noise".

This is valid Perl code syntactically, except Perl complains of invalid or incompatible regexp modifiers.

So, if you chose the characters carefully an url like that could be executable in Perl (but not this one or urls in general).


https://www.mcmillen.dev/sigbovik/: “93% of Paint Splatters are Valid Perl Programs”


I don't think you can make that or any URL into valid syntactic Perl code.

But maybe you're thinking of this - https://metacpan.org/pod/Acme::URL

  use Acme::URL;
  
  print https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label;


Shouldn't that be :

  developer = {mozilla: {org: 42}} ?


`{mozilla: 42}` is enough because the goal is not to prevent NaN (arising from converting `(42).org` to a number).


Should work in Haskell. You can add your own operators there.


Could probably make it actually valid Haskell too, not just syntactically


Yes, definitely. You would just have to define all those variables and operators. You could very probably even make it evaluate to the right URL in the end.


> Python uses `and` and `or` for trinary syntax

A nitpick: Python uses `if` and `else` for trinary syntax:

    a = foo if bar else baz


Python also has "shorthand ternary" which uses and/or:

    a = bar and foo # "or None" is implied
    b = bar and foo or baz
It does almost the same but not quite. In the latter example, if foo is falsey, it evaluates to "b = baz":

    b = bar ? (foo ? foo : baz) : baz


Woah! I was all ready to disagree with you, saying 'only for Boolean arguments'. In years of writing python I really thought `__and__`/`__or__` returned Boolean. (Of course they can, but for builtins I mean.)

https://docs.python.org/3/reference/datamodel.html#emulating...

It's not made clear here that I can see. Grr, needs type annotations!


That's the wrong part of the doc. You refer to the binary operators || and &&, etc.

This is the correct part: https://docs.python.org/3/reference/expressions.html?highlig...

Edit: you can't override "and" and "or" with dunder methods.


Ahh of course, thanks!


See the language around short-circuiting here

https://docs.python.org/3/library/stdtypes.html#boolean-oper...

The expression is returned based on its truthiness


Is this official, encouraged syntax? It looks like it's just exploiting short-circuit evaluation. `bar && foo || baz` works in Javascript as well for example.


No.

In fact, nearly 20 years ago PEP 308 ("Conditional Expressions") was made so people wouldn't need to resort to this sort of syntax - https://www.python.org/dev/peps/pep-0308/ .

Quoting from the Python FAQ from 2.6 at https://web.archive.org/web/20151030070641if_/https://docs.p... :

> In many cases you can mimic a ? b : c with a and b or c, but there’s a flaw: if b is zero (or empty, or None – anything that tests false) then c will be selected instead. In many cases you can prove by looking at the code that this can’t happen (e.g. because b is a constant or has a type that can never be false), but in general this can be a problem.

> Tim Peters (who wishes it was Steve Majewski) suggested the following solution: (a and [b] or [c])[0].


I have seen it a lot of code over the years, sometimes referred to as "shorthand ternary" or the "and-or-trick". In Python and/or do not return a boolean, but one of its input arguments. So I am going to say it is official syntax, but I don't know to what extent it's encouraged


Not only in python, but in most scripting languages. It’s hard to find one that doesn’t. But it’s neither official, nor ternary syntax anywhere.

  true and false or true
  true ?   false :  true
These are not equivalent. If the second argument is evaluated as false, the “ternary” breaks.

Lua suggests to use this as ternary, but it has only two false values (nil, false), which reduces the number of problematic cases a little.


It saves a character in code golf, but as others have pointed out, it's unsafe.

    a if b else c #+1 byte
    a and b or c  #buggy
    (c,b)[not a]  #ok in general
    a and 1or c   #-1 byte if b is a constant
    (b,c)[a]      #-4 byte if a is boolean


or None is not implied

    >>> repr(False and False)
    'False'
    >>> repr(False and False or None)
    'None'
False is not None

    >>> False == None
    False


As it's error prone, and only saves 1 character, it's not a great shorthand.


EDIT2: What I was replying to is no longer there.

It does short circuit, your logic is just purposely fragile. If you reverse the variables it will print Empty, because the and statement evaluates left to right and stops if one of the arguments is falsey.

  print(len(foo) > 0 and foo[0]  or "Empty")
And since len can't give negative numbers the more Pythonic way to do it would be like this. Even though I generally prefer normal if blocks.

  print(len(foo) and foo[0]  or "Empty")

edit: I do agree that you should probably avoid doing this at all, because it is easy to introduce subtle bugs.


My logic wasn't purposely fragile, I missed that the test goes on the left hand size of the 'and', which makes my case that it's error prone ;)


Scala can do it, except for // starting a comment, so it would have to be

    foo ? https:/developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label
which of course isn't as cool. Still pretty close though.


everyone loves MDN, surely you should only resign after commiting w3schools?


> The lowdown: you can copy and paste just about any website directly into your JavaScript code without modification and it won't break anything.

Well, unless your code relies on an `http` or `https` label :)


And it also won't work between member declarations of a class. Basically it only works where statements are legal.


One also could look at it the other way around: an URL always matches any string, that includes Javascript code. Here is a nice example from the real world:

Whenever I use the Regex engine of XML Schema (or later XPath), with the regular expression shown in RFC3986 - Uniform Resource Identifier (URI): Generic Syntax, I get the error: "Regular expression matches any string.". (The reason for this being an error is, that the XSD regex engine had only one purpose: to constrain a string, so we can check for valid input or define a datatype. That's why it behaves that way.) Here is the part with the regex from Appendix B[1] of aforementioned RFC:

    The following line is the regular expression for breaking-down a
    well-formed URI reference into its components.

        ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
         12            3  4          5       6  7        8 9

    The numbers in the second line above are only to assist readability;
    they indicate the reference points for each subexpression (i.e., each
    paired parenthesis).  We refer to the value matched for subexpression
    <n> as $<n>.  For example, matching the above expression to

      http://www.ics.uci.edu/pub/ietf/uri/#Related

    results in the following subexpression matches:

      $1 = http:
      $2 = http
      $3 = //www.ics.uci.edu
      $4 = www.ics.uci.edu
      $5 = /pub/ietf/uri/
      $6 = <undefined>
      $7 = <undefined>
      $8 = #Related
      $9 = Related

    where <undefined> indicates that the component is not present, as is
    the case for the query component in the above example.  Therefore, we
    can determine the value of the five components as

      scheme    = $2
      authority = $4
      path      = $5
      query     = $7
      fragment  = $9
[1]: https://datatracker.ietf.org/doc/html/rfc3986#appendix-B


Not exactly. The original URI syntax didn't allow several characters including "{|`><}\^ plus whitespaces anywhere; the quoted section of RFC 3986 assumes that the string is already well-formed (thus the regex can't be used for the validation). Even the modern URL Standard has an outright failure [1].

[1] https://url.spec.whatwg.org/#example-url-parsing


I remember the pains of working on a tool that supported a path (/some/foo or C:\Some\Window), URI-style (ssh://user@host:port//somewhere) and SCP-style (user@host:/somewhere) in the same spot. That was not nice code and it of course had many failure modes.


Semi-related, there's a a useful trick for using Javascript labels as URLs.

Way back in the day, I used to write labels like `<a href="javascript:open_save_dialog:;">` instead of `<a href="javascript:void(0)">`. The idea is that both yield undefined to prevent the default URL following semantics of the link, to instead do whatever JS logic is attached to the onclick event, but when the user hovers over the link, the label makes the corner tooltip of the browser show descriptive information to the user about what clicking will do, whereas `void(0)` tells you nothing.


Remove the protocol part, and with the JS Proxy[0], you can actually do something like:

  await www.example.org
as demonstrated in a tweet[1]. This is pretty interesting IMO :-)

[0]: http://developer.mozilla.org/en-US/docs/Web/JavaScript/Refer...

[1]: https://twitter.com/rreverser/status/1138788910975397888


And in Rebol & Red URL's are syntactically valid datatypes (literals): http://www.rebol.com/r3/docs/datatypes/url.html

For eg. in Red console:

    >> read https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label
    
    == {<!DOCTYPE html><html lang="en-US" prefix="og: https://ogp.me/ns#"><head><me...


I want this. Esp in DSLs. Bonus points for first class support for fragments and so forth.

(aka Intrinsics?)


You can have fragments (and so forth) with Rebol/Red URLs.

For eg.

  >> type? https://www.example.com/foo.html#bar
  == url!
  
  >> type? https://www.example.com/foo.cgi?bar=1&baz=2
  == url!


Whenever I come back to writing JavaScript after a long break, I always forget how to correctly return objects from lambda expressions, and end up with this:

    () => {
       foo: "bar";
       baz: "qux";
    }
Thankfully TypeScript warns me it's a ()=>void function instead of whatever was expected in the context. I scratch my head for a while and then remember:

    () => ({
        foo: "bar",
        baz: "qux"
    });


I'm more baffled that

    foo: "bar";
is a valid expression. I know, it's the named statement syntax normally used for loops, but I didn't know this worked for any expression.


In JavaScript statements don’t have to do anything.

"hi"; 1; true; new Date(); () => {};

This is valid JavaScript


No different than Java or any lang that supports labels and c style comments


Technically it's C++-style comments, the original C only had /* */ comment blocks.


At some point in the distant past I was working on a tool that would identify PowerShell script using the built in parsing API. It was astonishing the wide variety of files that parsed without error, including a lot of XML files.


You really should not use this ever in any situation, but for historical reasons[1], if they are the only thing on the line you can use HTML comments in JavaScript like so:

  function foo() {
    <!--- returns true ---->
    return true;
  }
This works at least in Chrome and V8, not sure about Firefox.

[1] https://stackoverflow.com/a/1508005/1888964



So even a data URI can be valid JS syntax

  data:document/document,alert('hi')
If ran as JS it will alert "hi". If put into a link it will force the browser to download a document containing the alert statement.


That data URI happens to use `document` which happens to be a browser global, so most data URIs won't work.

e.g. running the above in Node gives:

  Uncaught ReferenceError: document is not defined

For anyone unfamiliar with JS, the above string is parsed as:

  (label):
    (variable cast to int) / (variable cast to int), // evals to NaN
    (function call) // evals


I was actually not aware that JavaScript supports labels. Quite funny since, usually, these articles are about quirks in JS that basically no other language has.


This is a neat trick, but I'm more interested in the little remark about how Svelte uses the $: label

https://svelte.dev/docs#3_$_marks_a_statement_as_reactive

Is Svelte doing something clever with these labels at runtime, or is it just using the label to do some kind of transformation when the javascript is compiled, anyone know?


I believe it's parsing the JavaScript and doing a transformation. That's kind of svelte's USP: better ergonomics because it goes beyonds what's possible in standard JS.


Transformation is the name of the game with Svelte. None of the Svelte syntax is run at runtime because it's all compiled down to vanilla JS (and therefore ships with 0 runtime libraries).


> The lowdown: you can copy and paste just about any website directly into your JavaScript code without modification and it won't break anything.

The following doesn't seem to be valid though. :( Am I missing something?

    https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label

    let a = 1;


Sort of the same reason why `if (true) let a = 1` is a syntax error (and the error message says why: "Lexical declaration cannot appear in a single-statement context")

The rationale here is that the variable goes out of scope right away due to the implicit block, so they made it a syntax error so that you don't waste time pulling your hair out debugging issues downstream from this line without realizing this scope-related obscurity. (This also has potential security implications e.g. if you think you're shadowing a variable from a parent scope but you're not)


```

>> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

10

<- 10

```

Eg, this is equivalent to:

```

>> https: 10

```


Yes, but that's not related to code example I gave, which produces a syntax error.


My bad, I misunderstood.


I can remember putting the URL of the Matrix movie website, in some plain Java code that dealt with well, matrices.

What I did not know is my code was more or less secretly reviewed by a senior engineer, who was a bit surprised, but learnt something he told me.

This is how the secret review became public. My red pill, indeed!

edit: typo


C# too


C too, C++ too


Neat, but as TFA says, not useful.

Objective-S makes URIs useful: http://objective.st/URIs/


    Unused label.ts (7028)
    'https:' is defined but never used. (eslintno-unused-labels)
    Delete `https:` (eslintprettier/prettier)


Awesome! It's basically a label either https: or http: that gets nothing assigned to it since the // turns everything afterwards into a comment.


one of my earliest programming qirk encounters was scratching my head over why `() => { key: value }` returns nothing


Nix has first-class support for URLs. Like, there are strings ("https://example.org") and URLS (https://example.org). And if URL is not valid, it will be parser error.

However, this "feature" turned out to be misfeature and may be removed in future.

The RFC: https://github.com/NixOS/rfcs/blob/master/rfcs/0045-deprecat...


javascript delenda est


Javascriptus delenda est.

Or: Ceterum autem censeo Javascriptum esse delendam.


„JavaScriptus delendus est.“

But I think the parent refers to the famous quote by Cato Censorius who demanded Carthago to be destroyed. He used the „accusativo cum infinitivo“ construct to express his personal desire not something that actually happened to this point. So it should read:

[Ceterum censeo] Javascriptum esse delendum.


I don’t actually know Latin, just enough to know that the declension for scriptus should be applied. Thanks for fixing!


Darmok and Jalad at Tanagra.


Maybe one day I will look back at that episode and conclude it was fine, even good


I wanted to read this, but the styling of the website cuts diagonally across the content on my tablet in portrait mode.

It makes the article unreadable...


Punchline, http: is a label; // and everything following is a comment.


More like an anecdote then


Yeah, in retrospect, I'm slightly embarrassed for even clicking on it.


Yes, but do you know how long it took us to get CSS to this stage so that it could draw arbitrary shapes???




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: