Hacker News new | past | comments | ask | show | jobs | submit login
Perl Cannot Be Parsed: A Formal Proof (2008) (perlmonks.org)
57 points by __Joker on June 14, 2015 | hide | past | favorite | 12 comments



The gist of this is that perl's parser depends on previously symbols that are already defined to figure out how to parse things.

The same goes for languages like C, but the difference is that in perl you can define symbols at compilation time as a result of running other code, e.g.:

    $ perl -wE 'BEGIN { eval "sub spew { say \@_ }" if time % 2 == 0 } spew "hello"'
    hello
    $ perl -wE 'BEGIN { eval "sub spew { say \@_ }" if time % 2 == 0 } spew "hello"'
    Unquoted string "spew" may clash with future reserved word at -e line 1.
Here the program can be syntactically correct or incorrect depending on the compilation time.


for what it's worth the same 'problem' exists in python and probably most other similar languages.

edit: here's the same code in python

  from time import time
  if round(time())%2==0:
      exec("def spew(x): print x")
  spew("hello")


It's not the same thing. In Python this is a run-time error. The code you posted is parsed and compiled to byte-code just fine. Then, depending on the time, a new function is created dynamically at run-time or a NameError is raised at run-time.

In Python and other dynamic languages the compilation part is very simple, the code is parsed (the grammar is very simple) and converted to byte-code. Then the byte-code is executed. Things like function and class definitions happen at run-time, not at compile-time.

In Perl 5, the line between compilation-time and run-time is very fuzzy. It doesn't have a simple grammar that you can use to convert Perl to op-code. You can run code at compilation-time and change the way the compiler interprets things.


As the sibling points out this isn't quite like what you have in Perl. In this case the python code will be parsed the same way. The difference with Perl is that depending on previously defined symbols something might parse as a function call, an indirect method call etc.

This comment in the perl parser describes some of the beginnings of that rabbit hole: http://perl5.git.perl.org/perl.git/blob/70f63a4c7dba89e8e48b...


Except that doesn't change the way 'spew("hello")' is parsed, as far as I know. The 'proof' given for perl actually means you can't fully parse the file without also running it.


I came to the same conclusion around 2001 while trying to adapt Cigital's ITS4 static security scanner to Perl, but stopped short of producing a "formal proof" :)


Cute. You probably can shorten the argument a bit by accepting Rice's theorem ( https://en.wikipedia.org/wiki/Rice's_theorem ) instead of explicitly building the Rice widget (which is pretty much how Rice's theorem is proven).


> That's some mighty fine left brain thinking there( especially for a Monday morning ), but does it in anyway affect any practical aspect of Perl?

The first comment.


Building advanced tools like a static analyzer is much harder / partly impossible because you can't parse it without running it.


Abstract interpretation/symbolic execution should still work, but it would require basically reimplementing the semantics of perl, yes.


That still may be of little help in, say, a perl IDE, as the meaning of a program _can_ depend on the input it reads.

"But that will not happen in practice"

If you think that to be the case, use a language that _can_ be parsed. That makes life easier for implementers of IDE's, refactoring tools, etc. and costs you nothing (given the "that will not happen in practice" claim)


Take a look at PPI - Parse, Analyze and Manipulate Perl (without perl) module: https://metacpan.org/pod/PPI

Infact it says: "only Perl can parse Perl".

On the other hand JavaScript has esprima that is written in JavaScript, and also other languages like Python can parse themselfs.

As a Perl programmer, I don't know if it is a pro or a cons. For example Perl::Critic cannot parse some costructs, cause they are too complicated. What I usually do, as in many other contexts, I try to understand what is the subset of features I need to use.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: