Facebook rewrites PHP runtime, will open source on Tuesday

matthew-wegner · on Jan 31, 2010

This was mentioned in that anonymous Facebook employee interview: http://news.ycombinator.com/item?id=1089800

Specifically:

Rumpus: So tell me about the engineers.

Employee: They’re weird, and smart as balls. For example, this guy right now is single-handedly rewriting, essentially, the entire site. Our site is coded, I’d say, 90% in PHP. All the front end — everything you see — is generated via a language called PHP. He is creating HPHP, Hyper-PHP, which means he’s literally rewriting the entire language. There’s this distinction in coding between a scripted language and a compiled language. PHP is an example of a scripted language. The computer or browser reads the program like a script, from top to bottom, and executes it in that order: anything you declare at the bottom cannot be referenced at the top. But with a compiled language, the program you write is compiled into an executable file. It doesn’t have to read the program from beginning to end in order to execute commands. It’s much faster that way. So this engineer is converting the site from one that runs on a scripted language to one that runs on a compiled language. However, if you went to go talk to him about basketball, you would probably have the most awkward conversation you’d have with a human being in your entire life. You just can’t talk to these people on a normal level. If you wanted to talk about basketball, talk about graph theory. Then he’d get it. And there’s a lot of people like that. But by golly, they can do their jobs.

jrockway · on Jan 31, 2010

The computer or browser reads the program like a script, from top to bottom, and executes it in that order: anything you declare at the bottom cannot be referenced at the top. But with a compiled language, the program you write is compiled into an executable file. It doesn’t have to read the program from beginning to end in order to execute commands. It’s much faster that way.

I lost count of the number of untrue statements there after about ten.

sounddust · on Jan 31, 2010

The person being interviewed was obviously a non-technical person who was trying to provide information as he/she understood. It's easy to just take every statement literally and proclaim the person an idiot, but it's almost as easy to translate these statements into their (obvious) accurate equivalents and get some value from this interview. I don't understand the point of the former approach.

brown9-2 · on Feb 1, 2010

To expand, the interviewer in the original interview:

- thought that Facebook's servers would be hosted in their office

- was unfamiliar with the idea of usability studies or tracking eyeballs during usability studies

- was baffled by the idea that facebook would track your clicks throughout the entire site to help determine who your closest contacts were

- was also surprised by the fact that in the cloud, nothing is ever truly deleted

- was surprised to see that facebook employees would have the ability to login as any user in the system

olalonde · on Jan 31, 2010

Wanna talk basketball?

VonGuard · on Jan 31, 2010

Terrified stare

ubernostrum · on Jan 31, 2010

Sure. Just got back from watching KU/K-State. Great game; K-State played some strong defense down to the end of regulation, and KU had problems with poor shot choice the whole game. Thought for a minute KU was going to blow it again in overtime, especially after Collins missed the free throw, but that last foul on Brady (and the resulting five-point lead) was just the nail in the coffin.

I do wish they hadn't had so many timekeeping problems, though (and I'm not sure what was up with the aggressive travelling calls in the first half either).

profquail · on Jan 31, 2010

Not without some graph theory:

http://www.colleyrankings.com/method.html

pbiggar · on Jan 31, 2010

Obviously, this isn't the traditional way of looking at compilation, but it does make sense.

I guess the problem being described, somewhat obtusely, is that massive amounts of dependencies (in Facebook's case million of lines of code) are loaded in order to execute a relatively small page.

I'll admit that to a CS person who understand compilers, their description makes it look like they don know what they're talking about. But if you explained Facebook's interpretation/compilation problems to a non-techy, it would probably come out sounding similar.

(Note comment reuse: http://www.reddit.com/r/programming/comments/aodzg/one_faceb...)

jrockway · on Jan 31, 2010

It's the wrong distinction to make. PHP is slow because the runtime is slow, not because it's a "scripting language".

SBCL and V8 and TraceMonkey are all very fast "scripting languages" because they have a good codegen in the runtime. PHP does not do any codegen.

If I were explaining this to a non-technical person, I would say: A computer works by executing instructions, one at a time. When PHP is executed, it is compiled to a list of instructions, but not ones the processor can directly understand. It uses a "virtual machine" to read each instruction and then execute the corresponding instructions on the real computer. This is slow, because what could be one instruction actually becomes many, sometimes hundreds, of actual instructions. The goal of this project is to skip the virtual machine and run our code directly on the real machine, meaning that we won't have to do as much unnecessary work. This will make our application run faster!

Now, this still might make CS folks mad, because there is bookkeeping overhead other than deciding which instructions to run (you can't store "scalars" in registers, after all), and of course modern CPUs don't execute one instruction at a time. But now we're wandering off into esoterica -- at least the basic idea is right, and I don't use meaningless expressions like "scripting language".

pbiggar · on Jan 31, 2010

Yes, the runtime is slow. But many language design decision, notably references, make it difficult to make a fast implementation (I should know, I did my PhD on it).

They're trying to explain the problems that Facebook are having, which is that there is just such a large volume of code. They aren't trying to give a short course in interpretation and compilation.

Scripting language is not a meaningless expression. Many papers use it. I submitted papers with "scripting language" in the title to PLDI and POPL, and there was not one complaint or problem with the term from reviewers (who found plenty of tiny "errors" otherwise).

jrockway · on Jan 31, 2010

What does "scripting language" generally mean. UNIX "scripts" can be written in C. Is C a scripting language?

pbiggar · on Jan 31, 2010

Generally speaking, a scripting language comes from the set {Python, Perl, JavaScript, PHP, Lua, Ruby}.

Its a tricky term to define, much like "compiler". Actually "compiler" is a nice analogy: everyone knows what one is, but the actual test for "is this a compiler" is shaky, and returns true for lots of things that aren't _really_ compilers.

Anyway, Ousterout coined it in his paper on TCL, but annoyingly chose not to formally define it. The set at that time was (I think) {TCL, Perl, sh}.

But because he didn't define it properly, the word has been used for years without a proper definition. The best I can do is to say read my PhD thesis, starting on page 7, where I'd spent a meaty six pages explaining in as much detail as I can what it means to be a scripting language.

lolcraft · on Jan 31, 2010

From my understanding of the issue, a scripting language was once a small language with one purpose which quickly expanded to fill all possible needs (e.g. Perl).

By this definition C isn't a scripting language, as it was small and stood that way.

jrockway · on Jan 31, 2010

How is that meaningful from a computer science standpoint?

SapphireSun · on Jan 31, 2010

I'm definitely stealing from someone else's comment a long time ago but they said it very well: The difference is that scripting languages have no main function (as far as I know).

lanstein · on Feb 1, 2010

This obviously is not in the spirit of what you're saying, but...

Python main() functions by Guido van Rossum

http://www.artima.com/weblogs/viewpost.jsp?thread=4829

SapphireSun · on Feb 2, 2010

I think the idea is that you optionally can have a main, but you aren't required to do so ;-)

Hexstream · on Jan 31, 2010

Well, isn't that called a "general-purpose programming language"?

SapphireSun · on Jan 31, 2010

Maybe "scripting language" is the way today's kids say it?

Hexstream · on Feb 1, 2010

C is a general-purpose programming language but I don't think anyone ever called it a "scripting" language... (besides "scripting" DSL's with C syntax in some games)

a0002 · on Jan 31, 2010

Quick question from a noob: why do references make it harder to make a fast implementation? I'm genuinely interested in the topic, could you post a link to your PhD thesis or some other relevant information? Cheers

romland · on Feb 1, 2010

I am guessing it is related to dereferencing. It seems Google agrees, it's also mentioned on Wikipedia at http://en.wikipedia.org/wiki/Reference_%28C%2B%2B%29#Relatio...

[snip]...consequence of this is that in many implementations, operating on a variable with automatic or static lifetime through a reference, although syntactically similar to accessing it directly, can involve hidden dereference operations that are costly.[/snip]

dangrossman · on Jan 31, 2010

It's the description of PHP that isn't quite right. It's not read perfectly top-down like a script, nor is it interpreted at the same time as it's read. You can declare a function at the end of a file and use it in the first line.

pbiggar · on Jan 31, 2010

Well, that's not quite right either. I've purged my mind of the edge cases here, but it generally involves includes.

I admit it sounds a bit off, like the quotee doesn't know exactly what she's talking about, but she's not egregiously wrong.

pbiggar · on Jan 31, 2010

Hmmmm. There's no way you can name ten. I'd wager you can't name five, and I'd be surprised if you could name two.

jmillikin · on Jan 31, 2010

I can name at least nine; assuming jrockway is more familiar with PHP than I am (very likely), 10 is reasonable or even understated.

1) The term "scripting language"; meaningless, and usually just used as an insult (as in this case) rather than with any reasonable definition.

2) Browsers don't execute PHP

3) Not all scripting languages are executed based on line order.

4) PHP doesn't require values to be declared before they're used.

5) Not all compiled languages are compiled to a separate file.

6) Most modern compiled languages compile to byte-code, which is not more "executable" than the original source.

7) Compilers do have to read the program from beginning to end. This might just be my ignorance, but I've never heard of a random-access parser.

8) Being compiled doesn't necessarily make execution any faster.

9) The compiled file still has to be read entirely, before it can be executed. Depending on code size, sections of it may be read repeatedly.

pbiggar · on Jan 31, 2010

I thought we were talking about the bit jrockway actually quoted. Oh well, you're still miles off.

1) You can read from the interview that the quotee was speaking to a non-technical person. Even so, they're using the term in the commonly used sense. But you haven't shown that they're wrong, only that you disagree with the term.

2) You are right. This is the only falsehood I can see here. I'll stipulate that its ok in a 'you know what I meant' kinda way.

3) Your answer to 2) depends on the quotee speaking about PHP, but now they aren't? And in 1) you said that scripting languages don't exist. Basically, you're reaching. Anyway, please name a scripting language which is not executed on line order. PHP certainly is.

4) Values aren't mentioned anywhere. Anyway, PHP requires functions and classes to be declared before use. (Strictly speaking, values don't exist before they are used, so what you said doesn't make sense. I presume you meant variables, but variables aren't declared. So I presume you meant defined, but strictly speaking there is no such thing as a variable in PHP anyway, just a mapping of strings to values (see: all the literature on the topic, including mine)).

5) They're obviously talking in the context of a native compiler, so I see nothing wrong with this.

6) They do? First I heard of it. I'm guessing you've got a funny definition of 'modern compiled languages' to back this up.

7) Executables don't though.

8) Not necessarily, sure. The first iteration of my PHP compiler was 10 times slower. But to say that the quotee is _wrong_ just because some implementation can be slower than some other implementation, is faulty logic.

9) What's your point? Why are they wrong?

So I'll give you one. Two if we count number 8, but that only counts for the most extreme form of pedantry, and it doesn't really contradict anything. So, yeah, one. A long way from ten.

jmillikin · on Jan 31, 2010

All of those complaints are from the quote.

1) All the more reason not to introduce incorrect terminology; non-technical people won't be able to understand that it's wrong.

3) There are legitimate definitions of "scripting languages"; the one I use is that a scripting language is not useful without 3rd-party code. The Bourne shell and JavaScript are classic examples of scripting languages.

QuakeC is a scripting language which isn't executed in line order.

4) Values are procedures (PHP doesn't have functions; the keyword is mis-named), classes, or variables. They do not have to be declared before use in PHP -- if they did, writing mutually-recursive procedures would be impossible in PHP.

5) I see nothing "obvious" about your statement. They state that compiling results in a separate file, which is wrong because some compilers don't.

6) JavaScript (in all popular implementations), Python, JVM languages (Java, Scala), .NET languages (C#, F#, VB.NET), Perl 6. I think even Ruby has a bytecode compiler, now.

7) There is no indication that this new PHP implementation compiles to native executables.

8) Their claim is that compilation makes execution faster. There exist cases where compilation does not make execution faster. Therefore, their claim is incorrect.

9) Their quote states that compiled binaries don't have to be read from beginning to end to execute. This is incorrect. When a binary is executed, or a library linked, it is loaded entirely into memory.

pbiggar · on Feb 1, 2010

1) Whatever your opinion, the quotee is not wrong.

3) Please define it. Note that I spend six pages in my PhD on defining it, and there are no formal definitions. Ousterout introduced the term, and didn't define it.

That's a crap definition of scripting language. I don't even know what it means, and it's unusable. C++ is pretty much useless without the C++ standard library. Is it a scripting language now?

3) Nice. The fact that you could dredge up a minor language which peaked in 1996 does not make the quotee wrong.

4) This is all wrong.

> They do not have to be declared before use in PHP -- if they did, writing mutually-recursive procedures would be impossible in PHP.

Whether or not mutually recursive procedures have to be declared is a matter of parsing style. For example, its there in C since it was created using a one-pass compiler. Its not there in Java. In PHP, declaring a function (say x) puts the function into the function-symbol table under the entry "x". When calling x(), the function-table is looked up. It is not necessary to have defined x to parse code that uses it. For example:

  if (false) { x(); } // legit, x() is never called

In a single PHP file, classes and functions declared in the top scope are considered declared at the top of the file. This can lend the appearance that they are not required to be declared, but its a hack. If you include a file later, dont expect to call its functions now.

> Values are procedures, classes, or variables.

I presume you mean that variables, classes and procedures are all kinds of values. That is simply incorrect. Classes are not first-class values, and first-class classes and functions are approximated in PHP by allowing classes to be instantiated by name (using a string) at run-time. This is changing slightly in 5.3, but the semantics are complicated, and still use strings.

Variables are just syntax in PHP. As I said, look at any literature on PHP (or javascript if you prefer). Variables are syntax for keys in a local symbol-table, and aren't real entities. They are certainly not values.

> PHP doesn't have functions; the keyword is mis-named.

Evidence? I can't think how this is correct.

5) So what if some compilers don't. Nearly all compilers do. The level of pedantry here is astounding. Just because you can list an edge case in which they are wrong, does not make them wrong in the general case.

6) Ah, your definition of "compiled" is "bytecode compiled". What a funny circular argument.

7) Except that the quotee says it! "But with a compiled language, the program you write is compiled into an executable file."

8) "Is a rocketship faster than a bicycle?" Yes. "Aha, but if my rocket isnt moving, the bicycle is faster. Therefore I assert that rocketships are _not_ faster than bicycles". Bollox.

9) Loading from memory is not "reading" in the sense the quotee used, which was clearly parsing. I think you're being deliberately obstinate: try using the context of the article to determine what the words mean. I could argue that binaries do not have to be fully read into memory (they are loaded lazily, page-by-4k-page, by most modern OSes), but I don't want to start a discussion on it. Heaven forbid an obscure 90s OS used 8k pages.

Anyway, you're deliberately twisting the quotee's words and being pedantic, so I'm done here.

dkersten · on March 12, 2010

6) A lot of people share his definition - for example, everyone who says that Java, C# etc are compiled languages. I think thats enough people to make this definition true.

8) Being natively compiled does not make that program automatically faster. It depends on the use case (I/O vs compute bound) and it also depends on the quality and complexity of the code generator and optimizer. Furthermore, bytecode compiled languages (as opposed to native compiled) may be able to better optimize at runtime using JIT compilation due to being able to make assumptions you couldn't make at "compile time" in AOT compilation. Saying that natively compiled programs are faster than other kinds of programs is simply not always true.

codexon · on Jan 31, 2010

3) 4)

<?php

echo foo('baz');

function foo($bar) { return $bar; }

?>

This looks to me like you can use PHP functions before declaring them above.

pbiggar · on Feb 1, 2010

Functions and classes declared anywhere in a file are pushed to the top of that file by the parser. This obscures all the edge cases. Simplest example I can think of:

  echo foo('baz'); // error

  {
    function foo($bar) { return $bar; } // not pulled to top of file
  }

olalonde · on Jan 31, 2010

You'd have to work pretty hard to make a compiled algorithm slower than an interpreted one.

gaius · on Jan 31, 2010

Not true. The runtime has access to information that the compiler doesn't, such as which branches are actually taken.

pbiggar · on Jan 31, 2010

This makes no sense. Interpreters add indirection at every single statement in the program. The fact that they may know the direction of the occasional branch cannot possibly make up for this.

gaius · on Jan 31, 2010

JIT is a valid strategy for a scripting language interpreter.

pbiggar · on Jan 31, 2010

No. A JIT is a valid strategy for a scripting language implementation. An interpreter is a different thing. Many JIT compilers use an interpreter for running uncompiled code and profiling ("mixed-mode interpreter"), but "interpreter" and "JIT" are in no way synonymous.

gaius · on Jan 31, 2010

-1 for that? Wow.

jules · on Jan 31, 2010

Can you give an example of this? In my experience compiled code is often 100x faster than interpreted.

InclinedPlane · on Jan 31, 2010

Currently this is generally true, though not always. The compiler only has so much information available when it's compiling, the run-time potentially has more information. The most aggressive optimizations of compiled code rely on profiling data gathered from actually running the code. You generate representative usage scenarios, run your program using those usage scenarios (usually automated), gather data about how frequently different code blocks are hit through the use of instrumented binaries, then use that data to produce a highly efficient optimized compiler output.

However, not all software engineering groups have the capabilities to produce such highly optimized binaries, and there is always the risk that a user's particular usage patterns will differ enough from the expected patterns that they will lose the benefit of this extensive optimization. However, in an interpreted or byte-code language a lot of the same information needed for optimization is available to the run-time. A run-time designed for optimization may be able to take advantage of that, creating super efficient code paths based on actual usage. This model is more difficult to implement but potentially more robust than statically optimized compilation (and also has the potential to take greater advantage of differences in hardware, a statically compiled native binary doesn't have the ability to morph its optimization based on whether its running on a single core Atom or a 6-way Core i7 cpu, or some 100-core monster of the future, but a run-time potentially can).

In the average case most of this is just theory, but the potential is very real.

Some worthwhile background reading:

Trace Trees: http://www.ics.uci.edu/~franz/Site/pubs-pdf/ICS-TR-06-16.pdf

A blog post / talk from Steve Yegge on dynamic language performance and other topics: http://steve-yegge.blogspot.com/2008/05/dynamic-languages-st...

jules · on Jan 31, 2010

Very interesting. Can this kind of technology yield further speedups for languages that are already fast (compared to getting dynamic language speed closer to fast)?

BTW these technologies are not interpreters, but compilers (but runtime compilers).

pbiggar · on Jan 31, 2010

Interpreters are not JITs.

dkersten · on March 12, 2010

But there is no reason why an interpreter could not contain a JIT to on-demand compile. In fact, the definition of interpreter that seems to be in common use is that it takes raw source code and executes it - nowhere have I ever seen anybody state that the compiler cannot on-demand compile the source code as its interpreted (perhaps to speed up future calls to that code). This is still distinct from VM based implementations, which compile the source code to byte code and the byte code is then executed or natively compiled languages where the code is compiled directly to the host processors instruction set.

protomyth · on Jan 31, 2010

its probably on a tangent, but the the theory is used in the OpenGL implementation of OS X [ http://lists.cs.uiuc.edu/pipermail/llvmdev/2006-August/00649... ].

eggnet · on Jan 31, 2010

Modern CPUs have branch prediction.

w1ntermute · on Jan 31, 2010

Your link is for this story, not for the quote. Here's the correct source: http://therumpus.net/2010/01/conversations-about-the-interne...

quickpost · on Jan 31, 2010

Thanks! I kept clicking the above link and ending up right back in the same place!

Time for a break.

carson · on Jan 31, 2010

If this is true and it is just a compiled version of PHP how much better is that going to be than APC or eAccelerator? There must be more to it than just compiled PHP.

wvenable · on Jan 31, 2010

The idea is probably to compile PHP into machine code rather than byte code.

bad_user · on Jan 31, 2010

I know you're thinking machine code must be better than byte-code, but that doesn't do any good, quite the opposite.

With byte-code you can compile sections of code to machine-code, based on runtime profiling / type-inference (as the JVM does). With ahead-of-time compiling to machine code, you're probably going to end up with a serialization of the PHP code in the final executable, plus an interpreter :)

tlrobinson · on Jan 31, 2010

[cringe]

This person knows just enough to be dangerous.

n8agrin · on Jan 31, 2010

The article is completely unsubstantiated.

Well, I was able to put all the pieces together on this one, finally, and I now understand exactly what is up: Facebook has rewritten the PHP runtime from scratch.

Look if it's true, this is cool, it will no doubt be a great contribution to the OS world, but let's wait until Tuesday or until we have more concrete info beyond this author's guess that FB has completely rewritten the PHP runtime. I realize the author has little to gain from making this up, except for maybe 15 min of fame on HN, but still, don't believe everything you read.

kaens · on Jan 31, 2010

I think that someone reimplementing PHP and cleaning up a lot of it's . . . quirks . . . could be a very good thing for the web. Someone who didn't care about maintaining backwards compatibility with code that relies on those . . . quirks . . . and who did care about language design, and clean implementation.

Unfortunately, I have this sinking feeling that this is not going to be that.

scorxn · on Jan 31, 2010

Is there an implication that all this optimization could be merged into PHP core? PHP's appeal is ubiquitous support. Even if it is open-sourced, I can't imagine a bunch of hosting companies suddenly serving Facebook-flavored PHP.

pbiggar · on Jan 31, 2010

The PHP internals developers heaped scorn on compilation when I talked to them about phc (http://phpcompiler.org). I think it might be different with a Facebook seal of approval though.

jerf · on Jan 31, 2010

Should the current internal developers object to the idea even when implemented, well, "a new set of PHP internals developers" would not be the worst thing that ever happened to PHP....

jganetsk · on Jan 31, 2010

Wouldn't it have been easier to have moved away from PHP? From what I understand, Facebook only uses PHP for the most front of ends. Business logic is all in other languages. Isn't it easy to port the front-end into something else?

indigoviolet · on Jan 31, 2010

varaon · on Jan 31, 2010

Note: This user has worked for Facebook for two years.

indigoviolet · on Jan 31, 2010

I guess I should probably clarify what I meant by the "No." above:

Facebook has a large, well-tested codebase, with a huge amount of infrastructure built on and to support PHP. It would be ridiculous to expect Facebook to migrate all of this to a different language.

Re: the article itself, you'll have to wait until Tuesday.

olalonde · on Jan 31, 2010

What other languages? (genuine question)

adrianwaj · on Jan 31, 2010

Reddit was rewritten from Lisp to Python: http://www.aaronsw.com/weblog/rewritingreddit

Twitter from Ruby into Scala and JVM: http://www.artima.com/scalazine/articles/twitter_on_scala.ht...

I think there are two aspects of a language (correct me if I am wrong) -- it's elegance/simplicity/breadth of code produced by programmers that write with it --- and then the efficiency/effectiveness/HW-optimized executable code produced by its compiler once it's run: so how does PHP and Facebook fit in with all this?

munctional · on Jan 31, 2010

Twitter's frontend is still definitely Ruby (on Rails). Based on what I've heard from people who have consulted there, it's a gigantic ball of crap. They've so heavily patched Rails 2.0 that they can't realistically migrate to a more modern version of Rails.

BerislavLopac · on Jan 31, 2010

Which is sorta ridiculous considering that it's definitely not among the most complex of the Web apps out there. Even late competitors like www.shoutem.com are much more complex as they allow for a bunch of Twitters to be created on the same platform.

The only complex thing about Twitter is its size, and I bet their developers are working round the clock just to keep it from falling apart.

rufugee · on Jan 31, 2010

I do a fair amount of Rails, so I'm really curious here. How could it be that they've so heavily patched 2.0 that they can't move on? Anyone from Twitter care to comment?

I've worked on many Rails apps, and have upgraded the apps from version to version. It's a pain when key elements of the API shift, but it's not that bad...even when the project has monkey-patched Rails a lot. And twitter certainly has the resources to afford to dedicate a few programmers to this task, so I'm just not sure I buy it.

munctional · on Jan 31, 2010

One of the contractors I spoke with said that they had a branch running Rails 2.1 successfully. When they deployed it in production, the entire application fell on its face.

Supposedly, the problem was caused by Cache Money, but nobody at Twitter wanted to risk moving to a different version again. They're still on 2.0 today. :-)

Another fun fact: Twitter has over 1,500 remote git branches. They also have bright green deer in the reception area of their office. :-)

leej · on Jan 31, 2010

FB is on a much different scale so it'll be out of question for short term but I wonder if they have looked at Quercus.

_tggb · on Jan 31, 2010

Erlang, C++, Java, Python and possibly others that I can't quite remember. See http://www.infoq.com/presentations/Facebook-Software-Stack for actual details.

dylanz · on Jan 31, 2010

Yeah, I had the same question. The PHP syntax definitely puts me off. That said, it's just the front-end. Might as well keep on keepin' on with what you're using, and since it's open source, just rewrite the thing.

Deep down inside I wish they were using ERB (or HAML) however, and he was writing HERB (or HHAML). I'm still enamored with Ruby syntax, and wish it would get some more big business love.

senko · on Jan 31, 2010

> That team were forced to sign NDA's, and taken to a very quiet, secluded meeting room where some cool new Facebook-backed open source project was described.

This caught my eye - an interesting use of the term "open source" that I haven't previously been aware of. This is only a single datapoint (and the article states the project is going to be opened up anyways), but I do have a feeling the term has been diluted and joined the buzzword ranks.

jrockway · on Jan 31, 2010

It's "open source" as in, "we'll release it as open source when we are good and ready". So far, that's been never.

VonGuard · on Jan 31, 2010

Yeah, open source is kinda a verb now. Anyway, Facebook has other open source projects: see Hive http://www.facebook.com/pages/Hive/43928506208

Kinda silly to have a facebook page for an open source project, though.

jmatt · on Jan 31, 2010

Here's the main page for open source facebook projects:

http://developers.facebook.com/opensource.php

I agree it's a bit silly. I think they hope that it'll catch on sometime in the future and they'll have yet another type of group socializing on facebook.

jackowayed · on Jan 31, 2010

You mean people actually socialize on Facebook pages?

It seems that the only effective purpose for pages is for celebrities/companies to push updates to their fans (and almost always ignore the reverse direction), and groups are even worse. Most pages and groups I see are things that you join because you agree with/identify with the name and then totally ignore. I wonder if facebook a) cares, and b) could do something to fix it.

jmatt · on Jan 31, 2010

Ya I used to agree with you completely. I still do for the majority of pages out there.

I recently learned of a counter example. There is a locally owned sports bar and restaurant that has a very active group. Most of the members are fans of sports teams that are across country and the games play regularly at this bar. The owner is active in the group too and definitely encourages and responds to conversation. The other group I know that is relatively active has more or less the same characteristics. It's people who identify with the group but are otherwise disjoint - and this is the best way for them to casually communicate. I think it's safe to say this was the original intent of groups and pages. Celebrities and companies are just taking advantage it. Of course, I have a handful of pages on my facebook profile, so I guess I'm as guilty as anyone else. (I'm With COCO!)

EDIT: abusing -> taking advantage

timdorr · on Jan 31, 2010

This is particularly interesting because PHP's runtime was rewritten for 5.1 back in late-2005 (on top of 5.0's Zend Engine 2.0 improvements in mid-2004).

And is PHP really considered "pokey"? Sounds like this guy is making stuff up because I get execution times of >0.01 seconds on my micro-framework.

pbiggar · on Jan 31, 2010

PHP is dirt slow. When you look at the implementations of Lua or Python (which have approximately the same design as PHP), its about 4-5 times slower. Note this is only in the interpreter -- lots of the library code is written in C, which makes the difference somewhat less relevant.

The implication is that writing code that uses PHP's built-in libraries is pretty fast, but the more you write in PHP itself, the slower it gets. For example, my impression is that Yahoo isnt really written in PHP - its written in C patched together using PHP.

The Zend engine was "rewritten" for PHP 4, PHP 5, and to a certain extent for PHP 5.1. I guess it wasn't a complete rewrite because the legacy code from about 10 years ago is still in there. Anyway, its still dirty, badly written, slow, and very very badly commented. Hacks abound (and not the good kind).

zmimon · on Jan 31, 2010

This is a point I always have trouble impressing on people. They do a simple benchmarks on a tiny code base that pulls some data from the database, spits out some data, and they compare and PHP seems to be lightning fast.

The problem is that because it's interpreted at some point PHP slows down in proportion to the size of your code base. A small app runs really fast, a big app with 100,000 lines of code will kill your server unless you modularize it really really well - which harder than it seems, because the more modularized you make it the more separate "includes" you end up with in different files and then you come to realize that including a lot of files itself is a problem. And the nature of PHP's very loose coupling tends to lead to code that is nearly impossible to do large scale refactoring on once you have gone too far down the path.

I work on an application that has a very thin PHP layer that performs some simple web services that are the back end for a pure Java web app. Amazingly, when we load test it, the PHP part is the bottleneck, burning CPU like crazy just parsing all our files ... over ... and over ... and over. The java code meanwhile, while theoretically doing far more "work", is completely bored. We will probably look at using an accelerator of some kind or maybe just rewriting all the PHP in another language.

pbiggar · on Jan 31, 2010

If your problem is parsing time, just use an accelerator. APC is the standard and best integrated.

On the other hand, if you have an opportunity to switch out PHP for something better (read: nearly anything), you should. Otherwise it might grow to a point that you can't remove it.

leej · on Jan 31, 2010

evaluate quercus.

rgrove · on Jan 31, 2010

"For example, my impression is that Yahoo isnt really written in PHP - its written in C patched together using PHP."

Yahoo! is a company, not a single application. Not everything at Yahoo! is written in PHP, but the vast majority of Yahoo! properties do use PHP heavily on the frontend, and not just as a way of patching together C extensions.

timdorr · on Jan 31, 2010

Well, if you use APC then that's all irrelevant. It's the equivalent of creating .pyc files. I never got the impression it was slower in actual execution, though.

pvg · on Jan 31, 2010

its about 4-5 times slower.

Really? In what actual benchmarks, real world situations?

pyre · on Jan 31, 2010

Note this is only in the interpreter

You did read this part right?

pvg · on Jan 31, 2010

http://shootout.alioth.debian.org/u32q/benchmark.php?test=al...

Many of these are largely interpreter.

igouy · on Feb 1, 2010

For the comparison you seem to be interested in choose the measurements where the programs are forced onto a single core

http://shootout.alioth.debian.org/u32/benchmark.php?test=all...

pvg · on Feb 2, 2010

The argument being made was that the PHP runtime is 4-5 times slower than Python and that PHP only looks as fast as it does because of the C libraries. This is simply untrue and the OP wasn't able to back it up. Cores don't come into it.

igouy · on Feb 2, 2010

The Python mandelbrot program uses 4 cores the PHP mandelbrot program uses 1 - that's why Python seems so much faster on mandelbrot.

The Python spectral-norm program uses 4 cores the PHP spectral-norm program uses 1 - that's why Python seems so much faster on spectral-norm.

The Python binary-trees program uses 4 cores the PHP binary-trees program uses 1 - that's why Python seems so much faster on binary-trees.

pvg · on Feb 2, 2010

Right. Python is not 4-5 times faster than PHP. That was my point.

pbiggar · on Feb 2, 2010

Actually I was basing it off the Language ShootOut. Last time I looked through it properly, Python was 16x slower than C, and PHP was 70x slower.

I wonder why its changed. PHP certainly hasn't gotten faster in the meantime.

jws · on Jan 31, 2010

A quick stumble through the Language Shootout suggests using PHP instead of C might make you need, oh say 10 to 100 times as many servers. So there is room for improvement.

Of course if your workload is not dominated by script running CPU time then it doesn't really matter. Even then, until the cost for extra servers and their management exceeds the engineering cost to recode, add servers.

I get execution times of >0.01 seconds – hope you aren't planning to handle more than 50 transactions per second.

It all depends what kind of world you live in. I know people that use their hard disks for booting and loading caches and that's it. If one query in a hundred requires a seek they will not keep up. Most of the rest of us could happily fork a CGI PHP for each request and not notice.

igouy · on Feb 1, 2010

I guess you stumbled past the places where the benchmarks game website links to "Overall Performance: PHP is rarely the bottleneck (HTML slides)" http://talks.php.net/show/drupal08/7

nir · on Jan 31, 2010

Well, with quotes like "it is simply not a language designed for the sorts of workloads that Java and .NET are" I'm not sure of the author's technical level.

Also, his comparison of Zend's "folks" (which ones? Founders? Execs? Sales people?) to "gestapo officer looking for a spy: "What? Who said that? Who said it was slow? Tell us their name!"" isn't too apt - they'd probably like to know who complained about PHP performance, so they can offer them their products/services, rather than silence them somehow.

lanstein · on Jan 31, 2010

Probably the same people who decided the backslash was the correct character to use as a namespace separator.

nir · on Jan 31, 2010

I suppose once you do that there's no telling what you're capable of.

tapostrophemo · on Jan 31, 2010

I'm curious; is this (http://timdorr.com/archives/2005/12/pgf-the-ease-of.php) the micro-framework of which you speak?

timdorr · on Jan 31, 2010

Nope, that's an older version of it: http://github.com/timdorr/asoworx