Hacker News new | past | comments | ask | show | jobs | submit login
Deep-copying in JavaScript (dassur.ma)
166 points by ingve on Jan 25, 2018 | hide | past | favorite | 132 comments



If you find yourself deep copying then you're about to level up your JavaScript career.

The actual answer is not to find out how to deep copy reliably.

The actual answer is don't deep copy - instead you should be using immutable data structure - check out immutablejs. This will solve the problems that you are trying to address by deep copying.

Sticking with deep copying will only increase your pain and decrease your software reliability.


Yeah, generally what's needed for data updates is _nested shallow copies_, not a deep copy. I talked about this in the "Immutable Update Patterns" section of the Redux docs [0], and the articles in the "Immutable Data" section of my React/Redux links list [1] give more examples of how to handle data immutably in JS.

There's dozens of immutable update utility libraries available [2]. The one that I'm really excited about right now is a new library called Immer [3], from MobX author Michel Weststrate. It uses ES6 proxies to let you write normal mutative code in a callback, then applies the updates immutably. It looks like it should simplify a lot of common immutable update use cases, like in Redux reducers.

[0] https://redux.js.org/docs/recipes/reducers/ImmutableUpdatePa...

[1] https://github.com/markerikson/react-redux-links/blob/master...

[2] https://github.com/markerikson/redux-ecosystem-links/blob/ma...

[3] https://github.com/mweststrate/immer/


What's a nested shallow copy, and how is it different from a deep copy ?


A shallow copy means creating a new object (or array), where all of the fields or array values point to the same references as the original:

    const original = {a : {a1 : 42}, b : {b1 : 99} };
    const shallowCopy = {...original};
    console.log(shallowCopy.a === original.a); // true
A deep copy means creating new objects at _every_ level of nesting:

    const deepCopy = _.cloneDeep(original);
    console.log(deepCopy.a === original.a); // false
To properly do an immutable update for `original.a.a1`, you'd need to copy each level of nesting, and overwrite the fields in that chain:

    const updated = {
        ...original,
        a : {
            ...original.a,
            a1 : 123
        }
    };

    console.log(original.a === updated.a); // false
    console.log(original.b === updated.b); // true


>If you find yourself deep copying then you're about to level up your JavaScript career.

I don't know if I'm supposed to read 'level up your JavaScript career' as you're about to start doing important stuff and get paid for it too, or you're about to increase your technical proficiency by stopping deep copying and finding a better way because it is too painful to keep on (the second interpretation seems to be borne out by the rest of the post, I just wouldn't think of it as leveling up normally)

on edit: fixed silly typo


Doesn't immutablejs do the deep copying itself?

Even better would be a language (could be plain JS) that enforces immutability. How about a new keyword in Typescript?


There's Object.freeze[1] for that. Of course that's checked only at runtime, because js is a dynamic language …

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Typescript has a readonly directive you can declare for interfaces, and it gives a type error on assignment.


The real actual answer is transpile to asm.js from a language that isn't fundamentally a steaming pile of garbage.


"Deep copy" and "structured clone" are not synonymous. A structured clone is one specific type of deep copy, and it has a variety of quirks. It's not clear to me why anyone wants structured clone as opposed to some other kind of deep copy, except if you were polyfilling some API that demands it like I did here https://github.com/dumbmatter/fakeIndexedDB

There are plenty of more reasonable deep clone functions available, for instance the Lodash one someone posted in another comment https://lodash.com/docs/4.17.4#cloneDeep


I'd like to see the benchmark in the article include the lodash cloneDeep function.


The solutions presented fail if one of the values in obj is a function. E.g., JSON.parse(JSON.stringify({foo:x=>x})) returns {}.

It's not intuitively obvious what the "correct" solution ought to be. If cloner is a deep-cloning function, and f is a function, which of the following should be true?

1. cloner(f) === f?

2. For all ...args, cloner(f)(...args) === f(...args)?

3. For all ...args, cloner(f)(...args) === cloner(f(...args))?

4. For all ...args, cloner(f)(...args) === f(...args.map(cloner))?

5. For all ...args, cloner(f)(...args) === cloner(f(...args.map(cloner)))?

6. Something else entirely?


Ohai. Author here. That’s indeed one of the trip-wires. So is that any kind of prototype is lost.

The point of the structured clone algorithm is to make values transferable to other realms, so functions are actively ignored. Most of the time, I think that is “good enough”.


You might have missed the main thread pointing out that your claim that "JavaScript passes everything by reference" or is call by reference is wrong in every way. Javascript is call by value, those values can be strings, numbers and references to objects and arrays and such. "Call by reference" is very specific terminology that means that when you run somefunction(n), the caller's variable, n, might get entirely changed, even if it's just a number.


Can you fix your first statement: "JavaScript passes everything by reference" which is incorrect?

see: https://jsfiddle.net/t73ykuj0/


Can you please explain why this behavior is happening in this jsfiddle, or point me to a link that can help me understand this better?


The term for what Javascript does is pass-reference-by-value or pass-by-sharing:

https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sh...


“Deep copy” is a code smell and attempts to implement it will lead to pernicious bugs. Either you’re copying data, in which case it should be serializable to and unseriazable from JSON, or you’re copying objects, in which case you should think about what you’re really trying to achieve. There is usually a better solution.


I use lodash.cloneDeep for this. https://lodash.com/docs/4.17.4#cloneDeep. It has variants to munge the object while copying.


Missing from the benchmarks is simple deep copying:

    function deepClone (obj) {
        var clone = {}
        for (var key in obj) {
            if (obj.hasOwnProperty(key)) {
                var value = obj[key]
                switch (Object.prototype.toString.call(value)) {
                    case '[object Object]': clone[key] = deepClone(obj[key]); break
                    case '[object Array]' : clone[key] = value.slice(0); break
                    default               : clone[key] = value
                }
            }
        }
        return clone
    }
This is 5-10x faster than the JSON trick in WebKit: http://jsben.ch/W0ciO

(naive implementation, but covers 90% of uses, see lodash for the full mess: https://github.com/lodash/lodash/blob/master/.internal/baseC...)


I believe that it's faster, but only because it's not actually deep-cloning.

This:

    value.slice(0);
is a shallow clone; it makes a new array, but doesn't clone it's contents.


Hence being a 'naive' implementation. You can replace that with something like `deepClone(value)` + implement copying for different types, ultimately you'll end up with code like lodash's.


It's not just naive though, it's wrong; it won't even give you new objects although it seems like it's meant-to:

    a = [1, {foo: "bar"}];
    b = a.slice(0);

    b[1].foo = "baz";

    console.log(a)
    > [1, {foo: "baz"}]

    console.log(b)
    > [1, {foo: "baz"}]
A naive implementation would do this:

    case '[object Array]' : clone[key] = value.map((item) => deepClone(item)); break
And would thus be slower.


We are in agreement here. My suggestion was to make the deepClone method also take arrays, and walk through them copying values in the for loop (significantly faster than map).

Again, I linked to a full implementation at the end; was more commenting on the lack of this particular approach, wrote this on the spot as an example.


Because JS doesn't clone natively and library behavior varies somewhat, our team has started to consider deep-copying and cloning to be a smell. It forces the maintainer to figure out how exactly the copying/cloning is going to work, and it also creates nasty bugs when modifying a class without knowing it'll be copied or cloned somewhere.

We instead only use immutable, plain ("POCO" or "POJO" in other languages) objects for data. If you want to copy an object, you have to call the same functions to change it that you called on the original. If the functions are deterministic (as they should be), then you get identical objects at the end.

Our code base is effectively functional, even though we still use objects as our syntactic sugar (addOwner(car, user) becomes car.addOwner(user), etc.)


I think it is almost unbelievable that the ESxx train continues to expand the language with ever more features, while at the same time there is still no native method to deep-copy an object..

Almost everything in JS is an object, but there is no reliable way to copy one.. Feels like an epic failure to me. Would it not be an idea for the ESxx team to finally fix this core issue for the first upcoming release?


Why is deep copy so essential? Deep copy is the wrong solution to your problem 99% of the time, IMO.


I think spamming mutable objects around in your codebase is asking for trouble. I'd be really happy to be able to do something like this:

  foo( Object.copy(barObj) );

I mean, why would you send your state's mutable object as argument if you're 100% sure you don't want it to be changed?


Because copying is expensive. If mutability is the issue, use Object.freeze(), or define properties with Object.defineProperty() setting writable to false.


I kinda agree, although I think "epic failure" is a bit strong. But that's exactly why I am volunteering to work on the spec to move that functionality into the HTML spec, and then hopefully also to ECMAScript.


In most common cases, like shallow clones and cloning simple objects, are already well-covered and solved, often with very few lines of code or even just a single command (`Object.assign()`). Stage 4 draft (i.e. it will be in the next version): `{ ...obj }` to clone an object [0].

Cloning Arrays, Maps, Sets is easy using their default constructors in a single line, such as `new Map(originalMap)` or `new Set(originalSet)` [1]. You can clone an array using spread syntax or `Array.from` [2].

Another issue is that there are many variants of "deep cloning". Some people only need actual "regular" object properties cloned - copying objects without any of the features you get through `Object.defineProperty()`, like object literals.

Some people want symbols cloned, others don't.

Some people don't care about functions. Some people want some things, like functions, not cloned but referenced to avoid duplication.

Some people want all the options copied that you get from `Object.defineProperty()`.

Some people care about the prototype chain, others don't (if you only clone regular objects, e.g. from object literal object creation and similar ones).

So there is huge variety about what people need when they talk about "deep-copying" in Javascript, because objects have so many optional features, but most of the time they are not needed or not relevant to the copying.

As others have said, it also is not actually all that essential. I use a functional style without OOP (this, bind, class (ES6 or ES5 "pseudo-class" construction), try not to mutate, and I rarely need deep-cloning, if at all. I think right now I only use my carefully crafted and speed-tested deep-clone module in tests. Each time I started using it in my code I eventually came up with a much better solution that didn't need it. Note that I did not actively try to avoid it, it just happened - the use case went away by itself each time I came up with an improved version of the code.

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[1] https://stackoverflow.com/a/30626071/544779

[2] https://www.briangonzalez.org/post/copying-array-javascript


> Feels like an epic failure to me

Whole JS feels this way to me :)


I would never deep copy using any of those methods. Can't you just write your own recursive version? I remember plenty of examples on jsPerf (site not working at the moment) including my own that significantly outperformed JSON parse/stringify.


Yeah I'm confused as most of those methods, like postMessage, stringifies things anyway. I simply iterate other object properties to make a copy of a full object.

I've never run performance tests but surely that's faster than parsing an object into its string representation then parsing it back into the language, no?


As I explain in the article, postMessage does not stringify. It runs the structural clone algorithm, as per spec. Otherwise things like ArrayBuffers, Sets et al. would not make it to the other side.


The problem is figuring out if you're in a looping/cycling data structure or not. JSON.stringify and friends often already have code to deal with that, and it's quite complex in pure JS because there is no "object id" function.


Tackling the toughest challenges of our generation.


For fun and games, try deep copying event objects. Spoiler alert: it doesn't work in ways that are not immediately obvious. Hours of my life I will never get back...


Is the `await` used in the last two examples a typo? They're not asynchronous are they?


So, it’s me being lazy and copy-pasting, but you can use await on synchronous functions, too. It’s a no-op :D


Not a no-op, it wraps the awaited value in `Promise.resolve()`:

   async function foo() {
      const a = await {then(resolve) { resolve(123); }};
      console.log(a);
    }
    foo();


Indeed, you don't use `await` on functions either, it's used on promises. In fact, when used on a non-promise, it still waits an extra loop cycle before resolving and continuing the async function.


JavaScript is absolutely not pass by reference.

    let a = 1;
    
    function foo( a ) {
      a = 2;
    }
    
    foo( a );
    
    console.log( a ); // 1


"By reference" means different things to different people.

The most accurate statement I've found is that JavaScript, much like Java, passes reference-by-value. Everything is passed by value, but if your value is an object, then the new scope gets a copy of the reference. If you modify the object by dereferencing (that is, `a.foo = "bar";`), your modifications to the object persist outside of your scope.

This is an important understanding to have when working with JavaScript.


Everything is passed by value.

A numeric value is put into the call stack.

Which in the case of references happens to be the address of pointee.

It is only pass by reference, if what is put into the call stack is the address of the actual variable on the caller stack holding the data.


Isn't that what I said? That's what I was trying to say, at least, but keeping the language more js-oriented.


I'm not the parent commenter but yes, you both made the same distinction as far as I can tell.


It doesn't help us understand the language if we try to shoehorn JavaScript's behavior into one of two buckets: "pass by value" or "pass by reference."

As you mentioned elsewhere, these terms are well defined in computer science. But how JavaScript works doesn't match either of them.

It only creates confusion to force JavaScript assignment and parameter passing into one of those two categories. This is what has led to the widespread misconception that "primitives are passed by value, and objects are passed by reference."

We're not implementing JavaScript engines here, we're programming in JavaScript. Implementation details like addresses and pointers are unknown to our code. They may differ from one JavaScript engine to another, and even during a single run of your program as the engine decides on the fly how to optimize or de-optimize your code.

What matters is the observable behavior from the point of view of the running JavaScript code.

There are a few ways you could describe that behavior, such as my notion of a new "name" that refers to an existing "thing", or shkkmo's and andrewstuart2's "pass a reference by value". But it's not "pass by value" as that term is understood in other programming languages such as C/C++.


No, JavaScript only does pass by value, nothing else.

There is nothing to discuss here other than JavaScript developers trying to imagine stuff instead of learning the underlying literature concepts.

JavaScript is not a special snowflake language.

Computer Science describes how things are supposed to be, they don't change because of language X or Y.


It's essentially exactly analogous to pass-by-value in C, where references are replaced by pointers. The primary difference is that JavaScript (and many other langues) use references as the only way to refer to an object, whereas C also has the concept of directly referring to an object by value where assignment or passing as a parameter will make a copy of the object.


It isn't "if the value is an object then scope gets a copy of the reference" No new objects are created when you pass an object.

An object's value is a memory location. When you pass it as an argument you're passing that value into the method. The analogy is more akin to casting than copying.


Nowadays, everything is pass-by-value. The only question is whether a reference or a pointer is one of the things that can be passed by value, or if everything passed is always a copy.

An exception to pass-by-value would be Forth; in the case of stack-based languages there really isn't a "pass" going on at all. But everything copies something for every non-inlined function call nowadays. Another instance where it's less obvious what is going on is a truly immutable language like Erlang or Haskell, where the "value" is quite likely still being passed via a pointer, but that's only relevant from the point of view of considering garbage collection behavior.

This makes the old "pass-by-value vs pass-by-reference" terminology vestigial to the modern software engineer, which is why the only place you'll encounter it is in school courses. A better way of understanding the modern issues is to ask what powers a passed-in value has and just ignore the old question.


Off the top of my head, here are languages in use that have a feature for pass-by-reference semantics: C#, Ada, C++ (though in this language you can get ahold of the reference as a value if you want).

Of course, to implement pass by reference, you tend to have to pass pointers around, but the point is that the language gives the illusion that the identifiers within a function's scope are locations specified by the caller.

I guess a takeaway is that in designing a language, one has to consider the relationship between an identifier, the mutable cell associated with the identifier, and the value stored in that cell. It seem that in common languages the identifier refers to the cell, but if the identifier is used in a value context, it evaluates to the value in the cell. Languages with pass-by-reference semantics let you make your own & function, so to speak.

(A weird example in C of this relationship is that an identifier for an array refers to a cell that contains the whole array, but, if you use the identifier in a value context, it becomes a pointer to the first element. One aspect of it not actually being a pointer is that you cannot replace its value with the pointer to another array.)

Anyway, an everything-must-be-a-first-class-object hard-liner would insist that pass-by-reference is the way of the past. But, sometimes it is not economical to get everything to be a first-class object. For instance, you might decide it is not worth trying to create a pointer type for referring to members of a packed struct, but it is not too hard to make a pass-by-reference feature for this (via copying).


Pascal “var” parameters. Pascal is rarely used any more, but I suppose it’s no more obscure than Ada.


The key feature of pass-by-reference is whether the callee can replace the value bound to the caller's argument (not mutate it).

In other words, pseudocoded:

    x=1
    changeMe(x)
    write(x) // 2
With that in mind, it's clearer that JS is pass-by-value with object reference values (or pass-by-sharing, though I never hear that in practice) as the caller remains bound to the passed object, but a C++ reference argument is an actual pass-by-reference of the original binding. So's an argument passed to a Pascal/Delphi out or variable parameter, and I'm sure other examples exist as well.

So I guess it depends on what you mean by "nowadays" but I'd still consider C++ a modern language within "everything".


You can never actually replace what the reference is bound to in C++, you can only call the assignment operator. The real distinction here is that in languages like Java and python, assignment is fundamentally different from mutation. In C++ there is no distinction, assignment is just a form of mutation.


Did you know? Perl is pass-by-reference.

    sub mutate {
      $_[0] = "blue";
    }
    
    my $color = "red";
    mutate($color);
    print($color); # "blue"


> Nowadays, everything is pass-by-value.

Even viewing sharing as a subtype of value, that's not true; call by need exists in modern languages (e.g., Haskell), and, in fact, classic call-by-reference is available in lots of languages, though not the exclusive model in any (and not the primary model in any current popular language I can think of.)


> Nowadays, everything is pass-by-value.

Except for obscure, seldom-used languages like C++ and C#.

If you can write

    int a;
    ParseInt("123", a);
    Print(a); // print 123
your language has pass by reference.


I would add other unknown languages like Ada, D, VB.NET, Delphi, FreePascal, Rust.


Rust's references are ordinary values. Moreover, the GP's example wouldn't work in Rust; you'd need to pass a value of type &mut i32 to ParseInt to mutate the value.


What are you talking about?

PHP has pass-by-reference: http://php.net/manual/en/language.references.pass.php


And the references are passed by value. There are two places you can find in RAM that will contain that reference, which once you penetrate through the abstractions and wrappers, will be a pointer; that second location is the reference that was passed by value. A reference is basically a pointer that the language prevents you from doing pointer arithmetic on, and in many languages such as PHP, is simply automatically dereferenced for you.

The reason why I brought up languages like Forth is that there really did used to be languages that did not actually "pass" things by value. Forth had a true pass-by-reference, in which there is no second copy of anything, neither the value, nor a pointer to the value. Now it's a dead distinction, in the sense that pass-by-value and pass-by-reference used to be distinguishing, because everything is copying something. Which is why the distinction is sort of confusing to try to apply to modern languages, and produces lots of heat but no light; it's dead, inapplicable to the modern language landscape.

The proper question to be asking is what permissions/privileges are passed by a function's parameter, whether you're allowed to modify it and have the modifications visible to the caller, whether you're not allowed to modify it at all (immutable languages, for instance), or whether there's an entire ownership system around what the passed-in value represents (Rust), not whether or not something was copied when the function call was made. Trying to answer this question in terms of "pass-by-X" doesn't help, if for no other reason than pass-by-X implies there's only two possibilities, and I outlined three classes of possibilities above, each of which can have their own further nuances. Is Rust pass-by-reference or pass-by-value? Well, the question is invalid, in either the original or the modern mutation of the meaning.


> There are two places you can find in RAM that will contain that reference

That is an implementation detail completely depends on the VM. There is one very simple exception to your example: call inlining. Pass-by-reference is a behavior specification, not an implementation specification.

> Rust

If Rust emitted purely pass-by-value machine code, zero-cost-abstractions would be impossible in the language.


It's mostly about explicit vs. implicit. If a reference is implicit - it's pass-by-reference.


Not according to the comment that I am replying to:

> A reference is basically a pointer that the language prevents you from doing pointer arithmetic on, and in many languages such as PHP, is simply automatically dereferenced for you.

Also, the implicit is usually pass-by-value.

    function foo(a) { a++; }
    function bar(ref a) { a++; }
    b = 1;
    foo(b); // Pass by value, implicit.
    assert(b == 1);
    bar(ref b); // Pass by reference.
    assert(b == 2);
Heap vs. stack values are not the same thing as pass-by-reference and pass-by-value.


> Is Rust pass-by-reference or pass-by-value? Well, the question is invalid, in either the original or the modern mutation of the meaning.

What do you mean? Maybe you mean "invalid" as in "no one should ask this anymore," but in case you mean "invalid" as in "is an apple an orange?" then

- It's strictly pass-by-value in the classic meaning. Sure, because of immutability there's the obvious optimization the compiler can do which is, under the hood, pass a pointer to a caller's value, but you could also implement everything by copying without the programmer noticing a difference (except for how long the program takes to run).

- It's strictly not pass-by-reference in the classic meaning. The "references" are a type of value. Parameters never are equivalent to the passed-in variable.

Whether or not it's a useful distinction to make is another story. References-as-values seems to be the emerging dominant paradigm for language design, but I sort of wish there were more languages that let you say "this variable is the fifth element of this array."

> if for no other reason than pass-by-X implies there's only two possibilities

I guess because people have only ever heard of by value and by reference, but there are a whole bunch more: by name, by copy-restore, by sharing.


pass-by-value, pass-by-reference and pass-reference-by-value are all distinct.

PHP does all three, pass-by-value (simple values), pass-reference-by-value (objects and arrays), pass-by-reference (when using & prefix)

The terms still have important distinctions in modern languages as they are functionally different.


> Trying to answer this question in terms of "pass-by-X" doesn't help, if for no other reason than pass-by-X implies there's only two possibilities

No, it doesn't. Heck, when I learned pass- (or call-)-by-X there were three main values of X mentioned (and implicitly a near-universal number of possible alternatives): value, reference, and name.

There are actually several more recognized values; the Wikipedia article on Evaluation Strategy has a reasonably good list:

https://en.m.wikipedia.org/wiki/Evaluation_strategy


You are correct, but it would so help to explain to the reader what the correct evaluation strategy employed by JavaScript is: call-by-sharing. https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sh...


Not quite accurate. The primitive types (string, number, null, boolean, symbol, undefined) are passed only by value. Also, the page you link to says "the terminology is inconsistent across different sources", so it doesn't seem an ideal choice. Javascript is pass by value, with some of those values being references.


Would you know that some Javascript engines store primitives on the heap, so potentially it can all be described as call-by-sharing? Then again, they are immutable, so it is indistinguishable from call-by-value.

I think making these distinctions is somewhat asinine and that it doesn't really do more than make people aware that the function call interface actually has some interesting design considerations. It's sort of like "ok, we have decided whether a hamburger is a sandwich. Now we can finally do the thing that desperately depended on this decision, which was, uhh..."


Don't know why you are being down voted. The author claimed EVERYTHING is pass by reference, which is false.


It's not even close, nothing is passed by reference!


objects? I don't care about low level engine implementation details. I'm saying from the dev's perspective, it's effectively pass by reference.


It is only pass by reference if what ends up in the call stack is the address of the actual variable on the caller side, not what it points to.


Wow, you got downvoted for being 100% right. Nothing in the first sentence nor the first code example is right. I haven't bothered to look at the rest ^^

JS doesn't pass everything by reference. As you've shown in your code example. And JS doesn't actually pass anything by reference as you and Amarshal have shown in your comments (which I knew, but didn't at the same time... Thanks for enlighting me guys!).

But what's mind blowing is that the code example given doesn't actually have anything to do with pass by ref of value. It's checking that the attributes of objects are reference based. It's not actually checking the function argument. Just an attribute of the argument.

I've upvoted your comment and your subcomment/reply. Not that you or anyone cares. They're just points. But it's a shame your comment is being greyed out.


EDIT: I am wrong. Below quote is irrelevant. I misunderstood the term "pass by reference", and confused it with mutable entities.

It is for objects, not primitives.

> JavaScript has 6 primitive data types: string, number, boolean, null, undefined, symbol (new in ES6). With the exception of null and undefined, all primitives values have object equivalents which wrap around the primitive values, e.g. a String object wraps around a string primitive. All primitives are immutable.

https://stackoverflow.com/a/13266769/1319878


The object is still passed by value, however the object itself contains references, which may be mutated. If objects were pass by reference then you’d end up with {a: 2} at the end:

    let a = {a: 1}
    
    function foo(a) {
      a = {a: 2}
    }
    
    foo(a)
    a //=> {a: 1}
Don’t confuse passing a mutable entity by value with pass by reference.


> The object is still passed by value, however the object itself contains references, which may be mutated.

This would imply that:

  var v = 1;

  (function (o) {
    o.a = 2;
  }({ a: v });

  console.log(v);  // => 2
The whole "objects are pass-reference-by-value" is obviously annoying and confusing, but we're kind of stuck with it.


Apparently there is another term (which I hadn't heard) which is: call-by-sharing

https://en.m.wikipedia.org/wiki/Evaluation_strategy#Call_by_...


That's irrelevant. Look, same example but with an object.

    let a = { bar: 'baz' };
    
    function foo( a ) {
      a = { beep: 'boop' };
    }
    
    foo( a );
    
    console.log( a ); // { bar: 'baz' }


Another way to think about it is that you just get a new reference to the same object, inside foo you are changing what 'a' refers to instead of what 'b' refers to. Using 'let' to declare 'b' is not necessary and using 'const' works just as well.

$ node

> const b = { bar: 'baz' }

> function foo(a){ a = {beep: 'boop'}; }

> foo(b)

> console.log(b)

{ bar: 'baz' }

>


I thought this was just a matter of scope. a declared outside of the function is not the same as the one declared as a parameter.


that answer doesn't concern with the pass-by semantic, only with the primitive/object semantics.

this is the gist of the issue, fetched from http://javadude.com/articles/passbyvalue.htm about java but exemplifies perfectly the pass by reference semantics:

"If you can write such a method/function in your language such that calling

Type var1 = ...; Type var2 = ...; swap(var1,var2); actually switches the values of the variables var1 and var2, the language supports pass-by-reference semantics."

as java javascript is reference-by-value


Primitive types and properties that are primitive types are pass by value, objects and functions are pass by reference.


It's still pass by value. The value of the object happens to be a reference to a memory location. A primitive also points to a memory location too.


There's no difference

    let a = function bar() { };
    
    function foo( a ) {
      a = 2;
    }
    
    foo( a );
    
    console.log( a ); // bar


I thought this was because Javascript (by design) doesn't warn about shadowing identifiers.


It is for objects, just not for primitives.


No. Object are passed by value. Exact same example apply:

    let a = {key: 'a'};
    
    function foo( a ) {
      a = {otherKey: 'b'};
    }
    
    foo( a );
    
    console.log( a ); // {key: 'a'}


But it is also true that:

    let a = {key: 'a'};
    
    function foo( a ) {
      a['otherKey'] = 'b';
    }
    
    foo( a );
    
    console.log( a ); // {key: 'a',otherKey: 'b'}


Yes, because "a" is not an object, it is a reference to an object. This reference is passed by value. This kind of thing is why I'd recommand everybody to know C: when you know pointers there's no magic anywhere anymore


Could you give an example in C of how the above works?


Something like this, without any kind of error checking being done.

    #include <stdlib.h>

    // assume some hash table library, exercise for the reader
    typedef struct{} *hashtable_t;
    extern hashtable_t hashtable_init(void);
    extern void hashtable_put(hashtable_t hashtable, const char* key, const char* value);
    extern void hashtable_dump(hashtable_t hashtable);

    typedef struct {
      hashtable_t table;
    } data_t;

    void foo(data_t* a)
    {
        hashtable_put(a->table, "otherKey", "b");
    }

    int main(void)
    {
        // let a = {key: 'a'};
        data_t* a = (data_t*) malloc(sizeof(data_t));
        a->table = hashtable_init();
        hashtable_put(a->table, "key", "a");

        foo(a);

        // console.log( a );
        hashtable_dump(a->table);
        return 0;
    }

The function foo() gets the numeric value of the a pointer, thus a parameter inside foo() points to the same memory location.

If C had pass-by-reference, it would be possible to give (implicitly) the memory location of the local variable a in main() instead. For example, like in Pascal (var) or C++ (& in function declaration).


Yes, which together with the other example shows that javascript uses pass-reference-by-value, as explained elsewhere on the thread.


From a user perspective most languages make a mess out of these things and confuse people. In Javascript certain things are declared in a similar way, but some behave like values and some behave like references. Which makes it both pass-by-value and pass-by-reference.


Wrong from CS point of view, references are also pass-by-value on JavaScript.

If JavaScript allowed pass-by-referece, we would be able to do the following,

    function swap(x, y) {
        let c = x
        x = y
        y = c
    }

    let a = {a: 6, b: 7}
    let b = {c: 8, d: 8}

    swap(a, b)

    console.log(b)

    // Object {a: 6, b: 7}
Meaning, pass-by-reference means being able to actually change the "pointer" used by the local variable.


I understand what you mean, but consider this to be an incorrect view on these things. They are not about CS, but about user experience.


The incorrect view is to teach wrong concepts just because some people rather learn based on gut feeling, than taking the effort to learn it properly.


I believe that javascript always uses pass-reference-by-value, but simple types are immutable, which means that pass-reference-by-value is functionally identical to pass-by-value since there is no way to make changes to the immutable referent.


How does inmutable-js fit into this discussion? My limited understanding is it defers the copy decision and then can limit it to the area needed. Wouldn't that lead in most cases to a better performance?


If you have immutable objects (doesn't even have to be ImmutableJS, you can just decide to never mutate object properties), then there's really not even a need for cloning at that point. The reason you would clone an object is to avoid mutating the original object, which you don't need to worry about when everything is immutable.


Except that you need to copy _something_ at some point to modify values (or more precisely, having another object that is the same but with X change). In theory everything is re-created with a change, but in practice only the changed objects are copied.


Immutable objects aren't very useful if you can't create mutated copies...


I don't know the details of immutable-js but generally an immutable 'deep copy' shouldn't be different than a 'copy' that you'd normally get by e.g. passing the thing by value. Two symbols point to the same root that defines the data structure, the symbols are value-equal, but not id-equal (at the symbol comparison level). The usual concerns about deep copying don't really apply, since if you try to mutate either of them, neither of them will be affected, but a new piece of data will be created that has an id-different path to the root of the structure but everything unmodified out of that path is still id-equal.


Somewhat OT:

If you click through, you can find the original Twitter conversation that prompted this posting [1]. In that conversation, an interesting point about "optimizing away" string representations was brought up: [2, 3]

> Interesting! How is this better or worse than the usual trick of `JSON.parse(JSON.stringify(obj))` ?

(...)

My idea was that the browser could detect that the string is immediately parsed anyway and just interpret the two function calls as "deep copy this object for me".

I had thought about somethig like this in the context of innerHTML a whole ago. Generally, there are a lot of places in the DOM where can modify an object not by setting properties, but by getting a string representation of the object, modifying that string, then feeding the new string back to the browser for reparsing. Examples are document.cookie, css properties and of course innerHTML.

In those cases, an optimisation like the above sounds like it could actually make sense.

E.g., if someone abused innerHTML like this:

  for (let i = 0; i < data.length; i++) {
    myList.innerHTML = myList.innerHTML + "<li>" + data[i] + "</li>";
    doSomethingWith(myList.lastChild);
  }
a naive implementation would have to serialize, parse and recreate the whole object tree for each iteration of the list. However you could probably optimize it by having innerHTML return some special object that only "looks" like a string but somehow keeps track of its mutations internally. When the object is fed back to innerHTML later, the browser could use the internal properties to mutate the DOM instead of recreating it.

I'm curious, does anyone know if optimisations like this are actually used?

[1] https://twitter.com/DasSurma/status/955484341358022657

[2] https://twitter.com/sergiomdgomes/status/955484966313488384

[3] https://twitter.com/moritz_kn/status/956444334970343424


> If you don’t expect cyclic objects and don’t need to preserve built-in types, you get the fastest clone across all browsers by using JSON.parse(JSON.stringify()), which I found quite surprising.

I wonder if that’s because browser devs know by now that this is a widely used technique and hence optimise for it.


I suspect it is because JSON.parse and JSON.stringify are more commonly used methods and therefore see more optimization than methods use less.


I was surprised how difficult it was to make a deep copy with javascript, (I did try JSON.parse) but finally decided to rewrite the logic without deep copy. Its a great article, shed some light on an interesting programming question


In some cases you can assign the object you want to copy as the [[prototype]] of another object, this way you delegate what you don't want to change and change what you want in the "copy" via assignments


On the "pass by value" vs. "pass by reference" controversy, I see two issues here:

First, these terms have specific meaning in other languages such as C++, but JavaScript's behavior does not match either of their definitions.

Second, there is a widespread myth that JavaScript passes primitive values and objects in two different ways. It doesn't. They work exactly the same, and making an artificial distinction between them is not useful.

I find it helpful to make up some new terminology, so my thinking isn't led astray by implied comparisons with other languages. I like to talk about names and things.

A name is a variable name or a parameter name.

A thing is any object or primitive value: anything that you can pass as a function argument or use on the right side of an assignment operator.

A name always refers to one thing. A thing can have multiple names.

When you pass an argument into a function, or when you use the assignment operator, you are giving a new name to an existing thing. If the name already referred to some other thing, that connection is lost. The name now refers to the new thing, and only to the new thing.

Everything works this way, primitives and objects alike.

The reason it's often claimed that objects are passed into a function by reference, while primitives are passed by value, is that we instinctively write different code for the two cases. That different code is what leads to the notion that objects and primitives are treated differently.

If you pass an object into a function and you want to add or change a property of that object, you'll naturally write:

  function foo( obj ) {
      obj.myprop = 42;
  }
But if you know the function is going to receive a primitive value, you won't write this:

  function foo( val ) {
      val.myprop = 42;
  }
If you do, you'll find out soon enough that it won't work. It will throw an exception in strict mode, or fail silently in non-strict mode.

Instead you may write:

  function foo( val ) {
      // Do stuff here with val
      val = 42;
      // Do stuff here with the *new* val, which isn't the old val
  }
The reason the functions behave differently is not because JavaScript passes primitives and objects differently. It doesn't; they are all treated the same.

But primitives are immutable. They have no properties that we can change. We know that, or we find out if we try, so we simply don't write code that tries to change a property of a primitive value.

It isn't the JavaScript language that treats objects vs. primitives differently when passed into a function or when used in an assignment statement. We programmers do, by writing different code for the two.

(Edited for clarity)


> The reason the functions behave differently is not because JavaScript passes primitives and objects differently. It doesn't; they are all treated the same.

>The reason is that primitives are immutable

This is incorrect. See: https://jsfiddle.net/t73ykuj0/

The reason is that javascript uses "pass-reference-by-value" not "pass-by-reference"

edit: The immutability of primitives is why the "pass-reference-by-value" behavior of javascript works like "pass-by-value" for primitives (and for immutable objects).


I think we are saying the same thing. Your term "pass reference by value" is the same as what I called "give a new name to an existing thing".

I simply like to avoid the terms "pass by reference" and "pass by value" to avoid comparison with other languages. But if you want to use "reference" and "value", then your term "pass a reference by value" is a good description of how it works - both for objects and primitives.

Coincidentally, just before I saw your comment I changed what I said about "The reason is that primitives are immutable" to make it more clear.

The real reason is that we instinctively write different code for the two. If a function parameter named 'val' is a primitive, we're not going to try to change a property on it. If we do, it won't work, because the primitive is immutable. (That's the point I was trying to get at with my original comment.) The same would be true, of course, with an immutable object.

And when we pass an object into a function, we may reassign the name completely inside the function, as in the mutatePassByReference() function in your fiddle, but more likely we will write code that manipulates its properties, as in your mutatePassReferenceByValue().

So the immutability of primitives isn't the direct cause of what is sometimes thought of as different behavior for objects and primitives. It's an indirect cause, because it leads us to naturally write different code for the two.

That's what I was really trying to get at (in my usual long-winded way): JavaScript doesn't treat objects and primitives differently, we programmers do.


> I think we are saying the same thing. Your term "pass reference by value" is the same as what I called "give a new name to an existing thing".

>I simply like to avoid the terms "pass by reference" and "pass by value" to avoid comparison with other languages.

The terms mean the same thing in every language. Some people (as shown in the article and this thread) don't understand what "pass-by-reference" means and incorrectly use it to apply to javascript.

The pass-reference-by-value behavior is shared by a lot of languages, I'm not sure why you want to avoid comparison with them.


> The pass-reference-by-value behavior is shared by a lot of languages, I'm not sure why you want to avoid comparison with them.

That's a fair point. I just think a lot of JavaScript programmers are confused about this, and end up thinking that primitives and objects are passed in two different ways. Even MDN makes this mistake:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guid...

My use of "things" and "names" is just an attempt to get away from this confusion.


There is no confusion between pass by reference" and "pass by value".

They are clearly defined terms in Computer Science.


That's true, "pass by reference" and "pass by value" are clearly defined terms in computer science. The problem is that legions of JavaScript programmers think that one or the other of those terms must apply to JavaScript, when neither one correctly describes the language's behavior.

This leads to the widespread misconception that objects and primitives are handled in two different ways, one passed by reference and the other by value.


I doubt there would be any serious compiler design lecture about implementing a JavaScript engine that would use anything other than "pass by value" to decribe how arguments are placed into the stack upon calling a function.

The problem is trying to apply terms without understanding what they actually mean.


This is exactly the problem I've been talking about.

A compiler design lecture about implementing a JavaScript engine should not use the term "pass by value" to describe JavaScript's behavior.

Most JavaScript engines to date have been written in C/C++. Function parameters are passed by value in these languages, unless you use a reference parameter in C++.

This means that if the argument is a struct, the function receives a copy of the struct, not a reference to the original struct that was passed in.

Consider this C code:

  typedef struct {
      int bar;
  } Foo;
  
  void fun( Foo boo ) {
      boo.bar = 43;
  }
  
  int main() {
      Foo foo;
      foo.bar = 42;
  
      printf( "Before call: %d\n", foo.bar );
      fun( foo );
      printf( "After call: %d\n", foo.bar );
  
      return 0;
  }
This will print:

  Before call: 42
  After call: 42
fun() is passed a copy of the struct, so modifying the struct inside the function only affects its local copy. That's what "pass by value" means in C.

Compare with the similar JavaScript:

  function fun( boo ) {
      boo.bar = 43;
  }
  
  function main() {
      let foo = { bar: 42 }
  
      console.log( "Before call:", foo.bar );
      fun( foo );
      console.log( "After call:", foo.bar );
  }
  
  main();
This will log:

  Before call: 42
  After call: 43
The function doesn't receive a copy of the object, it receives the object itself, under a new name.

Using the term "pass by value" for JavaScript's behavior would only invite confusion.


You are doing pass by value on JavaScript example.

For it to be pass by reference you need to change boo inside fun(), but you are changing a field of boo.bar using the boo address passed by value.

When JavaScript starts having pass by reference, this code will hold true. Until then it won't, regardless how people re-invent CS concepts on HN.

  function fun( boo ) {
      boo = { bar: 42 }
  }
  
  function main() {
      let foo = null;
  
      console.log( "Before call:",  foo);
      fun( foo );
      console.log( "After call:", foo );
  }
  
  main();
This will log:

  Before call: null
  After call: { bar: 42 }


Placing arguments into the stack is an implementation detail and it can be done either way.

For example, we can hold a reference to a struct representing a primitive value and place it onto the stack. If we let our callee modify this struct we get a behavior you understand as pass-by-reference. Otherwise we can copy this struct into a new struct on write and replace the reference on the stack. This way we are passing references, but implementing pass-by-value behavior. We can do the same for non-primitive data structures, etc.

So these things are about behavior, not implementation. And Javascript does have a behavior where it passes certain things as references implicitly stored in variables. It's in no way pass by value.


When JavaScript starts having pass by reference, this code will hold true.

  function fun( boo ) {
      boo = { bar: 42 }
  }
  
  function main() {
      let foo = null;
  
      console.log( "Before call:",  foo);
      fun( foo );
      console.log( "After call:", foo );
  }
  
  main();
This will log:

  Before call: null
  After call: { bar: 42 }


I don't know why it's so hard to understand. It's not the essence of passing by reference, you are mixing implementation details with semantics.

If you can modify something that you didn't __explicitly__ use as a reference, you have an implicit pass by reference behavior. It doesn't matter if you can't modify the root reference itself of a nested structure, but only the children. You still can modify something that you didn't explicitly specify, so it is still pass by reference.

And by the way, CS has no precise terminology about this, CS prefers to neglect usability side of programming languages altogether.


> It's not the essence of passing by reference, you are mixing implementation details with semantics.

> And by the way, CS has no precise terminology about this,

That is not correct. You are the one mixing up terms. Pass-by-reference and Pass-by-value have specific meanings.

There are two acceptable terms for how Javascript passes variables, those are "pass-by-sharing" or "pass-reference-by-value" https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sh...


But they are not actual terms and are not even conventions, just some things somebody used somewhere that you choose to stand behind. And I choose not to and prefer to use more common understanding of them.

See, it's all about usability and how users understand things. If one way of passing things leaves them with no side effect and another way lets called functions modify shared state without them realizing - these are the big distinctions. It really adds nothing if we call two slightly different variations of passing references by different names instead of more common and intuitive "pass-by-reference".


> But they are not actual terms and are not even conventions, just some things somebody used somewhere that you choose to stand behind. And I choose not to and prefer to use more common understanding of them.

Yes, they are actual terms. You are choosing to be wrong and spread misinformation.

> It really adds nothing if we call two slightly different variations of passing references by different names instead of more common and intuitive "pass-by-reference".

Yes, it really does. The "slightly different variation" has a huge impact on how you write code. PHP, for example, does pass-reference-by-value by default (same behavior as Javascript) but it can also do pass-by-reference if you specify that in the function definition. see: https://www.tehplayground.com/nzi7SxTxXrrlqTZf

You used to be able to specify pass-by-reference when calling a function, but that was deprecated in 5.3 and removed in 5.4. It was removed precisely because it is important the the person writing a function knows if the variables he is using are pass-by-reference or pass-reference-by-value.

Since it is all about usability and helping users understand things, please stop trying to remove important meaning from the established term "pass-by-reference".


> The "slightly different variation" has a huge impact on how you write code.

From my years of experience with a language that has both variations (Perl) - it doesn't have much impact on how you write code at all, except making the language a bit more flexible and expressive in rare situations.

I consider usability of programming languages to be rather important and would like wrong ideas not to be claimed as terminology there nor conventions. Wrong people shouldn't be pushing wrong terms into the field they have no understanding of.


> The reason is that javascript uses "pass-reference-by-value" not "pass-by-reference"

The classical name for this (very common, now—especially in dynamic OO languages—though uncommon enough for a while after it was named that the name isn't used much) approach seems to be “pass by sharing”.


This is why JavaScript is the biggest joke of web development. And error handling.

Can we please stop using it


JavaScript is pass-by-reference for all Objects and Arrays. Value types are pass-by-value.

    let obj = {a: 1}
    
    function foo(arg) {
      return obj === arg;
    }
    
    foo(obj) // true
The === operator operates on object not by comparing their value, but by comparing their memory reference [1]. A function argument variable can be reassigned using = in the function body, but that changes which location in memory the reference points to and isn't somehow "proof" of pass-by-value.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Javascript is pass-reference-by-value for Objects and Arrays, see: https://jsfiddle.net/t73ykuj0/

If javascript was pass-by-reference for Objects, your code could read: arg = {a: 2}; return obj === arg;

and still behave the same.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: