I don't think anyone, mathematicians or functional programmers, have a problem with mutation [0]. They have a problem with names suddenly changing without being too explicit about that fact.
In particular, mathematics certainly doesn't prohibit you from using whatever kind of notation you like to explain your concept, but it has suggested a certain primacy of the notion of a (pure) function. The reasons are both ones of mathematical "elegance" (whatever that means) and pedagogy. They're a fantastically important fundamental building block.
Mathematicians would never claim that pure functions are the only thing around, but by and large they'd assume that's what you were talking about given no other indication. Furthermore if you introduced something new (say something with a notion of mutation) then you'd merely be obliged to explain exactly what that new thing is with sufficient formality so that the reader can be assured that its behavior is completely reasonable.
All of this holds identically for functional programming. The only additional challenge is that FPers are accidental intuitionists so you can't merely name the behavior of a new kind of notation but must actually exhibit an interpreter of that new notation into more common notation. This is significantly more heavy-handed than most mathematicians would do.
[0] Or any other side effects. The whole argument is parametric in your choice of side effect.
From a mathematicians perspective, functional programming languages do a fairly bad job at being able to express algorithms in the way mathematicians have been writing them down since before computers existed. There is little difference between writing $x_{k+1} = f(x_{k})$ and $x = f(x)$ in an imperative language. In some cases the index k can even be identified as a multiplicator of some discrete timestep, in which case what the hardware is doing is in very close correspondence to the abstract iterative process.
All the operations in abstract algebra are actually destructive by default, if you do a multiplication, then that is a map
m : A \otimes A \to A
If you want a copy of the values you put in you have to pre- compose, with a diagonal map \Delta: A \to A \otimes A, which might even not exist (it doesn't if \otimes is the tensor product of vector spaces for example). Even application of a morphism on its own is treated as irreversible, once you've applied a morphism to an Object you can't get it back unless you are in a special category like Set, or the morphism happens to be invertible. The fact that functional programming languages hide copying behind your back does not make sense physically either, on a hardware level duplication is fairly expensive, because it decreases entropy.
There are countless examples in Computational Physics and Numerics, where pure functional programming languages fall flat on the nose. I'm thinking about finite-element methods with adaptive refinements, all the code for computer algebra and computational group theory, etc., not to mention all the Fortran/C/C++ code that is in existence in physics (At Cern they have written 60 million lines of it). It might be that the experts in functional programming languages simply don't have the expertise necessary to come up with viable solutions, because things like repa and accelerate are not realistic alternatives when it comes to real world applications.
I'm sadly not actually able to follow what you're getting at here.
I think there are a few threads going on that I'd be interested in following up on, whether imperative/mutable semantics are common in mathematics, whether it makes sense to call something in abstract algebra as "destructive", what copying means, whether analyzing computational physics/numerics is a similar task, whether functional programming experts are capable of building real world applied math tools...
Ultimately, though, I'm not sure which ones move the conversation along productively.
I think perhaps the most interesting one is the notion of copying. It is certainly the case that modern FP assumes copying is okay. This comes naturally from the "structural properties" of implication in type theory and there's plenty of work studying type theories which are "substructural" and thus don't assume an ambient copy/destroy comonoid. I don't honestly consider this out of scope of FP, although no languages implement it today. There is a need to pick a "default semantics sweet spot" and linear logic may have not yet proven it pays its way on the weight/power tradeoff spectrum. But maybe it will be shown to in time?
In the mean time you can do plenty of work in abstract algebra without talking about mutation or destruction at all. You mention category theory but ignore the prime dictum of CT: it's all about the arrows. Categories certainly exist which embody destructive internal algebras---CT would be pretty useless if it didn't---but externally we consider arrows and compositions of arrows alone. This is where the real power of CT comes from anyway.
Perhaps a few points of clarification are in order, my original comment had too many different threads mixed together. I will first try to expand on my remarks on copying and then try to respond to some of the things you said.
When you look at the categories in use by mathematicians a surprisingly large number of them are not cartesian, this is true in particular of the categories relevant to physics, where both the category of symplectic manifolds (relevant to classical mechanics) and Hilbert spaces (relevant to Quantum Mechanics) are monoidal and not cartesian. In both cases you therefore can not copy states.
Moreover in physics things are local in the sense that the dynamics of a physical system is usually determined by differential equations. That means there is typically no situation where you have to retain much of the state of the system for a long time in order to simulate its behavior.
Even for complicated things like PDE simulations, what you really do is approximate some infinite dimensional function space locally with finite dimensional vector spaces. The state space in each step of the simulation is some tensor product. So while a computer certainly allows you to do it, it isn't really natural to constantly copy state in such simulations. Abstractly a Fortran 77 program could probably described by a collection of morphisms in some traced monoidal category, so it doesn't seem too far off from the mathematics it was invented to implement, although ultimately it clearly is deficient.
From a certain angle a lot of abstract algebra is actually more or less about creative destruction of information, certainly if you use it as a tool to study number theory, or topology. A whole class of functors there are monoidal functors from cartesian to non-cartesian categories and you carefully adjust how much information the functor loses in order to make computations feasible (Homology vs. Homotopy for example). The reason mutation doesn't seem to be prevalent, is because a lot of operations that are studied are invertible. But as soon as you for example study a group operation on a space, that is not faithful, then you won't be able to recover the original space after you've applied the group operation once, unless you've copied it before.
Of course the origins of copying in type theory is that for example Gentzen's calculus can be properly thought of as living in some cartesian closed category, as you probably know. Personally I find linear logic interesting, especially because it is able to faithfully embed intuitionistic logic and lets you talk precisely about copying. The problem is of course that type inference is undecidable for linear type theory, if I remember correctly.
In terms of expressive power Haskell is probably already fairly close to the ideal, most relevant to the discussions is I believe its inability to express laws its type classes have.
It is fairly good at expressing "free" or "cofree" things, but is not so good at expressing laws statically, not to mention providing a mechanism for proving the laws for instances. The best thing you can do is write embedded interpreters or compilers for those free / cofree things, that sort out things at runtime, or use template haskell.
Broadly, as others have highlighted, you appear to be mixing modes between internal algebra and external algebra. That's fine and in agreement with my argument: "nobody cares about using [categories] with [side effects], they merely want you to be explicit".
The only intensifying bit is that you need some kind of ambient category to build your notion of categories internally to and this one, by long tradition, is typically cartesian.
Differential equations are a great example, but I'm going to use them to again intensify my point instead of yours ;)
Differential equations clearly involve a vital component of state. However, it is benign in its usage. In particular, an enormous amount of work done in diffeq (from designing the models, to analyzing them, to solving them) is done statically either via algebraic manipulations of state free equations or via analyses such as, e.g., Laplace Transforms.
The point is that diffeq equations, by being explicit about state (e.g. external to it) enable many useful interpretations. If diffeq were truly about state alone then you would have no other choices---if I handed you a differential equation then the only thing you could do would be to observe it as it is now or press play and watch.
Clearly that's an important POV, but equally clearly it's not the only one.
So, if your argument boils down to the fact that there are many topics of study which have destructive internal algebras then, yes I agree wholeheartedly!
I just am emphasizing that you need a place to study them externally because it's damn powerful. This external place is, by mathematical convention and the seductive power of cartesian closure, usually side-effect free.
In particular, if you want to dive down into the internal algebra of a side-effective category then you have monads to do that with.
---
Complete inference in any interesting type system stronger than HM is probably undecidable. Personally I don't actually see that as such a big problem. Replacing complete inference is a compiler which will, in real time, tell you everything it knows about relevant names in scope which is pretty powerful. Not being able to ignore types entirely is actually a good thing.
---
And w.r.t. laws and the like, I disagree that TH is much of an answer, would rather suggest that "interpreters and compilers" are an ideal solution and essentially the best we can do while retaining constructivity, and suggest you look closer at the dependently typed languages if you want to learn more about law expression in that kind of tool.
In fact there is, take A to be the space of differential forms on some manifold, and M the wedge product. Given two differential forms eta, mu, there is no way to recover eta and mu from M(eta,mu). Even simpler take two arbitrary numbers multiply them, unless they were prime you have no way of telling from the product what the two original numbers were. In other words multiplication destroys information.
In the case of numbers this is fine because there is in fact a way of copying them beforehand by an operation Delta : A -> A x A, which sends a number a to (a,a), so if given (a,b) you want both their product and the numbers themselves, you need a map A x A -> A x A x A given by (for example)
(id x M x id) . (id x id x Delta) . (Delta x id)
In the case of differential forms, there is no such (natural) map Delta.
No, applying a map does not mutate or consume the parameters, and therefore no information is destroyed. Function application in mathematics is referentially transparent.
To make this point perfectly clear: Whenever you encounter an expression such as "f(x)", you may freely re-use the expression "x" in a different place. This is a matter independent from the category you choose to work with -- for example, the expression "psi \otimes psi" makes sense in the monoidal category of Hilbert spaces.
This largely depends on your viewpoint, the map M : A * A \to A was meant to be a morphism in an arbitrary monoidal category. An element like x, would be interpreted as a morphism x : I -> A (in general the objects A in a monoidal category, or any category for that matter have no notion of elements). The expression f(x) is interpreted as the composition f . x, so no you are not free to use x a second time, once you formed that composition. This is swept under the rug in most cases because monoidal categories like Set are cartesian and allow you to freely copy morphisms like x : I -> A, via the Delta map Delta : A -> A * A. Explicitly you have Delta(x) = Delta . x = x * x. In general it might not be possible to even determine if two morphisms x : I -> A and y : I -> A are the same.
In some cases you might be able to assume from the start that you have a certain number of equal morphisms x : I -> A "in reserve", for example in linear logic you have the operator ! ("of course") which gives you an arbitrary number of identical morphisms to play with.
In any case this distinction is quite subtle and I understand, why you might assume that I'm simply misunderstanding things. In particular I should emphasize that almost no programming languages work this way, although with some effort you would be able to recast typical cryptographic / numerical code in this language.
It is also really easy that to see for example in the case of addition that indeed information is destroyed, clearly the map
(a,b) -> (a+b,a-b)
has an inverse if the characteristic isn't 2, the subsequent projection to the first factor destroys information. Categories in which that is possible have maps d_A : A -> I.
What tel is pointing out above, is that Delta and d together form a comonoid.
While psi \otimes psi certainly makes sense, the map psi \to psi \otimes psi is not linear and therefore not a morphism (Physicists call this the no cloning theorem for some reason).
I understand where you are come from, but you are conflating different meta-levels (external vs. internal language/logic): You are absolutely free to use any formal expression "x" a second time, and - crucially - it will have the same mathematical meaning if you choose to do so, as can be seen by pondering over the tautology "P(x) <-> P(x)". This is not a property of the category that you chose to work with, but rather of the formal language that we use when engaging in mathematics, as you have demonstrated yourself when you had permitted that "psi \otimes \psi" made sense.
The non-existence of a morphism "psi -> psi \otimes psi" and the notion of "destroyed information" that you are discussing in the rest of your post is independent from all of this. If you wish I can elaborate on the "no cloning theorem".
Well ok, then this was a simple misunderstanding, or miscommunication. My knowledge is largely based on reading of http://ncatlab.org/nlab/show/internal+logic and several other pages on that wiki, plus graduate training in math and physics. Based on your last reply I'm confident you have a similar background. Logic and category theory are not my specialty and I probably haven't expressed myself as clearly as I would in professional communication.
To check if I understand the argument: I'm guessing you're talking about the internal logic or language of some category, in contrast to the external one used to talk about that category.
Right I think that was the point of misunderstanding between me and asuidyasiud. If you think of x as an assumption in linear logic for example, then there is a difference if you assume x or x^2 and so on. Since in programming languages such x : I -> A are values of type A, this then ties in with duplication and destruction I was going on about above. I tried to express the sentiment that in many ways it is actually natural to think of those kinds of logic as more fundamental, because it makes the duplication of values (unnatural for things like resources, spaces of state, etc.) explicit or even impossible.
Yes, not injective essentially, which is not the same as not having an inverse. The notion can be extended though, for example functors that aren't faithful are certainly destructive .
In particular, mathematics certainly doesn't prohibit you from using whatever kind of notation you like to explain your concept, but it has suggested a certain primacy of the notion of a (pure) function. The reasons are both ones of mathematical "elegance" (whatever that means) and pedagogy. They're a fantastically important fundamental building block.
Mathematicians would never claim that pure functions are the only thing around, but by and large they'd assume that's what you were talking about given no other indication. Furthermore if you introduced something new (say something with a notion of mutation) then you'd merely be obliged to explain exactly what that new thing is with sufficient formality so that the reader can be assured that its behavior is completely reasonable.
All of this holds identically for functional programming. The only additional challenge is that FPers are accidental intuitionists so you can't merely name the behavior of a new kind of notation but must actually exhibit an interpreter of that new notation into more common notation. This is significantly more heavy-handed than most mathematicians would do.
[0] Or any other side effects. The whole argument is parametric in your choice of side effect.