Hacker News new | past | comments | ask | show | jobs | submit | more kxyvr's comments login

Along a similar line, the only book that I've found that has goes through a careful derivation of the divergence theorem using Lebesgue integration is, "A Concise Introduction to the Theory of Integration," Second Edition by Daniel W. Stroock. I prefer this one to his later book, "Essentials of Integration Theory for Analysis."

Does anyone else have a good reference or book on the topic?


Also when you talk about the divergence theorem using Lebesuge integration, you're tending into the territory of Geometric Measure Theory which is very deep and im not sure why anyone would do this unless they were a working mathematician and had a practical need for it, like say they were studying geometric analysis (which only a small group of people in the world do).


This is an amazing book and I'm so suprised to see someone else knows about it. However , who cares about the Lebesgue integral? The only thing its good for is integrating pathological functions like the rationals, the indicator of the Cantor set, and fractals. Riemann integration is just fine and I'm not sure what all the fuss is about the Lebesgue Integral. Sure expectation of a random variable is a Lebesgue integral, but most of the time you have a density anyways and you can use the Riemann Integral.


As mentioned in a sibling comment, Lebesgue integration can be helpful with probability theory because we can wrap some information into the measure rather than the function. Though, to be sure, this can often be done in a similar manner using the Riemann-Stieltjes integral.

To me, part of the value of Lebesgue integration is in understanding the limitations of Riemann integrals and when they break. Some of this is covered in Stroock's book in chapter 5.1. Alternatively, when in working in function spaces, we may need to integrate in a more general way than Lebesgue integration, so things like Bochner integrals, which require similar theory. This can arise in the theory related to things like PDE constrained optimization, which most of the time is targeted toward physics related models.

All that said, bluntly, I prefer to work with Riemann integrals when at all possible. However, the same question then applies. Do you or someone else have a reference for a rigorous derivation of the divergence theorem or integration by parts in multiple dimensions using Riemann integration? It's not particularly hard in one dimension, but higher dimensions is tricky and it's hard to get the details of integrating on the surface correct. Stroock's book is the only reference that I know of and he does it with Lebesgue integration.


Well the mathematicians care for those pathological cases. Probably the most important of said pathological cases is Brownian motion / Weiner processes / SDEs. Brownian motion is differential with zero probability, yet it has many modeling applications. It is also fractal-like (the self-similarity property).

The more practical advantage measure theory provides for probability is you can simultaneously handle continuous and discrete distributions. Most of the time what works for one works for the other but you can get some weird mistakes (Shannon’s differential entropy has a few issues as a measure of information not found in the discrete case because he got lazy and just replaced the sum with an integral).

A good chunk of the time I come across measure theoretic probability papers I feel like they’re making the paper a lot more complicated and messier than it needs to be, but it does serve a purpose.


Interchange of limits and integrals is more straightforward with Lebesgue integrals.


Have you ever encountered a scenario where interchanging limit with integral in practice was not allowed?


For the mathematically inclined, the best I've seen is An Introduction to Continuum Mechanics by Morton Gurtin from 1981. At one point, it could be purchased from Google books directly as a pdf.


No. It's used differently.

Here's another way to understand Kalman filters that doesn't require statistics, but does require some knowledge of feedback controllers. Consider a model of a system of the form

  x'=Ax+Bu
  y=Cx
Here, we have a linear system with a state variable `x`, system dynamics `A`, control `u`, control dynamics, `B`, and observation `y`. This states that we have linear dynamics, but we can only observe some of the state variables. For example, perhaps we can observe position, but not velocity or acceleration. At the same time, we want those other variables because we need them for a feedback control or some kind of observation. In order to do this, we use machinery similar to a feedback controller. Define an observer system:

  xx' = Axx + Bu + L(y-Cxx)
  xx(0) = xx0
Here, we have a new observer variable `xx` that we want to converge to the true state variable `x`. To do this, we have introduced a new matrix of gains called `L`, which we call the observer gain for a Luenberger observer. The reason that this system is useful is that if we consider the error `e=x-xx`, and subsitute this into the above equations, we get:

  e' = x' - xx'
     = ...
     = (A-LC) e
From ODE theory, we know that `e` will converge to 0 if the real part of the eigenvalues of `A-LC` is negative. Hence, to facilitate this, we focus on trying to find an appropriate `L` that forces this condition. In order to do this, we note that the eigenvalues of `A-LC` are the same as it's transpose, `At-CtLt`. This leads to an optimization formulation:

  min 0.5 <Qee,ee> + 0.5 <Ruu,uu> st ee' = Atee + Ctuu, ee(0) = ee(0)
The notation `<x,y>` means inner product and is simply `xt y`. Here, we're essentially looking at the adjoint equation with a new kind of control `uu` and a new adjoint state variable `ee`. This is a linear quadratic control on the adjoint of the error generated by the Luenberger observer. There is a mostly open choice for `Q` and `R` that we discuss below. Through a long sequence of derivations that is nearly identical to that of linear quadratic control, we find that we eventually solve for a matrix P in the system:

  PAt + AP - PCt inv(R) C P = -Q
And then set the observer gain to `L=-PCt inv(R)`

Given this observer gain, we go back to our system above with `xx`, plug it in, and then solve. That system is constantly updated with observations of the original system in `y` and it produces an `xx` that converges very rapidly to the original state variable `x`. This gives us a way to view all of the state variables in the original equation even though we can't observe them all directly.

Now, to get to a full Kalman filter, there's a question of how to choose the matrices Q and R above. They could be almost anything. If you have an estimate of the what you believe the covariance of the errors are in the state and observations in the original equations, you get to a Kalman filter. Really, the derivation above is continuous, so it would be a Kalman-Bucy filter. Honestly, you don't need any statistics, though. Normally, just normalize the weights, so that all of the state variables are around the same size and then maybe adjust it on what you think is more important. It generally works just fine.

In short, a Kalman filter is a little machine that uses the same machinery as a feedback controller that rapidly converges to the state of a system of interest. This is useful because it's unlikely that all of those states can be observed directly. Due to how feedback controllers work, it is robust to errors and can handle variation or error in the observations.


I am a mathematician and would love to mechanize my proofs. The issue is that systems like Coq construct proofs in a radically different way than how we were trained. I've struggled with Coq even with a somewhat more than a passing knowledge of the system; I used Coq's underlying language Gallina to write interpreters, which were then processed into OCaml while studying in graduate school. At the same time, I can barely turn out a basic proof in Coq.

What would help dramatically for me and others that work in applied math are proofs that cover the material in something like Principles of Mathematical Analysis by Walter Rudin. These are proofs that cover basic real analysis and topics like continuity, differentiation, and integration. It would also help to see basic linear algebra proofs such as that all real symmetric matrices have a real eigendecomposition. Now, these may exist and it's been a couple of years since I've looked. If anyone has a link to them, I'd be extraordinary grateful.


I am nothing more than an enthusiastic dabbler, but these have helped me.

There are two very different directions 1. the sort of full on proof automation with tactics https://direct.mit.edu/books/oa-monograph/4021/Certified-Pro... Dr. Chlipala builds up understanding of the "crush" tactic, which more or less lets you specify an inductive hypothesis, and that tactic fiddles around with identifying the base case, and expanding enough to generate the n+1 step. Along the way, you get exposure to using Gallina to write your own tactics.

it's neat, from a programmer point of view, because you can automate away the boring parts. There is pretty good discussion of debugging tactics, which might provide a path to transforming machine proofs into something a bit more comprehensible. at the very least, you can recover the application of axioms and such that did the transformation.

The other direction, that has been much more enlightening for me (the amateur hobbyist) was agda and Jesper Cockx's tutorial- https://github.com/jespercockx/agda-lecture-notes/blob/maste... he builds up all of the notation from, well, nothing. for example, you get refl, and then build up

  -- symmetry of equality
  sym : {A : Set} {x y : A} → x ≡ y → y ≡ x sym refl = refl
  -- transitivity of equality
  trans : {A : Set} {x y z : A} → x ≡ y → y ≡ z → x ≡ z (54) trans refl refl = refl
  -- congruence of equality
  cong : {A B : Set} {x y : A} → (f : A → B) → x ≡ y → f x ≡ f y cong f refl = refl
along with the syntax that comes with the agda std lib. agda is different, but you can still extract to Haskell (and allegedly javascript for node, but that was a little flakey for me and might go away in a future release). less useful for your goals, much more programming oriented - https://plfa.github.io is pretty good, and worth mentioning as a reference.


I'm slow on replies, but thanks for the references. I'll have a look.


For the math that you mention, I would suggest looking at mathlib (https://github.com/leanprover-community/mathlib). I agree that the foundations of Coq are somewhat distanced from the foundations most mathematicians are trained in. Lean/mathlib might be a bit more familiar, not sure. That said, I don't see any obstacles to developing classical real analysis or linear algebra in Coq, once you've gotten used to writing proofs in it.

I'm curious, which field of math do you work in?

Edit: for example, that symmetric matrices have real eigenvalues is shown here in mathlib: https://leanprover-community.github.io/mathlib_docs/analysis...


Thanks for sending by the mathlib examples. They're not particularly intelligible for me, but it's something to work towards.

Generally speaking, I work in the intersection between optimization, differential equations, and control. To that end, there's a variety of foundational results in things like optimization theory that I'd like to see formalized such as optimality conditions, convergence guarantees, and convergence rates.

Two questions if you know. One, which theorem prover would you recommend for working toward these kinds of results? Two, is there appetite or a place to publish older, known theorems that have been reworked and validated? And, to be clear, two is not particularly important for me and my career, but it is for many of my colleagues and I'm not sure what to tell them about it.


You're welcome! And likewise, thanks for the interesting reply.

I'm afraid I have no knowledge of your field, and no idea whether there are good tools and libraries for formalising the things you want. Maybe ask or have a look around the Proof Assistants StackExchange[1]?

There are many CS conferences through which you can publish formalised mathematics. One that comes to mind is ITP[2], but there are lots which are announced on mailing lists like TYPES-announce, coq-club, agda... You could look through previous versions of ITP and check out a few of the papers on formalising mathematics to get a feel for what these publications look like.

[1] https://proofassistants.stackexchange.com/

[2] https://mizar.uwb.edu.pl/ITP2023/


Is the problem that human mathematicians have decided that classical logic + ZF(C) is the best framework for them–but it isn’t the best framework for computers?

Which raises an interesting philosophical question - do contemporary mathematicians mostly do things one way rather than another (i.e. classical logic instead of intuitionistic/paraconsistent/etc logic, ZF(C) set theory instead of alternative set theories or type theory or whatever, classical analysis instead of non-standard analysis, and so on) - because they’ve settled upon the inherently superior (easier, more productive) way of doing things - or is that just an accident of history? Could it be that in some alternative timeline or parallel universe out there, mathematics took some different forks in the road, and the mathematical mainstream ended up looking very different from how it does here?


I can't speak for kxyvr, but let me chime in as a mathematician who does formalise my theorems. There's no issue with representing various foundations (e.g. ZFC) on a computer -- for example, that's essentially what Lean/mathlib does, and it's working out great for them. A "problem" with ZFC, however, is that it's very low-level, meaning there's a large distance between the "informal" proofs which mathematicians carry out and communicate, and their corresponding formal proofs in ZFC. Accordingly, for a ZFC-based mathematician to start using proof assistants, they not only have to learn how to use a proof assistant, but also to translate (or expand/elaborate) their math into ZFC.

In other foundations, such as homotopy type theory (which is what the article is about), the distance between "informal" mathematical arguments and their formal counterparts is much shorter. This is why formalisation is widespread in the HoTT-community. Indeed, I believe Voevodsky worked on HoTT in order to have a foundation which was much closer to the mathematical practice in homotopy theory.


I find these "fork in the road" questions endlessly fascinating, and that's exactly what I was getting at with my comment about the motivation for formalizing mathematics. And it doesn't just apply to math, but the whole "tech tree" of human history. How many seemingly arbitrary choices led us down a path that unknowingly diverged from some alternative future?

Last time I commented about this [0], I made some guy irrationally angry as he told me I sounded like I was 19 years old. And in an earlier comment [1], another guy took extreme offense to my questioning of the statement "with enough marbles, the discrete distribution behaves like a continuous distribution."

It seems there is some high priesthood of mathematics and physics that doesn't like when you ask these sorts of questions.

[0] https://news.ycombinator.com/item?id=35018407

[1] https://news.ycombinator.com/item?id=35201603


> Last time I commented about this [0], I made some guy irrationally angry as he told me I sounded like I was 19 years old

If people make comments like that, flag them and email dang (hn@ycombinator.com) and he’ll do something about them

> It seems there is some high priesthood of mathematics and physics that doesn't like when you ask these sorts of questions

In my personal experience, most professional mathematicians and physicists are open to questioning the foundations of their discipline, and deal charitably with doubters (even those who are coming from a place of misunderstanding).

I suspect most of the “high priests” you speak of aren’t actually professionals at all, just Internet randoms who (at best) know only slightly more than you do, and who know very little about the norms of civil dialogue


I work professionally in computational optimization, so here are some of my observations coming from the world of mechanics.

In my experience, it's because the client is trying to reuse an existing simulation code for something that requires optimization. This could be something like imaging (parameter estimation), calibration, or control. For example, say someone has something like a PDE solver that can simulate sound waves going through a media. If you can match the output of this simulation to some collected data, the parameters for the simulation will tell you what material is where.

Anyway, the issue is that these existing codes are rarely instrumented or derived in a way to make it easy to get derivatives. Management doesn't want pay for a rewrite, so they demand that existing equity (the simulator) be used. Then, a derivative free optimizer (DFO) gets slapped on as a first attempt. To be clear, this doesn't work particularly well, but it tends to be the place where I most often see these algorithms used.

As to why it doesn't work very well, derivative based optimizers can scale to the order of hundreds of millions of variables without too many issues (modulo the kind of problem, but it works very well for a lot of problems of value). DFOs tend to have issues past a few dozen variables.

Now, for anyone not wanting to end up in this conundrum, when you write your original simulation code parameterize on the floating point type. Basically, don't write the code with double, use some kind of generic, template parameter, or whatever on something we'll call MyFloat. The purpose here is to allow the code to be more easily compatible with automatic differentiation (AD) tools, which will give a real and machine precision exact derivative. Don't do finite differences. Don't think reinstrumenting later is easy. It's not. It's a pain. It's expensive. And, to be clear, AD can be done poorly and it can be slow. However, I've never seen a code instrumented with AD and a good gradient based optimizer do more poorly than a derivative free optimization code. They've always done dramatically better. Beyond just giving the derivative, I'll also mention that even if the client wants to recode the simulator to give an analytic derivative for speed, it is vastly easier to do so if the results can be checked with a simulator instrumented with AD. There's also other tricks of value that can be used in a properly instrumented code such as interval arithmetic for some kinds of sensitivity analysis.

Anyway, there are other reasons to use derivative free optimization. Some functions just flat out don't have derivatives. However, in the world of mechanics, most do and DFOs gets abused due to poor prior planning. In either case, please, for the love of math, just parameterize on your floating point type. It would make my job much easier.


Agreed. If (approximate) 1st-order information can be obtained in any way (even by *carefully* deployed finite difference), then gradient-based methods should be used. It is wrong to use DFO in such cases.

> DFOs tend to have issues past a few dozens variables.

It seems that the PRIMA git repo https://github.com/libprima/prima shows results on problems of hundreds of variables. Not sure how to interpret the results.


> DFOs tend to have issues past a few dozens variables.

It is interesting to check the results in section 5.1 of the following paper:

https://arxiv.org/pdf/2302.13246.pdf

where these algorithms are compared with finite-difference CG and BFGS included in SciPy.


At least in the past, there were some issues with the licensing of dependencies. For example, both Julia and MATLAB were dependent on a variety of routines from SuiteSparse such as the Choleski factorization and the QR factorization. These routines are dual licensed commercial/GPL. The difference is that your MATLAB license gave the ability to use things like the MATLAB Compiler to distribute somewhat closed code to a client because they, ostensibly, have a license for SuiteSparse. Julia did not, so any project compiled into a stand alone executable would be subject to GPL unless an additional license for the dependencies was purchased. Now, if you're only distributing the source code, which most people do, this doesn't matter as much, but we should all be aware of our license responsibilities. MATLAB has more functionality built-in and I trust they've done the legal legwork for their included routines, so I don't have to.

To be clear, Julia constantly updates and I'm sure many of its original dependencies are being recoded to help address this issue. I still think it's worth a check and audit if someone wants to put together a broader project as to not get burned by this issue later.


Yeah, there have been active attempts to move SuiteSparse out of the base sysimage for this reason. It was planned for this 1.9 release, but looks like there are some performance regressions from the current attempts, so it's not yet there.

Julia code is mostly distributed as source code, and compiling into standalone executables and then distributing them to outsiders is not very common, so it doesn't often come up as an issue. But it's an important detail to keep in mind in case the use case does arise.


I appreciate the work on this. That's not particularly easy code to write, so thanks to whoever is working through this.


This kind of comment makes me genuinely angry. It's exact opposite. Repeating this lie led to painful death.

My wife is an ICU physician. I traveled with her around the country during the COVID pandemic to help provide moral support for what was a very difficult job. One of the most common, heart breaking stories were the patients who were wrongly told by their friends or the internet they were going to die if they were intubated. Now, it is your right to refuse intubation just as it is your right to refuse resuscitation. That said, my wife would recommend intubation if it was warranted and these patients would flat refuse because of information like this. Then, she would watch them slowly and painfully decompensate day by day. Weaker and weaker. Then, when they were at death's door, they would universally reverse their refusal of intubation and ask her to do everything. At this point, my wife would hold their cell phone while they video chatted their husbands, and wives, and parents, and children. Tell them how much they loved them. Then, they would crash and my wife would frantically try to intubate them and keep them alive. At this point, they had no reserves. Most of them died.

No physician in their right mind recommends intubation without cause. It's a procedure. It carries risk. However, it is vastly easier to do in a controlled setting before someone has no reserves and is crashing. It means that they could give the patients lungs a break, so they can heal.

I listened to these stories day after day after day and tried to support my wife as best as I could. Physicians are not perfect. They do make mistakes. They also have vastly more experience dealing with the intricacies of care. That said, comments like this one absolutely killed people. People who didn't need to die. That was and still is wrong.


Sadly if you look back through this person's comment history you'll find as bad or worse. I personally am of the opinion that HN is no place for these kind of comments, but day by day they're out there fighting the good fight around here. It makes me sad to see that no amount of comments like yours appear to prevent this human from continuing their crusade here of comment-thread closed-minded misinformation-laden horse-shit.


What you're describing is clearly extremely strongly felt.

Nevertheless, there is a growing body of evidence that the use ventilators is linked to excess mortality:

https://www.jci.org/articles/view/170682


It wasn't an expensive disaster for IBM, but it was for MD Anderson Cancer Center, which spent $62 million on a project using Watson that didn't work:

https://www.medscape.com/viewarticle/876070

Certainly, not all projects work and institutions need to take risk. However, there is a big financial and opportunity cost to this. Rather than spending $62 million on Watson, and assuming a $100k salary with a a cost multiplier of 2 ($200k for 1 FTE), that amount of money could have paid for 77.5 FTEs over the 4 year duration of the Watson contract (62e6/200e3/4). I suspect that the overall scientific contribution of those researchers would have been higher.


As someone who has been through this, I'll offer some random, unsolicited advice:

1. Much is dictated by level of care. Meaning, there is a spectrum from independent living, assisted living, and nursing home. Independent living is the cheapest with the least amount of assistance. A nursing home is the most expensive and has the highest level of care.

2. Money does not necessarily dictate better care. We obtained better care from a $5k/mo facility than a $9k/mo.

3. Staff turnover is high. Reviews are important, but if the reviews are more than a year old, it doesn't mean a lot.

4. Most of the time, better care can be had if the facility knows the family is involved and they are being watched. To this end, each state I've been involved with allowed the family to demand a camera be installed in the room. The rooms with cameras are the first to be handled. And, to be clear, I don't like this, but it is what it is. Outside of this, families that visit either every day or multiple times a week who get to know the staff generally receive better care.

5. Medicaid can cover the cost of a nursing home, but all assets need to be essentially depleted to a few thousand. This includes any value in the home. It may be possible to transfer the assets to a trust, but there tends to be a cool down where this only works if the assets have been in a trust for multiple years. An estate lawyer familiar with this area of law would know best.

6. If assets are depleted as in 5, some states subsidize home based care through medicaid. This tends to be paltry, but it can help. I think we were originally authorized 20.5 hours a week and the amount essentially came out to $9/hr.

And, to be clear, all of the above addresses none of the emotional difficulty of being put in such a position.

As far as the article goes, I don't doubt it. We saw some really horrific things in the nursing homes. When you're going through it, it feels like a mix of sadness, helplessness, and rage. It doesn't have to be bad, but often is. At the same time, there are situations were you must use one because its not safe to have your family member be alone. Things like dementia, fall risks, etc. Please never judge someone for using a facility; it may be required. Just protect yourself and your family as best you can.


Thanks for this input! Scary as this is, more info is helpful.


1. Principles of Mathematical Analysis by Walter Rudin (baby Rudin) - I'd studied real analysis in the past, but this book is direct and rigorous and provided a good framework to move forward into things like functional analysis in a way that I was not prepared for with other books.

2. Differential Equations and Dynamical Systems by Lawrence Perko - Solidified for me how dynamic systems behaved and were solved. Very much helped my understanding of control theory as well.

3. A Concise Introduction to the Theory of Integration by Daniel Stroock - Helped solidify concepts related to Lebesgue integration and a rigorous formulation of the divergence theorem in high dimensions.

4. Convex Functional Analysis by Kurdilla and Zabarankin - Filled in a lot of random holes missing in my functional analysis knowledge. Provides a rigorous formulation of when an optimization formulation contains an infimum and whether it can be attained. Prior to this point, I often conflated the two.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: