Talk:LogSumExp

This article was nominated for deletion on 7 August 2015. The result of the discussion was keep.

"trick"?

What is the "trick" in the section "log-sum-exp trick for log-domain calculations"? I had to read the sentence "Like multiplication operation in linear-scale becoming simple addition in log-scale; an addition operation in linear-scale becomes the LSE in the log-domain." three times for it to sort of make sense, I'll try and fix it, assuming log-scale and log-domain are the same thing. --WiseWoman (talk) 20:56, 14 March 2020 (UTC)[reply]

The trick is to replace

\mathrm {LSE} (x_{1},\ldots ,x_{n})

by

\mathrm {LSE} (x_{1}-x_{\mathrm {max} },\ldots ,x_{n}-x_{\mathrm {max} })+x_{\mathrm {max} }

which is numerically more stable (e.g. when used in a computer program). I think the text is clear (perhaps it has changed since you commented). --80.129.163.20 (talk) 14:39, 20 January 2022 (UTC)[reply]

I stumbled on that sentence and math too. Apparently, the point is that applying LogSumExp to a vector of variables transformed to, or taken to be in, log space (which I agree is not obviously defined, as log can in principle output any number), is equivalent to taking the log of the sum of the vector of untransformed variables. Equivalence is symmetric, so LSE can also be thought of as a way to notate/represent/compute the logarithm of a sum. Whether and how it (the trick and the whole function) is useful is another question, perhaps not sufficiently answered by this article.

NB: The other reply explains another section of the article. Elias (talk) 09:59, 10 March 2023 (UTC)[reply]

LSE?

I think the LSE acronym is misleading as it can be read as Least Square Error. I'd be consistent across the text and use LogSumExp. User:misssperovaz — Preceding unsigned comment added by Missperovaz (talk • contribs) 04:57, 14 January 2021 (UTC)[reply]

Some approximation to the 2 variable case

In the case of two real-valued variables, it is possible to approximate the function as:

\ln(e^{x}+e^{y})\approx {\begin{cases}x+\ln(2),\qquad \qquad x=y\\{\dfrac {x\ e^{\frac {x}{\ln(2)}}-y\ e^{\frac {y}{\ln(2)}}}{e^{\frac {x}{\ln(2)}}-e^{\frac {y}{\ln(2)}}}},\ {\text{otherwise}}\end{cases}}

showing how heavily nonlinear the function really is.

Hope someone could fact check and later add it to the main text. 45.181.122.234 (talk) 15:29, 13 September 2024 (UTC)[reply]

Later I found a less accurate but more intuitive approximation:

\ln(e^{x}+e^{y})\approx {\frac {x+y}{2}}+{\frac {1}{2}}{\sqrt {(x-y)^{2}+(2\ln(2))^{2}}}

It has the property that both functions solves

f_{x}+f_{y}=1

45.181.122.234 (talk) 17:21, 12 July 2025 (UTC)[reply]

Another one less accurate but useful since works better than a Taylor expansion of second order but keeps the same components.

The 2nd order Taylor expansion is given by:

Z=\ln(e^{X}+e^{Y}){\overset {\text{2nd order Taylor's}}{\approx }}\ln(2)+{\frac {Y+X}{2}}+{\frac {(Y-X)^{2}}{8}}

Then the expected value will be limited to find the terms:

E[Z]\approx \ln(2)+{\frac {E[Y]+E[X]}{2}}+{\frac {E[(Y-X)^{2}]}{8}}

a way to improve considerably

And improvement for this approximation by using the same terms could be found by using the classic *small-angle approximation* for the cosine function

\cos(x)\approx 1-{\frac {x^{2}}{2}}

but instead of simplifying, by going in the other way around, and then applying the Isserlis's theorem:

{\begin{array}{r c l}:Z=\ln(e^{X}+e^{Y})&\approx &\ln(2)+{\frac {Y+X}{2}}+{\frac {(Y-X)^{2}}{8}}\pm 1\\:&=&1+\ln(2)+{\frac {Y+X}{2}}-\underbrace {\left[1-{\frac {1}{2}}\left({\frac {|Y-X|}{2}}\right)^{2}\right]} _{\text{small angle approximation in reverse}}\\:&=&1+\ln(2)+{\frac {Y+X}{2}}-\underbrace {\cos \left({\frac {|Y-X|}{2}}\right)} _{\text{cosine is an even function}}\\:&=&1+\ln(2)+{\frac {Y+X}{2}}-\underbrace {\cos \left({\frac {Y-X}{2}}\right)} _{\text{complex-valued expansion}}\\:\Rightarrow Z=\ln(e^{X}+e^{Y})&\approx &{\begin{cases}1+\ln(2)+{\frac {Y+X}{2}}-{\frac {1}{2}}\left[e^{-i{\frac {(Y-X)}{2}}}+e^{-i{\frac {(X-Y)}{2}}}\right],\quad {\frac {(Y-X)^{2}}{8}}\leq 3\\\max\{X,Y\},\quad {\text{otherwise}}\end{cases}}:\end{array}}

and since when the variables are too different I just have:

E[Z]\approx {\begin{cases}E[X],\quad (X>Y)\wedge \left({\frac {(Y-X)^{2}}{8}}>3\right)\\E[Y],\quad (Y>X)\wedge \left({\frac {(Y-X)^{2}}{8}}>3\right)\end{cases}}

I really care about computing accurately when

{\frac {(Y-X)^{2}}{8}}\leq 3

, here applying the Expected value jointly with Isserlis's theorem leads to:

{\begin{array}{r c l}:E[Z]=E\left[\ln \left(e^{X}+e^{Y}\right)\right]{\Biggr |}_{{\frac {(Y-X)^{2}}{8}}\leq 3}&\approx &1+\ln(2)+{\frac {E[X]+E[Y]}{2}}-{\frac {1}{2}}\left[e^{-{\frac {1}{2}}E\left[\left({\frac {Y-X}{2}}\right)^{2}\right]}+e^{-{\frac {1}{2}}E\left[\left({\frac {X-Y}{2}}\right)^{2}\right]}\right]\\:&=&1+\ln(2)+{\frac {E[X]+E[Y]}{2}}-{\frac {1}{2}}\left[\underbrace {e^{-{\frac {1}{8}}E\left[\left(Y-X\right)^{2}\right]}+e^{-{\frac {1}{8}}E\left[\left(Y-X\right)^{2}\right]}} _{\text{identicals}}\right]\end{array}}

\Rightarrow E[Z]\approx 1+\ln(2)+{\frac {E[X]+E[Y]}{2}}-e^{-{\frac {1}{8}}E\left[\left(Y-X\right)^{2}\right]},\quad {\frac {(Y-X)^{2}}{8}}\leq 3

Note that the bound

{\frac {(X-Y)^{2}}{8}}<3

is quite "fit", since it comes from the fact that since

\ln(e^{X}+e^{Y})=X+\ln(1+e^{Y-X})

and that:

1+\ln(2)+{\frac {Y+X}{2}}-\cos \left({\frac {Y-X}{2}}\right)=1+\ln(2)+X+{\frac {Y-X}{2}}-\cos \left({\frac {Y-X}{2}}\right)

by matching both and making

Y-X=D

I get:

{\cancel {X}}+\ln(1+e^{D})=1+\ln(2)+{\cancel {X}}+{\frac {D}{2}}-\cos \left({\frac {D}{2}}\right)\Rightarrow D\approx \pm 3.225

Summarizing, the approximation is given by:

\ln(e^{X}+e^{Y})\approx {\begin{cases}1+\ln(2)+{\frac {Y+X}{2}}-\cos \left({\frac {Y-X}{2}}\right),\quad {\frac {(Y-X)^{2}}{8}}\leq 3\\\max\{X,Y\},\quad {\text{otherwise}}\end{cases}}

45.181.122.234 (talk) 18:43, 14 July 2025 (UTC)[reply]

I realized that the expected value formula is wrong if

Y-X

don't have zero mean, but it could be fixed as:

E[Z]\approx \ln(2)+{\frac {E[X]+E[Y]}{2}}+1-\cos \left({\frac {E[Y]-E[X]}{2}}\right)\cdot e^{-{\frac {1}{8}}V[Y-X]},\quad {\frac {(Y-X)^{2}}{8}}\leq 3

45.181.122.234 (talk) 23:05, 22 July 2025 (UTC)[reply]

t>0

in the properties section, you need to specify that t is positive, otherwise it leads to mislead. 94.180.181.63 (talk) 10:13, 8 December 2024 (UTC)[reply]