Jump to content

Convolution of probability distributions: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
unclear, fixed for clarity.
T6283 (talk | contribs)
Remove backwards claim that sum of dice rolls proves convolution
Tags: Manual revert references removed
 
(46 intermediate revisions by 33 users not shown)
Line 1: Line 1:
{{short description|Probability distribution of the sum of random variables}}
The '''convolution of probability distributions''' arises in [[probability theory]] and [[statistics]] as the operation in terms of [[probability distribution]]s that corresponds to the addition of [[statistically independent|independent]] [[random variable]]s and, by extension, to forming linear combinations of random variables. The operation here is a special case of [[convolution]] in the context of probability distributions.

The '''convolution/sum of probability distributions''' arises in [[probability theory]] and [[statistics]] as the operation in terms of [[probability distribution]]s that corresponds to the addition of [[statistically independent|independent]] [[random variable]]s and, by extension, to forming linear combinations of random variables. The operation here is a special case of [[convolution]] in the context of probability distributions.


==Introduction==
==Introduction==


The [[probability distribution]] of the sum of two or more [[independent (probability)|independent]] [[random variable]]s is the convolution of their individual distributions. The term is motivated by the fact that the [[probability mass function]] or [[probability density function]] of a sum of random variables is the [[convolution]] of their corresponding probability mass functions or probability density functions respectively. Many well known distributions have simple convolutions: see [[List of convolutions of probability distributions]]
The [[probability distribution]] of the sum of two or more [[independent (probability)|independent]] [[random variable]]s is the convolution of their individual distributions. The term is motivated by the fact that the [[probability mass function]] or [[probability density function]] of a sum of independent random variables is the [[convolution]] of their corresponding probability mass functions or probability density functions respectively. Many well known distributions have simple convolutions: see [[List of convolutions of probability distributions]].

The general formula for the distribution of the sum <math>Z=X+Y</math> of two independent integer-valued (and hence discrete) random variables is<ref>[[Susan P. Holmes|Susan Holmes ]](1998). Sums of Random Variables:
Statistics 116. Stanford. http://statweb.stanford.edu/~susan/courses/s116/node114.html</ref>
:<math>P(Z=z) = \sum_{k=-\infty}^\infty P(X=k)P(Y=z-k)</math>

For independent, continuous random variables with [[probability density function]]s (PDF) <math>f,g</math> and [[cumulative distribution function]]s (CDF) <math>F,G</math> respectively, we have that the CDF of the sum is:
:<math>H(z)=\int_{-\infty}^\infty F(z-t)g(t) dt = \int_{-\infty}^\infty G(t)f(z-t) dt</math>

If we start with random variables <math>X</math> and <math>Y</math>, related by <math>Z = X + Y</math>, and with no information about their possible independence, then:

:<math>f_Z(z) = \int \limits_{-\infty}^{\infty} f_{XY}(x, z-x)~dx</math>

However, if <math>X</math> and <math>Y</math> are independent, then:

:<math>f_{XY}(x,y) = f_X(x) f_Y(y)</math>

and this formula becomes the convolution of probability distributions:

:<math>f_Z(z) = \int \limits_{-\infty}^{\infty} f_{X}(x)~f_Y(z-x)~dx</math>


== Example derivation ==
== Example derivation ==


There are several ways of derive formulae for the convolution of probability distributions. Often the manipulation of integrals can be avoided by use of some type of [[generating function]]. Such methods can also be useful in deriving properties of the resulting distribution, such as moments, even if an explicit formula for the distribution itself cannot be derived.
There are several ways of deriving formulae for the convolution of probability distributions. Often the manipulation of integrals can be avoided by use of some type of [[generating function]]. Such methods can also be useful in deriving properties of the resulting distribution, such as moments, even if an explicit formula for the distribution itself cannot be derived.


One of the straightforward techniques is to use [[Characteristic function (probability theory)|characteristic functions]], which always exists and are unique to a given distribution.{{cn|date=April 2013}}
One of the straightforward techniques is to use [[Characteristic function (probability theory)|characteristic functions]], which always exists and are unique to a given distribution.{{citation needed|date=April 2013}}


=== Convolution of Bernoulli distributions ===
=== Convolution of Bernoulli distributions ===


The convolution of two ''independent'' [[Bernoulli distribution|Bernoulli random variables]] is a Binomial random variable. That is, in a shorthand notation,
The convolution of two independent identically distributed [[Bernoulli distribution|Bernoulli random variables]] is a binomial random variable. That is, in a shorthand notation,
:<math> \sum_{i=1}^2 \mathrm{Bernoulli}(p) \sim \mathrm{Binomial}(2,p).</math>
:<math> \sum_{i=1}^2 \mathrm{Bernoulli}(p) \sim \mathrm{Binomial}(2,p)</math>


To show this let
To show this let
:<math>X_i \sim \mathrm{Bernoulli}(p), \quad 0<p<1, \quad 1 \le i \le 2</math>
:<math>X_i \sim \mathrm{Bernoulli}(p), \quad 0<p<1, \quad 1 \le i \le 2</math>
and define
and define
:<math>Y=\sum_{i=1}^2 X_i.</math>
:<math>Y=\sum_{i=1}^2 X_i</math>
Also, let ''Z'' denote a generic binomial random variable:
Also, let ''Z'' denote a generic binomial random variable:
:<math>Z \sim \mathrm{Binomial}(2,p) \,\! .</math>
:<math>Z \sim \mathrm{Binomial}(2,p) \,\! </math>


====Using probability mass functions====
====Using probability mass functions====
Line 30: Line 51:
&=\sum_{m\in\mathbb{Z}}\left[\binom{1}{m}p^m\left(1-p\right)^{1-m}\right]\left[\binom{1}{n-m}p^{n-m}\left(1-p\right)^{1-n+m}\right]\\
&=\sum_{m\in\mathbb{Z}}\left[\binom{1}{m}p^m\left(1-p\right)^{1-m}\right]\left[\binom{1}{n-m}p^{n-m}\left(1-p\right)^{1-n+m}\right]\\
&=p^n\left(1-p\right)^{2-n}\sum_{m\in\mathbb{Z}}\binom{1}{m}\binom{1}{n-m} \\
&=p^n\left(1-p\right)^{2-n}\sum_{m\in\mathbb{Z}}\binom{1}{m}\binom{1}{n-m} \\
&=p^n\left(1-p\right)^{2-n}\left[\binom{1}{n}\binom{1}{0}+\binom{1}{n-1}\binom{1}{1}\right]\\
&=p^n\left(1-p\right)^{2-n}\left[\binom{1}{0}\binom{1}{n}+\binom{1}{1}\binom{1}{n-1}\right]\\
&=\binom{2}{n}p^n\left(1-p\right)^{2-n}=\mathbb{P}[Z=n] .
&=\binom{2}{n}p^n\left(1-p\right)^{2-n}=\mathbb{P}[Z=n]
\end{align}</math>
\end{align}</math>


Here, use was made of the fact that <math>\tbinom{n}{k}=0</math> for ''k''>''n'' in the last but three equality, and of [[Pascal's rule]] in the second last equality.
Here, we used the fact that <math>\tbinom{n}{k}=0</math> for ''k''>''n'' in the last but three equality, and of [[Pascal's rule]] in the second last equality.


==== Using characteristic functions ====
==== Using characteristic functions ====
Line 48: Line 69:
The [[Expected value|expectation]] of the product is the product of the expectations since each <math>X_k</math> is independent.
The [[Expected value|expectation]] of the product is the product of the expectations since each <math>X_k</math> is independent.
Since <math>Y</math> and <math>Z</math> have the same characteristic function, they must have the same distribution.
Since <math>Y</math> and <math>Z</math> have the same characteristic function, they must have the same distribution.

==See also==
* [[List of convolutions of probability distributions]]


== References ==
== References ==
{{Reflist}}
* {{cite book | last1=Hogg | first1=Robert V. |authorlink1=Robert V. Hogg | last2=McKean | first2=Joseph W. | last3=Craig | first3=Allen T. | title=Introduction to mathematical statistics | edition=6th | publisher=Prentice Hall | url=http://www.pearsonhighered.com/educator/product/Introduction-to-Mathematical-Statistics/9780130085078.page | location=Upper Saddle River, New Jersey | year=2004 | pages=692 | ISBN=978-0-13-008507-8|MR=467974|}}
* {{cite book | last1=Hogg | first1=Robert V. |authorlink1=Robert V. Hogg | last2=McKean | first2=Joseph W. | last3=Craig | first3=Allen T. | title=Introduction to mathematical statistics | edition=6th | publisher=Prentice Hall | url=http://www.pearsonhighered.com/educator/product/Introduction-to-Mathematical-Statistics/9780130085078.page | location=Upper Saddle River, New Jersey | year=2004 | pages=692 | ISBN=978-0-13-008507-8|MR=467974}}


[[Category:Theory of probability distributions]]
[[Category:Theory of probability distributions]]

Latest revision as of 06:52, 26 June 2024

The convolution/sum of probability distributions arises in probability theory and statistics as the operation in terms of probability distributions that corresponds to the addition of independent random variables and, by extension, to forming linear combinations of random variables. The operation here is a special case of convolution in the context of probability distributions.

Introduction

[edit]

The probability distribution of the sum of two or more independent random variables is the convolution of their individual distributions. The term is motivated by the fact that the probability mass function or probability density function of a sum of independent random variables is the convolution of their corresponding probability mass functions or probability density functions respectively. Many well known distributions have simple convolutions: see List of convolutions of probability distributions.

The general formula for the distribution of the sum of two independent integer-valued (and hence discrete) random variables is[1]

For independent, continuous random variables with probability density functions (PDF) and cumulative distribution functions (CDF) respectively, we have that the CDF of the sum is:

If we start with random variables and , related by , and with no information about their possible independence, then:

However, if and are independent, then:

and this formula becomes the convolution of probability distributions:

Example derivation

[edit]

There are several ways of deriving formulae for the convolution of probability distributions. Often the manipulation of integrals can be avoided by use of some type of generating function. Such methods can also be useful in deriving properties of the resulting distribution, such as moments, even if an explicit formula for the distribution itself cannot be derived.

One of the straightforward techniques is to use characteristic functions, which always exists and are unique to a given distribution.[citation needed]

Convolution of Bernoulli distributions

[edit]

The convolution of two independent identically distributed Bernoulli random variables is a binomial random variable. That is, in a shorthand notation,

To show this let

and define

Also, let Z denote a generic binomial random variable:

Using probability mass functions

[edit]

As are independent,

Here, we used the fact that for k>n in the last but three equality, and of Pascal's rule in the second last equality.

Using characteristic functions

[edit]

The characteristic function of each and of is

where t is within some neighborhood of zero.

The expectation of the product is the product of the expectations since each is independent. Since and have the same characteristic function, they must have the same distribution.

See also

[edit]

References

[edit]
  • Hogg, Robert V.; McKean, Joseph W.; Craig, Allen T. (2004). Introduction to mathematical statistics (6th ed.). Upper Saddle River, New Jersey: Prentice Hall. p. 692. ISBN 978-0-13-008507-8. MR 0467974.