Marginal distribution
In probability theory, given two jointly distributed random variables X and Y, the marginal distribution of X is simply the probability distribution of X ignoring information about Y, typically calculated by summing or integrating the joint probability distribution over Y.
For discrete random variables, the marginal probability mass function can be written as Pr(X = x). This is
- <math>\Pr(X=x) = \sum_{y} \Pr(X=x,Y=y) = \sum_{y} \Pr(X=x|Y=y) \Pr(Y=y),</math>
where Pr(X = x,Y = y) is the joint distribution of X and Y, while Pr(X = x|Y = y) is the conditional distribution of X given Y.
Similarly for continuous random variables, the marginal probability density function can be written as pX(x). This is
- <math>p_{X}(x) = \int_y p_{X,Y}(x,y) \, dy = \int_y p_{X|Y}(x|y) \, p_Y(y) \, dy </math>
where pX,Y(x,y) gives the joint distribution of X and Y, while pX|Y(x|y) gives the conditional distribution for X given Y.
Why the name 'marginal'? One explanation is to imagine the p(x,y) in a 2D table such as a spreadsheet. The marginals are got by summing the columns (or rows) -- the column sum would then be written in the margin of the table, ie. the column at the side of the table.