# Marginal distribution

In probability theory, given two jointly distributed random variables *X* and *Y*, the **marginal distribution** of *X* is simply the probability distribution of *X* ignoring information about *Y*, typically calculated by summing or integrating the joint probability distribution over *Y*.

For discrete random variables, the marginal probability mass function can be written as Pr(*X* = *x*). This is

- <math>\Pr(X=x) = \sum_{y} \Pr(X=x,Y=y) = \sum_{y} \Pr(X=x|Y=y) \Pr(Y=y),</math>

where Pr(*X* = *x*,*Y* = *y*) is the joint distribution of *X* and *Y*, while Pr(*X* = *x*|*Y* = *y*) is the conditional distribution of *X* given *Y*.

Similarly for continuous random variables, the marginal probability density function can be written as *p*_{X}(*x*). This is

- <math>p_{X}(x) = \int_y p_{X,Y}(x,y) \, dy = \int_y p_{X|Y}(x|y) \, p_Y(y) \, dy </math>

where *p*_{X,Y}(*x*,*y*) gives the joint distribution of *X* and *Y*, while *p*_{X|Y}(*x*|*y*) gives the conditional distribution for *X* given *Y*.

Why the name 'marginal'? One explanation is to imagine the *p*(*x*,*y*) in a 2D table such as a spreadsheet. The marginals are got by summing the columns (or rows) -- the column sum would then be written in the margin of the table, ie. the column at the side of the table.