Marginal distribution

In probability theory, given two jointly distributed random variables X and Y, the marginal distribution of X is simply the probability distribution of X ignoring information about Y, typically calculated by summing or integrating the joint probability distribution over Y.

For discrete random variables, the marginal probability mass function can be written as Pr(X = x). This is

$\Pr(X=x) = \sum_{y} \Pr(X=x,Y=y) = \sum_{y} \Pr(X=x|Y=y) \Pr(Y=y),$

where Pr(X = x,Y = y) is the joint distribution of X and Y, while Pr(X = x|Y = y) is the conditional distribution of X given Y.

Similarly for continuous random variables, the marginal probability density function can be written as pX(x). This is

$p_{X}(x) = \int_y p_{X,Y}(x,y) \, dy = \int_y p_{X|Y}(x|y) \, p_Y(y) \, dy$

where pX,Y(x,y) gives the joint distribution of X and Y, while pX|Y(x|y) gives the conditional distribution for X given Y.

Why the name 'marginal'? One explanation is to imagine the p(x,y) in a 2D table such as a spreadsheet. The marginals are got by summing the columns (or rows) -- the column sum would then be written in the margin of the table, ie. the column at the side of the table.