R programming language

From Christoph's Personal Wiki
Jump to: navigation, search

The R programming language (or just "R"), sometimes described as "GNU S", is a mathematical language and environment used for statistical analysis and display.

R is highly extensible through the use of packages, which are user submitted libraries for specific functions or specific areas of study. A core set of packages are included with the installation of R, with many more available at the comprehensive R archive network, CRAN. The bioinformatics community has seeded a successful effort to use R for the analysis of data from molecular biology laboratories. The bioconductor project started in the fall of 2001 provides R packages for the analysis of genomic data. e.g. Affymetrix and cDNA microarray object-oriented data handling and analysis tools.

see scripts


Installing R on SuSE 10.1 using the default settings for the rpm or source distribution seems to be a problem. Below are the methods I have used to resolved these problems.

First make sure you have the following installed (check http://www.rpmfind.net for the packages):


It also sometimes helps to create a soft link to gfortran like so (changing the directory to suit your needs):

ln -s /usr/bin/g77 /usr/bin/gfortran

Then, and this is important, add the following to your config.site (found in your R source directory):


Now you are ready to install R on SuSE:



./configure --x-includes=/usr/include/X11  # sometimes necessary
make check
make pdf     # optional
make info    # optional
make install # as superuser ('root')

That's it. You are now ready to use R

Comparison with other programs

Although R is mostly used by statisticians, and other people in need of statistics, it can also be used as a general matrix calculation toolbox in a program such as GNU Octave or its proprietary counterpart, MATLAB.

It should not be confused with the R package [1], a collection of programs for multidimensional and spatial analysis available on Macintosh and VAX/VMS systems.


How to get help:

Opens browser
For more on using help
For help on ..
To search for ..

How to leave again:

Image can be saved to .RData

Basic R commands

Most arithmetic operators work like you would expect in R:

4+2 #Prints '6'
3*4 #Prints '12'

Operators have precedence as known from basic algebra:

1+2*4   #Prints '9', while
(1+2)*4 #Prints '12'


A function call in R looks like this:

  • function_name(arguments)
  • Examples:
cos(pi/3) #Prints '0.5'
exp(1)    #Prints '2.718282'

A function is identified in R by the parentheses

  • That's why it's: help(), and not: help

Variables (objects) in R

To assign a value to a variable (object):

x<-4   #Assigns 4 to x
x=4    #Assigns 4 to x (new)
x      #Prints '4'
y<-x+2 #Assigns 6 to y

Functions for managing variables:

ls() or objects()
lists all existing objects
tells the structure (type) of object 'x'
removes (deletes) the object 'x'


A vector in R is like a sequence of elements of the same mode.

x<-1:10           #Creates a vector
y<-c("a","b","c") #So does this

Handy functions for vectors:

Concatenates arguments into a vector
Returns the smallest value in vector
Returns the largest value in vector
Returns the mean of the vector

Elements in a vector can be accessed individually:

x[1]      #Prints first element
x[1:10]   #Prints first 10 elements
x[c(1,3)] #Prints element 1 and 3

Most functions expect one vector as argument, rather than individual numbers

mean(1,2,3)    #Replies '1'
mean(c(1,2,3)) #Replies '2'

The Recycling Rule

The recycling rule is a key concept for vector algebra in R.

When a vector is too short for a given operation, the elements are recycled and used again.

Examples of vectors that are too short:

y<-c(1,2) #y is too short
x+y       #Returns '2,4,4,6'



All simple numerical objects in R function like a long string of numbers. In fact, even the simple: x<-1, can be thought of like a vector with one element.

The functions dim(x) and str(x) returns information on the dimensionality of x.

Important Objects

A series of numbers
Tables of numbers
More 'powerful' matrix (list of vectors)
Collections of other objects
Intelligent(?) lists

Data Matrices

Matrices are created with the matrix() function.

#This produces something like this:
– [,1] [,2] [,3] [,4]
– [1,] 1 4 7 10
– [2,] 2 5 8 11
– [3,] 3 6 9 12

The recycling rule still applies:

#Gives the following matrix:
– [,1] [,2] [,3]
– [1,] 2 5 2
– [2,] 5 2 5
– [3,] 2 5 2

Indexing Matrices

For vectors we could specify one index vector like this:

x[c(1,3)] #Returns '2' and '1'

For matrices we have to specify two vectors:

m[c(1,3),c(1,3)] #Return 2*2 matrix
m[1,] #First row as vector

Beyond two dimensions

You can actually assign to dim():

dim(x)           #Returns 'NULL'
dim(x)<-c(3,4)   #3*4 Matrix
dim(x)           #Returns '3 4'
dim(x)<-c(2,3,2) #x is now in 3d
dim(x)           #Returns '2 3 2'

But functions like mean() still work:

mean(x) #Returns '6.5'

Graphics and visualisation

Visualization is one of R's strong points.

R has many functions for drawing graphs, including:

hist(x)   #Draws a histogram of values in x
plot(x,y) #Draws a basic xy plot of x against y

Adding stuff to plots:

points(x,y) #Add point (x,y) to existing graph.
lines(x,y)  #Connect points with line.

Graphical devices

A graphical device is what 'displays' the graph. It can be a window, it can be the printer.

Functions for plotting "Devices":

X11() #This function allows you to change the size and composition of the plotting window.
par(mfrow=c(x,y)) #Splits a plotting device into x rows and y columns.
dev.print(postscript, file="???.ps") #Use this device to save the plot to a file.

Packages (add-ons)

To install packages from the CLI, execute the following:

R CMD INSTALL /path/to/pkg_version.tar.gz

See also




External links


Packages / Resources