linkem.avg(ProGenExpress) | R Documentation |
Uses merge
to link together two data frames - the first containing data
on the location of genes in the prokrayotic genome, the second containg information on
the numerical values (eg gene expression data) associated with genes. If duplicate values
for genes exist in the numerical data then the mean value is taken.
linkem.avg(genedata = NULL, M = NULL, col1 = NULL, col2 = NULL, cnames = "M")
genedata |
Data frame of gene locations as produced by read.ptt
or read.GenBank. This data frame MUST have numerical
columns named "Start" and "Stop" indicating the locations of genes |
M |
Data frame of numerical values associated with particular genes |
col1 |
The column in genedata to use to link to M |
col2 |
The column in M to use to link to genedata |
cnames |
The column names in M which contain the numerical data of interest |
This is a very short wrapper function around merge
, so please read the docs
for that function. The two data frames should contain at least one common column that
contains gene names. If there is more than one row per gene in M then the mean value per
gene is taken - there must be only one row per gene in the resulting data frame
The merged data frame, ordered by column "Start"
Michael Watson
# load the IFR microarray data data(IFR) # read in an NCBI .ptt file for S typhimurium data(STLT2) # produced by 'STLT2 <- read.ptt("NC_003197.ptt")' # link the genome and the microarray data on Synonym columns # we want one value per gene and so we average over the values linked <- linkem.avg(STLT2,IFR,"Synonym","Synonym",cnames=c("M4h","M8h","M12h")) linked[1:5,]