Link togther two data frames, one representing the genome, the other representing numerical data

linkem.avg(ProGenExpress)

R Documentation

Link togther two data frames one representing the genome the other representing numerical data

Description

Uses merge to link together two data frames - the first containing data on the location of genes in the prokrayotic genome, the second containg information on the numerical values (eg gene expression data) associated with genes. If duplicate values for genes exist in the numerical data then the mean value is taken.

Usage

linkem.avg(genedata = NULL, M = NULL, col1 = NULL, col2 = NULL, cnames = "M")

Arguments

`genedata`	Data frame of gene locations as produced by `read.ptt` or `read.GenBank. This data frame MUST have numerical columns named "Start" and "Stop" indicating the locations of genes`
`M`	Data frame of numerical values associated with particular genes
`col1`	The column in genedata to use to link to M
`col2`	The column in M to use to link to genedata
`cnames`	The column names in M which contain the numerical data of interest

Details

This is a very short wrapper function around merge, so please read the docs for that function. The two data frames should contain at least one common column that contains gene names. If there is more than one row per gene in M then the mean value per gene is taken - there must be only one row per gene in the resulting data frame

Value

The merged data frame, ordered by column "Start"

Author(s)

Michael Watson

Examples


# load the IFR microarray data
data(IFR)

# read in an NCBI .ptt file for S typhimurium
data(STLT2) # produced by 'STLT2 <- read.ptt("NC_003197.ptt")'

# link the genome and the microarray data on Synonym columns
# we want one value per gene and so we average over the values
linked <- linkem.avg(STLT2,IFR,"Synonym","Synonym",cnames=c("M4h","M8h","M12h"))
linked[1:5,]

[Package ProGenExpress version 1.0 Index]