find.region(ProGenExpress)R Documentation

Find regions of interest in prokaryotic genomes

Description

This function looks for regions of interest by searching for clusters of genes that are close together on the genome and which display similar patterns of behaviour in terms of the provided numerical (e.g. microarray) data

Usage

find.region(linked = NULL, maxdist = 150, cnames = "M", minlength = 4)

Arguments

linked A data frame consisting of gene names, locations on the genome and numerical values to be plotted. Must have numerical columns "Start" and "Stop" indicating the location on the genome of genes or features
maxdist Maximum distance between the stop of one gene and the start of the next for them to be determined "close"
cnames Vector of column names in linked which contain the numerical values to be considered
minlength The minimum number of genes in a region for it to be reported

Details

This function looks for regions of interest by searching for clusters of genes that are close together on the genome and which display similar patterns of behaviour in terms of the provided numerical (e.g. microarray) data.

At present, the numerical data for each gene is coded into a pattern according to the values being above or below zero. If this pattern is the same for all genes, then those genes are considered to be behaving similarly.

Value

A data frame is returned with the following columns

start The Synonym of the first gene in the region
comp2 The Synonym of the last gene in the region

Author(s)

Michael Watson

See Also

~~objects to See Also as ~~fun~~, ~~~

Examples


        # load the IFR microarray data
        data(IFR)

        # read in an NCBI .ptt file for S typhimurium
        data(STLT2) # produced by 'STLT2 <- read.ptt("NC_003197.ptt")'

        # link the genome and the microarray data on Synonym columns
        # we want one value per gene and so we average over the values
        linked <- linkem.avg(STLT2,IFR,"Synonym","Synonym",cnames=c("M4h","M8h","M12h"))

        # find interesting regions
        reg <- find.region(linked, cnames=c("M4h","M8h","M12h"))

        # examine the first region
        show.region(linked,start=reg[1,1],stop=reg[1,2], cnames=c("M4h","M8h","M12h"))

[Package ProGenExpress version 1.0 Index]