Plot numerical values on a genome, setting the range by base number

plotrange(ProGenExpress)

R Documentation

Plot numerical values on a genome setting the range by base number

Description

Draws a plot showing numerical values on the genome of prokaryotes, using base number to identify the region to plot

Usage

plotrange(linked = NULL, cnames = "M", start = 1, stop = 10000, gname = "Gene", display.gname = NULL, 
                highlight = NULL, col.cogs = NULL, cogs.color.column = "Color", cogs.legend = TRUE, 
                greyscale = FALSE, distscale = NULL, genomecol="grey", genenames=TRUE)

Arguments

`linked`	A data frame consisting of gene names, locations on the genome and numerical values to be plotted. Must have numerical columns "Start" and "Stop" indicating the location on the genome of genes or features
`cnames`	Vector of column names in linked which contain the numerical values to be plotted
`start`	The base number identifying the start of the region to be plotted
`stop`	The base number identifying the end of the region to be plotted
`gname`	The name of the column in linked to which the rows in highlight correspond (E.g. "Gene" or "Synonym")
`display.gname`	The name of the column in linked containing the labels of genes for the plot (E.g. "Gene" or "Synonym"). Defaults to the same as gname if NULL
`highlight`	A data frame of regions which to highlight, each row corresponding to a region. Must have as a minimum columns named "start" and "stop" which correspond to the gene names in linked (column identified by the gname argument above) which border the regions to be highlighted. May also contain columns "name" and "color", which give a label and color for the highlighted regions respectively
`col.cogs`	This controls whether the plot is colored numerically or by COGS category. If NULL, the data is colored numerically ("red" for positive numbers, "green" for negative numbers). Otherwise this should be a data frame with at least two columns, one named "Description" containing descriptions of the COGs categories and the other named "Color" containing colors to be assigned to the relative COGs category
`cogs.color.column`	The name of the column containing colors in col.cogs. Defaults to "Color"
`cogs.legend`	A boolean indicating whether or not to draw the COGs legend
`greyscale`	Ignored if col.cogs is a data frame. Otherwise this controls whether "red" and "green" become "gray25" and "gray75" when the data is colored numerically
`distscale`	Controls whether or not the distance between groups of bars should represent intergenic distance. If set to NULL, intergenic distance is ignored and groups of bars are plotted equally spaced on the genome. If set to any other value than NULL, intergenic distance is taken into account. If not NULL, the distance between genes A and B is set to: ((number bp between A and B) / distscale) + 1, where the unit is the width of one bar. This ensures that there is always a gap of at least 1 between groups of bars representing distinct genes. A low value of distscale will really emphasize the intergenic distance, whereas a high value will not. A value between 10 and 100 is reccomended for most prokaryotic genomes, although as with most things, this is a matter of preference. Defaults to NULL.
`genomecol`	The color of the background line representing the genome.
`genenames`	Controls whether gene names are plotted on the genome or not. Defaults to TRUE

Details

Internally the function used PGE.barplot

Value

A graph is plotted on the current device

Author(s)

Michael Watson

References

Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003 Sep 11;4(1):41.

Examples


        # load the IFR microarray data and the S typhimurium genome
        data(IFR)
        data(STLT2) # produced by 'STLT2 <- read.ptt("NC_003197.ptt")'

        # link the genome and the microarray data
        # we want one value per gene and so we average over the values
        linked <- linkem.avg(STLT2,IFR,"Synonym","Synonym",cnames=c("M4h","M8h","M12h"))

        # plot the 4h results on a section of the genome from 1-50000
        plotrange(linked,start=1,stop=50000,cnames="M4h")

        # plot the same but take into account intergenic distance
        plotrange(linked,start=1,stop=50000,cnames="M4h", distscale=1)

        # plot the same but more sensibly
        plotrange(linked,start=1,stop=50000,cnames="M4h", distscale=100)

        # plot the same again but highlight genes from bcfA to bcfE
        highlight <- data.frame(start="bcfA",stop="bcfF")
        plotrange(linked,start=1,stop=50000,cnames="M4h",highlight=highlight)

        # plot the same again but display Synonym instead of Gene name
        plotrange(linked,start=1,stop=50000,cnames="M4h", display.gname="Synonym")

        # plot the same again but select the highlighted region on Synonym rather than
        # gene name
        highlight <- data.frame(start="STM0016",stop="STM0029")
        plotrange(linked,start=1,stop=50000,cnames="M4h", display.gname="Gene",gname="Synonym", 
                        highlight=highlight)
        plotrange(linked,start=1,stop=50000,cnames="M4h", display.gname="Synonym",gname="Synonym", 
                        highlight=highlight)

        # plot 4h, 8h and 12h data for all genes
        plotrange(linked,start=1,stop=50000,cnames=c("M4h","M8h","M12h"))

        data(cogstab) # produced by :
                          # cogstab <- read.table("cogs_category.txt", sep="\t", 
                          #                             header=TRUE, quote=NULL, colClasses="character")
                          # cogstab <- data.frame(cogstab,Color=I(rainbow(nrow(cogstab))))

        # this time colour the genes by COGs category
        plotrange(linked,start=1,stop=50000,cnames=c("M4h","M8h","M12h"),col.cogs=cogstab)

[Package ProGenExpress version 1.0 Index]