Go to 2. Plotting II
Go to 3. Phylogeny Vignette
Go to 4. Human Vignette
Go to 5. Credits
1 Plotting chromosomes I
This guide shows the files to plot idiograms of measured karyotypes and optionally marks.
1.1 Load package
visit gitlab for installation instructions https://gitlab.com/ferroao/idiogramFISH
1.2 Get your chromosome size data
Initially you have to put your chromosome data in a data.frame.
From scratch:
# Example data.frame to write in R, use: (column OTU is optional if only 1 OTU)
mydfChrSize<-read.table(text=
" OTU chrName shortArmSize longArmSize
\"Species one\" 1 1.5 2.0
\"Species one\" 2 2.0 2.5
\"Species one\" 3 1.0 1.5
\"Species one\" B 2.0 3.5" , header=TRUE, stringsAsFactors=FALSE,fill=TRUE)
OTU | chrName | shortArmSize | longArmSize |
---|---|---|---|
Species one | 1 | 1.5 | 2.0 |
Species one | 2 | 2.0 | 2.5 |
Species one | 3 | 1.0 | 1.5 |
Species one | B | 2.0 | 3.5 |
loading saved data:
If you use RStudio, in the menu “Session”, use “Set working directory” for choosing your desired folder or:
Open your chromosome data data.frame importing it from a .csv (read.csv) or .xls file (readxl).
Editing a data.frame:
For fixing column names use:
1.3 Get marks general data
This data.frame is optional.
Put your mark data in a data.frame. This data.frame has the marks present in all karyotypes without position info. If the style
column is not present, param. defaultStyleMark = "square"
will be used during plotting.
# From scratch:
mydfMarkColor<-read.table(text=
" markName markColor style
5S red dots
45S green square
DAPI blue cM # <- new style
CMA yellow square
\"B mark\" black square" , header=TRUE, stringsAsFactors=FALSE,fill=TRUE)
markName | markColor | style |
---|---|---|
5S | red | dots |
45S | green | square |
DAPI | blue | cM |
CMA | yellow | square |
B mark | black | square |
For fixing column names use:
1.4 Get marks positions data
Open or write your mark positions in a data.frame. This data.frame has the marks present in all karyotypes with position info. This data.frame has also the centromeric marks present in all karyotypes.
# We will use column OTU if data.frame because chromosome size df has it
mydfOfMarks<-read.table(text=
" OTU chrName markName chrRegion markSize markDistCen
\"Species one\" 1 45S p NA NA # no measure means whole arm
\"Species one\" 1 5S q 0.5 0.5
\"Species one\" B \"B mark\" w NA NA # w for whole chromosome
\"Species one\" 2 45S p 1 1.0
\"Species one\" 3 DAPI q 1 1.0
\"Species one\" 1 DAPI cen
\"Species one\" 3 CMA cen", header=TRUE, stringsAsFactors=FALSE,fill=TRUE)
OTU | chrName | markName | chrRegion | markSize | markDistCen |
---|---|---|---|---|---|
Species one | 1 | 45S | p | NA | NA |
Species one | 1 | 5S | q | 0.5 | 0.5 |
Species one | B | B mark | w | NA | NA |
Species one | 2 | 45S | p | 1.0 | 1.0 |
Species one | 3 | DAPI | q | 1.0 | 1.0 |
Species one | 1 | DAPI | cen | NA | NA |
Species one | 3 | CMA | cen | NA | NA |
For fixing column names use something like:
1.5 Add some text to the right
For ver. > 1.7
# We will use column note to add a note to the right of the karyotype of the OTU in column OTU
notesdf<-read.table(text=
" OTU note
\"Species one\" \"Author notes\" ", header=TRUE, stringsAsFactors=FALSE,fill=TRUE)
For adding just the OTU name (from column OTU
of data.frame of chr. size) to the right, use OTUasNote=TRUE
1.6 Plotting
You can plot without marks (use only 1st data.frame), but we will use all 4 data.frames created. By default the function will calculate indices (Romero-Zarco, 1986; Watanabe et al., 1999) and morphological categories of Guerra (1986) and Levan (1964). Use parameters chrIndex
and morpho
of the function plotIdiograms
to modify that. See ?plotIdiograms
.
The cM
style of mark always adds the name as if legend="inline"
, even when legend="aside"
(default).
# svg("mydfChrSize.svg",width=12,height=6 )
par(mar = c(0, 0, 0, 0))
plotIdiograms(dfChrSize= mydfChrSize, # chr. size data.frame
dfMarkPos= mydfOfMarks, # mark position data.frame (inc. cen.)
dfMarkColor=mydfMarkColor, # mark style d.f.
distTextChr = .7, # separation among text and chr names and ind.
orderBySize = FALSE, # do not order chr. by size
karHeiSpace=1.6, # vertical size of karyotype including spacer
fixCenBorder = TRUE # use chrColor as border color of cen. or cen. marks
,legendWidth = .8 # legend item width
,legendHeight = .5 # legend item height
,markLabelSpacer = 2 # legend spacer
,rulerPos=0, # ruler position
ruler.tck=-0.01, # ticks of ruler size and orientation
notes=notesdf # data.frame with notes NEW
,notesTextSize = 1.3 # font size of notes
,notesPos = .2 # space from chr. (right) to note
,ylimBotMod = 1 # modify ylim bottom argument
,ylimTopMod = 0 # modify ylim top argument
,xlimLeftMod = 2 # modify left xlim
,xlimRightMod = 3 # modify right xlim
,lwd.cM = 2 # thickness of cM marks
)
Vertices when centromereSize=0
are rounded:
png("mydfChrSize2.png", width=550, height=550)
par(mar = c(0, 0, 0, 0))
plotIdiograms(dfChrSize = bigdfOfChrSize[1:8,], # chr. size data.frame
dfMarkColor = mydfMarkColor,# mark style df
dfMarkPos = bigdfOfMarks, # mark position df
centromereSize = 0, # <- HERE
roundness=3, # vertices roundness
chrSpacing = .7, # space among chr.
karHeight = 2, # karyotype rel. height
karHeiSpace=4, # vertical size of karyotype including spacer
amoSepar= 2.5, # separation among karyotype
indexIdTextSize=.8, # font size of chr. name and indices
karIndexPos = .1, # position of kar. index
markLabelSize=.7, # font size of mark legends
fixCenBorder = FALSE, # do not use chrColor as border color of cen. or cen. marks
distTextChr = .8, # separation among chr. and ind.
rulerPos= 0, # ruler position
ruler.tck=-0.01, # ticks of ruler size and orientation
xlimLeftMod = 2, # modify xlim left argument
ylimBotMod = 0.4, # modify ylim bottom argument
ylimTopMod = 0 # modify ylim top argument
,lwd.cM = 2 # thickness of cM marks
)
dev.off()
There is no need to add dfMarkColor
and you can also use the parameter mycolors
(optional too), to establish marks’ colors. Colors are assigned depending on the order of marks, i.e.:
Let’s use the cM
style of marks. A protruding line.
cM
style does not apply to centromere marks. To make something similar, use centromereSize=0
, legend="inline"
and fixCenBorder = FALSE
.
charVectorCol <- c("tomato3","darkolivegreen4","dfsd","blue","green")
png("dfOfChrSizeVector.png", width=1000, height=450)
par(mar=rep(0,4))
# Modify size of kar. to use rulerInterval and ceilingFactor (>= 1.13)
quo<-9
dfOfChrSize2<-dfOfChrSize
dfOfChrSize2$shortArmSize<-dfOfChrSize$shortArmSize/quo
dfOfChrSize2$longArmSize<-dfOfChrSize$longArmSize/quo
dfOfMarks2b<-dfOfMarks2
dfOfMarks2b$markSize<-dfOfMarks2$markSize/quo
dfOfMarks2b$markDistCen<-dfOfMarks2$markDistCen/quo
plotIdiograms(dfChrSize = dfOfChrSize2, # d.f. of chr. sizes
dfMarkPos = dfOfMarks2b, # d.f. of marks' positions
defaultStyleMark = "cM", # forces "cM" style in d.f dfMarkColor (exc. 5S)
mycolors = charVectorCol, # colors to use
distTextChr = .5, # separ. text and chr.
markLabelSize=.7, # font size for labels (legend)
lwd.cM=2, # width of cM marks
legendWidth=0.9, # legend item width
rulerPos= 0, # ruler position
ruler.tck=-0.01, # ruler tick orientation and length
rulerNumberSize=.5 # ruler font size
,xlimRightMod = 1 # modify xlim right arg.
)
dev.off()
1.7 Circular Plots
Example with monocen. and holocen using circularPlot=TRUE
{
require(plyr)
dfOfChrSize$OTU <- "Species mono"
dfChrSizeHolo$OTU <- "Species holo"
monoholoCS <- plyr::rbind.fill(dfOfChrSize,dfChrSizeHolo)
dfOfMarks2$OTU <-"Species mono"
dfMarkPosHolo$OTU <-"Species holo"
monoholoMarks <- plyr::rbind.fill(dfOfMarks2,dfMarkPosHolo)
monoholoMarks[which(monoholoMarks$markName=="5S"),]$markSize<-.5
}
library(idiogramFISH)
plotIdiograms(dfChrSize = monoholoCS, # data.frame of chr. size
dfMarkColor= dfMarkColor, # df of mark style
dfMarkPos = monoholoMarks,# df of mark positions, includes cen. marks
roundness =5, # vertices roundness
addOTUName = TRUE, # add OTU names
distTextChr = .5, # separ. among chr. and text and among chr. name and indices
karHeiSpace = 3, # karyotype height inc. spacing
karIndexPos = .2, # move karyotype index
chrId="original", # use original name of chr.
OTUTextSize = .7, # size of OTU name
legendHeight= 1, # height of legend labels
legendWidth = 1, # width of legend labels
# ,legend="inline"
fixCenBorder = TRUE, # use chrColor as border color of cen. or cen. marks
rulerPos= 0, # position of ruler
ruler.tck=-0.02, # size and orientation of ruler ticks
rulerNumberPos=.9, # position of numbers of rulers
xlimLeftMod=1, # modify xlim left argument of plot
xlimRightMod=2, # modify xlim right argument of plot
ylimBotMod= .2 # modify ylim bottom argument of plot
# GRAPHICAL PARAMETERS FOR CIRCULAR PLOT
,circularPlot = T # circularPlot
,shrinkFactor = .9 # percentage 1 = 100% of circle with chr.
,circleCenter = 3 # X coordinate of circleCenter (affects legend pos.)
,chrLabelSpacing = .9 # chr. names spacing
,OTUsrt = 0 # angle for OTU name (or nuber)
,OTUplacing = "number" # Use number and legend instead of name. See OTUcentered
,OTUjustif = 0 # OTU names justif. left.
,OTULabelSpacerx = -1.5 # modify position of OTU label, when OTUplacing="number" or "simple"
,OTUlegendHeight = 1.5 # space among OTU names when in legend - OTUplacing
)
Recreating circular karyotype of (Golczyk et al., 2005)
# First swap short and long arms to show the same rotation of the article
listradfs<-swapChrRegionDfSizeAndMarks(traspadf,traspaMarks,c("3","6","7","9","12") )
# Create marks' characteristics
dfMarkColor5S25S<-read.table(text=" markName markColor style
5S black dots
25S white dots" , header=TRUE, stringsAsFactors=FALSE,fill=TRUE)
plotIdiograms(dfChrSize = listradfs$dfChrSize, # d.f. of chr. sizes
dfMarkPos = listradfs$dfMarkPos, # d.f. of marks' positions
dfMarkColor = dfMarkColor5S25S, # d.f. of mark characteristics
cenColor = "black", # cen. color for GISH
roundness = 5, # corner roundness
chrWidth = 1, # chr width
orderBySize = F # do not order chr. by size
,addOTUName = F # do not add OTU name
,legendHeight = 2.5 # labels separ y axis
# circular plot parameters
,circularPlot=TRUE
,radius=5 # basic radius
,useOneDot=F # use two dots
,chrLabelSpacing = 1 # chr nama spacing
,rotation = .1 # anti-clockwise rotation
,shrinkFactor = .95 # % of circle use
)
Plasmid data from genBank
# Load data .gb downloaded from: https://www.ncbi.nlm.nih.gov/nuccore/NZ_CP009939.1
filename <- system.file("extdata", "sequence.gb", package = "idiogramFISH")
mylist<-genBankReadIF(filename)
names(mylist)
[1] “gbdfMain” “gbdfAssemblyMeta” “gbdfAnnoMeta” “source”
[5] “gene” “CDS”
# Authors of plasmid sequence
paste(mylist$gbdfMain[which(mylist$gbdfMain$field=="AUTHORS"),][1,2] )
[1] “Johnson,S.L., Minogue,T.D., Teshima,H., Davenport,K.W., Shea,A.A.,; Miner,H.L., Wolcott,M.J. and Chain,P.S.”
# mylist$gbdfSourceMeta
# View(mylist$gbdfMain)
# View(mylist$gbdfAssemblyMeta)
# mylist$gbdfAnnoMeta
# View(mylist$gbdfCDS)
# View(mylist$gene)
# create plasmid size data data.frame
myPlasmiddf <- data.frame(chrName=1, chrSize=mylist$source$end)
myPlasmiddf$OTU<-mylist$gbdfMain[which(mylist$gbdfMain$field=="DEFINITION"),]$value
myPlasmiddf$OTU<-gsub(", complete sequence.","",myPlasmiddf$OTU)
# Creating mark info data.frame
mylist$gene$markPos <-pmin(as.numeric(mylist$gene$begin),as.numeric(mylist$gene$end) )
mylist$gene$markSize<-abs(as.numeric(mylist$gene$end)-as.numeric(mylist$gene$begin) )
mylist$gene$markName<-mylist$gene$locus_tag
# Replace codes with names
mylist$gene[which(!is.na(mylist$gene$gene) ),]$markName<-
mylist$gene[which(!is.na(mylist$gene$gene) ),]$gene
marksDf<-mylist$gene[,c("markName","markPos","markSize"),]
# manually move away some names
# add spaces before name
distantNames<-unlist(lapply(marksDf$markName,
function(x) gsub("(.*)",paste0(paste0(rep(" ",20),collapse=""),"\\1"),x) ) )
# add spaces after name
distantNames2<-unlist(lapply(marksDf$markName,
function(x) gsub("(.*)",paste0("\\1",paste0(rep(" ",20), collapse="")),x) ) )
# Replace names
for (i in seq(1,nrow(marksDf ), by=2) ) { marksDf$markName[i]<-distantNames[i]}
for (i in seq(2,nrow(marksDf ), by=2) ) { marksDf$markName[i]<-distantNames2[i]}
# add mandatory column
marksDf$chrName<-1
# add marker for start pos.
marksDf<-rbind(marksDf,c(paste0("START",paste0(rep(" ",20), collapse="")),1,NA,1))
# add column - name of plasmid
marksDf$OTU <- myPlasmiddf$OTU
# create mark general data data.frame
markStyle2 <-markStyle <- idiogramFISH:::makedfMarkColorMycolors(unique(marksDf$markName),
c("black","forestgreen","cornflowerblue") )
# 1st plot with cM style of marks
markStyle$style<-"cM"
# prefix to remove from marks
mypattern<-sub("([[:alnum:]]+_).*","\\1",trimws(marksDf$markName[1]) )
library(idiogramFISH)
par(mar=rep(0,4))
plotIdiograms(dfChrSize = myPlasmiddf, # plasmid size d.f.
dfMarkPos = marksDf, # mark pos d.f.
dfMarkColor = markStyle, # mark style d.f.
roundness = 21, # corners not rounded
chrWidth = .1, # chr. width
chrId="", # no chr. name
markLabelSize=.5, # font size of labels
pattern=mypattern, # remove pattern from mark names
protruding=.5, # modify cM marks size
ylimBotMod = 0, # modify plot size
ylimTopMod = 0,
xlimLeftMod = 2,
# circular params.
circularPlot = TRUE, # circular
shrinkFactor = 1, # use 100% of circle
labelSpacing = 1.5, # label spacing from chr.
rotation=0, # begin plasmid in top
labelOutwards = TRUE, # label projected based on mark angle
OTUjustif = 0.5, # OTU name justif. centered.
OTUplacing = "simple" # plasmid name place. See OTUcentered
)
# overlap square style of marks with second plot
# plot over previous plot
plotIdiograms(dfChrSize = myPlasmiddf, dfMarkPos = marksDf, dfMarkColor = markStyle2, circularPlot = T,
shrinkFactor = 1, roundness = 21, chrWidth = .1, labelSpacing = 2, chrId="",
ylimBotMod = 0, ylimTopMod = 0, rotation=0, legend="",
addOTUName = FALSE, # do not add OTU name, see above
callPlot = FALSE # do not create a new plot
)
Prokaryote chromosome from genBank
library(idiogramFISH)
# Download prokaryote genome from:
# https://www.ncbi.nlm.nih.gov/nuccore/NC_014248.1
# Choose Customize View -> Basic Features -> genes, CDS
# Send To -> File -> Create File
# Use your file name:
# filename2<- "nostoc.gb" # 5 Mbytes
mylist<-genBankReadIF(filename2) # Wait 6 seconds to load ...
names(mylist)
# "gbdfMain" "gbdfAnnoMeta" "source" "gene" "CDS" "tRNA"
# "regulatory" "ncRNA" "rRNA" "misc_feature" "tmRNA"
# Authors of sequence
paste(mylist$gbdfMain[which(mylist$gbdfMain$field=="AUTHORS"),][1,2] )
# [1] "Ran,L., Larsson,J., Vigil-Stenman,T., Nylander,J.A., Ininbergs,K.,;
# Zheng,W.W., Lapidus,A., Lowry,S., Haselkorn,R. and Bergman,B."
# create chr. size data data.frame
myProkaryotedf <- data.frame(chrName=1, chrSize=mylist$source$end)
myProkaryotedf$OTU<-mylist$gbdfMain[which(mylist$gbdfMain$field=="DEFINITION"),]$value
myProkaryotedf$OTU<-gsub(", complete genome.","",myProkaryotedf$OTU)
# Creating mark info data.frame
mylistSel<-mylist[which(names(mylist) %in%
setdiff( names(mylist) , c("gbdfMain","gbdfAnnoMeta","source","CDS") ) )]
mylistSelDF<-dplyr::bind_rows(mylistSel, .id="feature")
mylistSelDF$markPos <-pmin(as.numeric(mylistSelDF$begin),as.numeric(mylistSelDF$end) )
mylistSelDF$markSize<-abs(as.numeric(mylistSelDF$end)-as.numeric(mylistSelDF$begin) )
mylistSelDF$markName<-mylistSelDF$locus_tag
# Replace codes with names
mylistSelDF[which(!is.na(mylistSelDF$gene) ),]$markName<-
mylistSelDF[which(!is.na(mylistSelDF$gene) ),]$gene
marksDf<-mylistSelDF[,c("markName","markPos","markSize","feature"),]
# manually move away some names
distantNames1<-unlist(lapply(marksDf$markName, function(x)
gsub("(.*)",paste0("\\1",paste0(rep(" ",25*3), collapse = "")),x) ) )
distantNamesCenter2<-unlist(lapply(marksDf$markName, function(x)
gsub("(.*)",paste0(paste0(rep(" ",25), collapse = ""),"\\1",paste0(rep(" ",25*2), collapse = "")),x) ) )
distantNamesCenter3<-unlist(lapply(marksDf$markName, function(x)
gsub("(.*)",paste0(paste0(rep(" ",25*2), collapse = ""),"\\1",paste0(rep(" ",25), collapse = "")),x) ) )
distantNames4<-unlist(lapply(marksDf$markName, function(x)
gsub("(.*)",paste0(paste0(rep(" ",25*3), collapse = ""),"\\1"),x ) ) )
for (i in seq(1,nrow(marksDf), by=4) ) { marksDf$markName[i]<-distantNames1[i]}
for (i in seq(2,nrow(marksDf), by=4) ) { marksDf$markName[i]<-distantNamesCenter2[i]}
for (i in seq(3,nrow(marksDf), by=4) ) { marksDf$markName[i]<-distantNamesCenter3[i]}
for (i in seq(4,nrow(marksDf), by=4) ) { marksDf$markName[i]<-distantNames4[i]}
# add marker for start
marksDf<-rbind(marksDf,c("START",1,NA,"start"))
# add mandatory column
marksDf$chrName<-1
# add column OTU, when in main data.frame
marksDf$OTU <- myProkaryotedf$OTU
unique(marksDf$feature)
# create mark general data data.frame
markStyle <- idiogramFISH:::makedfMarkColorMycolors(unique(marksDf$markName),
c("black","forestgreen","cornflowerblue") )
markStyle[which(markStyle$markName %in%
marksDf[which(marksDf$feature %in% c("tRNA","tmRNA") ),]$markName
),]$markColor<-"magenta"
markStyle[which(markStyle$markName %in%
marksDf[which(marksDf$feature %in% c("regulatory","ncRNA") ),]$markName
),]$markColor<-"tomato3"
markStyle[which(markStyle$markName %in%
marksDf[which(marksDf$feature %in% "rRNA" ),]$markName
),]$markColor<-"red2"
markStyle[which(markStyle$markName %in%
marksDf[which(marksDf$feature %in% "misc_feature" ),]$markName
),]$markColor<-"lightsalmon"
# duplicate mark style data.frame for square style
markStyle2 <- markStyle
# cM style d.f.
markStyle$style<-"cM"
# prefix to remove from mark names
mypattern<-sub("([[:alnum:]]+_).*","\\1",trimws(marksDf$markName[1]) )
# png("NOSTOC2.png", width=9500, height=9500) # 14 Mbytes
pdf("NOSTOC2.pdf", width=130, height=130) # 6 Mb
# svg("NOSTOC2.svg", width=130, height=130) # 100 Mb
par(mar=rep(0,4))
plotIdiograms(dfChrSize = myProkaryotedf, # chr. data d.f.
dfMarkPos = marksDf, # mark pos d.f.
dfMarkColor = markStyle, # mark style d.f.
roundness = 21, # corners not rounded
n=100, # number of vertices in rounded items.
chrWidth = .02, # chr. width
chrId="", # no chr. name
markLabelSize=1, # font size of labels
protruding=.5, # modify cM marks size
pattern= mypattern, # remove pattern from mark names
ylimBotMod = -.5, # modify plot size
ylimTopMod = -.5,
xlimLeftMod = .3,
xlimRightMod = .3,
# circular plot params.
circularPlot = TRUE, # circular
shrinkFactor = 1, # use 100% of circle
labelSpacing = 1.2, # label spacing from chr.
rotation=0, # begin chr. in top
labelOutwards = TRUE # label projected based on mark angle
,OTUjustif = 0.5 # OTU name centered
,OTUplacing = "simple" # location of OTU name, see OTUcentered
,radius = .1 # radius of circle
,OTUTextSize = 10 # font size of OTU name
)
# plot over previous plot square style
plotIdiograms(dfChrSize = myProkaryotedf, dfMarkPos = marksDf, dfMarkColor = markStyle2, circularPlot = T,
shrinkFactor = 1, roundness = 21, chrWidth = .02, chrId="",
ylimBotMod = -.5, ylimTopMod = -.5, xlimLeftMod = .3,
xlimRightMod = .3, radius=.1, n=100, rotation=0,
legend="", # do not add legend for marks
addOTUName = FALSE, # do not add. OTU name
callPlot = FALSE # plot over previous plot
)
dev.off()
1.8 Example with several species (OTUs)
To illustrate this, we will load some data.frames from the package
- Chromosome sizes
OTU | chrName | shortArmSize | longArmSize |
---|---|---|---|
Species 1 | 1 | 1.5 | 2.0 |
Species 1 | 2 | 2.0 | 2.5 |
Species 1 | 3 | 1.0 | 2.0 |
Species 2 | 1 | 3.0 | 4.0 |
Species 2 | 2 | 4.0 | 5.0 |
Species 2 | 3 | 2.0 | 3.0 |
Species 2 | X | 1.0 | 2.0 |
Species 2 | 4 | 3.0 | 4.0 |
Species 3 | 1 | 3.2 | 4.0 |
Species 3 | 2 | 4.5 | 5.0 |
Species 3 | 3 | 2.0 | 3.0 |
Species 3 | 4 | 1.5 | 2.0 |
Species 3 | 5 | 4.8 | 6.0 |
Species 3 | 6 | 6.1 | 7.0 |
Species 4 | 1 | 1.5 | 2.0 |
- Mark characteristics, does not require OTU
- optional for ver. > 1.0.0
markName | markColor | style |
---|---|---|
5S | red | dots |
45S | green | square |
DAPI | blue | square |
CMA | yellow | square |
- Mark position
OTU | chrName | markName | chrRegion | markDistCen | markSize |
---|---|---|---|---|---|
Species 1 | 1 | 5S | p | 0.5 | 1 |
Species 1 | 1 | 45S | q | 0.5 | 1 |
Species 1 | 2 | 45S | p | 1.0 | 1 |
Species 1 | 3 | DAPI | q | 1.0 | 1 |
Species 3 | 3 | 5S | p | 1.0 | 1 |
Species 3 | 3 | DAPI | q | 1.0 | 1 |
Species 3 | 4 | 45S | p | NA | NA |
Species 3 | 4 | DAPI | q | 1.0 | 1 |
Species 3 | 5 | CMA | q | 2.0 | 1 |
Species 3 | 6 | 5S | q | 0.5 | 1 |
Species 2 | 1 | DAPI | cen | NA | NA |
Species 2 | 4 | CMA | cen | NA | NA |
Plotting
par(mar = c(0, 0, 0, 0))
plotIdiograms(dfChrSize =bigdfOfChrSize,# chr. sizes
dfMarkColor=dfMarkColor, # mark characteristics, optional in dev version. see above.
dfMarkPos =bigdfOfMarks, # mark positions (inc. cen. marks)
karHeight=2.5, # karyotype rel. height
karHeiSpace=6, # karyotype vertical size with spacing
chrWidth = .35, # chr. width
amoSepar = 2, # Vertical separation of kar. when karSepar = TRUE
roundness = 10, # roundness of chr. vertices
distTextChr=.8, # distance of chr. to text
chrIndex = "AR", # add arm ratio only. For v. >=1.12
morpho="Guerra", # add chr. morphology by Guerra, see above. For v. >=1.12
indexIdTextSize=.6, # font size of indices and chr. name
OTUTextSize=.9, # font size of OTU names
markLabelSize=.7, # font size of legend
fixCenBorder = TRUE, # use chrColor as border color of cen. or cen. marks
legendHeight = 2, # height of labels
rulerPos=-1, # position of ruler
# rulerPosMod=3, # modify position of ruler
ruler.tck=-0.004, # size and orient. of ticks in ruler
rulerNumberPos=.4, # position of numbers of ruler
rulerNumberSize=.4, # font size of ruler
xlimRightMod = 3, # modify xlim left argument
xlimLeftMod = 2, # modify xlim left argument
ylimBotMod = 0, # modify ylim bottom argument
ylimTopMod = -.3 # modify ylim top argument
#,asp=1 # y x aspect ratio
)
1.9 GISH of monocentric chromosomes
You need the data.frame of chr. sizes, and a d.f. of marks
Chr. sizes:
parentalAndHybChrSize
OTU | chrName | shortArmSize | longArmSize |
---|---|---|---|
Parental 1 | 1 | 3.2 | 4 |
Parental 1 | 4 | 1.5 | 2 |
Parental 1 | 5 | 4.8 | 6 |
Parental 1 | 6 | 6.1 | 7 |
Parental 2 | 1 | 3.2 | 4 |
Parental 2 | 2 | 4.5 | 5 |
Parental 2 | 3 | 2.0 | 3 |
Allopolyploid | 1 | 3.2 | 4 |
Allopolyploid | 2 | 4.5 | 5 |
Allopolyploid | 3 | 2.0 | 3 |
Allopolyploid | 4 | 1.5 | 2 |
Allopolyploid | 5 | 4.8 | 6 |
Allopolyploid | 6 | 6.1 | 7 |
Marks’ positions data
dfAlloParentMarks
OTU | chrName | markName | chrRegion |
---|---|---|---|
Allopolyploid | 1 | Parental 1 | p |
Allopolyploid | 1 | Parental 2 | q |
Allopolyploid | 1 | Parental 2 | cen |
Allopolyploid | 2 | Parental 2 | w |
Allopolyploid | 3 | Parental 2 | w |
Allopolyploid | 4 | Parental 1 | w |
Allopolyploid | 5 | Parental 1 | w |
Allopolyploid | 6 | Parental 1 | w |
Parental 1 | 6 | Parental 1 | w |
Parental 1 | 5 | Parental 1 | w |
Parental 1 | 1 | Parental 1 | w |
Parental 1 | 4 | Parental 1 | w |
Parental 2 | 2 | Parental 2 | w |
Parental 2 | 1 | Parental 2 | w |
Parental 2 | 3 | Parental 2 | w |
Plotting
# svg("gish.svg",width=7,height=9 )
#png("parentalAndHybChrSize.png", width=700, height=900)
par(mar=rep(0,4) )
plotIdiograms(dfChrSize = parentalAndHybChrSize, # d.f. of chr. sizes
dfMarkPos = dfAlloParentMarks, # d.f. of marks' positions
cenColor = NULL, # cen. color for GISH
karHeiSpace=5, # karyotype height including spacing
karSepar = FALSE, # equally sized (height) karyotypes
rulerPos=-.7, # ruler position
ruler.tck= -0.006, # ruler tick orientation and length
rulerNumberSize=.4 # ruler font size
,legend="" # no legend
,notes=notesdf2 # data.frame with notes NEW
#,OTUasNote=TRUE # TRY THIS (OTU name to the right)
,notesTextSize = 1.3 # font size of notes
,notesPos = 1.5 # space from chr. (right) to note
,ylimBotMod = 1 # ylim bottom argument mod.
)
1.10 Plot data in micrometers and bases
Info in number of bases can be combined in the same plot with info. in micrometers.
This code is for the devel version >=1.13. For 1.12 only millions (not hundreds of millions) are supported.
To make the rules fit better, having less excess of length over chr., use ceilingFactor
.
#fig.width=10, fig.height=10
# modify data in millions to hundreds of millions of Mb
bigdfOfChrSize3_100Mb<-bigdfOfChrSize3Mb[1:8,]
bigdfOfChrSize3_100Mb$chrSize<-bigdfOfChrSize3_100Mb$chrSize*100
bigdfOfMarks3_100Mb<-bigdfOfMarks3Mb
bigdfOfMarks3_100Mb$markPos<-bigdfOfMarks3_100Mb$markPos*100
bigdfOfMarks3_100Mb$markSize<-bigdfOfMarks3_100Mb$markSize*100
# merge data.frames in micrometers and number of bases
mixedThreeSpChrSize <- plyr::rbind.fill(bigdfOfChrSize[1:8,], bigdfOfChrSize3_100Mb)
# sort by OTU name
mixedThreeSpChrSize <- mixedThreeSpChrSize[order(mixedThreeSpChrSize$OTU),]
# merge marks in micrometers and bases
mixedThreeSpMarks <- plyr::rbind.fill(bigdfOfMarks , bigdfOfMarks3_100Mb)
par(mar=rep(0,4))
plotIdiograms(dfChrSize = mixedThreeSpChrSize, # chr. size data.frame
dfMarkPos = mixedThreeSpMarks, # mark position df
chrWidth=.6, # width of chr.
chrSpacing = .6, # space among chr.
karHeight = 3, # kar. height without interspace
karHeiSpace = 5, # vertical size of karyotype including spacer
amoSepar =2, # separ. among kar.
indexIdTextSize=.6, # font size of chr. name and indices
markLabelSize=.7, # font size of mark legends
distTextChr = .65, # separation among chr. names and indices
legendWidth = 1.5 # legend items width
,fixCenBorder = TRUE # use chrColor as border color of cen. or cen. marks
,ylabline = -8 # position of Mb (title) in ruler
,rulerPos= 0, # ruler position
ruler.tck=-0.005, # ticks of ruler size and orientation
rulerNumberPos =.7, # position of numbers in ruler
rulerNumberSize=.7, # font size of ruler numbers
rulerInterval = 1.5, # ruler interval for micrometeres
rulerIntervalMb = 150000000,# ruler interval for Mb
ceilingFactor = 1, # affects rounding for ruler max. value
ylimBotMod = 0.4, # modify ylim bottom argument
ylimTopMod = 0 # modify ylim top argument
#,asp=1 # aspect of plot
)
Let’s explore those data.frames
OTU | chrName | shortArmSize | longArmSize | chrSize | |
---|---|---|---|---|---|
1 | Species 1 | 1 | 1.5 | 2.0 | NA |
2 | Species 1 | 2 | 2.0 | 2.5 | NA |
3 | Species 1 | 3 | 1.0 | 2.0 | NA |
9 | Species 1 genome | 1 | NA | NA | 3.5e+08 |
10 | Species 1 genome | 2 | NA | NA | 4.5e+08 |
11 | Species 1 genome | 3 | NA | NA | 2.5e+08 |
OTU | chrName | markName | chrRegion | markDistCen | markSize | markPos | |
---|---|---|---|---|---|---|---|
1 | Species 1 | 1 | 5S | p | 0.5 | 1 | NA |
2 | Species 1 | 1 | 45S | q | 0.5 | 1 | NA |
3 | Species 1 | 2 | 45S | p | 1.0 | 1 | NA |
4 | Species 1 | 3 | DAPI | q | 1.0 | 1 | NA |
13 | Species 1 genome | 1 | 5S | NA | NA | 100000000 | 250000000 |
14 | Species 1 genome | 1 | 45S | NA | NA | 100000000 | 50000000 |
15 | Species 1 genome | 2 | 45S | NA | NA | 100000000 | 350000000 |
16 | Species 1 genome | 3 | DAPI | NA | NA | 100000000 | 0 |
1.11 Using threshold
to fix scale
The default value of 35
for threshold
may shrink one of the OTUs of this example more than expected. In this case threshold
must be bigger.
# fig.width=7, fig.height=7
bigdfOfChrSize3_100Mb<-bigdfOfChrSize3Mb
bigdfOfChrSize3_100Mb$chrSize<-bigdfOfChrSize3Mb$chrSize*33
bigdfOfMarks3_100Mb<-bigdfOfMarks3Mb
bigdfOfMarks3_100Mb$markPos<-bigdfOfMarks3_100Mb$markPos*33
bigdfOfMarks3_100Mb$markSize<-bigdfOfMarks3_100Mb$markSize*33
par(mar=rep(0,4))
plotIdiograms(dfChrSize = bigdfOfChrSize3_100Mb, # chr. size data.frame
dfMarkPos = bigdfOfMarks3_100Mb, # mark position df
chrWidth=.6, # width of chr.
chrSpacing = .6, # space among chr.
karHeight = 3, # kar. height without interspace
karHeiSpace = 5, # vertical size of karyotype including spacer
amoSepar =2, # separ. among kar.
indexIdTextSize=.6, # font size of chr. name and indices
markLabelSize=.7, # font size of mark legends
distTextChr = .65, # separation among chr. names and indices
fixCenBorder = TRUE # use chrColor as border color of cen. or cen. marks
,legendWidth = 1.5 # legend items width
,ylabline = -2 # position of Mb (title) in ruler
,rulerPos= 0, # ruler position
ruler.tck=-0.005, # ticks of ruler size and orientation
rulerNumberPos =.7, # position of numbers in ruler
rulerNumberSize=.7, # font size of ruler numbers
rulerInterval = 1.5, # ruler interval for micrometeres
rulerIntervalMb = 50000000, # ruler interval for Mb
ylimBotMod = 0.4, # modify ylim bottom argument
ylimTopMod = 0 # modify ylim top argument
#### NEW #####
,threshold = 90 # this will allow to not to shrink data greater than 350 Mb
)
1.12 Use cM instead of Mb (devel. version > 1.12)
Info in cM can be combined in the same plot with info. in micrometers.
This code is for the devel version >1.12. Does not work on CRAN version.
To make the rules fit better, having less excess of length over chr., use ceilingFactor
.
#fig.width=10, fig.height=10
# merge data.frames in micrometers and cM
bigdfOfChrSize3cM<-bigdfOfChrSize3Mb[1:8,]
bigdfOfChrSize3cM$chrSize<-bigdfOfChrSize3cM$chrSize/100000
mixedThreeSpChrSize <- plyr::rbind.fill(bigdfOfChrSize[1:8,], bigdfOfChrSize3cM)
# sort by OTU name
mixedThreeSpChrSize <- mixedThreeSpChrSize[order(mixedThreeSpChrSize$OTU),]
# create data with cM. markSize col. is not necessary because style is cM
bigdfOfMarks3cM<-bigdfOfMarks3Mb
bigdfOfMarks3cM$markPos<-bigdfOfMarks3Mb$markPos/100000
bigdfOfMarks3cM$markSize<-NA
# As we want only the cM idiograms to be plotted as cM (lines), change mark names
bigdfOfMarks3cM$markName<-paste0("cM",bigdfOfMarks3cM$markName)
# d.f of all marks
mixedThreeSpMarks <- plyr::rbind.fill(bigdfOfMarks , bigdfOfMarks3cM)
# create a data.frame with mark characteristics
mixedDfMarkStyle <- idiogramFISH:::makedfMarkColorMycolors(unique(mixedThreeSpMarks$markName),
c("red","green","blue","yellow")
)
# mark names of cM marks with "cM" style (lines): not dots, not squares
mixedDfMarkStyle[which(mixedDfMarkStyle$markName %in%
grep("cM", mixedDfMarkStyle$markName, value=TRUE) ) ,]$style<-"cM"
par(mar=rep(0,4))
plotIdiograms(dfChrSize = mixedThreeSpChrSize, # chr. size data.frame
dfMarkPos = mixedThreeSpMarks, # mark position data.frame
dfMarkColor = mixedDfMarkStyle, # mark style data.frame
chrWidth=.6, # width of chr.
chrSpacing = .7, # space among chr.
specialOTUNames = bigdfOfMarks3cM$OTU, # OTUs in this object will have different ruler units
specialyTitle = "cM", # ruler title for specialOTUNames
specialChrWidth = .2, # modify chr width of OTUs in specialOTUNames
specialChrSpacing = 1.1, # modify chr spacing of OTUs in specialOTUNames
karHeight = 3, # kar. height without interspace
karHeiSpace = 5, # vertical size of karyotype including spacer
amoSepar =2, # separ. among kar.
indexIdTextSize=.6, # font size of chr. name and indices
distTextChr = .65, # separation among chr. names and indices
protruding = 1, # extension of cM mark type
pattern = "cM", # regex pattern to remove from mark names
markLabelSize=.7 # font size of mark legends
,legendWidth = 2 # legend items width
,fixCenBorder = TRUE # use chrColor as border color of cen. or cen. marks
,lwd.cM = 2 # thickness of cM marks
,ylabline = -8 # position of Mb or cM (title) in ruler
,rulerPos= 0, # ruler position
ruler.tck=-0.005, # ticks of ruler size and orientation
rulerNumberPos =.7, # position of numbers in ruler
rulerNumberSize=0.7, # font size of ruler numbers
rulerIntervalcM = 12, # ruler interval for OTU in specialOTUnames and MbThreshold not met
ceilingFactor = 1, # affects max. value in ruler. See also rulerInterval
ylimBotMod = 0.4, # modify ylim bottom argument
ylimTopMod = 0 # modify ylim top argument
)
1.13 Using groups
Adding the column group
Open your chromosome data - Chr. size - as data.frame and add column
# Example data.frame written in R, use
dfwithgroups<-read.table(text="
chrName shortArmSize longArmSize group
1 1 3 5 1
2 1 3.2 5.5 1
3 1 3.5 4.8 1
4 4 1 3 NA
5 5 3 5 NA
6 X 4 6 NA", header=TRUE, stringsAsFactors=F)
chrName | shortArmSize | longArmSize | group |
---|---|---|---|
1 | 3.0 | 5.0 | 1 |
1 | 3.2 | 5.5 | 1 |
1 | 3.5 | 4.8 | 1 |
4 | 1.0 | 3.0 | NA |
5 | 3.0 | 5.0 | NA |
X | 4.0 | 6.0 | NA |
Heteromorphic pairs
It can be used to plot heteromorphic pairs, see pair 1
dfwithHetero<-read.table(text="
chrName shortArmSize longArmSize group
1 1A 3 5 1
2 1B 3 5 1
4 2 1 3 NA
5 3 3 5 NA
6 4 4 6 NA", header=TRUE, stringsAsFactors=FALSE)
chrName | shortArmSize | longArmSize | group | |
---|---|---|---|---|
1 | 1A | 3 | 5 | 1 |
2 | 1B | 3 | 5 | 1 |
4 | 2 | 1 | 3 | NA |
5 | 3 | 3 | 5 | NA |
6 | 4 | 4 | 6 | NA |
Open or write your mark positions as a data.frame. This data.frame has the marks present in all karyotypes with position info.
dfOfMarksHetero<-read.table(text=
" chrName markName chrRegion markSize markDistCen
1 1A 5S p 1 0.9
2 1B 45S p 1 0.9
3 2 CMA q 1 1.0
4 3 DAPI q 1 1.0", header=TRUE, stringsAsFactors=FALSE)
chrName | markName | chrRegion | markSize | markDistCen |
---|---|---|---|---|
1A | 5S | p | 1 | 0.9 |
1B | 45S | p | 1 | 0.9 |
2 | CMA | q | 1 | 1.0 |
3 | DAPI | q | 1 | 1.0 |
# svg("dfwithHetero.svg",width=13.5,height=8 )
par(mar=rep(0,4))
dfwithHetero$OTU<-"hetero"
dfwithgroups$OTU<-"first"
both<-plyr::rbind.fill(dfwithHetero,dfwithgroups)
dfOfMarksHetero$OTU<-"hetero"
plotIdiograms(dfChrSize=both, # chr. sizes
dfMarkPos=dfOfMarksHetero, # position of marks
karHeiSpace = 4,
chrId="original", # chr. name in df.
chrIndex = "", # do not add chr. indices
morpho="", # do not add chr. morphologies
karIndex = FALSE, # do not add karyotype indices
distTextChr = .8, # distance from text to chr.
markDistType="cen", # mark position measured to center of mark
orderBySize = FALSE, # do not order chr. by size
ruler=FALSE # do not plot ruler
,ylimBotMod = 1 # modify ylim bottom argument
,legendWidth = 1 # width of legend
)
References
Golczyk H, Hasterok R, Joachimiak AJ. 2005. FISH-aimed karyotyping and characterization of Renner complexes in permanent heterozygote Rhoeo spathacea Genome, 48(1): 145–153. https://doi.org/10.1139/g04-093
Guerra M. 1986. Reviewing the chromosome nomenclature of Levan et al. Brazilian Journal of Genetics, 9(4): 741–743
Levan A, Fredga K, Sandberg AA. 1964. Nomenclature for centromeric position on chromosomes Hereditas, 52(2): 201–220. https://doi.org/10.1111/j.1601-5223.1964.tb01953.x. https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1601-5223.1964.tb01953.x
Romero-Zarco C. 1986. A new method for estimating karyotype asymmetry Taxon, 35(3): 526–530. https://onlinelibrary.wiley.com/doi/abs/10.2307/1221906
Watanabe K, Yahara T, Denda T, Kosuge K. 1999. Chromosomal evolution in the genus Brachyscome (Asteraceae, Astereae): statistical tests regarding correlation between changes in karyotype and habit using phylogenetic information Journal of Plant Research, 112: 145–161. http://link.springer.com/article/10.1007/PL00013869