01 Jun

Hello world!

Today, I’d like to share with you a code for plotting Bland-Altman diagrams.
The Bland-Altman diagram (Bland & Altman, 1986 and 1999), or difference plot, is a graphical method to compare two measurements techniques. In this graphical method the differences (or alternatively the ratios) between the two techniques are plotted against the averages of the two techniques. Alternatively (Krouwer, 2008) the differences can be plotted against one of the two methods, if this method is a reference or “gold standard” method.
Horizontal lines are drawn at the mean difference, and at the limits of agreement, which are defined as the mean difference plus and minus 1.96 times the standard deviation of the differences.

```Bland.Altman.diagram <- function(x,y,id=FALSE, alpha = .05,rep.meas = FALSE,subject,idname = FALSE, xname = 'x',yname = 'y',...) {   library(ggplot2) library(RColorBrewer)   #*** 1. Set a few constants z <- qnorm(1 - alpha / 2) ## value of z corresponding to alpha d <- x - y ## pair-wise differences m <- (x + y) / 2 ## pair-wise means   #*** 2. Calculate mean difference d.mn <- mean(d,na.rm = TRUE)   #*** 3. Calculate difference standard deviation if (rep.meas == FALSE) { d.sd = sqrt(var(d,na.rm = TRUE)) } else { #*** 3a. Ensure subject is a factor variable if (!is.factor(subject)) subject <- as.factor(subject)   #*** 3b. Extract model information n <- length(levels(subject)) # Number of subjects model <- aov(d ~ subject) # One way analysis of variance MSB <- anova(model)[[3]][1] # Degrees of Freedom MSW <- anova(model)[[3]][2] # Sums of Squares   #*** 3c. Calculate number of complete pairs for each subject pairs <- NULL for (i in 1:length(levels(as.factor(subject)))) { pairs[i] <- sum(is.na(d[subject == levels(subject)[i]]) == FALSE) } Sig.dl <- (MSB - MSW) / ((sum(pairs) ^ 2 - sum(pairs ^ 2)) / ((n - 1) * sum(pairs))) d.sd <- sqrt(Sig.dl + MSW) }   #*** 4. Calculate lower and upper confidence limits ucl <- d.mn + z * d.sd lcl <- d.mn - z * d.sd print(d.mn) print(ucl) print(lcl)   #*** 5. Make Plot xlabstr = paste(c('Mean(',xname,', ',yname,')'), sep = "",collapse = "") ylabstr = paste(c(xname, ' - ', yname), sep = "",collapse = "") id = ifelse(id==FALSE, as.factor(rep(1,length(m))), as.factor(id)) xy <- data.frame(m,d,id) xy\$id <- as.factor(xy\$id) maxm <- max(m)     #scatterplot of x and y variables scatter <- ggplot(xy,aes(m, d))     if(idname == FALSE){ scatter <- scatter + geom_point(color='darkgrey') + scale_color_manual( guide=FALSE) } else { scatter <- scatter + geom_point(aes(color = id))+ scale_color_brewer(idname, palette="Set1") # Set1 }   scatter <- scatter + geom_abline(intercept = d.mn,slope = 0, colour = "black",size = 1,linetype = "dashed" ) + geom_abline( intercept = ucl,slope = 0,colour = "black",size = 1,linetype = "dotted" ) + geom_abline( intercept = lcl,slope = 0,colour = "black",size = 1,linetype = "dotted" ) + ylim(min(lcl, -max(xy\$d), min(xy\$d)), max(ucl, max(xy\$d), -min(xy\$d)))+ xlab(xlabstr) + ylab(ylabstr) + #geom_text(x=maxm-2.4,y=d.mn,label='Mean',colour='black',size=4)+ #geom_text(x=maxm-2.4,y=ucl,label='+1.96 SD',colour='black',size=4)+ #geom_text(x=maxm-2.4,y=lcl,label='-1.96 SD',colour='black',size=4)+ theme( plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.line = element_line(color="black", size = 1) )   plot(scatter)   values <- round(cbind(lcl,d.mn,ucl),4) colnames(values) <- c("LCL","Mean","UCL") if (rep.meas == FALSE){ Output <- list(limits = values,Var = d.sd ^ 2) } else { Output <- list(limits = values,Var = Sig.dl,MSB = MSB,MSW = MSW) } return(Output) }   A <- c(-0.358, 0.788, 1.23, -0.338, -0.789, -0.255, 0.645, 0.506, 0.774, -0.511, -0.517, -0.391, 0.681, -2.037, 2.019, -0.447, 0.122, -0.412, 1.273, -2.165) B <- c(0.121, 1.322, 1.929, -0.339, -0.515, -0.029, 1.322, 0.951, 0.799, -0.306, -0.158, 0.144, 1.132, -0.675, 2.534, -0.398, 0.537, 0.173, 1.508, -1.955)   Bland.Altman.diagram(A, B,alpha = .05,rep.meas = F,xname = 'A',yname = 'B')```

The result is the following:

The ‘id’ field allows you to define different categories that will be colored differently on the graph:

13 Jan

Good Morning Followers!

Yesterday, I found a nice post here by Joseph Rickert. I produces nice plots with points and a google map background. Try it and have fun!

```library(ggmap) library(plyr) library(gridExtra) temp <- tempfile() download.file("http://www.plantsciences.ucdavis.edu/plant/data.zip", temp) connection <- unz(temp, "Data/Set3/Set3data.csv") rice <- read.csv(connection) names(rice) <- tolower(names(rice)) # Create a custom soil attribute plot # @param df Data frame containing data for a field # @param attribute Soil attribute # @return Custom soil attribute plot create_plot <- function(df, attribute) { map <- get_map(location = c(median(df\$longitude), median(df\$latitude)), maptype = "satellite", source = "google", crop = FALSE, zoom = 15) plot <- ggmap(map) + geom_point(aes_string(x = "longitude", y = "latitude", color = attribute), size = 5, data = df) plot <- plot + ggtitle(paste("Farmer", df\$farmer, "/ Field", df\$field)) plot <- plot + scale_color_gradient(low = "darkorange", high = "darkorchid4") return(plot) } ph_plot <- dlply(rice, "field", create_plot, attribute = "ph") ph_plots <- do.call(arrangeGrob, ph_plot)```

07 Jan

In love with the ggplot2 look? Sick of using plotRGB to display your RGB raster?
Here comes the rggplot function!

Have a look:

```  rggbplot <- function(inRGBRst,npix=NA,scale = 'lin'){   rgblinstretch <- function(rgbDf){ maxList <- apply(rgbDf,2,max) minList <- apply(rgbDf,2,min) temp<-rgbDf for(i in c(1:3)){ temp[,i] <- (temp[,i]-minList[i])/(maxList[i]-minList[i]) } return(temp) }   rgbeqstretch<-function(rgbDf){   temp<-rgbDf for(i in c(1:3)){ unique <- na.omit(temp[,i]) if (length(unique>0)){ ecdf<-ecdf(unique) temp[,i] <- apply(temp[,i,drop=FALSE],2,FUN=function(x) ecdf(x)) } } return(temp) }   if(is.na(npix)){ if(ncell(inRGBRst)>5000){ npix <- 5000 } else{ npix <- ncell(inRGBRst) } } x <- sampleRegular(inRGBRst, size=npix, asRaster = TRUE) dat <- as.data.frame(x, xy=TRUE) colnames(dat)[3:5]<-c('r','g','b')   if(scale=='lin'){ dat[,3:5]<- rgblinstretch(dat[,3:5]) } else if(scale=='stretch'){ dat[,3:5]<- rgbeqstretch(dat[,3:5]) }   p <- ggplot()+ geom_tile(data=dat, aes(x=x, y=y, fill=rgb(r,g,b))) + scale_fill_identity()   }   b <- (brick(system.file("external/rlogo.grd", package="raster")))*6-600 print(rggbplot(b))```

25 Dec

Merry GISmas dearest followers!
A couple of weeks ago, I made a post on heatmaps. Today I came across a nice answer on stackoverflow, it goes like this:

```# LOAD THE DATA stock <- "MSFT" start.date <- "2006-01-12" end.date <- Sys.Date() quote <- paste("http://ichart.finance.yahoo.com/table.csv?s=", stock, "&a=", substr(start.date,6,7), "&b=", substr(start.date, 9, 10), "&c=", substr(start.date, 1,4), "&d=", substr(end.date,6,7), "&e=", substr(end.date, 9, 10), "&f=", substr(end.date, 1,4), "&g=d&ignore=.csv", sep="") stock.data <- read.csv(quote, as.is=TRUE) stock.data <- transform(stock.data, week = as.POSIXlt(Date)\$yday %/% 7 + 1, wday = as.POSIXlt(Date)\$wday, year = as.POSIXlt(Date)\$year + 1900)```

The data has the following structure:

```> head(stock.data) Date Open High Low Close Volume Adj.Close week wday year 1 2014-12-24 48.64 48.64 48.08 48.14 11437800 48.14 52 3 2014 2 2014-12-23 48.37 48.80 48.13 48.45 23648100 48.45 51 2 2014 3 2014-12-22 47.78 48.12 47.71 47.98 26566000 47.98 51 1 2014 4 2014-12-19 47.63 48.10 47.17 47.66 64551200 47.66 51 5 2014 5 2014-12-18 46.58 47.52 46.34 47.52 40105600 47.52 51 4 2014 6 2014-12-17 45.05 45.95 44.90 45.74 34970900 45.74 51 3 2014```

And now this is how we chart it:

```library(ggplot2) ggplot(stock.data, aes(week, wday, fill = Adj.Close)) + geom_tile(colour = "white") + scale_fill_gradientn('MSFT \n Adjusted Close',colours = c("#D61818","#FFAE63","#FFFFBD","#B5E384")) + facet_wrap(~ year, ncol = 1)```

Here is how it looks like

Calendar heatmap

05 Dec

Hi folks!

Wanna present your table in a nice way? Have you ever thought of heat maps?
Here is an example on how to use them:

```library(ggplot2) library(plyr) # might be not needed here anyway it is a must-have package I think in R library(reshape2) # to "melt" your dataset library (scales) # it has a "rescale" function which is needed in heatmaps library(RColorBrewer) # for convenience of heatmap colors, it reflects your mood sometimes nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv") head(nba) nba\$Name <- with(nba, reorder(Name, PTS)) nba.m <- melt(nba) nba.m <- ddply(nba.m, .(variable), transform,rescale = rescale(value)) p <- ggplot(nba.m, aes(variable, Name)) + geom_tile(aes(fill = rescale),colour = "white") + scale_fill_gradient2(low="#006400", mid="#f2f6c3",high="#cd0000",midpoint=0.5)```

The nba data.frame has the folowing structure:

```> head(nba) Name G MIN PTS FGM FGA FGP FTM FTA FTP X3PM X3PA X3PP ORB DRB TRB AST STL BLK TO PF 1 Dwyane Wade 79 38.6 30.2 10.8 22.0 0.491 7.5 9.8 0.765 1.1 3.5 0.317 1.1 3.9 5.0 7.5 2.2 1.3 3.4 2.3 2 LeBron James 81 37.7 28.4 9.7 19.9 0.489 7.3 9.4 0.780 1.6 4.7 0.344 1.3 6.3 7.6 7.2 1.7 1.1 3.0 1.7 3 Kobe Bryant 82 36.2 26.8 9.8 20.9 0.467 5.9 6.9 0.856 1.4 4.1 0.351 1.1 4.1 5.2 4.9 1.5 0.5 2.6 2.3 4 Dirk Nowitzki 81 37.7 25.9 9.6 20.0 0.479 6.0 6.7 0.890 0.8 2.1 0.359 1.1 7.3 8.4 2.4 0.8 0.8 1.9 2.2 5 Danny Granger 67 36.2 25.8 8.5 19.1 0.447 6.0 6.9 0.878 2.7 6.7 0.404 0.7 4.4 5.1 2.7 1.0 1.4 2.5 3.1 6 Kevin Durant 74 39.0 25.3 8.9 18.8 0.476 6.1 7.1 0.863 1.3 3.1 0.422 1.0 5.5 6.5 2.8 1.3 0.7 3.0 1.8```

And here is the output. Nice isn’t it?