01 Jun

Bland-Altman diagram with ggplot2

Hello world!

Today, I’d like to share with you a code for plotting Bland-Altman diagrams.
The Bland-Altman diagram (Bland & Altman, 1986 and 1999), or difference plot, is a graphical method to compare two measurements techniques. In this graphical method the differences (or alternatively the ratios) between the two techniques are plotted against the averages of the two techniques. Alternatively (Krouwer, 2008) the differences can be plotted against one of the two methods, if this method is a reference or “gold standard” method.
Horizontal lines are drawn at the mean difference, and at the limits of agreement, which are defined as the mean difference plus and minus 1.96 times the standard deviation of the differences.

Bland.Altman.diagram <-
  function(x,y,id=FALSE, alpha = .05,rep.meas = FALSE,subject,idname = FALSE, xname = 'x',yname = 'y',...) {
 
    library(ggplot2)
	  library(RColorBrewer)
 
    #*** 1. Set a few constants
    z <- qnorm(1 - alpha / 2)  ## value of z corresponding to alpha
    d <- x - y               ## pair-wise differences
    m <- (x + y) / 2           ## pair-wise means
 
    #*** 2. Calculate mean difference
    d.mn <- mean(d,na.rm = TRUE)
 
    #*** 3. Calculate difference standard deviation
    if (rep.meas == FALSE) {
      d.sd = sqrt(var(d,na.rm = TRUE))
    } else {
      #*** 3a. Ensure subject is a factor variable
      if (!is.factor(subject))
        subject <- as.factor(subject)
 
      #*** 3b. Extract model information
      n <- length(levels(subject))      # Number of subjects
      model <- aov(d ~ subject)           # One way analysis of variance
      MSB <- anova(model)[[3]][1]       # Degrees of Freedom
      MSW <- anova(model)[[3]][2]       # Sums of Squares
 
      #*** 3c. Calculate number of complete pairs for each subject
      pairs <- NULL
      for (i in 1:length(levels(as.factor(subject)))) {
        pairs[i] <- sum(is.na(d[subject == levels(subject)[i]]) == FALSE)
      }
      Sig.dl <-
        (MSB - MSW) / ((sum(pairs) ^ 2 - sum(pairs ^ 2)) / ((n - 1) * sum(pairs)))
      d.sd <- sqrt(Sig.dl + MSW)
    }
 
    #*** 4. Calculate lower and upper confidence limits
    ucl <- d.mn + z * d.sd
    lcl <- d.mn - z * d.sd
    print(d.mn)
    print(ucl)
    print(lcl)
 
    #*** 5. Make Plot
    xlabstr = paste(c('Mean(',xname,', ',yname,')'), sep = "",collapse = "")
    ylabstr = paste(c(xname, ' - ', yname), sep = "",collapse = "")
    id = ifelse(id==FALSE, as.factor(rep(1,length(m))), as.factor(id))
    xy <- data.frame(m,d,id)
    xy$id <- as.factor(xy$id)
    maxm <- max(m)
 
 
    #scatterplot of x and y variables
    scatter <- ggplot(xy,aes(m, d)) 
 
 
    if(idname == FALSE){
      scatter <- scatter + geom_point(color='darkgrey') + scale_color_manual( guide=FALSE)
    } else {
      scatter <- scatter + geom_point(aes(color = id))+ scale_color_brewer(idname, palette="Set1") # Set1
    }
 
    scatter <- scatter + geom_abline(intercept = d.mn,slope = 0, colour = "black",size = 1,linetype = "dashed" ) +
      geom_abline(
        intercept = ucl,slope = 0,colour = "black",size = 1,linetype = "dotted"
      ) +
      geom_abline(
        intercept = lcl,slope = 0,colour = "black",size = 1,linetype = "dotted"
      ) +
      ylim(min(lcl, -max(xy$d), min(xy$d)), max(ucl, max(xy$d), -min(xy$d)))+
      xlab(xlabstr) +
      ylab(ylabstr) +
      #geom_text(x=maxm-2.4,y=d.mn,label='Mean',colour='black',size=4)+
      #geom_text(x=maxm-2.4,y=ucl,label='+1.96 SD',colour='black',size=4)+
      #geom_text(x=maxm-2.4,y=lcl,label='-1.96 SD',colour='black',size=4)+
      theme(
        plot.background = element_blank(),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.line = element_line(color="black", size = 1)
    )
 
    plot(scatter)
 
    values <- round(cbind(lcl,d.mn,ucl),4)
    colnames(values) <- c("LCL","Mean","UCL")
    if (rep.meas == FALSE){
      Output <- list(limits = values,Var = d.sd ^ 2)
    } else {
      Output <- list(limits = values,Var = Sig.dl,MSB = MSB,MSW = MSW)
    }
    return(Output)
  }
 
A <- c(-0.358, 0.788, 1.23, -0.338, -0.789, -0.255, 0.645, 0.506, 
       0.774, -0.511, -0.517, -0.391, 0.681, -2.037, 2.019, -0.447, 
       0.122, -0.412, 1.273, -2.165)
B <- c(0.121, 1.322, 1.929, -0.339, -0.515, -0.029, 1.322, 0.951, 
       0.799, -0.306, -0.158, 0.144, 1.132, -0.675, 2.534, -0.398, 0.537, 
       0.173, 1.508, -1.955)
 
Bland.Altman.diagram(A, B,alpha = .05,rep.meas = F,xname = 'A',yname = 'B')

The result is the following:
Rplot01

The ‘id’ field allows you to define different categories that will be colored differently on the graph:Rplot02

18 Apr

Online LaTeX diff tool

Hi PubliGISer,

You want to publish the revision of your awesome paper but you need a track change to meet reviewers’ requirements and you do not know how to get it? You are desperate and think that you will not be able to realize your dream of a successful research career?

Online LaTeX diff tool is the TOOL you are looking for.

Just copy paste the old and the new latex files and this tool will automatically compute the difference between the 2 documents.

Problem solved.

22 Mar

Download files from Dropbox via shareable link

Hello there!

I have a nice piece of code for today on how to download a file from a dropbox shareable link (I reckon it adapted slightly a code found here). Here is how it works. Argument x is the document name, d the document key, and outfile is the desired filename and location.

dl_from_dropbox <- function(x, key, outfile) {
  require(RCurl)
  bin <- getBinaryURL(paste0("https://dl.dropboxusercontent.com/s/", key, "/", x),
                      ssl.verifypeer = FALSE)
  con <- file(outfile, open = "wb")
  writeBin(bin, con)
  close(con)        
}
 
# Example:
dl_from_dropbox("GViewer_Embeds.txt", "06fqlz6gswj80nj", '/home/GViewer_Embeds.txt')
22 Nov

3D plot with a cut-off line

Did you ever notice that things IRL are in 3D? Well today, let’s do some 3D plots!

First, let’s define our three variables

library(rgl)
 
x <- 1:5/10
y <- 1:5
z <- x %o% y
z <- z + .2*z*runif(25) - .1*z

Now let’s suppose that we want to plot an additional line at a specific cut-off value of 0.6

cutoff <- 0.6

Let’s plot all that with the RGL package:

persp(x, y, z, theta=-35, phi=10,col="lightgrey",xlab="X factor", ylab="Why", zlab="Z-bra")
cLines <- contourLines(x,y,z,levels=c(cutoff))
lines(trans3d(x=cLines[[1]]$x, y=cLines[[1]]$y, z=cutoff,pmat=p ),col = 'red',lw=2,lt=2)

Here is the figure:
screenshot-from-2016-11-22-175501

10 Nov

Count points in polygons

Hi guys,

Here is a quick r-snippet to count points inside polygons:

library("raster")
 
x <- getData('GADM', country='ITA', level=1)
class(x)
# [1] "SpatialPolygonsDataFrame"
# attr(,"package")
# [1] "sp"
 
set.seed(1)
# sample random points
p <- spsample(x, n=300, type="random")
p <- SpatialPointsDataFrame(p, data.frame(id=1:300))
 
proj4string(x)
# [1] " +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0"
proj4string(p)
# [1] " +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0"
 
plot(x)
plot(p, col="red" , add=TRUE)

Here is the figure:
screenshot-from-2016-11-10-165353

res <- over(p, x)
table(res$NAME_1) # count points
#               Abruzzo                Apulia            Basilicata
#                    11                    20                     9
#              Calabria              Campania        Emilia-Romagna
#                    16                     8                    25
# Friuli-Venezia Giulia                 Lazio               Liguria
#                     7                    14                     7
#             Lombardia                Marche                Molise
#                    22                     4                     3
#              Piemonte              Sardegna                Sicily
#                    35                    18                    21
#               Toscana   Trentino-Alto Adige                Umbria
#                    33                    15                     6
#         Valle d'Aosta                Veneto
#                     4                    22
15 May

Load googlesheet data in R

Dear R-addict,

At the era of cloud-based computing, functions such as read.csv, read.xls or even fread are really old-fashioned.
Here is a chunk of code that will allow you to load data from a googlesheet:

 
install.packages("gsheet")
library(gsheet)
# Download a sheet
 
# Download a sheet
url <- 'https://docs.google.com/spreadsheets/d/1XBs7p44-djCPmN4TnPEgboUVAdB2mChbAlCjqnVOyQ0'
a <- gsheet2tbl(url)
12 May

Download daymet data with R

Hey folks ! It’s been a while !

Interested in meteorological data over US ? Here is a way to easily download Daymet data inspired from DaymetR package.

Daymet is a collection of algorithms and computer software designed to interpolate and extrapolate from daily meteorological observations to produce gridded estimates of daily weather parameters. Weather parameters generated include daily surfaces of minimum and maximum temperature, precipitation, humidity, and radiation produced on a 1 km x 1 km gridded surface over the conterminous United States, Mexico, and Southern Canada.

Download the “tiles” shapefile here.

setwd("/bite/guru-blard/DAYMET")
 
param = c("vp", "tmin", "tmax", "swe", "srad", "prcp", "dayl") %see here https://daymet.ornl.gov/overview.html for the available variable
tiles=c(11922:11925, 11742:11745) %id of the tiles of interest (cfr shapefile)
year_range=1985:2015
 
for (i in year_range) {
  for (j in tiles) {
    for (k in param) {
      download_string = sprintf("https://daymet.ornl.gov/thredds/fileServer/ornldaac/1219/tiles/%s/%s_%s/%s.nc", 
                                i, j, i, k)
      daymet_file = paste(k, "_", i, "_", j, ".nc", 
                          sep = "")
      cat(paste("Downloading DAYMET data for tile: ", 
                j, "; year: ", i, "; product: ", k, "\n", 
                sep = ""))
      try(download(download_string, daymet_file, quiet = TRUE, 
                   mode = "wb"), silent = FALSE)
    }
  }
}
10 Mar

Efficient manuscript revision with latexdiff

Hi there,

Here is a nice tip to highlight the revisions in your manuscripts before resubmitting to your favorite journals: use latexdiff!

latexdiff --type=CTRADITIONAL old_shitty_manuscript.tex new_shiny_manuscript.tex > diff.tex

You can then easily customize how differences appear in your pdf by changing the commands in the preamble.
Here are my settings:

\providecommand{\DIFadd}[1]{{\protect\color{blue} #1}} %DIF PREAMBLE
\providecommand{\DIFdel}[1]{} %DIF PREAMBLE

New text appears in blue and old text is simply removed.

19 Feb

Color a raster with gdal using gdaldem

Hi Gigis,
Have you never look for a simple way to color a raster without manualy modifying the symbology or use a python/r script to build a RAT?
Here is the solution for lazy gis people like me 😉
My example is a raster Geotif with 0 for land, 1 for water and 255 for no data. Most of the time, when you open it in a GIS, you have black overview because the contrast is adjusted on the 255 value. Here is how to proceed:

1/ Write a simple text file colortable.txt such as:
0 white
1 blue
255 grey
2/gdaldem color-relief -of VRT sourcefile.tif colortable.txt out.vrt destfile.tif

3/ Just open the VRT, and the magic is there!

Khan-Guru, the Raster Magician