20 Nov

Hey,

Today a bit of trSIGonometry with this function to compute the angle/orientation/direction between 2 points of a GPX file.

```############################ #Function ############################   angle <- function(pt1, pt2) { conv = 180 / pi # conversion radians / degrees diff <- pt2 - pt1 diff.abs <- abs(diff) if (diff[1]>=0 & diff[2]>=0){angle <- pi/2 - atan(diff.abs[2] / diff.abs[1])} if (diff[1]>=0 & diff[2]<=0){angle <- pi/2 + atan(diff.abs[2] / diff.abs[1])} if (diff[1]<=0 & diff[2]>=0){angle <- 3*pi/2 + atan(diff.abs[2] / diff.abs[1])} if (diff[1]<=0 & diff[2]<=0){angle <- pi + (pi/2- atan(diff.abs[2] / diff.abs[1]))} return(angle * conv) }   ############################ #Script ############################   library(rgdal) library(raster) library(data.table)   #Set folder setwd("/home/GuruBlard/gpx/")   # Load file names filenames <- list.files(pattern=".gpx\$")   # Iteration through all gpx files for (filename in filenames){ # Load the GPX   #ogrListLayers("export.gpx") gpx.trackpoints <- readOGR(dsn = filename, layer= "track_points", stringsAsFactors = F)   # Extract the coordinates as a matrix: gpx.tp <- coordinates(gpx.trackpoints)   # #Loop through and add angle - the difference between the of the current point and the following one: for (tp in 1:(nrow(gpx.tp)-1)) { gpx.trackpoints\$angle[tp] <- angle(gpx.tp[tp,], gpx.tp[tp+1,]) }   gpx.trackpoints@data\$Date <- substr(as.character(gpx.trackpoints@data\$time), 1,10) gpx.trackpoints@data\$Hour <- substr(as.character(gpx.trackpoints@data\$time),12,19) gpx.trackpoints@data\$filename <- filename writeOGR(gpx.trackpoints, paste0(tools::file_path_sans_ext(filename), ".shp"), layer=tools::file_path_sans_ext(filename), driver="ESRI Shapefile") }```

17 Jul

Holà GISpo,

Today, a simple function to (highly) speed up the standard Euclidean distance function in R when you use large matrix (n> 100.000.000).

```euc_dist <- function(m) {mtm <- Matrix::tcrossprod(m); sq <- rowSums(m*m); sqrt(outer(sq,sq,"+") - 2*mtm)}   # Example M <- matrix(rexp(1000000, rate=.1), ncol=1000)   ptm <- proc.time() d <- dist(M, method = "euclidean") proc.time() - ptm   # user system elapsed # 1.08 0.00 1.08   ptm <- proc.time() d2 <- euc_dist(M) d2 <- as.dist(d2) proc.time() - ptm   # user system elapsed # 0.424 0.008 0.429   isTRUE(all.equal(as.matrix(d), as.matrix(d2))) #TRUE```
15 May

At the era of cloud-based computing, functions such as read.csv, read.xls or even fread are really old-fashioned.
Here is a chunk of code that will allow you to load data from a googlesheet:

```  install.packages("gsheet") library(gsheet) # Download a sheet   # Download a sheet url <- 'https://docs.google.com/spreadsheets/d/1XBs7p44-djCPmN4TnPEgboUVAdB2mChbAlCjqnVOyQ0' a <- gsheet2tbl(url)```
28 Jan

Hi there, today an introduction to VRT, a magic tool, by Matthew T. Perry. Post from here.

Lazy Raster Processing With Gdal Vrts.
No, not lazy as in REST … Lazy as in “Lazy evaluation”:

In computer programming, lazy evaluation is the technique of delaying a computation until the result is required.

Take an example raster processing workflow to go from a bunch of tiled, latlong, GeoTiff digital elevation models to a single shaded relief GeoTiff in projected space:

1) Merge the tiles together
2) Reproject the merged DEM (using bilinear or cubic interpolation)
3) Generate the hillshade from the merged DEM

Simple enough to do with GDAL tools on the command line. Here’s the typical, process-as-you-go implementation:

```gdal_merge.py -of GTiff -o srtm_merged.tif srtm_12_*.tif gdalwarp -t_srs epsg:3310 -r bilinear -of GTiff srtm_merged.tif srtm_merged_3310.tif gdaldem hillshade srtm_merged_3310.tif srtm_merged_3310_shade.tif -of GTiff```

Alternately, we can simulate lazy evaluation by using GDAL Virtual Rasters (VRT) to perform the intermediate steps, only outputting the GeoTiff as the final step.

```gdalbuildvrt srtm_merged.vrt srtm_12_0*.tif gdalwarp -t_srs epsg:3310 -r bilinear -of VRT srtm_merged.vrt srtm_merged_3310.vrt gdaldem hillshade srtm_merged_3310.vrt srtm_merged_3310_shade2.tif -of GTiff```

So what’s the advantage to doing it the VRT way? They both produce exactly the same output raster. Lets compare:

The Lazy VRT method delays all the computationally-intensive processing until it is actually required. The intermediate files, instead of containing the raw raster output of the actual computation, are XML files which contain the instructions to get the desired output. This allows GDAL to do all the processing in one step (the final step #3). The total processing time is not significantly different between the two methods but in terms of the productivity of the GIS analyst, the VRT method is superior. Imagine working with datasets 1000x this size with many more steps – having to type the command, wait 2 hours, type the next, etc. would be a waste of human resources versus assembling the instructions into vrts then hitting the final processing step when you leave the office for a long weekend.

Additionaly, the VRT method produces only small intermediate xml files instead of having a potentially huge data management nightmare of shuffling around GB (or TB) of intermediate outputs! Plus those xml files serve as an excellent piece of metadata which describe the exact processing steps which you can refer to later or adapt to different datasets.

So next time you have a multi-step raster workflow, use the GDAL VRTs to your full advantage – you’ll save yourself time and disk space by being lazy.

13 Jan

Hey GISette! What’s up?

Here is a simple way to sample a raster as Raster object or SpatialPointsDataFrame using the library raster in r.

```library(raster) r <- raster(system.file("external/test.grd", package="raster")) plot(r)   x<-sampleRandom(r, size=1000, asRaster=TRUE) plot(x)   y<-sampleRandom(r, size=100, sp=TRUE) plot(r) plot(y, add=TRUE, pch=20, col="black")```
09 Dec

Hello GI(uy)s!

What’s up ? Here is a really efficient function (developped by mrdwab) to perform stratified random sampling on data.table object in R. Enjoy !

```#' @title Stratified random sampling #' #' @description Applies a linear stretch to any multiband raster \code{link{stack}} or \code{link{brick}} #' #' @param group The grouping column(s). Can be a character vector or the numeric positions of the columns. #' @param size The desired sample size. Can be a decimal (proportionate by group) or an integer (same number of samples per group). #' @param select A named list with optional subsetting statements. #' @param replace Logical. Should sampling be done with or without replacement. #' @param bothSets Logical. Should a list be returned. Useful when setting up a “testing” and “training” sampling setup. #' @return a data.table object #' @author mrdwab source:https://gist.github.com/933ffeaa7a1d718bd10a.git #' @import data.table   stratifiedDT <- function(indt, group, size, select = NULL, replace = FALSE, keep.rownames = FALSE, bothSets = FALSE) { if (is.numeric(group)) group <- names(indt)[group] if (!is.data.table(indt)) indt <- as.data.table( indt, keep.rownames = keep.rownames) if (is.null(select)) { indt <- indt } else { if (is.null(names(select))) stop("'select' must be a named list") if (!all(names(select) %in% names(indt))) stop("Please verify your 'select' argument") temp <- vapply(names(select), function(x) indt[[x]] %in% select[[x]], logical(nrow(indt))) indt <- indt[rowSums(temp) == length(select), ] } df.table <- indt[, .N, by = group] df.table if (length(size) > 1) { if (length(size) != nrow(df.table)) stop("Number of groups is ", nrow(df.table), " but number of sizes supplied is ", length(size)) if (is.null(names(size))) { stop("size should be entered as a named vector") } else { ifelse(all(names(size) %in% do.call( paste, df.table[, group, with = FALSE])), n <- merge( df.table, setnames(data.table(names(size), ss = size), c(group, "ss")), by = group), stop("Named vector supplied with names ", paste(names(size), collapse = ", "), "\n but the names for the group levels are ", do.call(paste, c(unique( df.table[, group, with = FALSE]), collapse = ", ")))) } } else if (size < 1) { n <- df.table[, ss := round(N * size, digits = 0)] } else if (size >= 1) { if (all(df.table\$N >= size) || isTRUE(replace)) { n <- cbind(df.table, ss = size) } else { message( "Some groups\n---", do.call(paste, c(df.table[df.table\$N < size][, group, with = FALSE], sep = ".", collapse = ", ")), "---\ncontain fewer observations", " than desired number of samples.\n", "All observations have been returned from those groups.") n <- cbind(df.table, ss = pmin(df.table\$N, size)) } } setkeyv(indt, group) setkeyv(n, group) indt[, .RNID := sequence(nrow(indt))] out1 <- indt[indt[n, list( .RNID = sample(.RNID, ss, replace)), by = .EACHI]\$`.RNID`]   if (isTRUE(bothSets)) { out2 <- indt[!.RNID %in% out1\$`.RNID`] indt[, .RNID := NULL] out1[, .RNID := NULL] out2[, .RNID := NULL] list(SAMP1 = out1, SAMP2 = out2) } else { indt[, .RNID := NULL] out1[, .RNID := NULL][] } }```
04 Dec

Good Evening, fellow GIS analysts!

Here is a quick function to randomly draw a set of points over your area of interest, say for an accuracy assessment.

First the function:

```  randSamp <- function(n,ext,filename=''){ if(class(ext)=='Extent'){ x <- runif(n, ext@xmin, ext@xmax) y <- runif(n, ext@ymin, ext@ymax) s <- SpatialPoints(list(x,y)) } else if (class(ext) %in% c('Raster','RasterStack','RasterBrick')){ x <- runif(2*n, extent(ext)@xmin, extent(ext)@xmax) y <- runif(2*n, extent(ext)@ymin, extent(ext)@ymax) s <- SpatialPoints(list(x,y)) proj4string(s) <- crs(ext) } else if (class(ext) %in% c('SpatialPolygonsDataFrame','SpatialPolygons')){ x <- c() y <- c() id <- c() while(length(x)<n){ xl <- runif(2*n, extent(ext)@xmin, extent(ext)@xmax) yl <- runif(2*n, extent(ext)@ymin, extent(ext)@ymax) sl <- SpatialPoints(list(xl,yl)) proj4string(sl) <- proj4string(ext) point_in_pol <- over( sl , ext , fn = NULL,returnList = FALSE) idl <- which(!apply(point_in_pol, 1, function(x) all(is.na(x)))) x <- cbind(x,xl[idl]) y <- cbind(y,yl[idl]) } id<-sample(c(1:length(x)),n) s <- SpatialPoints(list(x[id],y[id])) proj4string(s) <- proj4string(ext) } else { print('ext not of class "Extent" nor "SpatialPolygonsDataFrame", "SpatialPolygons", "Raster", "RasterStack","RasterBrick". I cannot work with this') break } if(filename!=''){ s_points_df <- SpatialPointsDataFrame(s, data=data.frame(ID=1:n)) shapefile(s_poly_df,filename,overwrite=TRUE) } return(s) }```

And a example on how to use it:

```library(sp) library(rgdal) library(raster) set.seed(1) dat <- matrix(stats::rnorm(2000), ncol = 2) ch <- chull(dat) coords <- dat[c(ch, ch[1]), ] # closed polygon ext <- SpatialPolygons(list(Polygons(list(Polygon(coords)), ID=1))) proj4string(ext) = CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0") spsamp <- randSamp(10,ext)   # Now let's visualize this piece of art plot(ext) plot(spsamp,add=T,col='red')```

03 Dec

Here are two functions on how to stretch the histograms of your raster for better visualization.

```#' @title Linear stretch of a multi-band raster. #' #' @description Applies a linear stretch to any multiband raster \code{link{stack}} or \code{link{brick}} #' #' @param img Raster* object #' @param minmax user defined minimum and maximum for the stretch (optional) #' @return a Raster* object #' @author White-Guru #' @seealso \code{\link{calc}} #' @import raster #' @export #'   linstretch<-function(img,minmax=NA){ if(is.na(minmax)) minmax<-c(min(getValues(img),na.rm=T),max(getValues(img),na.rm=T)) temp<-calc(img,fun=function(x) (255*(x-minmax[1]))/(minmax[2]-minmax[1])) #set all values above or below minmax to 0 or 255 temp[temp<0]<-0;temp[temp>255]<-255; return(temp) }   #' @title Histogram equalization stretch of a multi-band raster. #' #' @description Applies a histogram equalization stretch to any multiband raster \code{link{stack}} or \code{link{brick}} #' #' @param img Raster* object #' @return a Raster* object #' @author White-Guru #' @seealso \code{\link{calc}} #' @import raster #' @export #' eqstretch<-function(img){ unique <- na.omit(getValues(img)) if (length(unique>0)){ ecdf<-ecdf(unique) out <- calc(img,fun=function(x) ecdf(x)*255) } return(out) }```

And here is a dummy example of their effect:

```  library(raster) library(RColorBrewer)   #import a random raster b <- (brick(system.file("external/rlogo.grd", package="raster")))*6-600 plotRGB(linstretch(b))   b4 <- subset(b,1)   nf<-layout(matrix(seq(1,8), byrow=T, nrow=2)) par(mar=c(3,3,1,1),oma=rep(0,4)) myramp<-colorRampPalette(brewer.pal(5,"PiYG")) #custom ramp from RColorBrewer   # no stretch # BAD hack in order to get plot(*Raster) to look like it didn't apply a Min/Max stretch temp<-b4; temp[1]<-0; temp[2]<-255; plot(temp, col=myramp(255)) #plot with diff color map hist(getValues(b4), freq=F, main="Original raster") rm(temp)   #min/max stretch and histogram b4.minmax<-linstretch(b4) plot(b4.minmax, col=myramp(255)) hist(getValues(b4.minmax), breaks=seq(0,255), xlim=c(0,255), freq=F, main="Linear Min/Max") rm(b4.minmax)   #histogram equalization plot(eqstretch(b4), col=myramp(255)) b4.ecdf<-ecdf(getValues(b4)) hist(getValues(calc(b4, fun=function(x) b4.ecdf(x)*255)), breaks=seq(0,255), xlim=c(0,255), freq=F, main="Histogram Equalization")   #2% linear stretch b4quant<-quantile(getValues(b4), c(0.02,0.98)) # 2% cutoffs b4.linstretch<-linstretch(b4, minmax=c(b4quant[1],b4quant[2])) plot(b4.linstretch, col=myramp(255)) hist(getValues(b4.linstretch), breaks=seq(0,255), xlim=c(0,255), freq=F, main="Linear 2%")```

And here is the result:

03 Dec

His GISguys!

Here is a simple example of how to use the very fast multi-thred gdal_calc to get NDWI with stats, histogram and overviews.

```file='my_input_file.tif' outfile='my_output_file.tif'   system( paste('gdal_calc.py', '-A',file, '--A_band=2', '-B',file, '--B_band=4', paste(' --outfile=', outpufile,sep=''), paste(' --calc=','\"','(A.astype(float)-B)/(A.astype(float)+B)','\"',sep=''), paste('--type=','\'','Float32','\'',sep=''), paste('--co=','\"','COMPRESS=LZW','\"',sep='')))   system(paste('gdalinfo -hist -stats',outpufile))   system( paste('gdaladdo', outpufile, '2 4 8 16')```

Here is another version with byte file as output (values ranging from 0 to 255, faster and smaller files):

```file='my_input_file.tif' outfile='my_output_file.tif'   system( paste('gdal_calc.py', '-A',file, '--A_band=2', '-B',file, '--B_band=4', paste(' --outfile=', outpufile,sep=''), paste(' --calc=','\"','((((A.astype(float)-B)/(A.astype(float)+B))+1)*128)','\"',sep=''), paste('--type=','\'','Byte','\'',sep=''), paste('--co=','\"','COMPRESS=LZW','\"',sep='')))   system(paste('gdalinfo -hist -stats',outpufile))   system( paste('gdaladdo', outpufile, '2 4 8 16')```
03 Dec

Tired of specifying all the parameters for a graph you do very often?
Here is a dummy example explaining how to define a function specifically for a class:

```y18 <- c(1:3, 5, 4, 7:3, 2*(2:5), rep(10, 4)) x18 <- c(1:18) (ys18 <- smooth.spline(x18,y18)) outl<-which(abs(y18-ys18\$y)>3*sd(abs(y18-ys18\$y))) x <- list(y=y18,ys=ys18,outl=outl) class(x)<-'splint'   plot.splint<-function(x,...){ plot(x\$y,...) lines(x\$ys,col='blue') points(x\$outl,x\$y[x\$outl],col='red',pch=20) }   plot(x,main='Temporal smoothing and outlier deteciton',xlab='Time',ylab='Vegetation Index')```

And here is the result: