giovedì 9 agosto 2007

R package installation and administration

A short list of basic but useful commands for managing
the packages in R:

# install a package
install.packages("ROCR")
# visualize package version
package_version("pamr")
# update a package
update.packages("Cairo")
# remove a package
remove.packages("RGtk2")

venerdì 3 agosto 2007

Sorting/ordering a data.frame according specific columns


x = rnorm(20)
y = sample(rep(1:2, each = 10))
z = sample(rep(1:4, 5))

data.df <- data.frame(values = x, labels.1 = y, labels.2 = z)
print(data.df)

# data ordered according to "labels.1" column
# and then "labels.2" column
nams <- c("labels.1", "labels.2")
data.df.sorted = data.df[do.call(order, data.df[nams]), ]
print(data.df.sorted)

giovedì 2 agosto 2007

Receiver Operating Characteristic (ROC) Curve in ROCR and verification packages

The following VERY basic code shows how to plot a simple ROC Curve both by means of ROCR package and by verification package.

# it allows two different plots in the same frame
par(mfrow = c(1,2))
# plot a ROC curve for a single prediction run
# and color the curve according to cutoff.
library(ROCR)
data(ROCR.simple)
pred <- prediction(ROCR.simple$predictions, ROCR.simple$labels)
perf <- performance(pred,"tpr", "fpr")
plot(perf,colorize = TRUE)
# plot a ROC curve for a single prediction run
# with CI by bootstrapping and fitted curve
library(verification)
roc.plot(ROCR.simple$labels,ROCR.simple$predictions, xlab = "False positive rate",
ylab = "True positive rate", main = NULL, CI = T, n.boot = 100, plot = "both", binormal = TRUE)



lunedì 30 luglio 2007

screen - an other VERY useful Unix tool

from R News Vol. 7/1 April 2007 (http://cran.r-project.org/doc/Rnews/Rnews_2007-1.pdf):
If you need to run R code that executes for long periods of time upon remote machines, this amazing unix tool would became your best friend!
screen is a so-called terminal multiplexor, which allows us to create, shuffle, share, and suspend command line sessions within one window. It provides protection against disconnections and the flexibility to retrieve command line sessions remotely.

Starting using this utility is easy like ABC:

  1. Log in to remote server
  2. Run screen
  3. Run R and the long calculation
  4. Detach screen (Ctrl-a, Ctrl-d)
  5. Logout

The R session continues working in the background, contained within the screen session. If we want to revisit the session to check its progress, then we:

  1. Log in remotely via secure shell
  2. Start screen -r, which recalls the unattached session
  3. Examine how your calculation/script is performing
  4. Detach the screen session, (Ctrl-a, Ctrl-d)
  5. Log out

This procedure can be used, clearly, for invoking whatever unix program/command you need to use; it is sufficient to substitute the R invoking command with your invoking command line program(for example python).

As usual in the shell-space, invoking man (man screen in this case) will provide all sort of information you need to know about the tool.

lunedì 16 luglio 2007

R upgrading on Windows© revisited

From the list:
When I update R the following has worked for me (Windows XP)
1. Install the new version to a new directory (say C:\Program Files\R\R-2.5.1).
2. Rename the new library subdirectory to library2.
3. Copy the entire contents of the old library subdirectory (say
C:\Program Files\R\R-2.4.0\library\ to the new R root to create
C:\Program Files\R\R-2.5.1\library\ .
4. Copy the contents of library2 to library to update your basic library.
5. Now start your new version of R and update packages from the GUI or
from the R console. (You may need to firs check Rprofile .site to
ensure that no packages have been loaded)
6. On occasion I have got warning messages when I tried to load
packages after this procedure. This has been cleared by running
update.packages(checkBuilt = TRUE)
This checks that your packages have been built with the latest
version. When I do this I agree to install all available updates.
7. You may wish to copy various autoloads etc from your old
Rprofile.site to your new Rprofile.site. I understand that there are
some compatibility problems with 2.5.1 and SciViews so be careful.

mercoledì 20 giugno 2007

String manipulation, insert delim

From the list, as usual:

I want to be able to insert delimiters, say commas, into a string
of characters at uneven intervals such that:

foo<-c("haveaniceday")# my string of character
bar<-c(4,1,4,3) # my vector of uneven intervals
my.fun(foo,bar) # some function that places delimiters appropriately
have,a,nice,day # what the function would ideally return


1)

paste(read.fwf(textConnection(foo), bar, as.is = TRUE), collapse = ",")
[1] "have,a,nice,day"


2)

my.function <- function(foo, bar){
# construct a matrix with start/end character positions
start <- head(cumsum(c(1, bar)), -1) # delete last one
sel <- cbind(start=start,end=start + bar -1)
strings <- apply(sel, 1, function(x) substr(foo, x[1], x[2]))
paste(strings, collapse=',')
}

my.function(foo, bar)
[1] "have,a,nice,day"

venerdì 8 giugno 2007

Back to back historgram

library(Hmisc)
age <- rnorm(1000,50,10)
sex <- sample(c('female','male'),1000,TRUE)
out <- histbackback(split(age, sex), probability=TRUE, xlim=c(-.06,.06), main = 'Back to Back Histogram')
#! just adding color
barplot(-out$left, col="red" , horiz=TRUE, space=0, add=TRUE, axes=FALSE)
barplot(out$right, col="blue", horiz=TRUE, space=0, add=TRUE, axes=FALSE)