We might not always want to feed R code from Clojure directly into R. Many times, we might have files containing R expressions and we would want to evaluate the whole file.
We can do this quite easily too. Let's see how.
We must first complete the recipe, Setting up R to talk to Clojure, and have Rserve running. We must also have the Clojure-specific parts of that recipe done and the connection to Rserve made.
Moreover, we'll need access to the java.io.File
class:
(import '[java.io File])
We'll first define a function to make evaluating a file in R easier, and then we'll find a file and execute it:
source
function, and it returns whatever R does:(defn r-source ([filename] (r-source filename *r-cxn*)) ([filename r-cxn] (.eval r-cxn (str "source("" (.getAbsolutePath (File. filename)) "")"))))
chrsqr-example.R
that creates a random data table and performs a Χ² test on it:dat <- data.frame(q1=sample(c("A","B","C"), size=1000,replace=TRUE), sex=sample(c("M","F"), size=1000,replace=TRUE)) dtab <- with(dat,table(q1,sex)) (Xsq <- chisq.test(dtab))
user=> (def x-sqr (.asList (r-source "chisqr-example.R"))) #'user/x-sqr ;; X-square user=> (.. x-sqr (at 0) asList (at "statistic") asDouble) 0.2166086470268894 ;; degrees of freedon user=> (.. x-sqr (at 0) asList (at "parameter") asInteger) 2 ;; p-value user=> (.. x-sqr (at 0) asList (at "p.value") asDouble) 0.897354468808211
The most difficult part of this is to deal with the return value. After calling r-source
, we convert the output to an R list. We pull the statistic item from that and convert it to a double. That's the Χ² value. The parameter item is the degrees of freedom. Also, the p.value item is the p-value for the test.
Generally, when I'm picking out the results from their Java data structures, the REPL and documentation are the biggest help. For example, the value x-sqr
, when printed on the REPL, displays this:
user=> x-sqr [#<REXPGenericVector org.rosuda.REngine.REXPGenericVector@4e2f1185+[9]named> #<REXPLogical org.rosuda.REngine.REXPLogical@43be5d17[1]>]
This tells me that the list's first item is a generic R vector and the second item is an R logical structure. Diving further into the first item shows the names of the members it contains:
user=> (.. x-sqr (at 0) asList names) ["statistic" "parameter" "p.value" "method" "data.name" "observed" "expected" "residuals" "stdres"]
This helps me pick out the values I'm looking for, and by using some test data and referring to the documentation for the data types, I can easily write the code that is required to dig down to the results.
The documentation for R's Java data types is available at http://rforge.net/org/docs/index.html?org/rosuda/REngine/package-tree.html.