org-mode/rorg.org

#+OPTIONS:    H:3 num:nil toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t TeX:t LaTeX:t skip:nil d:(HIDE) tags:not-in-toc
#+TITLE: rorg --- Code evaluation in org-mode, with an emphasis on R
#+SEQ_TODO:  TODO PROPOSED | DONE DROPPED MAYBE
#+STARTUP: oddeven

* Overview (Dan [2009-02-08 Sun])
** Project objectives
This project is basically about putting source code into org
files. This isn't just code to look pretty as a source code example,
but code to be evaluated. Org files have 3 main export targets: org,
html and latex. Thus the aim of this project is to produce files in
those formats that have benefitted in some way from the evaluation of
source code that is present in the source org file. We have a current
focus on R code, but we are regarding that more as a working example
than as a defining feature of the project.

Code evaluation can have three relevant consequences. Our aim is to
deal with these consequences as follows:

*** It produces text/numeric output
    We (optionally) incorporate the text output as text in the target
    document
*** It produces graphical output
    We either link to the graphics or (html/latex) include them inline.
*** It creates some non-graphics files
    ? We link to other file output
*** It alters the environment by side effect in some other way
    We bear this in mind

** Implementation questions
   These objectives raise three questions:

 1. How is the code placed in the org file?
 2. When is the code evaluated?
 3. What is the result of code evaluation?

*** How is the code placed in the org file?
    Using some version of the code block ideas that Eric and Austin
    have worked on. (In addition, an aim of org-R was to allow Org
    users who are not R users to specify R code implicitly, using
    native org syntax. I'd like to maintain that, but it's not central
    to this project.)

*** When is the code evaluated?
    Let's use an asterisk to indicate content which includes the *result* of code evaluation, rather than the code itself. Clearly
    we have a requirement for the following transformation:

    org \to org*

    Let's say this transformation is effected by a function
    `org-eval-buffer'. This transformation is necessary when the
    target format is org (say you want to update the values in an org
    table, or generate a plot and create an org link to it), and it
    can also be used as the first step by which to reach html and
    latex:

    org \to org* \to html

    org \to org* \to latex

    Thus in principle we can reach our 3 target formats with
    `org-eval-buffer', `org-export-as-latex' and `org-export-as-html'.

    An extra transformation that we might want is

    org \to latex

    I.e. export to latex without evaluation of code, in such a way that R
    code can subsequently be evaluated using
    =Sweave(driver=RweaveLatex)=, which is what the R community is
    used to. This would provide a `bail out' avenue where users can
    escape org mode and enter a workflow in which the latex/noweb file
    is treated as source.

**** How do we implement `org-eval-buffer'?

     AIUI The following can all be viewed as implementations of
     org-eval-buffer for R code:

***** org-eval-light
      This is the beginnings of a general evaluation mechanism, that
      could evaluate python, ruby, shell, perl, in addition to R.
      The header says it's based on org-eval, what is org-eval??

***** org-R
      This accomplishes org \to org* in elisp by visiting code blocks
      and evaluating code using ESS.

***** RweaveOrg
      This accomplishes org \to org* using R via

: Sweave("file-with-unevaluated-code.org", driver=RweaveOrg, syntax=SweaveSyntaxOrg)

***** org-exp-blocks.el
      Like org-R, this achieves org \to org* in elisp by visiting code
      blocks and using ESS to evaluate R code.


*** What is the result of code evaluation?
    Here we have to consider text/numeric output, and graphical
    output. And also the stage at which evaluation occurs
***** org \to org*
****** Text / numerical output
       In the case of org \to org*, I would argue that, where
      appropriate, it should be stored in org tables. Thus an advantage
      our project would have over Sweave is that tabular output is
      automatically conveqrted to native tables on export to HTML and
      latex.
****** Graphical output
       We place an org link to the file. This is done already by
       org-R-apply, and by RweaveOrg.
***** latex \to latex*
      This is done by Sweave(driver=RweaveLatex) and so is out of our hands

* Commentary

** Eric <2009-02-06 Fri 15:41>
I think we're getting close to a comprehensive set of objectives
(although since you two are the real R user's I leave that decision up
to you).  Once we've agreed on a set of objectives and agreed on at
least to broad strokes of implementation, I think we should start
listing out and assigning tasks.


* Objectives
** Send data to R from org
   Org-mode includes orgtbl-mode, an extremely convenient way of using
   tabular data in a plain text file.  Currently, spreadsheet
   functionality is available in org tables using the emacs package
   calc.  It would be a boon both to org users and R users to allow
   org tables to be manipulated with the R programming language.  Org
   tables give R users an easy way to enter and display data; R gives
   org users a powerful way to perform vector operations, statistical
   tests, and visualization on their tables.

*** Implementations
**** naive
     Naive implementation would be to use =(org-export-table "tmp.csv")=
     and =(ess-execute "read.csv('tmp.csv')")=.
**** org-R
     org-R passes data to R from two sources: org tables, or csv
     files. Org tables are first exported to a temporary csv file
     using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]].
**** org-exp-blocks
org-exp-blocks uses [[org-interblock-R-command-to-string]] to send
commands to an R process running in a comint buffer through ESS.
org-exp-blocks has no support for dumping table data to R process, or
vice versa.

**** RweaveOrg
     NA


** Evaluate R code from org and deal with output appropriately
*** vector output
    When R code evaluation generates vectors and 2-dimensional arrays,
    this should be formatted appropriately in org buffers (orgtbl-mode) as well
    as in export targets (html, latex)

    Agreed, if we can convert the vector data to lists then we can use
    the many orgtbl-to-* functions to convert the list to whatever
    output format we desire. See `orgtbl-to-orgtbl, `orgtbl-to-latex',
    `orgtbl-to-html', `orgtbl-to-csv', etc...

**** Implementations
***** org-R
     org-R converts R output (vectors, or matrices / 2d-arrays) to an
     org table and stores it in the org buffer, or in a separate org
     file (csv output would also be perfectly possible).
***** org-exp-blocks
***** RweaveOrg
*** graphical output
    R can generate graphical output on a screen graphics device
    (e.g. X11, quartz), and in various standard image file formats
    (png, jpg, ps, pdf, etc). When graphical output is generated by
    evaluation of R code in Org, at least the following two things are desirable:
    1. output to screen for immediate viewing is possible
    2. graphical output to file is linked to appropriately from the
     org file This should have the automatic consequence that it is
     included appropriately in subsequent export targets (html,
     latex).

**** Implementations
***** org-R
      org-R does (1) if no output file is specified and (2) otherwise
***** org-exp-blocks
      org-exp-blocks tries to do 2, but I don't think that part was
      every really working

***** RweaveOrg


** Evaluate R code on export
At first I was leaning towards leaving the exporting to Sweave, but it
seems that once we have evaluation or R working, it will not be
difficult to implement full evaluation of R blocks, one-liners, and
creation of R graphics on export directly in elisp.

I think that this would be worth the effort since it would eliminate
the need for using Sweave, and would allow for exportation to any
target format supported by org-mode.


* Notes
** Special editing and evaluation of source code in R blocks
   Unfortunately org-mode how two different block types, both useful.
   In developing RweaveOrg, a third was introduced.

   Eric is leaning towards using the =#+begin_src= blocks, as that is
   really what these blocks contain: source code.  Austin believes
   that specifying export options at the beginning of a block is
   useful functionality, to be preserved if possible.

   Note that upper and lower case are not relevant in block headings.

*** PROPOSED R-block proposal
I (Eric) propose that we use the syntax of source code blocks as they
currently exist in org-mode with the addition of *evaluation*,
*header-arguments*, *exportation*, *single-line-blocks*, and
*references-to-table-data*.

1) *evaluation*: These blocks can be evaluated through =\C-c\C-c= with
   a slight addition to the code already present and working in
   [[file:existing_tools/org-eval-light.el][org-eval-light.el]].  All we should need to add for R support would
   be an appropriate entry in [[org-eval-light-interpreters]] with a
   corresponding evaluation function.  For an example usinga
   org-eval-light see [[* src block evaluation w/org-eval-light]].

2) *header-arguments*: These can be implemented along the lines of
   Austin's header arguments in [[file:existing_tools/RweaveOrg/org-sweave.el][org-sweave.el]].

3) *exportation*: Should be as similar as possible to that done by
   Sweave, and hopefully can re-use some of the code currently present
   in [[file:existing_tools/exp-blocks/org-exp-blocks.el ][org-exp-blocks.el]].

4) *single-line-blocks*: It seems that it is useful to be able to
   place a single line of R code on a line by itself.  Should we add
   syntax for this similar to Dan's =#+R:= lines?  I would lean
   towards something here that can be re-used for any type of source
   code in the same manner as the =#+begin_src R= blocks, maybe =#+src_R=?

5) *references-to-table-data*: I get this impression that this is
   vital to the efficient use of R code in an org file, so we should
   come up with a way to reference table data from a single-line-block
   or from an R source-code block.  It looks like Dan has already done
   this in [[file:existing_tools/org-R.el][org-R.el]].

What do you think?  Does this accomplish everything we want to be able
to do with embedded R source code blocks?

**** src block evaluation w/org-eval-light
here's an example using org-eval-light.el

first load the org-eval-light.el file

[[elisp:(load (expand-file-name "org-eval-light.el" (expand-file-name "existing_tools" (file-name-directory buffer-file-name))))]]

then press =\C-c\C-c= inside of the following src code snippet.  The
results should appear in a comment immediately following the source
code block.  It shouldn't be too hard to add R support to this
function through the `org-eval-light-interpreters' variable.

(Dan: The following causes error on export to HTML hence spaces inserted at bol)

 #+begin_src shell
date
 #+end_src

*** Source code blocks
    Org has an extremely useful method of editing source code and
    examples in their native modes.  In the case of R code, we want to
    be able to use the full functionality of ESS mode, including
    interactive evaluation of code.

    Source code blocks look like the following and allow for the
    special editing of code inside of the block through
    `org-edit-special'.

#+BEGIN_SRC r

,## hit C-c ' within this block to enter a temporary buffer in r-mode.

,## while in the temporary buffer, hit C-c C-c on this comment to
,## evaluate this block
a <- 3
a

,## hit C-c ' to exit the temporary buffer
#+END_SRC

*** dblocks
    dblocks are useful because org-mode will automatically call
    `org-dblock-write:dblock-type' where dblock-type is the string
    following the =#+BEGIN:= portion of the line.

    dblocks look like the following and allow for evaluation of the
    code inside of the block by calling =\C-c\C-c= on the header of
    the block.

#+BEGIN: dblock-type
#+END:

*** R blocks
    In developing RweaveOrg, Austin created [[file:existing_tools/RweaveOrg/org-sweave.el][org-sweave.el]].  This
    allows for the kind of blocks shown in [[file:existing_tools/RweaveOrg/testing.Rorg][testing.Rorg]].  These blocks
    have the advantage of accepting options to the Sweave preprocessor
    following the #+BEGIN_R declaration.

** Interaction with the R process

We should take care to implement this in such a way that all of the
different components which have to interactive with R including:
- evaluation of source code blocks
- automatic evaluation on export
- evaluation of \R{} snippets
- evaluation of single source code lines
- sending/receiving vector data

I think we currently have two implementations of interaction with R
processes; [[file:existing_tools/org-R.el][org-R.el]] and [[file:existing_tools/exp-blocks/org-exp-blocks.el ][org-exp-blocks.el]].  We should be sure to take
the best of each of these approaches.


* Tasks


* buffer dictionary
 LocalWords:  DBlocks dblocks