org-mode/org-babel.org

78 KiB
Raw Blame History

org-babel — facilitating communication between programming languages and people

Introduction

Org-Babel enables communication between programming languages and between people.

Org-Babel provides:

communication between programs
Data passes seamlessly between different programming languages, text, and tables.
communication between people
Data and calculations are embedded in the same document as notes explanations and reports.

communication between programs

Org-Mode supports embedded blocks of source code (in any language) inside of Org documents. Org-Babel allows these blocks of code to be executed from within Org-Mode with natural handling of their inputs and outputs.

simple execution

with both scalar, file, and table output

reading information from tables

reading information from other source blocks (disk usage in your home directory)

This will work for Linux and Mac users, not so sure about shell commands for windows users.

To run place the cursor on the #+begin_src line of the source block labeled directory-pie and press \C-c\C-c.

cd ~ && du -sc * |grep -v total
64 "Desktop"
11882808 "Documents"
8210024 "Downloads"
879800 "Library"
57344 "Movies"
7590248 "Music"
5307664 "Pictures"
0 "Public"
152 "Sites"
8 "System"
56 "bin"
3274848 "mail"
5282032 "src"
1264 "tools"
pie(dirs[,1], labels = dirs[,2])

operations in/on tables

student grade letter
1 99 A
2 59 F
3 75 C
4 15 F
5 7 F
6 13 F
case score
   when 0..59: "F"
   when 60..69: "D"
   when 70..79: "C"
   when 80..89: "B"
   when 90..100: "A"
   else "Invalid Score"
end
rand(100)
hist(grades[,2])

communication between people

Quick overview of Org-Mode's exportation abilities, with links to the online Org-Mode documentation, a focus on source-code blocks, and the exportation options provided by Org-Babel.

Interactive tutorial

This would demonstrate applicability to Reproducible Research, and Literate Programming.

Tests embedded in documentation

org-babels own functional tests are contained in a large org-mode table, allowing the test suite to be run be evaluation of the table and the results to be collected in the same table.

Tasks [22/38]

TODO Create objects in top level (global) environment [0/5]

sessions

initial requirement statement [DED]

At the moment, objects created by computations performed in the code block are evaluated in the scope of the code-block-function-body and therefore disappear when the code block is evaluated {unless you employ some extra trickery like assign('name', object, env=globalenv()) }. I think it will be desirable to also allow for a style wherein objects that are created in one code block persist in the R global environment and can be re-used in a separate block.

This is what Sweave does, and while I'm not saying we have to be the same as Sweave, it wouldn't be hard for us to provide the same behaviour in this case; if we don't, we risk undeservedly being written off as an oddity by some.

IOW one aspect of org-babel is that of a sort of functional meta-programming language. This is crazy, in a very good way. Nevertheless, wrt R I think there's going to be a lot of value in providing for a working style in which the objects are stored in the R session, rather than elisp/org buffer. This will be a very familiar working style to lots of people.

There are no doubt a number of different ways of accomplishing this, the simplest being a hack like adding

for(objname in ls())
    assign(objname, get(objname), envir=globalenv())

to the source code block function body. (Maybe wrap it in an on.exit() call).

However this may deserve to be thought about more carefully, perhaps with a view to having a uniform approach across languages. E.g. shell code blocks have the same semantics at the moment (no persistence of variables across code blocks), because the body is evaluated in a new bash shell process rather than a running shell. And I guess the same is true for python. However, in both these cases, you could imagine implementing the alternative in which the body is evaluated in a persistent interactive session. It's just that it's particularly natural for R, seeing as both ESS and org-babel evaluate commands in a single persistent R session.

sessions [Eric]

Thanks for bringing this up. I think you are absolutely correct that we should provide support for a persistent environment (maybe called a session) in which to evaluate code blocks. I think the current setup demonstrates my personal bias for a functional style of programming which is certainly not ideal in all contexts.

While the R function you mention does look like an elegant solution, I think we should choose an implementation that would be the same across all source code types. Specifically I think we should allow the user to specify an optional session as a header variable (when not present we assume a default session for each language). The session name could be used to name a comint buffer (like the R buffer) in which all evaluation would take place (within which variables would retain their values at least once I remove some of the functional method wrappings currently in place ).

This would allow multiple environments to be used in the same buffer, and once this setup was implemented we should be able to fairly easily implement commands for jumping between source code blocks and the related session buffers, as well as for dumping the last N commands from a session into a new or existing source code block.

Please let me know if you foresee any problems with this proposed setup, or if you think any parts might be confusing for people coming from Sweave. I'll hopefully find some time to work on this later in the week.

can functional and interpreted/interactive models coexist?

Even though both of these use the same *R* buffer the value of a is not preserved because it is assigned inside of a functional wrapper.

a <- 9
b <- 21
a + b
a

This functional wrapper was implemented in order to efficiently return the results of the execution of the entire source code block. However it inhibits the evaluation of source code blocks in the top level, which would allow for persistence of variable assignment across evaluations. How can we allow both evaluation in the top level, and efficient capture of the return value of an entire source code block in a language independent manner?

Possible solutions…

  1. we can't so we will have to implement two types of evaluation depending on which is appropriate (functional or imperative)
  2. we remove the functional wrapper and parse the source code block into it's top level statements (most often but not always on line breaks) so that we can isolate the final segment which is our return value.
  3. we add some sort of "#+return" line to the code block
  4. we take advantage of each languages support for meta-programming through eval type functions, and use said to evaluate the entire blocks in such a way that their environment can be combined with the global environment, and their results are still captured.
  5. I believe that most modern languages which support interactive sessions have support for a last_result type function, which returns the result of the last input without re-calculation. If widely enough present this would be the ideal solution to a combination of functional and imperative styles.

None of these solutions seem very desirable, but for now I don't see what else would be possible.

Of these options I was leaning towards (1) and (4) but now believe that if it is possible option (5) will be ideal.

(1) both functional and imperative evaluation

Pros

  • can take advantage of built in functions for sending regions to the inferior process
  • retains the proven tested and working functional wrappers

Cons

  • introduces the complication of keeping track of which type of evaluation is best suited to a particular context
  • the current functional wrappers may require some changes in order to include the existing global context
(4) exploit language meta-programming constructs to explicitly evaluate code

Pros

  • only one type of evaluation

Cons

  • some languages may not have sufficient meta-programming constructs
(5) exploit some last_value functionality if present

Need to ensure that most languages have such a function, those without will simply have to implement their own similar solution…

language last_value function
R .Last.value
ruby _
python _
shell see last command for shells
emacs-lisp see special-case
82 + 18
last command for shells

Do this using the tee shell command, and continually pipe the output to a file.

Got this idea from the following email-thread.

suggested from mailing list

while read line 
do 
  bash -c "$line" | tee /tmp/last.out1 
  mv /tmp/last.out1 /tmp/last.out 
done

another proposed solution from the above thread

#!/bin/bash 
# so - Save Output. Saves output of command in OUT shell variable. 
OUT=`$*` 
echo $OUT

and another

.inputrc: "^[k": accept-line "^M": " | tee /tmp/h_lastcmd.out ^[k"

.bash_profile: export __=/tmp/h_lastcmd.out

If you try it, Alt-k will stand for the old Enter; use "command $__" to access the last output.

Best,

Herculano de Lima Einloft Neto

emacs-lisp will be a special case

While it is possible for emacs-lisp to be run in a console type environment (see the elim function) it is not possible to run emacs-lisp in a different session. Meaning any variable set top level of the console environment will be set everywhere inside emacs. For this reason I think that it doesn't make any sense to worry about session support for emacs-lisp.

Further thoughts on 'scripting' vs. functional approaches

These are just thoughts, I don't know how sure I am about this. And again, perhaps I'm not saying anything very radical, just that it would be nice to have some options supporting things like receiving text output in the org buffer.

I can see that you've already gone some way down the road towards the 'last value' approach, so sorry if my comments come rather late. I am concerned that we are not giving sufficient attention to stdout / the text that is returned by the interpreters. In contrast, many of our potential users will be accustomed to a 'scripting' approach, where they are outputting text at various points in the code block, not just at the end. I am leaning towards thinking that we should have 2 modes of evaluation: 'script' mode, and 'functional' mode.

In script mode, evaluation of a code block would result in all text output from that code block appearing as output in the org buffer, presumably as an #+begin_example…#+end_example. There could be an :echo option controlling whether the input commands also appear in the output. [This is like Sweave].

In functional mode, the result of the code block is available as an elisp object, and may appear in the org buffer as an org table/string, via the mechanisms you have developed already.

One thing I'm wondering about is whether, in script mode, there simply should not be a return value. Perhaps this is not so different from what exists: script mode would be new, and what exists currently would be functional mode.

I think it's likely that, while code evaluation will be exciting to people, a large majority of our users in a large majority of their usage will not attempt to actually use the return value from a source code block in any meaningful way. In that case, it seems rather restrictive to only allow them to see output from the end of the code block.

Instead I think the most accessible way to introduce org-babel to people, at least while they are learning it, is as an immensely powerful environment in which to embed their 'scripts', which now also allows them to 'run' their 'scripts'. Especially as such people are likely to be the least capable of the user-base, a possible design-rule would be to make the scripting style of usage easy (default?), perhaps requiring a special option to enable a functional style. Those who will use the functional style won't have a problem understanding what's going on, whereas the 'skript kiddies' might not even know the syntax for defining a function in their language of choice. And of course we can allow the user to set a variable in their .emacs controlling the preference, so that functional users are not inconveniennced by having to provide header args the whole time.

Please don't get the impression that I am down-valuing the functional style of org-babel. I am constantly horrified at the messy 'scripts' that my colleagues produce in perl or R or whatever! Nevertheless that seems to be how a lot of people work.

I think you were leaning towards the last-value approach because it offered the possibility of unified code supporting both the single evaluation environment and the functional style. If you agree with any of the above then perhaps it will impact upon this and mean that the code in the two branches has to differ a bit. In that case, functional mode could perhaps after all evaluate each code block in its own environment, thus (re)approaching 'true' functional programming (side-effects are hard to achieve).

ls > files
echo "There are `wc -l files` files in this directory"

even more thoughts on evaluation, results, models and options

Thanks Dan, These comments are invaluable.

What do you think about this as a new list of priorities/requirements for the execution of source-code blocks.

  • Sessions

    1. we want the evaluation of the source code block to take place in a session which can persist state (variables, current directory, etc…).
    2. source code blocks can specify their session with a header argument
    3. each session should correspond to an Emacs comint buffer so that the user can drop into the session and experiment with live code evaluation.
  • Results

    1. each source-code block generates some form of results which (as we have already implemented) is transfered into emacs-lisp after which it can be inserted into the org-mode buffer, or used by other source-code blocks
    2. when the results are translated into emacs-lisp, forced to be interpreted as a scalar (dumping their raw values into the org-mode buffer), as a vector (which is often desirable with R code blocks), or interpreted on the fly (the default option). Note that this is very nearly currently implemented through the results-type-header.
    3. there should be two means of collecting results from the execution of a source code block. Either the value of the last statement of the source code block, or the collection of all that has been passed to STDOUT during the evaluation.
header argument or return line (header argument)

Rather than using a header argument to specify how the return value should be passed back, I'm leaning towards the use of a #+RETURN line inside the block. If such a line is not present then we default to using STDOUT to collect results, but if such a line is present then we use it's value as the results of the block. I think this will allow for the most elegant specification between functional and script execution. This also cleans up some issues of implementation and finding which statement is the last statement.

Having given this more thought, I think a header argument is preferable. The #+return: line adds new complicating syntax for something that does little more than we would accomplish through the addition of a header argument. The only benefit being that we know where the final statement starts, which is not an issue in those languages which contain 'last value' operators.

new header :results arguments

script
explicitly states that we want to use STDOUT to initialize our results
return_last
stdout is ignored instead the value of the final statement in the block is returned
echo
means echo the contents of the source-code block along with the results (this implies the script :results argument as well)

TODO comint notes

Implementing comint integration in org-babel-comint.el.

Need to have…

  • handling of outputs

    • split raw output from process by prompts
    • a ring of the outputs, buffer-local, `org-babel-comint-output-ring'
    • a switch for dumping all outputs to a buffer
  • inputting commands

Lets drop all this language specific stuff, and just use org-babel-comint to split up our outputs, and return either the last value of an execution or the combination of values from the executions.

comint filter functions
;;  comint-input-filter-functions	hook	process-in-a-buffer
;;  comint-output-filter-functions	hook	function modes.
;;  comint-preoutput-filter-functions   hook
;;  comint-input-filter			function ...
1
2
3
4
5

TODO rework evaluation lang-by-lang [0/4]

This should include…

  • functional results working with the comint buffer
  • results headers

    script

    return the output of STDOUT

    • write a macro which runs the first redirection, executes the body, then runs the second redirection
    last

    return the value of the last statement

  • sessions in comint buffers
TODO R [3/3]
  • functional results working with comint
  • script results
  • ensure callable by other source block
  • rename buffer after session

To redirect output to a file, you can use the sink() command.

a <- 8
b <- 9
c <- 10
a + b
21
a + b + c
[1] 17
[1] 21
[1] 27
83
twoentyseven
[1] 83
[1] 27
(setq debug-on-error t)
TODO Ruby [1/4]
  • functional results working with comint
  • script results
  • ensure callable by other source block
  • rename buffer after session
a = 2
b = 4
c = a + b
c * (a + b)
TODO Python
TODO Shells

TODO implement a session header argument

use this header argument to override the default session buffer

TODO function to bring up inferior-process buffer

This should be callable from inside of a source-code block in an org-mode buffer. It should evaluate the header arguments, then bring up the inf-proc buffer using pop-to-buffer.

TODO function to dump last N lines from inf-proc buffer into the current source block

Callable with a prefix argument to specify how many lines should be dumped into the source-code buffer.

TODO support for working with *Org Edit Src Example* buffers [1/4]

TODO set buffer-local-process variables appropriately [DED]

I think something like this would be great. You've probably already thought of this, but just to note it down: it would be really nice if org-babel's notion of a buffer's 'session/process' played nicely with ESS's notion of the buffer's session/process. ESS keeps the current process name for a buffer in a buffer-local variable ess-local-process-name. So one thing we will probably want to do is make sure that the Org Edit Src Example buffer sets that variable appropriately. [DED]

I had not thought of that, but I agree whole heartedly. [Eric]

Once this is done every variable should be able to dump regions into their inferior-process buffer using major-mode functions.

TODO some possible requests/proposed changes for Carsten [2/3]

While I remember, some possible requests/proposed changes for Carsten come to mind in that regard:

DONE Remap C-x C-s to save the source to the org buffer?

I've done this personally and I find it essential. I'm using

(defun org-edit-src-save ()
  "Update the parent org buffer with the edited source code, save
the parent org-buffer, and return to the source code edit
buffer."
  (interactive)
  (let ((p (point)))
    (org-edit-src-exit)
    (save-buffer)
    (org-edit-src-code)
    (goto-char p)))

(define-key org-exit-edit-mode-map "\C-x\C-s" 'org-edit-src-save)

which seems to work.

I think this is great, but I think it should be implemented in the org-mode core

TODO Rename buffer and minor mode?

Something shorter than Org Edit Src Example for the buffer name. org-babel is bringing org's source code interaction to a level of maturity where the 'example' is no longer appropriate. And if further keybindings are going to be added to the minor mode then maybe org-edit-src-mode is a better name than org-exit-edit-mode.

Maybe we should name the buffer with a combination of the source code and the session. I think that makes sense.

[ES] Are you also suggesting a new org-edit-src minor mode? [DED] org-exit-edit-mode is a minor mode that already exists:

Minor mode installing a single key binding, "C-c '" to exit special edit.

org-edit-src-save now has a binding in that mode, so I guess all I'm saying at this stage is that it's a bit of a misnomer. But perhaps we will also have more functionality to add to that minor mode, making it even more of a misnomer. Perhaps something like org-src-mode would be better.

DEFERRED a hook called when the src edit buffer is created

This should be implemented in the org-mode core

DEFERRED send code to inferior process

Another thought on this topic: I think we will want users to send chunks of code to the interpreter from within the Org Edit Src buffer, and I think that's what you have in mind already. In ESS that is done using the ess-eval-* functions. [DED]

I think we can leave this up to the major-mode in the source code buffer, as almost every source-code major mode will have functions for doing things like sending regions to the inferior process. If anything we might need to set the value of the buffer local inferior process variable. [Eric]

TODO optionally evaluate header references when we switch to *Org Edit Src* buffer

That seems to imply that the header references need to be evaluated and transformed into the target language object when we hit C-c ' to enter the Org Edit Src buffer [DED]

Good point, I heartily agree that this should be supported [Eric]

(or at least before the first time we attempt to evaluate code in that buffer I suppose there might be an argument for lazy evaluation, in case someone hits C-c ' but is "just looking" and not actually evaluating anything.) Of course if evaluating the reference is computationally intensive then the user might have to wait before they get the Org Edit Src buffer. [DED]

I fear that it may be hard to anticipate when the references will be needed, some major-modes do on-the-fly evaluation while the buffer is being edited. I think that we should either do this before the buffer is opened or not at all, specifically I think we should resolve references if the user calls C-c ' with a prefix argument. Does that sound reasonable? [Eric]

Yes [Dan]

TODO fully purge org-babel-R of direct comint interaction

try to remove all code under the ;; functions for evaluation of R code line

TODO improve the source-block snippet

file:~/src/emacs-starter-kit/src/snippets/text-mode/rst-mode/chap::name Chapter title

,#name : Chapter title
,# --
${1:Chapter}
${1:$(make-string (string-width text) ?\=)}

$0

sb snippet

waiting for guidance from those more familiar with yasnippets

TODO resolve references to other buffers

This would allow source blocks to call upon tables, source-blocks, and results in other buffers.

See…

TODO figure out how to handle errors during evaluation

R has a try function, with error handling, along the lines of python. I bet ruby does too. Maybe more of an issue for functional style; in my proposed scripting style the error just gets dumped to the org buffer and the user is thus alerted.

TODO figure out how to handle graphic output

This is listed under graphical output in out objectives.

This should take advantage of the :results file option, and languages which almost always produce graphical output should set :results file to true by default. That would handle placing these results in the buffer. Then if there is a combination of silent and file :results headers we could drop the results to a temp buffer and pop open that buffer…

TODO share org-babel

how should we share org-babel?

  • post to org-mode and ess mailing lists
  • create a org-babel page on worg
  • create a short screencast demonstrating org-babel in action

examples

we need to think up some good examples

interactive tutorials

This could be a place to use org-babel assertions.

for example the first step of a tutorial could assert that the version of the software-package (or whatever) is equal to some value, then source-code blocks could be used with confidence (and executed directly from) the rest of the tutorial.

answering a text-book question w/code example

org-babel is an ideal environment enabling both the development and demonstrationg of the code snippets required as answers to many text-book questions.

something using tables

maybe something along the lines of calculations from collected grades

file sizes

Maybe something like the following which outputs sizes of directories under the home directory, and then instead of the trivial emacs-lisp block we could use an R block to create a nice pie chart of the results.

du -sc ~/*
(mapcar #'car sizes)

TODO command line execution

Allow source code blocks to be called form the command line. This will be easy using the sbe function in org-babel-table.el.

This will rely upon resolve references to other buffers.

TODO inline source code blocks [3/5]

Like the \R{ code } blocks

not sure what the format should be, maybe just something simple like src_lang[]{} where lang is the name of the source code language to be evaluated, [] is optional and contains any header arguments and {} contains the code.

(see the-sandbox)

DONE evaluation with \C-c\C-c

Putting aside the header argument issue for now we can just run these with the following default header arguments

:results
silent
:exports
results

DONE inline exportation

Need to add an interblock hook (or some such) through org-exp-blocks

DONE header arguments

We should make it possible to use header arguments.

TODO fontification

we should color these blocks differently

TODO refine html exportation

should use a span class, and should show original source in tool-tip

TODO formulate general rules for handling vectors and tables / matrices with names

This is non-trivial, but may be worth doing, in particular to develop a nice framework for sending data to/from R.

Notes

In R, indexing vector elements, and rows and columns, using strings rather than integers is an important part of the language.

  • elements of a vector may have names
  • matrices and data.frames may have "column names" and "row names" which can be used for indexing
  • In a data frame, row names must be unique

Examples

> # a named vector
> vec <- c(a=1, b=2)
> vec["b"]
b 
2 
> mat <- matrix(1:4, nrow=2, ncol=2, dimnames=list(c("r1","r2"), c("c1","c2")))
> mat
   c1 c2
r1  1  3
r2  2  4
> # The names are separate from the data: they do not interfere with operations on the data
> mat * 3
   c1 c2
r1  3  9
r2  6 12
> mat["r1","c2"]
[1] 3
> df <- data.frame(var1=1:26, var2=26:1, row.names=letters)
> df$var2
 [1] 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1
> df["g",]
  var1 var2
g    7   20

So it's tempting to try to provide support for this in org-babel. For example

  • allow R to refer to columns of a :var reference by their names
  • When appropriate, results from R appear in the org buffer with "named columns (and rows)" However none (?) of the other languages we are currently supporting really have a native matrix type, let alone "column names" or "row names". Names are used in e.g. python and perl to refer to entries in dicts / hashes. It currently seems to me that support for this in org-babel would require setting rules about when org tables are considered to have named columns/fields, and ensuring that (a) languages with a notion of named columns/fields use them appropriately and (b) languages with no such notion do not treat then as data.
  • Org allows something that looks like column names to be separated by a hline
  • Org also allows a row to function as column names when special markers are placed in the first column. An hline is unnecessary (indeed hlines are purely cosmetic in org [correct?]
  • Org does not have a notion of "row names" [correct?] The full org table functionality exeplified here has features that we would not support in e.g. R (like names for the row below).

Initial statement: allow tables with hline to be passed as args into R

This doesn't seem to work at the moment (example below). It would also be nice to have a natural way for the column names of the org table to become the column names of the R data frame, and to have the option to specify that the first column is to be used as row names in R (these must be unique). But this might require a bit of thinking about.

col1 col2 col3
1 2 3
4 schulte 6
1 2 3
4 schulte 6
tabel
"col1" "col2" "col3"
1 2 3
4 "schulte" 6

Another example is in the grades example.

PROPOSED Are we happy with current behaviour regarding vector/scalar output?

This simple example of multilingual chaining produces vector output if there are spaces in the message and scalar otherwise.

paste(msg, "und_R", sep="_")
msg + "_y_python"
(concat msg "_elisp")

PROPOSED conversion between org-babel and noweb (e.g. .Rnw) format

I haven't thought about this properly. Just noting it down. What Sweave uses is called "R noweb" (.Rnw).

I found a good description of noweb in the following article (see the pdf).

I think there are two parts to noweb, the construction of documentation and the extraction of source-code (with notangle).

documentation: org-mode handles all of our documentation needs in a manner that I believe is superior to noweb.

source extraction At this point I don't see anyone writing large applications with 100% of the source code contained in org-babel files, rather I see org-babel files containing things like

  • notes with active code chunks
  • interactive tutorials
  • requirements documents with code running test suites
  • and of course experimental reports with the code to run the experiment, and perform analysis

Basically I think the scope of the programs written in org-babel (at least initially) will be small enough that it wont require the addition of a tangle type program to extract all of the source code into a running application.

On the other hand, since we already have named blocks of source code which reference other blocks on which they rely, this shouldn't be too hard to implement either on our own, or possibly relying on something like noweb/notangle.

PROPOSED support for passing paths to files between source blocks

Maybe this should be it's own result type (in addition to scalars and vectors). The reason being that some source-code blocks (for example ditaa or anything that results in the creation of a file) may want to pass a file path back to org-mode which could then be inserted into the org-mode buffer as a link to the file…

This would allow for display of images upon export providing functionality similar to org-exp-blocks only in a more general manner.

PROPOSED re-implement helper functions from org-R

Much of the power of org-R seems to be in it's helper functions for the quick graphing of tables. Should we try to re-implement these functions on top of org-babel?

I'm thinking this may be useful both to add features to org-babel-R and also to potentially suggest extensions of the framework. For example one that comes to mind is the ability to treat a source-code block like a function which accepts arguments and returns results. Actually this can be it's own TODO (see source blocks as functions).

DEFERRED use textConnection to pass tsv to R?

When passing args from the org buffer to R, the following route is used: arg in buffer -> elisp -> tsv on file -> data frame in R. I think it would be possible to avoid having to write to file by constructing an R expression in org-babel-R-assign-elisp, something like this

(org-babel-R-input-command
 (format  "%s <- read.table(textConnection(\"%s\"), sep=\"\\t\", as.is=TRUE)"
	  name (orgtbl-to-tsv value '(:sep "\t" :fmt org-babel-R-quote-tsv-field))))

I haven't tried to implement this yet as it's basically just fiddling with something that works. The only reason for it I can think of would be efficiency and I haven't tested that.

This Didn't work after an initial test. I still think this is a good idea (I also think we should try to do something similar when writing out results frmo R to elisp) however as it wouldn't result in any functional changes I'm bumping it down to deferred for now. [Eric]

for quick tests

1 2 3
mean(mean(vec))
2

DEFERRED re-implement R evaluation using ess-command or ess-execute

I don't have any complaints with the current R evaluation code or behaviour, but I think it would be good to use the ESS functions from a political point of view. Plus of course it has the normal benefits of an API (insulates us from any underlying changes etc). [DED]

I'll look into this. I believe that I looked at and rejected these functions initially but now I can't remember why. I agree with your overall point about using API's where available. I will take a look back at these and either switch to using the ess commands, or at least articulate under this TODO the reasons for using our custom R-interaction commands. [Eric]

ess-execute

Lets just replace org-babel-R-input-command with ess-execute.

I tried this, and although it works in some situations, I find that ess-command will often just hang indefinitely without returning results. Also ess-execute will occasionally hang, and pops up the buffer containing the results of the command's execution, which is undesirable. For now these functions can not be used. Maybe someone more familiar with the ESS code can recommend proper usage of ess-command or some other lower-level function which could be used in place of org-babel-R-input-command.

ess functions

(ess-command COM &optional BUF SLEEP NO-PROMPT-CHECK)

Send the ESS process command COM and delete the output from the ESS process buffer. If an optional second argument BUF exists save the output in that buffer. BUF is erased before use. COM should have a terminating newline. Guarantees that the value of .Last.value will be preserved. When optional third arg SLEEP is non-nil, `(sleep-for (* a SLEEP))' will be used in a few places where `a' is proportional to `ess-cmd-delay'.

(ess-execute COMMAND &optional INVERT BUFF MESSAGE)

Send a command to the ESS process. A newline is automatically added to COMMAND. Prefix arg (or second arg INVERT) means invert the meaning of `ess-execute-in-process-buffer'. If INVERT is 'buffer, output is forced to go to the process buffer. If the output is going to a buffer, name it BUFF. This buffer is erased before use. Optional fourth arg MESSAGE is text to print at the top of the buffer (defaults to the command if BUFF is not given.)

out current setup

  1. The body of the R source code block is wrapped in a function
  2. The function is called inside of a write.table function call writing the results to a table
  3. The table is read using org-table-import

DEFERRED Rework Interaction with Running Processes [0/3]

TODO ability to select which of multiple sessions is being used

Increasingly it is looking like we're going to want to run all source code blocks in comint buffer (sessions). Which will have the benefits of

  1. allowing background execution
  2. maintaining state between source-blocks

    • allowing inline blocks w/o header arguments
R sessions

(like ess-switch-process in .R buffers)

Maybe this could be packaged into a header argument, something like :R_session which could accept either the name of the session to use, or the string prompt, in which case we could use the ess-switch-process command to select a new process.

TODO evaluation of shell code as background process?

After C-c C-c on an R code block, the process may appear to block, but C-g can be used to reclaim control of the .org buffer, without interrupting the R evalution. However I believe this is not true of bash/sh evaluation. [Haven't tried other languages] Perhaps a solution is just to background the individual shell commands.

The other languages (aside from emacs lisp) are run through the shell, so if we find a shell solution it should work for them as well.

Adding an ampersand seems to be a supported way to run commands in the background (see external-commands). Although a more extensible solution may involve the use of the call-process-region function.

Going to try this out in a new file org-babel-proc.el. This should contain functions for asynchronously running generic shell commands in the background, and then returning their input.

partial update of org-mode buffer

The sleekest solution to this may be using a comint buffer, and then defining a filter function which would incrementally interpret the results as they are returned, including insertion into the org-mode buffer. This may actually cause more problems than it is worth, what with the complexities of identifying the types of incrementally returned results, and the need for maintenance of a process marker in the org buffer.

'working' spinner

It may be nice and not too difficult to place a spinner on/near the evaluating source code block

TODO conversion of output from interactive shell, R (and python) sessions to org-babel buffers

[DED] This would be a nice feature I think. Although an org-babel purist would say that it's working the wrong way round… After some interactive work in a R buffer, you save the buffer, maybe edit out some lines, and then convert it to org-babel format for posterity. Same for a shell session either in a shell buffer, or pasted from another terminal emulator. And python of course.

DONE Remove protective commas from # comments before evaluating

org inserts protective commas in front of ## comments in language modes that use them. We need to remove them prior to sending code to the interpreter.

,# this one might break it??
:comma_protection

DONE pass multiple reference arguments into R

Can we do this? I wasn't sure how to supply multiple 'var' header args. Just delete this if I'm being dense.

This should be working, see the following example…

n + m
10

DONE ensure that table ranges work

when a table range is passed to org-babel as an argument, it should be interpreted as a vector.

1 2 simple
2 3 Fixnum:1
3 4 Array:123456
4 5
5 6
6 7
"simple"
"#{n.class}:#{n}"
n
Array:123
ar.size
3

DONE global variable indicating default to vector output

how about an alist… org-babel-default-header-args this may already exist… just execute the following and all source blocks will default to vector output

(setq org-babel-default-header-args '((:results . "vector")))

DONE name named results if source block is named

currently this isn't happening although it should be

:namer
:namer

DONE (simple caching) check for named results before source blocks

DONE set :results silent when eval with prefix argument

'silentp

DONE results-type header (vector/file) [3/3]

In response to a point in Dan's email. We should allow the user to force scalar or vector results. This could be done with a header argument, and the default behavior could be controlled through a configuration variable.

:scalar
":scalar"

since it doesn't make sense to turn a vector into a scalar, lets just add a two values…

vector
forces the results to be a vector (potentially 1 dimensional)
file
this throws an error if the result isn't a string, and tries to treat it as a path to a file.

I'm just going to cram all of these into the :results header argument. Then if we allow multiple header arguments it should work out, for example one possible header argument string could be :results replace vector file, which would replace any existing results forcing the results into an org-mode table, and interpreting any strings as file paths.

DONE multiple :results headers

:schulte

DONE file result types

When inserting into an org-mode buffer create a link with the path being the value, and optionally the display being the file-name-nondirectory if it exists.

"something"

something

This will be useful because blocks like ditaa and dot can return the string path of their files, and can add file to their results header.

DONE vector result types

8
8

DONE results name

In order to do this we will need to start naming our results. Since the source blocks are named with #+srcname: lines we can name results with #+resname: lines (if the source block has no name then no name is given to the #+resname: line on creation, otherwise the name of the source block is used).

This will have the additional benefit of allowing results and source blocks to be located in different places in a buffer (and eventually in different buffers entirely).

'schulte

Once source blocks are able to find their own #+resname: lines we then need to…

(sbe "developing-resnames")
schulte

TODO change the results insertion functions to use these lines

TODO teach references to resolve #+resname lines.

DONE org-babel tests org-babel [1/1]

since we are accumulating this nice collection of source-code blocks in the sandbox section we should make use of them as unit tests. What's more, we should be able to actually use org-babel to run these tests.

We would just need to cycle over every source code block under the sandbox, run it, and assert that the return value is equal to what we expect.

I have the feeling that this should be possible using only org-babel functions with minimal or no additional elisp. It would be very cool for org-babel to be able to test itself.

This is now done, see /tec/org-mode/src/commit/8d0fd189336db4b9207f3607df0f829fe52e8916/%2A%20Tests.

DEFERRED org-babel assertions (may not be necessary)

These could be used to make assertions about the results of a source-code block. If the assertion fails then the point could be moved to the block, and error messages and highlighting etc… could ensue

DONE make C-c C-c work anywhere within source code block?

This seems like it would be nice to me, but perhaps it would be inefficient or ugly in implementation? I suppose you could search forward, and if you find #+end_src before you find #+begin_src, then you're inside one. [DED]

Agreed, I think inside of the #+srcname: line would be useful as well.

'schulte

DONE integration with org tables

We should make it easy to call org-babel source blocks from org-mode table formulas. This is practical now that it is possible to pass arguments to org-babel source blocks.

See the related sandbox header for tests/examples.

digging in org-table.el

In the past org-table.el has proven difficult to work with.

Should be a hook in org-table-eval-formula.

Looks like I need to change this if statement (line 2239) into a cond expression.

DONE source blocks as functions

Allow source code blocks to be called like functions, with arguments specified. We are already able to call a source-code block and assign it's return result to a variable. This would just add the ability to specify the values of the arguments to the source code block assuming any exist. For an example see

When a variable appears in a header argument, how do we differentiate between it's value being a reference or a literal value? I guess this could work just like a programming language. If it's escaped or in quotes, then we count it as a literal, otherwise we try to look it up and evaluate it.

DONE folding of code blocks? [2/2]

[DED] In similar way to using outline-minor-mode for folding function bodies, can we fold code blocks? #+begin whatever statements are pretty ugly, and in any case when you're thinking about the overall game plan you don't necessarily want to see the code for each Step.

DONE folding of source code block

Sounds good, and wasn't too hard to implement. Code blocks should now be fold-able in the same manner as headlines (by pressing TAB on the first line).

REJECTED folding of results

So, lets do a three-stage tab cycle… First fold the src block, then fold the results, then unfold.

There's no way to tell if the results are a table or not w/o actually executing the block which would be too expensive of an operation.

DONE selective export of text, code, figures

[DED] The org-babel buffer contains everything (code, headings and notes/prose describing what you're up to, textual/numeric/graphical code output, etc). However on export to html / LaTeX one might want to include only a subset of that content. For example you might want to create a presentation of what you've done which omits the code.

[EMS] So I think this should be implemented as a property which can be set globally or on the outline header level (I need to review the mechanics of org-mode properties). And then as a source block header argument which will apply only to a specific source code block. A header argument of :export with values of

code
just show the code in the source code block
none
don't show the code or the results of the evaluation
results
just show the results of the code evaluation (don't show the actual code)
both
show both the source code, and the results

this will be done in (sandbox) selective export.

DONE a header argument specifying silent evaluation (no output)

This would be useful across all types of source block. Currently there is a :replace t option to control output, this could be generalized to an :output option which could take the following options (maybe more)

t
this would be the default, and would simply insert the results after the source block
replace
to replace any results which may already be there
silent
this would inhibit any insertion of the results

This is now implemented see the example in the sandbox

DONE assign variables from tables in R

This is now working (see (sandbox-table)-R). Although it's not that impressive until we are able to print table results from R.

DONE insert 2-D R results as tables

everything is working but R and shell

DONE shells

DONE R

This has already been tackled by Dan in org-R:check-dimensions. The functions there should be useful in combination with R-export-to-csv as a means of converting multidimensional R objects to emacs lisp.

It may be as simple as first checking if the data is multidimensional, and then, if so using write to write the data out to a temporary file from which emacs can read the data in using org-table-import.

Looking into this further, is seems that there is no such thing as a scalar in R R-scalar-vs-vector In that light I am not sure how to deal with trivial vectors (scalars) in R. I'm tempted to just treat them as vectors, but then that would lead to a proliferation of trivial 1-cell tables…

DONE allow variable initialization from source blocks

Currently it is possible to initialize a variable from an org-mode table with a block argument like table=sandbox (note that the variable doesn't have to named table) as in the following example

1 2 3
4 schulte 6
(message (format "table = %S" table))
"table = ((1 2 3) (4 \"schulte\" 6))"

It would be good to allow initialization of variables from the results of other source blocks in the same manner. This would probably require the addition of #+SRCNAME: example lines for the naming of source blocks, also the table=sandbox syntax may have to be expanded to specify whether the target is a source code block or a table (alternately we could just match the first one with the given name whether it's a table or a source code block).

At least initially I'll try to implement this so that there is no need to specify whether the reference is to a table or a source-code block. That seems to be simpler both in terms of use and implementation.

This is now working for emacs-lisp, ruby and python (and mixtures of the three) source blocks. See the examples in the sandbox.

This is currently working only with emacs lisp as in the following example in the emacs lisp source reference.

TODO Add languages [0/5]

I'm sure there are many more that aren't listed here. Please add them, and bubble any that you particularly care about up to the top.

Any new language should be implemented in a org-babel-lang.el file. Follow the pattern set by org-babel-script.el, org-babel-shell.el and org-babel-R.el.

TODO perl

This could probably be added to org-babel-script.el

TODO java

TODO ditaa

TODO dot

TODO asymptote

Bugs [11/14]

TODO non-orgtbl formatted lists

for example

'((:results . "replace"))

TODO collapsing consecutive newlines in string output

"the first line ends here


     and this is the second one

even a third"
the first line ends here
	     and this is the second one
	return even a third

TODO cursor movement when evaluating source blocks

E.g. the pie chart example. Despite the save-window-excursion in org-babel-execute:R. (I never learned how to do this properly: org-R jumps all over the place…)

DONE R-code broke on "org-babel" rename

8 * 2

DONE error on trivial R results

So I know it's generally not a good idea to squash error without handling them, but in this case the error almost always means that there was no file contents to be read by org-table-import, so I think it's ok.

pie(c(1, 2, 3), labels = c(1, 2, 3))
8
8
c(1,2,3)
1
2
3

DONE ruby new variable creation (multi-line ruby blocks)

Actually it looks like we were dropping all but the last line.

total = 0
table.each{|n| total += n}
total/table.size
2

DONE R code execution seems to choke on certain inputs

Currently the R code seems to work on vertical (but not landscape) tables

"schulte"
num
schulte
11
11
11
schulte
9
9
11
(setq debug-on-error t)
'(1 2 3)
mean(mean(table))
2
1
2
3
mean(table)

DEFERRED org bug/request: prevent certain org behaviour within code blocks

E.g. gets recognised as a link (when there's text inside the brackets). This is bad for R code at least, and more generally could be argued to be inappropriate. Is it difficult to get org to ignore text in code blocks? [DED]

I believe Carsten addressed this recently on the mailing list with the comment that it was indeed a difficult issue. I believe this may be one area where we could wait for an upstream (org-mode) fix.

DONE with :results replace, non-table output doesn't replace table output

And vice versa. E.g. Try this first with table and then with len(table) [DED]

table
1 2 3
4 "schulte" 6
2

Yes, this is certainly a problem. I fear that if we begin replacing anything immediately following a source block (regardless of whether it matches the type of our current results) we may accidentally delete hand written portions of the user's org-mode buffer.

I think that the best solution here would be to actually start labeling results with a line that looks something like…

This would have a couple of benefits…

  1. we wouldn't have to worry about possibly deleting non-results (which is currently an issue)
  2. we could reliably replace results even if there are different types
  3. we could reference the results of a source-code block in variable definitions, which would be useful if for example we don't wish to re-run a source-block every time because it is long-running.

Thoughts? If no-one objects, I believe I will implement the labeling of results.

DONE extra quotes for nested string

Well R appears to be reading the tables without issue…

these should be quoted

ls
"COPYING"
"README.markdown"
"block"
"examples.org"
"existing_tools"
"intro.org"
"org-babel"
"rorg.org"
"test-export.html"
"test-export.org"
tab[1][0]
README.markdown
as.matrix(tab[2,])
README.markdown

DONE simple ruby arrays not working

As an example eval the following. Adding a line to test

[3, 4, 5]
ar.first

DONE space trailing language name

fix regexp so it works when there's a space trailing the language name

:schulte

DONE Args out of range error

The following block resulted in the error below [DED]. It ran without error directly in the shell.

cd ~/work/genopca
for platf in ill aff ; do
    for pop in CEU YRI ASI ; do
	rm -f $platf/hapmap-genos-$pop-all $platf/hapmap-rs-all
	cat $platf/hapmap-genos-$pop-* > $platf/hapmap-genos-$pop-all
	cat $platf/hapmap-rs-* > $platf/hapmap-rs-all
    done
done

executing source block with sh… finished executing source block string-equal: Args out of range: "", -1, 0

the error string-equal: Args out of range: "", -1, 0 looks like what used to be output when the block returned an empty results string. This should be fixed in the current version, you should now see the following message no result returned by source block.

DONE ruby arrays not recognized as such

Something is wrong in /tec/org-mode/src/commit/8d0fd189336db4b9207f3607df0f829fe52e8916/lisp/org-babel-script.el related to the recognition of ruby arrays as such.

[1, 2, 3, 4]
1 2 3 4
[1, 2, 3, 4]
1 2 3 4

Tests

Evaluate all the cells in this table for a comprehensive test of the org-babel functionality.

Note: if you have customized org-babel-default-header-args then some of these tests may fail.

functionality block arg expected results pass
basic evaluation pass
emacs lisp basic-elisp 5 5 pass
shell basic-shell 6 6 pass
ruby basic-ruby org-babel org-babel pass
python basic-python hello world hello world pass
R basic-R 13 13 pass
tables pass
emacs lisp table-elisp 3 3 pass
ruby table-ruby 1-2-3 1-2-3 pass
python table-python 5 5 pass
R table-R 3.5 3.5 pass
source block references pass
all languages chained-ref-last Array Array pass
source block functions pass
emacs lisp defun-fibb fibbd fibbd pass
run over Fibonacci 0 1 1 pass
a Fibonacci 1 1 1 pass
variety Fibonacci 2 2 2 pass
of Fibonacci 3 3 3 pass
different Fibonacci 4 5 5 pass
arguments Fibonacci 5 8 8 pass
bugs and tasks pass
simple ruby arrays ruby-array-test 3 3 pass
R number evaluation bug-R-number-evaluation 2 2 pass
multi-line ruby blocks multi-line-ruby-test 2 2 pass
forcing vector results test-forced-vector-results Array Array pass

basic tests

(+ 1 4)
expr 1 + 5
"org-babel"
'hello world'
b <- 9
b + 4

read tables

1 2 3
4 5 6
(length (car table))
table.first.join("-")
table[1][1]
mean(mean(table))

references

Lets pass a references through all of our languages…

Lets start by reversing the table from the previous examples

table.reverse()
table
4 5 6
1 2 3

Take the first part of the list

table[1]
4
1

Turn the numbers into string

(mapcar (lambda (el) (format "%S" el)) table)
"(4)" "(1)"

and Check that it is still a list

table.class.name

source blocks as functions

(defun fibbd (n) (if (< n 2) 1 (+ (fibbd (- n 1)) (fibbd (- n 2)))))
(fibbd n)

sbe tests

Testing the insertion of results into org-mode tables.

"the first line ends here


     and this is the second one

even a third"
the first line ends here
	     and this is the second one
	return even a third
raise "oh nooooooooooo"
-:5: warning: parenthesize argument(s) for future version
-:5:in `main': oh nooooooooooo (RuntimeError)
	from -:8
the first line ends here… -:5: warning: parenthesize argument(s) for future version…

forcing results types tests

8
triv.class.name

Sandbox

To run these examples evaluate org-babel-init.el

org-babel.el beginning functionality

date
Thu May 14 18:52:25 EDT 2009
Time.now
Thu May 14 18:59:09 -0400 2009
"Hello World"
Hello World

org-babel-R

a <- 9
b <- 16
a + b
25
hist(rgamma(20,3,3))

org-babel plays with tables

Alright, this should demonstrate both the ability of org-babel to read tables into a lisp source code block, and to then convert the results of the source code block into an org table. It's using the classic "lisp is elegant" demonstration transpose function. To try this out…

  1. evaluate /tec/org-mode/src/commit/8d0fd189336db4b9207f3607df0f829fe52e8916/lisp/org-babel-init.el to load org-babel and friends
  2. evaluate the transpose definition \C-c\C-c on the beginning of the source block
  3. evaluate the next source code block, this should read in the table because of the :var table=previous, then transpose the table, and finally it should insert the transposed table into the buffer immediately following the block

Emacs lisp

(defun transpose (table)
  (apply #'mapcar* #'list table))
1 2 3
4 schulte 6
(transpose table)
'(1 2 3 4 5)
1 2 3 4 5

Ruby and Python

table.first.join(" - ")
"1 - 2 - 3"
table[0]
1 2 3
table
1 2 3
4 "schulte" 6
len(table)
2
"add" "class" "contains" "delattr" "delitem" "delslice" "doc" "eq" "format" "ge" "getattribute" "getitem" "getslice" "gt" "hash" "iadd" "imul" "init" "iter" "le" "len" "lt" "mul" "ne" "new" "reduce" "reduce_ex" "repr" "reversed" "rmul" "setattr" "setitem" "setslice" "sizeof" "str" "subclasshook" "append" "count" "extend" "index" "insert" "pop" "remove" "reverse" "sort"

(sandbox table) R

1 2 3
4 schulte 6
x <- c(rnorm(10, mean=-3, sd=1), rnorm(10, mean=3, sd=1))
x
-3.35473133869346
-2.45714878661
-3.32819924928633
-2.97310212756194
-2.09640758369576
-5.06054014378736
-2.20713700711221
-1.37618039712037
-1.95839385821742
-3.90407396475502
2.51168071590226
3.96753011570494
3.31793212627865
1.99829753972341
4.00403686419829
4.63723764452927
3.94636744261313
3.58355906547775
3.01563442274226
1.7634976849927
tabel
1 2 3
4 "schulte" 6

shell

Now shell commands are converted to tables using org-table-import and if these tables are non-trivial (i.e. have multiple elements) then they are imported as org-mode tables…

ls -l
"total" 208 "" "" "" "" "" ""
"-rw-rr" 1 "dan" "dan" 57 2009 15 "block"
"-rw-rr" 1 "dan" "dan" 35147 2009 15 "COPYING"
"-rw-rr" 1 "dan" "dan" 722 2009 18 "examples.org"
"drwxr-xr-x" 4 "dan" "dan" 4096 2009 19 "existing_tools"
"-rw-rr" 1 "dan" "dan" 2207 2009 14 "intro.org"
"drwxr-xr-x" 2 "dan" "dan" 4096 2009 18 "org-babel"
"-rw-rr" 1 "dan" "dan" 277 2009 20 "README.markdown"
"-rw-rr" 1 "dan" "dan" 11837 2009 18 "rorg.html"
"-rw-rr" 1 "dan" "dan" 61829 2009 19 "#rorg.org#"
"-rw-rr" 1 "dan" "dan" 60190 2009 19 "rorg.org"
"-rw-rr" 1 "dan" "dan" 972 2009 11 "test-export.org"

silent evaluation

:im_the_results
:im_the_results
:im_the_results
:im_the_results_
:im_the_results_

(sandbox) referencing other source blocks

Doing this in emacs-lisp first because it's trivial to convert emacs-lisp results to and from emacs-lisp.

emacs lisp source reference

This first example performs a calculation in the first source block named top, the results of this calculation are then saved into the variable first by the header argument :var first=top, and it is used in the calculations of the second source block.

(+ 4 2)
(* first 3)
18

This example is the same as the previous only the variable being passed through is a table rather than a number.

(defun transpose (table)
  (apply #'mapcar* #'list table))
1 2 3
4 schulte 6
(transpose table)
(transpose table)
1 2 3
4 "schulte" 6

ruby python

Now working for ruby

89
2 * other

and for python

98
another*3

mixed languages

Since all variables are converted into Emacs Lisp it is no problem to reference variables specified in another language.

2
(* ruby-variable 8)
lisp_var + 4
20

R

a <- 9
a
9
other + 2
11

(sandbox) selective export

For exportation tests and examples see (including exportation of inline source code blocks) /tec/org-mode/src/commit/8d0fd189336db4b9207f3607df0f829fe52e8916/test-export.org

(sandbox) source blocks as functions

5
(* 3 n)
15
result
294

The following just demonstrates the ability to assign variables to literal values, which was not implemented until recently.

num+" schulte "
"eric schulte "

(sandbox) inline source blocks

This is an inline source code block

1 + 6
. And another source block with text output src_emacs-lisp{"eric"}.

This is an inline source code block with header arguments.

n

(sandbox) integration w/org tables

(defun fibbd (n) (if (< n 2) 1 (+ (fibbd (- n 1)) (fibbd (- n 2)))))
(fibbd n)
(mapcar #'fibbd '(0 1 2 3 4 5 6 7 8))

Something is not working here. The function `sbe ' works fine when called from outside of the table (see the source block below), but produces an error when called from inside the table. I think there must be some narrowing going on during intra-table emacs-lisp evaluation.

original fibbd
0 1
1 1
2 2
3 3
4 5
5 8
6 13
7 21
8 34
9 55

silent-result

(sbe 'fibbd (n "8"))

Buffer Dictionary

LocalWords: DBlocks dblocks org-babel el eric fontification