You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

8.9 KiB

Introduction

This message contains my thoughts on a feature request which I think would be useful: The :pre header argument, it would be used to execute a code block before the code block at point is executed. It would be similar to the behavior of the :post header argument (which executes a source block after the current one has been executed).

Having explained the purpose of the :pre header argument, it is worth mentioning that it is currently possible to mimick this desired behavior by using the :var header argument since calling a code block by its name in the :var header argument executes it before the current source block is executed. Further information can be found here.

I provide reasons on why I think :pre would be useful here. This specific reason could be considered a big disadvantage of still using :var this purpose.

For those who would like to know how executing a source code bock beforehand might be useful in some scenarios, you can find some use cases here.

Motivation for adding the :pre header argument

It would improve readability of code blocks by explicitly expressing dependency

By having a header argument for the sole purpose of expressing dependencies between code blocks, the readability of header arguments would be improved. Recall that it is currently possible to express such dependency by calling a code block through a :var <<name>> header argument but I think that :var header argument must only be used for defining variables (be it from results obtained from different code blocks or literals).

The first code block shown below show the differences between using :var and :pre for the same scenario. first-code-block uses the :var header argument while the second-code-block uses the :pre header argument.

For our experimentation, let's start with an empty directory and let's
execute the first step.

#+NAME: first-step
#+HEADER: :results silent
#+HEADER: :var e=clean-path-experiments
#+begin_src dash
touch first-step.txt
#+end_src

We know execute the second step.

#+NAME: second-step
#+HEADER: :results silent
#+HEADER: :var e=first-step
#+begin_src dash
touch second-step.txt
#+end_src

Finally, we execute the third step.

#+NAME: third-step
#+HEADER: :results silent
#+HEADER: :var e=second-step
#+begin_src dash
touch third-step.txt
#+end_src
For our experimentation, let's start with an empty directory and let's
execute the first step.

#+NAME: first-step
#+HEADER: :results silent
#+HEADER: :pre clean-path-experiments()
#+begin_src dash
touch first-step.txt
#+end_src

We know execute the second step.

#+NAME: second-step
#+HEADER: :results silent
#+HEADER: :pre first-step()
#+begin_src dash
touch second-step.txt
#+end_src

Finally, we execute the third step.

#+NAME: third-step
#+HEADER: :results silent
#+HEADER: :pre second-step()
#+begin_src dash
touch third-step.txt
#+end_src

In my opinion, the second code block looks more readable than the first one.

It wouldn't clutter up the expansion of code blocks

Consider the following simple example

For our experimentation, we will need to execute two important steps.

#+NAME: first-step
#+begin_src dash
touch first-step.txt
#+end_src

#+NAME: second-step
#+begin_src dash
touch second-step.txt
#+end_src

The following code block will execute XYZ but it first needs the steps
shown above to be executed. It doesn't define variables outside the
=main= function. You can find the content of this this code block in
=big-computation.cpp=. Visit that file and see how symbols are created
using a debugging tool.

#+NAME: big-computation
#+HEADER: :var e2=second-step()
#+HEADER: :var e1=first-step()
#+HEADER: :tangle big-computation.cpp
#+HEADER: :main no
#+begin_src cpp
#include <iostream>
#include <fstream>

int main() {
  // ...
  // A lot of steps
  // ...

  return 0;
}
#+end_src

Even though our document asserts that the code block doesn't define variables outside the main function, we can see that the expansion of the C++ code block (and therefore the resulting file of tangling that code block) results in

const char* e1 = "";
const char* e2 = "";

#include <iostream>
#include <fstream>

int main() {
  int number = 0
}

This can be avoided by using the :pre header argument as it follows

#+NAME: big-computation
#+HEADER: :pre '(first-step second-step)
#+HEADER: :tangle big-computation.cpp
#+HEADER: :main no
#+begin_src cpp
#include <iostream>
#include <fstream>

int main() {
  // ...
  // A lot of steps
  // ...

  return 0;
}
#+end_src

It would add importance to the :post header argument

The :post header argument can be used in Org Mode 9.3 to execute a given code block after the code block at point is executed; having a header argument that does the opposite of the :post header argument would give relevance to the :post header argument.

Appendix

Mimicking the behavior of :pre using the :var header argument

We can make the python code block execute create-sample-file before it is executed by using the :var header argument.

cat << EOF > main.txt
foo
bar
fizz
buzz
EOF
with open('main.txt', 'r') as f:
    print(len(f.readlines()))

In which scenarios is executing a source code block beforehand useful?

This section must not be considered as the part of the message that encourages the implementation of the :pre header argument since it is currently possible to accomplish what this header argument would do by using the :var header argument. More information on this can be found here.

Execuing a code block beforehand can be useful for creating minimal reproducible examples.

The following code block cleans the directory which is used for
experimentation purposes.

#+NAME: experiments/clean-dir
#+begin_src dash
if [ ! -z "$my__experiments" ] && [ -d "$my__experiments" ]; then
  find ~/e -mindepth 1 -maxdepth 1 -exec rm -rf {} +
fi
#+end_src

The following code block executes =tree= so that the directory organization

#+NAME: experiments/execute-tree
#+begin_src dash
tree -a --noreport "$my__experiments"
#+end_src

The following code block creates the directory structure for our
minimal reproducible example.

#+NAME: minimal-reproducible-example/create-dir-structure
#+HEADER: :var e=experiments/clean-dir()
#+HEADER: :post experiments/execute-tree()
#+begin_src python
import os

[os.makedirs(_) for _ in ["a/foo", "a/bar", "b"]]
[os.mknod(_) for _ in ["a/1.txt", "a/2.txt", "a/foo/b.txt", "a/bar/b.txt", 
"b/b.txt"]]
#+end_src

#+RESULTS: minimal-reproducible-example/create-dir-structure
#+begin_example
/home/beep1560/e
├── a
   ├── 1.txt
   ├── 2.txt
   ├── bar
     └── b.txt
   └── foo
       └── b.txt
└── b
    └── b.txt
#+end_example

It can also be useful in a tutorial since it allows to show a specific state in the tutorial. Consider the simple example below, If we execute tutorial-third-step then only all the previous step would be executed. If we then execute tutorial-second-step, because of experiments/clean-dir, the results of tutorial-third-step would be deleted.

#+NAME: experiments/clean-dir
#+begin_src dash
if [ ! -z "$my__experiments" ] && [ -d "$my__experiments" ]; then
  find ~/e -mindepth 1 -maxdepth 1 -exec rm -rf {} +
fi
#+end_src

#+NAME: tutorial-first-step
#+HEADER: :results silent
#+HEADER: :var e=experiments/clean-dir
#+begin_src python
import os
os.mknod("first-step.txt")
#+end_src

#+NAME: tutorial-second-step
#+HEADER: :results silent
#+HEADER: :var e=tutorial-first-step()
#+begin_src elisp
(let ((default-directory (getenv "my__experiments")))
  (write-region "" nil "second-step.txt"))
#+end_src

#+NAME: tutorial-third-step
#+HEADER: :results silent
#+HEADER: :var e=tutorial-second-step()
#+HEADER: :includes '("<iostream>" "<fstream>")
#+begin_src cpp
std::ofstream myfile;
myfile.open ("third-step.txt");
myfile.close();
#+end_src

#+NAME: tutorial-fourth-step
#+HEADER: :results silent
#+HEADER: :var e=tutorial-third-step()
#+begin_src bash
touch fourth-step.txt
#+end_src