|
4 years ago | |
---|---|---|
example | 4 years ago | |
.gitignore | 4 years ago | |
README.md | 4 years ago | |
harris-matrix-manual.png | 4 years ago | |
harris.py | 4 years ago | |
harrisDb.py | 12 years ago | |
harrisGv.py | 12 years ago | |
requirements.txt | 4 years ago |
This is a proof of concept for a Harris matrix created with Python and Graphviz, that I wrote in 2007-2008. A short Python script reads the stratigraphic data from a SQLite database and feeds the data to Graphviz, that draws the matrix.
To see an example, install Graphviz, then:
pip install pygraphviz
python harris.py
This will create the matrix.png
image, with the example Harris matrix
described in the example/matrix.db
database. Read below for more details!
On most archaeological excavations adopting the methodology of single context recording, the large number of stratigraphic units makes it necessary to use some sort of representation of the relative chronological sequence to keep track of what has already been excavated (not to mention building archaeology). This conceptual tool is the Harris matrix, used on paper for decades since its inception in the late 1960s.
From the theory of E. C. Harris, we know that all stratigraphic relations are bound to what I called an A-B-C model:
(Contemporaneity without physical equality is rather problematic, though).
Software applications aimed at creating a digital Harris matrix include WinBASP and ArchEd. (Win)BASP is to my knowledge the earliest example of a data management environment that (correctly) did away with representing the Harris matrix as a drawing, focusing on the underlying stratigraphic data. The Harris matrix can be formally defined as a directed graph from the most recent down to the older deposits, where the nodes represent layers, that are connected through stratigraphic relations (edges).
In 2007 and 2008 I spent some time experimenting with Graphviz for automating the creation of the Harris Matrix for the excavation of Gortyna in Crete. This repository contains the small Python application I had been writing to demonstrate how to automate the use of Graphviz to generate Harris matrix diagrams for that excavation. The application is far from complete and has no GUI, but it shows the model I had been developing from the first examples, where all steps were to be performed “by hand”. I published two blog posts (2007, 2008) detailing the experiment.
In the following years, there have been two interesting software tools based on the same data first, Graphviz later principle:
In the meantime, I'm afraid the vast majority of Harris matrices are drawn using Illustrator or Excel.
Using Graphviz directly is an instructing exercise and is easier to understand if you're just starting. We will be writing a plain text file that is used for both:
Graphviz has its own native, plain text format, that is documented on
the website. Graphviz
.dot
files can be read and written with any text
editor like Emacs, Vim, or Notepad++. Keeping a file of this
kind is the obvious choice to experiment, even though the
single-file approach is not very efficient for real world data.
This is a sample from the final .dot
file I had compiled during the
excavation weeks in Gortyna:
digraph matrix {
723->722
505->732
729->732
731->730->729
726->729
730->726
726->810->725
729->810->725
729->733->792->793
722->731
732->737->736->733
733->810->725
729->505
736->506
505->506
179->759
759->725
759->737
759->769->768->778
768->303
737->739->736->778
736->769
778->303
506->303
769->506
769->780
778->779
736->773->774->779->780
779->303
780->303
506->780
505->724
}
You can save this file as harris-matrix.dot
and follow along with
code examples below.
Apart from the initial preamble, it's a ridiculously easy syntax. The
Harris Matrix is to be read top-down, so i.e. A -> B
means “A is later than B”. You can also
concatenate multiple relations on the same row. Indenting is not
mandatory, but it helps keeping your file clean. You can write
comments on any line after a #
character, like
# this is a comment
A -> B -> C
A -> D -> E # this one too!
It's not that difficult to keep this file updated by hand, really. One thing you could worry about are redundant relations that could for sure make your graph ugly and unreadable. But this is about automation, so redundant data isn't going to be a problem: we'll be recording each relation.
I mentioned above that the Harris Matrix is a directed
graph. Graphviz comes with a lot of tools, but only one does what we
need, and it's named dot
. From the command line we can just run
dot harris-matrix.dot -Tpng -o harris-matrix.png
and get in zero seconds our data compiled as a graph. The -Tpng
command line option specifies which one of the many available output
formats we want to get. The -o
flag (that is, option) precedes the
output filename.
So far, the result is quite good. But redundant relations are still there, and I promised it wouldn't be a problem at all.
Here's when the power of UNIX comes in help. tred
is
another of the many tools provided by Graphviz, that acts as a
“transitive reduction filter for directed graphs”. So, it
has to run before dot
reads the input file. A
pipe (represented by the |
character) is the easiest way to pass data from one program to another
in UNIX style. Here's how I did it:
tred harris-matrix.dot | dot -Tpng -o harris-matrix-tred.png
Note that dot
by default accepts input from stdin, while tred
by
default uses stdout as output. Many simple programs that do one
single operation, well done: this is the core of the UNIX philosophy,
and Graphviz follows it. Once you understand this concept, things will
be much easier. The output of this second command is slightly
different from the first one:
You can play around with some general options to change the graphic layout of your graph. These are two options I often use to get better looking Harris Matrices:
digraph matrix { # these two options go at the beginning of the graph file
concentrate=true;
node[shape=rect];