Monday, 6 July 2009

I have been known to abandon blogs a month in.

In this case I'm busy, but I've been working on the code to automatically identify pericopes and correspondences in the NT. Its a pretty tough problem to tackle in odd hours here or there.

Still, the data I'm getting out is almost exactly the same as the published pericope correspondences for the synoptics, which is great. Some niggly tweaks still to go.

One in particular is tricky - where to break. I think this might be a human post-processing step. We need to put a pericope boundary sometimes between words that don't appear in either pericopes. The software can't work out if the boundary should be there, or if either or both words are just omitted words from one of the bounded pericopes. I can't see any way it could possibly do that, so I think I may have to tidy the data later.

Still the Codex Sinaiticus website might eat my time to work on this for the next few days.

I'm sure this will be all over the biblioblogs in a few hours, but the Codex Sinaiticus project has launched their new website with the complete facsimile leaves of the codex to browse through.

I'm having trouble with speed at the moment, I guess everyone is rushing there. But it is beautiful.

You can find it at

Have fun.

If you're unaware of what it is, see the Wikipedia article for an overview.

Wednesday, 10 June 2009

This is the colormap with John as well as the three synoptics.

Because of the way color-blending works, John's colors are calculated differently.

1. John only is brown.

2. Any other color in John shows shared content with the same color in the other gospels, so Yellow appearing in John shows shared content in Matt + Mark. Note that the opposite isn't true: Yellow in Matt doesn't mean it is shared with John.

3. There is no indication at all in the first three gospels of what is shared with John. The colors in the first three are purely indicating sharing between the 3 synoptics.

Clearly John hasn't got a lot of connections with the others, except for the crucifixion narrative.

Tuesday, 9 June 2009

The image I included a few posts ago from Honore's paper on statistics in the NT...

The relationships between the three synoptic g...Image via Wikipedia

...doesn't agree with my coloring map. In particular I detect a lot less triple tradition material than he does.
I'm sure this is because I am tracking words, and he is tracking pericopes. I have the pericope data (I used it to generate my map), but it will need another opportunity to analyse it.

Monday, 8 June 2009

Phew... Its been a marathon evening trying to get the code for this working well. And there's probably lots I could say about it, but for now a pretty picture.

(Image is released under a Creative Commons Attribution License 3.0, as usual).

The image is a colored map of the three synoptic gospels. Each pixel in the map represents one greek word, and it is colored depending on whether it is unique to that gospel, or is found in other gospels. Exactly as we colored the synoptics when first learning about them.

The color scheme is: Red, Green and Blue for Matt, Mark and Luke only. Yellow (Red+Green light) is Matt+Mark, Cyan (Green+Blue) is Mark+Luke, and Magenta (Red+Blue) is Matt+Luke. Black is triple tradition.

The diagram isn't sorted in any pericope-based order. Each gospel is in its correct order. I figured that would be prettier since at this scale you don't want to know what is corresponding to what.

Three things that strike me immediately (all of which I knew, but the diagram states them very clearly):

1. Mark is short and there's not much in there that's unique.

2. Luke has the most original material (John isn't here of course, if it were it would take that prize).

3. There's a lot more yellow than purple (Matt sticks to Mark more than Luke does).

And one thing I didn't realise.

4. There's twice as much magenta as cyan (Luke is closer to Matt than to Mark).

I don't know if number 4 is just one of those things that I never twigged and is perfectly obvious to everyone else, but the whole Markan priority thing made me assume that Luke would be more similar to Mark than Matt. Anyone else surprised at that? Or is it just me?

Anyway, lots of unpacking of this data to do. I'll post briefly about the techniques used to generate it, later in the week.

Thinking about the Synoptic problem from a systems approach at the weekend, I came up with a relatively simple model:

For each pair of extant documents there is a separate synoptic problem. The overall synoptic problem is the combination of these individuals.

Each pair of documents (call them A and B) sits on a continuum.

At one extreme B copied directly from A, possibly omitting some material and adding additional content.

At the other extreme A copied directly from B, again with possible omissions and additions.

In the middle of the continuum, A and B are independent and their agreements are due to some additional source.

These three cases are black-and-white and easy to draw pictures of. Filling in the continuum are other scenarios. Maybe A and B had some common sources (they'd both heard similar sermons from visiting preachers, they'd both got access to some of the same letters), and B copied from A. Now there's a blend in the relationship between both extremes of the continuum. I've tried to show this in shades of grey in the diagram.

Assuming there are 3 pairs of synoptic gospel texts - the statistical analysis of the synoptic problem should provide a point on the continuum for each of the three pairs. And (more importantly) an estimate of the error for each. It may be that the relationship between Matthew and Luke, for example, is placed somewhere between a common source (Q hypothesis) and Luke copying Matthew (Farrer hypothesis), with error margins that are significant enough that both sides can claim victory.

Would they? Or would a died in the wool Q fan think of anything less than smack in the middle as being a contradiction? From what I've read most researchers seem to pay only lip-service to the possibility of the solution being a shade of grey.

Sunday, 7 June 2009

In many things in science it is more important to know what you can know, than to come up with a conclusion prematurely.

One common heuristic in operations research (and science generally) is the signal to noise ratio. Data is inherently noisy. And everyone knows that the noise can obscure the signal.

What isn't widely known, is that this isn't the whole picture. In order to detect signals in the midst of lots of noise, you have to be so sensitive to patterns that you'll start to detect patterns that aren't there. In other words: to avoid false negatives (not finding something that is there), you'll be forced to used techniques that can give false positives (finding things that aren't there).

Strikes me that something of the same occurs in several scientific subdisciplines of theology. In the search for the historical Jesus, for example, we have to be so tuned to detecting theological bias in the scant records we have, that it is easy to spot features that aren't there. In the quest to solve the synoptic problem, similarly, we find features that are probably not there, which can lead to people declaring 'solutions' without understanding how likely they are to be true.

I'm not a scholar, but I'm trying to read as much on the synoptic problem as possible. So far I've not read anything that tries to seriously understand how much we can possibly know from the source text. There are various references to "of course, we'll never really know" - but nobody trying to say how much we could know.

Seems to me that is a crucial question.