[pd-discuss] Using an intermediate language for pdcalc proposal

Maarten Zeinstra mz at kl.nl
Thu Mar 10 13:55:33 UTC 2011


On Mar 10, 2011, at 14:29 , Rufus Pollock wrote:

> On 10 March 2011 11:13, Maarten Zeinstra <mz at kl.nl> wrote:
> [...
> 
>> @rufus: As a software developer I like to put as little data in an algorithm
>> as possible. Data is usually not as static as algorithm, separating the two
>> layers greatly increases maintainability of these tools.
> 
> Not sure I understand what you mean by data. Any algorithm whether
> written in python or otherwise is going to take inputs (such as author
> info, work info) but perhaps you mean by data config variables like
> how long copyright lasts for a certain type of work (e.g. life + 70
> for texts in the UK)?
> 
> [...]
> 
> Rufus


A decision tree, as we together envisioned almost 2 years ago, is by itself is not an algorithm. The decisions that need to be made at every node in such a tree is what I call an algorithm. 

The question at every node of the available trees fall in three categories:
1. Is X of type Y?
2. Is X (larger/smaller) than Y?
3. Is X-Y (larger/smaller) than Y?

Now I call the content of X, Y, and Z data as well as the sequence and references between these types of questions. How these comparisons are executed and how they refer to each other are part of the algorithm. Here the algorithm does not care what kind of data it needs to compare it only needs its structure (data). 

If you on the other hand mix data and algorithm you will get as many question types as you have question. Using the separation above you only need a programmer to maintain the 3 types of questions, instead of its intricate structure. A law scholar can maintain the structure of these questions. 

Does this clarify what I mean with 'data'?





More information about the pd-discuss mailing list