[openbiblio-dev] First cut of AsyncUpload branch

Tom Morris tfmorris at gmail.com
Wed Feb 15 18:48:11 UTC 2012


On Mon, Feb 13, 2012 at 5:18 AM, Mark MacGillivray <mark at odaesa.com> wrote:
> On Mon, Feb 13, 2012 at 10:11 AM, Etienne Posthumus
> <etienne.posthumus at okfn.org> wrote:
>> Jim, do I understand it correctly that you suggest some sort of
>> 'string-sniffing' support in ALL the parsers?
>> IOW, when called in some manner as a convention, eg.
>> someparser -s "arXiv:1201.6450"
>
> This sounds to me like something that should come before parsing -
> e.g. send a string to a URL, get back details of which parser would
> parse it, then submit to that parser. Does not actually need to be
> written into the parsers though.

It's actually pretty useful to have format handlers (e.g. parsers)
either register what they can handle or provide a method by which the
calling framework can query whether or not they can handle something.

In Google Refine, import format handlers are required to provide a
method by which the app can: 1) query whether they can handle a given
HTTP ContentType and 2) query whether they can handle a given URI (can
trivially be used to implement file extension filter or more
extensively do full blown content analysis).  Given a set of candidate
format handlers, Refine can query them all to see which ones can
handle the given content.

Another way to do something similar is to have static registration of
a set of content types, URI RegEx patterns, etc.  Either way, it's
useful to have the parser writer declare this stuff explicitly rather
than having it decoupled in a way that allows it to get out of sync
with the actual capabilities of the parser.

Tom




More information about the openbiblio-dev mailing list