[open-bibliography] Fwd: [LODLAM] Get Yourself a Linked Data Piece of WorldCat to Play With
Tom Johnson
thomas.johnson at oregonstate.edu
Wed Aug 22 02:28:51 UTC 2012
> It is great to see that you have done this – just the kind of thing we
were hoping folks would do when we released this download.
Great. This is just one method of access, I hope people will find
compelling new uses for this data dump.
> You say performance looks fine, I would be interested what hardware you
are running it on.
I'm running it, alongside a few datasets at < 500,000 triples, on a VM with
4 gb RAM. The memory usage seems low, except when loading data (when an
extra gb or two makes all the difference). CPU hasn't been an issue yet. I
played around with clustering, but in the end, I don't need it yet. Maybe
these things become an issue when traffic is high? My biggest hurdle has
been loading the data, which I solved by breaking the file into chunks of
10,000 triples and adding RAM.
Thanks (to Richard and Karen) for the link re: attribution. I've loaded the
data triple for triple into my store, so I'm assuming the VoID meets
requirements there. I'll see about adding sensible metadata and the
recommended attribution text to the djubby displays. Even though I wish
this data were released CC-0, I'd like to meet OCLC's expectations in good
faith.
Best,
Tom
On Tue, Aug 21, 2012 at 12:42 AM, Richard Wallis <richard.wallis at oclc.org>wrote:
> Hi Tom,
>
> It is great to see that you have done this – just the kind of thing we
> were hoping folks would do when we released this download.
>
> You say performance looks fine, I would be interested what hardware you
> are running it on. I have dropped 4Store on my MacBook Air and I am
> pleasantly surprised with the performance.
>
> Regarding your attribution question, you may find our attribution
> guidelines useful: http://www.oclc.org/data/attribution.html
>
> ~Richard
>
>
>
>
> On 21/08/2012 00:33, "Tom Johnson" <thomas.johnson at oregonstate.edu> wrote:
>
> A SPARQL endpoint is up at http://worldcat.library.oregonstate.edu/sparql
>
> Anyone with a SPARQL client is welcome to use that endpoint for as long as
> it exists. Performance looks just fine to me for the moment, but please let
> me know if you run into any problems.
>
> 4store provides a test query page, letting you write queries in the
> browser. It is at http://worldcat.library.oregonstate.edu/test
>
> Try something like:
>
> DESCRIBE <http://www.worldcat.org/isbn/0879692243>
>
> or:
>
> DESCRIBE <http://id.worldcat.org/fast/898705>
>
> The djubby front-end I put up should let you hit items by oclcnum or isbn,
> like so:
>
> http://data.library.oregonstate.edu/worldcat/oclc/14588496
> http://data.library.oregonstate.edu/worldcat/isbn/9780879692247
>
> It doesn't display blank nodes very intelligently, though, so I'm not sure
> how useful it will be in practice. It would be better to have something
> which will at least seek out labels for object URIs.
>
> I'm also thinking about loading in the VIAF and FAST graphs, since that
> should help with the kind of visualization Karen is talking about. In the
> meanwhile, there is an endpoint. I'll make an effort to keep it up live and
> up to date until I say otherwise, so please use it as you see fit.
>
> As an afterthought: anyone have any advice about meeting the attribution
> terms of the ODC license?
>
> - Tom
>
>
> On Fri, Aug 17, 2012 at 9:16 PM, Karen Coyle <kcoyle at kcoyle.net> wrote:
>
> Tom,
>
> I'll poke around and see if anyone is using any "easy" visualization
> software. The Nat'l Lib. of Spain did some neat stuff with Graphviz, but I
> have no idea what that took. [1]
>
> For basic functionality, I'd love to see a minimal web form that would
> launch a search via SPARQL. (Pubby may do this -- I don't see a screen shot
> that answers this for me.) It would probably have to be limited to
> searching on only certain values, but that's ok for a start. At minimum,
> pulling up everything with the subject URI of an OCLC number or an object
> with the VIAF URI. Since the URI patterns for those are set, it should be
> possible to have a form for just the number and to fill in the full URI for
> the search. An even easier alternative would be to supply the SPARQL
> patterns for those searches, to be launched from within a web page. I could
> find examples of what I mean if this isn't clear. In any case, being able
> to do some minimal searching seems to be a best first step.
>
> Thanks,
>
> kc
> [1] http://bne.linkeddata.es/graphvis/ <http://bne.linkeddata.es/graphvis/>
>
>
>
> On 8/17/12 4:14 PM, Tom Johnson wrote:
>
> I'm not having any trouble loading it (except that it is slow). I'm
> fussing with the best way to configure 4store to handle ~80 million
> triples.
>
> The data looks good to me.
>
> I'm also putting up a pubby-like front end.
>
> I'm not sure what the real cost of running a SPARQL endpoint for a
> dataset like this is going to look like, or whether I can support it in
> the long run. Still, I'm interested in hearing what people would want to
> see and how they would use it if I (or Oregon State) were to run
> services on it.
>
> - Tom
>
> On Fri, Aug 17, 2012 at 3:50 PM, Tom Morris <tfmorris at gmail.com
> <mailto:tfmorris at gmail.com> <tfmorris at gmail.com%3E>> wrote:
>
> Karen,
>
> On Fri, Aug 17, 2012 at 2:10 PM, Karen Coyle <kcoyle at kcoyle.net
> <mailto:kcoyle at kcoyle.net> <kcoyle at kcoyle.net%3E>> wrote:
> > Luc, I think this reflects an answer to your question. As with
> much that
> > happens in computer technology, some of us have to depend on
> others. I find
> > making our wishes clear helps guide those kind souls who have the
> necessary
> > skills. Maybe we can work further with Tom and others to spell
> out what we
> > need for this to be usable for us.
>
> Does having this data loaded into a triple store help you? What types
> of things would it enable?
>
> It seems like it might be marginally better than a raw RDF file, but
> it seems like it would still take a fair amount of work to do anything
> useful with it.
>
> Tom
>
> p.s. I'm curious to see if the other Tom is able to load it using his
> tools because it looks to me like it contains invalid URIs (embedded
> spaces) which may cause RDF tools to choke depending on how picky they
> are about parsing.
>
> >
> > What we still need in the RDF world is the application that would
> do for the
> > Semantic Web what Mosaic did for the Web: make it viewable and
> usable by the
> > non-programmer. But first we have to have an actual Semantic Web,
> and I
> > think that's still in progress in a strict sense.
> >
> > kc
> >
> >
> > On 8/17/12 10:31 AM, Tom Johnson wrote:
> >>
> >> I'm in the process of putting up a triplestore w/ endpoint
> already. I
> >> have no problem sending out a link.
> >>
> >> I'm in an all day meeting today, so it might not happen until
> the weekend.
> >>
> >> On Fri, Aug 17, 2012 at 9:53 AM, Luc Gauvreau <lgovro at gmail.com
> <mailto:lgovro at gmail.com <lgovro at gmail.com>>
> >> <mailto:lgovro at gmail.com <lgovro at gmail.com> <
> mailto:lgovro at gmail.com>> <lgovro at gmail.com%3E%3E>> wrote:
> >>
> >> Bonjour,
> >>
> >> A very good question!
> >>
> >> Multiple projects about linked datas and RDF, but who has the
> >> expertise to use it?
> >>
> >> Only experts and "geeks"?
> >>
> >> Is it possible for an "amateur" to use these kind of format,
> files
> >> and codes?
> >>
> >> A kind of "Linked data and RDF for dummies" will be very
> usefull.
> >> Merci,
> >>
> >> Luc Gauvreau
> >> (Montréal, Québec)
> >>
> >>
> >>
> >> 2012/8/17 Karen Coyle <kcoyle at kcoyle.net
> <mailto:kcoyle at kcoyle.net <kcoyle at kcoyle.net>> <
> mailto:kcoyle at kcoyle.net <kcoyle at kcoyle.net>
>
> <mailto:kcoyle at kcoyle.net>> <kcoyle at kcoyle.net%3E%3E>>
> >>
> >>
> >> I would love it if someone could put this in a triple
> store for
> >> others to play with. How difficult is that?
> >>
> >> kc
> >>
> >>
> >> On 8/17/12 8:58 AM, Jonathan Gray wrote:
> >>
> >> ---------- Forwarded message ----------
> >> From: Richard Wallis <
> richard.wallis at dataliberate.__com
> >> <mailto:richard.wallis at dataliberate.com<richard.wallis at dataliberate.com><
> mailto:richard.wallis at dataliberate.com <richard.wallis at dataliberate.com>>
> <mailto:richard.wallis at dataliberate.com<richard.wallis at dataliberate.com><
> mailto:richard.wallis at dataliberate.com <richard.wallis at dataliberate.com>>
> >>>
> >> Date: Fri, Aug 17, 2012 at 5:42 PM
> >> Subject: [LODLAM] Get Yourself a Linked Data Piece of
> >> WorldCat to Play With
> >> To: lod-lam at googlegroups.com
> <mailto:lod-lam at googlegroups.com <lod-lam at googlegroups.com> <
> mailto:lod-lam at googlegroups.com <lod-lam at googlegroups.com>> > <
> mailto:lod-lam at googlegroups.com <lod-lam at googlegroups.com> <
> mailto:lod-lam at googlegroups.com <lod-lam at googlegroups.com>>
> <mailto:lod-lam at googlegroups.com <lod-lam at googlegroups.com> <
> mailto:lod-lam at googlegroups.com <lod-lam at googlegroups.com>> >>
> >>
> >>
> >> In case you missed the press release earlier this week.
> >>
> >> You can now download a significant number of RDF
> triples
> >> describing
> >> the most highly held 1.2 million resources in WorldCat.
> >> Licensed
> >> under ODC-BY.
> >>
> >> I've posted more details on my blog:
> >>
> >>
>
> http://dataliberate.com/2012/__08/get-yourself-a-linked-data-__piece-of-worldcat-to-play-__with/<
> http://dataliberate.com/2012/__08/get-yourself-a-linked-data-__piece-of-worldcat-to-play-__with/>
>
> >>
> >>
> >>
> <
> http://dataliberate.com/2012/08/get-yourself-a-linked-data-piece-of-worldcat-to-play-with/<
> http://dataliberate.com/2012/08/get-yourself-a-linked-data-piece-of-worldcat-to-play-with/>
> >
> >>
> >> ~Richard
> >>
>
>
>
> _______________________________________________
> open-bibliography mailing list
> open-bibliography at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-bibliography
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20120821/1d6b4080/attachment-0001.html>
More information about the open-bibliography
mailing list