[ECODP-dev] Dataset usage statistics

John Glover john.glover at okfn.org
Mon Oct 14 15:48:52 UTC 2013


Hi Bert,


-          Does this command have any impact on the ckan? Slowdown, halt, …
> as there is always a data ingestion going as they take now 3hrs per
> ingestion.
> It pulls information from the PostgreSQL DB so there will be a limited
> slowdown possible.
>

Yes, it queries the tracking tables, I would not expect any significant
slowdown here.


>
> -          What is the estimated execution time for this?
> I run it on our test system for the full year 2013 and it took me around 2
> minutes. From the visual output on the screen it seems as if there is a
> select query done per day.
>

There are a couple of other aggregate select queries, but yes I wouldn't
expect this to take longer than a few minutes.


>
> -          Will these statistics even be correct when generated during a
> data ingestion?
> They correspond to the user clicks. So the ingestion is not part of a user
> visited click.
>

This is correct.


Regards,
John



>
> @OKF can you verify my respons.
>
> kind regards,
>
> Bert
>
>
>
>
> 2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>
>
>>  Dear Bert,****
>>
>> ** **
>>
>> Andre has a few quite pertinent question on generation of this
>> statistical data especially given the fact that this script hasn't been
>> used since last year. Could you provide him with answers? Maybe testing it
>> first on your environment would also be useful?****
>>
>> ** **
>>
>> Thank you in advance.****
>>
>> Regards, ****
>>
>> ** **
>>
>> *Agnieszka Zając*
>>
>> Open Data Portal Section****
>>
>> Tel: +352 2929.42834****
>>
>> ** **
>>
>> *From:* MEYER André (OP-EXT)
>> *Sent:* Thursday, October 10, 2013 12:17 PM
>> *To:* ZAJAC Agnieszka (OP)
>> *Cc:* SABETE Vafa (OP); HOHN Norbert (OP); PASTOR CAMARASA José Juan
>> (OP); BOUSSERT Philippe (OP); ZADRA Julien (OP-EXT)
>> *Subject:* RE: Dataset usage statistics****
>>
>> ** **
>>
>> Hello Agnieszka,****
>>
>> ** **
>>
>> I have a few questions before launching this on prod:****
>>
>> ** **
>>
>> **-          **Does this command have any impact on the ckan? Slowdown,
>> halt, … as there is always a data ingestion going as they take now 3hrs per
>> ingestion.****
>>
>> **-          **What is the estimated execution time for this?****
>>
>> **-          **Will these statistics even be correct when generated
>> during a data ingestion?****
>>
>> ** **
>>
>> Kind regards,****
>>
>> ** **
>>
>> *André Meyer*
>>
>> Application Team - Integration engineer****
>>
>> *_**__**_**__**_**__**__**_**__**_**__***
>>
>> * *
>>
>> *Publications Office of the European Union *
>>
>> *Unit A4 - Infrastructure and IT Security Systems***
>>
>> * *
>>
>> *Halian S.à.r.l.  (under contract with the Publications Office)*****
>>
>> ( (+352) 2929-42442****
>>
>> +  andre.meyer at ext.publications.europa.eu****
>>
>> ** **
>>
>> *From:* ZAJAC Agnieszka (OP)
>> *Sent:* 10 October, 2013 12:10 PM
>> *To:* MEYER André (OP-EXT)
>> *Cc:* SABETE Vafa (OP); HOHN Norbert (OP); PASTOR CAMARASA José Juan (OP)
>> *Subject:* FW: Dataset usage statistics****
>>
>> ** **
>>
>> ** **
>>
>> Dear Andre,****
>>
>> ** **
>>
>> Could you please generate statistics on dataset usage following the
>> instructions from Bert below? On 00.09 please.****
>>
>> ** **
>>
>> Thank you in advance.****
>>
>> ** **
>>
>> Best regards,****
>>
>> Agnieszka****
>>
>> ** **
>>
>> ** **
>>
>> *From:* Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com<bert.van.nuffelen at tenforce.com>]
>>
>> *Sent:* Thursday, October 10, 2013 10:09 AM
>>
>> *To:* ZAJAC Agnieszka (OP)
>> *Cc:* jurgen vannerom (jurgen.vannerom at tenforce.com); HOHN Norbert (OP);
>> SABETE Vafa (OP); PASTOR CAMARASA José Juan (OP)
>> *Subject:* Re: Dataset usage statistics****
>>
>> ** **
>>
>> Hi Agnieszka,****
>>
>> indeed. This has not changed.****
>>
>> Bert****
>>
>> ** **
>>
>> 2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>*
>> ***
>>
>> Hi,****
>>
>>  ****
>>
>> Thanks a lot for quick reply. Can it be applied for test on 00.09?****
>>
>>  ****
>>
>> Agnieszka****
>>
>>  ****
>>
>> *From:* Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com]
>> *Sent:* Thursday, October 10, 2013 10:02 AM
>> *To:* ZAJAC Agnieszka (OP)
>> *Cc:* jurgen vannerom (jurgen.vannerom at tenforce.com); HOHN Norbert (OP);
>> SABETE Vafa (OP); PASTOR CAMARASA José Juan (OP)
>> *Subject:* Re: Dataset usage statistics****
>>
>>  ****
>>
>> Hi Agnieszka,****
>>
>> here is the extract from our internal updated version.****
>> 1.1   DataSet Usage Statistics****
>>
>> To export the tracking stats, run the following command from the ckan
>> management scripts****
>>
>> $ ./ckan_user_stats.sh <absolute-path-file> <from_date>****
>>
>> The arguments are mandatory. The script creates a file, which must be
>> specified as an absolute path to the file, in which the statistics are
>> dumped as a CSV. The date has the form YYYY-MM-DD.****
>>
>> For instance:****
>>
>> ./ckan_user_stats.sh /applications/ecodp/users/ecodp/ckan/stat.csv
>> 2013-01-01****
>>
>> will create a file stat.csv at the given location containing the user
>> views stats from the first of January 2013. The content of the file is now
>> of the form:****
>>
>> dataset id,dataset name,publisher name,total views,recent views (last 2
>> weeks)****
>>
>> ac5ddfbf-4ae4-4829-a025-669c92dd12a2,V1OEYc8mJFRn3cOlnJYXA,publ,5,5****
>>
>> 84d8fbfe-5a57-4272-bca7-3f0a650c8121,xYIpDCIE81YFxghHr0z8Dg,cnect,4,4****
>>
>> dff2b3b0-2bfd-4260-ac65-610128779b52,IIJlaEf0VU835UgBkMuTrg,sg,4,3****
>>
>> 15e07204-c6c2-4219-84c4-c4f8d46a0efa,helloworkd,acp_amb,4,4****
>>
>> 33826c74-c364-448d-a6f5-af85bd7d55dc,1st,,3,0****
>>
>> f0592305-5c89-47f6-a6cd-7f566a43a782,VfGQxcxVB8MAfEYpM6ihBA,cnect,3,3****
>>
>> 6a3a8b74-b5bd-49ca-82d9-cb1d927fd344,Qj83cpCYT0MrAZIJILOQQ,sg,2,2****
>>
>> 5adb9513-8022-42bb-8f0d-af486520cd89,c8dxAO9R4zLEiZz84AWpQ,,1,0****
>>
>> 90acb89a-cd89-4de9-8f52-53f942af0d3f,test-try-hijack,,1,0****
>>
>> *Note:*****
>>
>> Internally the script executes a CKAN paster command. That command is
>> able do be executed without a date, in which case the date is defaulted to
>> 3 days prior to the current date. ****
>>
>> kind regards,****
>>
>> Bert****
>>
>>  ****
>>
>>  ****
>>
>> 2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>*
>> ***
>>
>> Dear Bert,****
>>
>> I would like to see the current report that can be generated from CKAN on
>> dataset usage. Before I ask Andre to do it could you please have a look at
>> the instrucitons in the operational manual pasted below? They seem somehow
>> not complete. Please let us know.****
>>
>> Thanks a lot in advance,****
>>
>> Agnieszka****
>>
>>  ****
>>
>> 1.  *DataSet Usage Statistics*****
>>
>> *To export the tracking stats, run the following command from the ckan
>> management scripts*****
>>
>> *$ ./ckan_user_stats.sh <directory> <from_date>*****
>>
>> *The directory argument is mandatory. In this directory the statistics
>> are dumped as CSV files.*****
>>
>> *If a date is specified, then the data from the given date is aggregated
>> into the export file. Otherwise, the default date is 3 days prior to the
>> current date. The date has the form YYYY-MM-DD.*****
>>
>> *The content of the files is now of the form:    *****
>>
>>  ****
>>
>>  ****
>>
>>
>>
>>
>> --
>> Bert Van Nuffelen
>>
>> Semantic Technologies Software Architect at TenForce
>> www.tenforce.be
>>
>> Bert.Van.Nuffelen at tenforce.com
>> Office: +32 (0)16 31 48 60
>> Mobile:+32 479 06 24 26
>> skype: bert.van.nuffelen ****
>>
>>
>>
>>
>> --
>> Bert Van Nuffelen
>>
>> Semantic Technologies Software Architect at TenForce
>> www.tenforce.be
>>
>> Bert.Van.Nuffelen at tenforce.com
>> Office: +32 (0)16 31 48 60
>> Mobile:+32 479 06 24 26
>> skype: bert.van.nuffelen ****
>>
>
>
>
> --
> Bert Van Nuffelen
>
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
>
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen
>
> _______________________________________________
> Ecodp-dev mailing list
> Ecodp-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20131014/82bcba73/attachment.html>


More information about the ecodp-dev mailing list