[ECODP-dev] Dataset usage statistics

Bert Van Nuffelen bert.van.nuffelen at tenforce.com
Thu Oct 10 10:57:15 UTC 2013

Hi Agnieszka,

some answers, others I have to defer to OKF.

-          Does this command have any impact on the ckan? Slowdown, halt, …
as there is always a data ingestion going as they take now 3hrs per
It pulls information from the PostgreSQL DB so there will be a limited
slowdown possible.

-          What is the estimated execution time for this?
I run it on our test system for the full year 2013 and it took me around 2
minutes. From the visual output on the screen it seems as if there is a
select query done per day.

-          Will these statistics even be correct when generated during a
data ingestion?
They correspond to the user clicks. So the ingestion is not part of a user
visited click.

@OKF can you verify my respons.

kind regards,


2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>

>  Dear Bert,****
> ** **
> Andre has a few quite pertinent question on generation of this statistical
> data especially given the fact that this script hasn't been used since last
> year. Could you provide him with answers? Maybe testing it first on your
> environment would also be useful?****
> ** **
> Thank you in advance.****
> Regards, ****
> ** **
> *Agnieszka Zając*
> Open Data Portal Section****
> Tel: +352 2929.42834****
> ** **
> *From:* MEYER André (OP-EXT)
> *Sent:* Thursday, October 10, 2013 12:17 PM
> *To:* ZAJAC Agnieszka (OP)
> *Cc:* SABETE Vafa (OP); HOHN Norbert (OP); PASTOR CAMARASA José Juan
> (OP); BOUSSERT Philippe (OP); ZADRA Julien (OP-EXT)
> *Subject:* RE: Dataset usage statistics****
> ** **
> Hello Agnieszka,****
> ** **
> I have a few questions before launching this on prod:****
> ** **
> **-          **Does this command have any impact on the ckan? Slowdown,
> halt, … as there is always a data ingestion going as they take now 3hrs per
> ingestion.****
> **-          **What is the estimated execution time for this?****
> **-          **Will these statistics even be correct when generated
> during a data ingestion?****
> ** **
> Kind regards,****
> ** **
> *André Meyer*
> Application Team - Integration engineer****
> *_**__**_**__**_**__**__**_**__**_**__***
> * *
> *Publications Office of the European Union *
> *Unit A4 - Infrastructure and IT Security Systems***
> * *
> *Halian S.à.r.l.  (under contract with the Publications Office)*****
> ( (+352) 2929-42442****
> +  andre.meyer at ext.publications.europa.eu****
> ** **
> *From:* ZAJAC Agnieszka (OP)
> *Sent:* 10 October, 2013 12:10 PM
> *To:* MEYER André (OP-EXT)
> *Cc:* SABETE Vafa (OP); HOHN Norbert (OP); PASTOR CAMARASA José Juan (OP)
> *Subject:* FW: Dataset usage statistics****
> ** **
> ** **
> Dear Andre,****
> ** **
> Could you please generate statistics on dataset usage following the
> instructions from Bert below? On 00.09 please.****
> ** **
> Thank you in advance.****
> ** **
> Best regards,****
> Agnieszka****
> ** **
> ** **
> *From:* Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com<bert.van.nuffelen at tenforce.com>]
> *Sent:* Thursday, October 10, 2013 10:09 AM
> *To:* ZAJAC Agnieszka (OP)
> *Cc:* jurgen vannerom (jurgen.vannerom at tenforce.com); HOHN Norbert (OP);
> *Subject:* Re: Dataset usage statistics****
> ** **
> Hi Agnieszka,****
> indeed. This has not changed.****
> Bert****
> ** **
> 2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>**
> **
> Hi,****
>  ****
> Thanks a lot for quick reply. Can it be applied for test on 00.09?****
>  ****
> Agnieszka****
>  ****
> *From:* Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com]
> *Sent:* Thursday, October 10, 2013 10:02 AM
> *To:* ZAJAC Agnieszka (OP)
> *Cc:* jurgen vannerom (jurgen.vannerom at tenforce.com); HOHN Norbert (OP);
> *Subject:* Re: Dataset usage statistics****
>  ****
> Hi Agnieszka,****
> here is the extract from our internal updated version.****
> 1.1   DataSet Usage Statistics****
> To export the tracking stats, run the following command from the ckan
> management scripts****
> $ ./ckan_user_stats.sh <absolute-path-file> <from_date>****
> The arguments are mandatory. The script creates a file, which must be
> specified as an absolute path to the file, in which the statistics are
> dumped as a CSV. The date has the form YYYY-MM-DD.****
> For instance:****
> ./ckan_user_stats.sh /applications/ecodp/users/ecodp/ckan/stat.csv
> 2013-01-01****
> will create a file stat.csv at the given location containing the user
> views stats from the first of January 2013. The content of the file is now
> of the form:****
> dataset id,dataset name,publisher name,total views,recent views (last 2
> weeks)****
> ac5ddfbf-4ae4-4829-a025-669c92dd12a2,V1OEYc8mJFRn3cOlnJYXA,publ,5,5****
> 84d8fbfe-5a57-4272-bca7-3f0a650c8121,xYIpDCIE81YFxghHr0z8Dg,cnect,4,4****
> dff2b3b0-2bfd-4260-ac65-610128779b52,IIJlaEf0VU835UgBkMuTrg,sg,4,3****
> 15e07204-c6c2-4219-84c4-c4f8d46a0efa,helloworkd,acp_amb,4,4****
> 33826c74-c364-448d-a6f5-af85bd7d55dc,1st,,3,0****
> f0592305-5c89-47f6-a6cd-7f566a43a782,VfGQxcxVB8MAfEYpM6ihBA,cnect,3,3****
> 6a3a8b74-b5bd-49ca-82d9-cb1d927fd344,Qj83cpCYT0MrAZIJILOQQ,sg,2,2****
> 5adb9513-8022-42bb-8f0d-af486520cd89,c8dxAO9R4zLEiZz84AWpQ,,1,0****
> 90acb89a-cd89-4de9-8f52-53f942af0d3f,test-try-hijack,,1,0****
> *Note:*****
> Internally the script executes a CKAN paster command. That command is able
> do be executed without a date, in which case the date is defaulted to 3
> days prior to the current date. ****
> kind regards,****
> Bert****
>  ****
>  ****
> 2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>**
> **
> Dear Bert,****
> I would like to see the current report that can be generated from CKAN on
> dataset usage. Before I ask Andre to do it could you please have a look at
> the instrucitons in the operational manual pasted below? They seem somehow
> not complete. Please let us know.****
> Thanks a lot in advance,****
> Agnieszka****
>  ****
> 1.  *DataSet Usage Statistics*****
> *To export the tracking stats, run the following command from the ckan
> management scripts*****
> *$ ./ckan_user_stats.sh <directory> <from_date>*****
> *The directory argument is mandatory. In this directory the statistics
> are dumped as CSV files.*****
> *If a date is specified, then the data from the given date is aggregated
> into the export file. Otherwise, the default date is 3 days prior to the
> current date. The date has the form YYYY-MM-DD.*****
> *The content of the files is now of the form:    *****
>  ****
>  ****
> --
> Bert Van Nuffelen
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen ****
> --
> Bert Van Nuffelen
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen ****

Bert Van Nuffelen

Semantic Technologies Software Architect at TenForce

Bert.Van.Nuffelen at tenforce.com
Office: +32 (0)16 31 48 60
Mobile:+32 479 06 24 26
skype: bert.van.nuffelen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20131010/4ca54769/attachment.html>

More information about the ecodp-dev mailing list