[ECODP-dev] Dataset usage statistics

Bert Van Nuffelen bert.van.nuffelen at tenforce.com
Thu Oct 10 10:57:15 UTC 2013


Hi Agnieszka,

some answers, others I have to defer to OKF.

-          Does this command have any impact on the ckan? Slowdown, halt, …
as there is always a data ingestion going as they take now 3hrs per
ingestion.
It pulls information from the PostgreSQL DB so there will be a limited
slowdown possible.

-          What is the estimated execution time for this?
I run it on our test system for the full year 2013 and it took me around 2
minutes. From the visual output on the screen it seems as if there is a
select query done per day.

-          Will these statistics even be correct when generated during a
data ingestion?
They correspond to the user clicks. So the ingestion is not part of a user
visited click.

@OKF can you verify my respons.

kind regards,

Bert




2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>

>  Dear Bert,****
>
> ** **
>
> Andre has a few quite pertinent question on generation of this statistical
> data especially given the fact that this script hasn't been used since last
> year. Could you provide him with answers? Maybe testing it first on your
> environment would also be useful?****
>
> ** **
>
> Thank you in advance.****
>
> Regards, ****
>
> ** **
>
> *Agnieszka Zając*
>
> Open Data Portal Section****
>
> Tel: +352 2929.42834****
>
> ** **
>
> *From:* MEYER André (OP-EXT)
> *Sent:* Thursday, October 10, 2013 12:17 PM
> *To:* ZAJAC Agnieszka (OP)
> *Cc:* SABETE Vafa (OP); HOHN Norbert (OP); PASTOR CAMARASA José Juan
> (OP); BOUSSERT Philippe (OP); ZADRA Julien (OP-EXT)
> *Subject:* RE: Dataset usage statistics****
>
> ** **
>
> Hello Agnieszka,****
>
> ** **
>
> I have a few questions before launching this on prod:****
>
> ** **
>
> **-          **Does this command have any impact on the ckan? Slowdown,
> halt, … as there is always a data ingestion going as they take now 3hrs per
> ingestion.****
>
> **-          **What is the estimated execution time for this?****
>
> **-          **Will these statistics even be correct when generated
> during a data ingestion?****
>
> ** **
>
> Kind regards,****
>
> ** **
>
> *André Meyer*
>
> Application Team - Integration engineer****
>
> *_**__**_**__**_**__**__**_**__**_**__***
>
> * *
>
> *Publications Office of the European Union *
>
> *Unit A4 - Infrastructure and IT Security Systems***
>
> * *
>
> *Halian S.à.r.l.  (under contract with the Publications Office)*****
>
> ( (+352) 2929-42442****
>
> +  andre.meyer at ext.publications.europa.eu****
>
> ** **
>
> *From:* ZAJAC Agnieszka (OP)
> *Sent:* 10 October, 2013 12:10 PM
> *To:* MEYER André (OP-EXT)
> *Cc:* SABETE Vafa (OP); HOHN Norbert (OP); PASTOR CAMARASA José Juan (OP)
> *Subject:* FW: Dataset usage statistics****
>
> ** **
>
> ** **
>
> Dear Andre,****
>
> ** **
>
> Could you please generate statistics on dataset usage following the
> instructions from Bert below? On 00.09 please.****
>
> ** **
>
> Thank you in advance.****
>
> ** **
>
> Best regards,****
>
> Agnieszka****
>
> ** **
>
> ** **
>
> *From:* Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com<bert.van.nuffelen at tenforce.com>]
>
> *Sent:* Thursday, October 10, 2013 10:09 AM
>
> *To:* ZAJAC Agnieszka (OP)
> *Cc:* jurgen vannerom (jurgen.vannerom at tenforce.com); HOHN Norbert (OP);
> SABETE Vafa (OP); PASTOR CAMARASA José Juan (OP)
> *Subject:* Re: Dataset usage statistics****
>
> ** **
>
> Hi Agnieszka,****
>
> indeed. This has not changed.****
>
> Bert****
>
> ** **
>
> 2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>**
> **
>
> Hi,****
>
>  ****
>
> Thanks a lot for quick reply. Can it be applied for test on 00.09?****
>
>  ****
>
> Agnieszka****
>
>  ****
>
> *From:* Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com]
> *Sent:* Thursday, October 10, 2013 10:02 AM
> *To:* ZAJAC Agnieszka (OP)
> *Cc:* jurgen vannerom (jurgen.vannerom at tenforce.com); HOHN Norbert (OP);
> SABETE Vafa (OP); PASTOR CAMARASA José Juan (OP)
> *Subject:* Re: Dataset usage statistics****
>
>  ****
>
> Hi Agnieszka,****
>
> here is the extract from our internal updated version.****
> 1.1   DataSet Usage Statistics****
>
> To export the tracking stats, run the following command from the ckan
> management scripts****
>
> $ ./ckan_user_stats.sh <absolute-path-file> <from_date>****
>
> The arguments are mandatory. The script creates a file, which must be
> specified as an absolute path to the file, in which the statistics are
> dumped as a CSV. The date has the form YYYY-MM-DD.****
>
> For instance:****
>
> ./ckan_user_stats.sh /applications/ecodp/users/ecodp/ckan/stat.csv
> 2013-01-01****
>
> will create a file stat.csv at the given location containing the user
> views stats from the first of January 2013. The content of the file is now
> of the form:****
>
> dataset id,dataset name,publisher name,total views,recent views (last 2
> weeks)****
>
> ac5ddfbf-4ae4-4829-a025-669c92dd12a2,V1OEYc8mJFRn3cOlnJYXA,publ,5,5****
>
> 84d8fbfe-5a57-4272-bca7-3f0a650c8121,xYIpDCIE81YFxghHr0z8Dg,cnect,4,4****
>
> dff2b3b0-2bfd-4260-ac65-610128779b52,IIJlaEf0VU835UgBkMuTrg,sg,4,3****
>
> 15e07204-c6c2-4219-84c4-c4f8d46a0efa,helloworkd,acp_amb,4,4****
>
> 33826c74-c364-448d-a6f5-af85bd7d55dc,1st,,3,0****
>
> f0592305-5c89-47f6-a6cd-7f566a43a782,VfGQxcxVB8MAfEYpM6ihBA,cnect,3,3****
>
> 6a3a8b74-b5bd-49ca-82d9-cb1d927fd344,Qj83cpCYT0MrAZIJILOQQ,sg,2,2****
>
> 5adb9513-8022-42bb-8f0d-af486520cd89,c8dxAO9R4zLEiZz84AWpQ,,1,0****
>
> 90acb89a-cd89-4de9-8f52-53f942af0d3f,test-try-hijack,,1,0****
>
> *Note:*****
>
> Internally the script executes a CKAN paster command. That command is able
> do be executed without a date, in which case the date is defaulted to 3
> days prior to the current date. ****
>
> kind regards,****
>
> Bert****
>
>  ****
>
>  ****
>
> 2013/10/10 ZAJAC Agnieszka (OP) <Agnieszka.ZAJAC at publications.europa.eu>**
> **
>
> Dear Bert,****
>
> I would like to see the current report that can be generated from CKAN on
> dataset usage. Before I ask Andre to do it could you please have a look at
> the instrucitons in the operational manual pasted below? They seem somehow
> not complete. Please let us know.****
>
> Thanks a lot in advance,****
>
> Agnieszka****
>
>  ****
>
> 1.  *DataSet Usage Statistics*****
>
> *To export the tracking stats, run the following command from the ckan
> management scripts*****
>
> *$ ./ckan_user_stats.sh <directory> <from_date>*****
>
> *The directory argument is mandatory. In this directory the statistics
> are dumped as CSV files.*****
>
> *If a date is specified, then the data from the given date is aggregated
> into the export file. Otherwise, the default date is 3 days prior to the
> current date. The date has the form YYYY-MM-DD.*****
>
> *The content of the files is now of the form:    *****
>
>  ****
>
>  ****
>
>
>
>
> --
> Bert Van Nuffelen
>
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
>
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen ****
>
>
>
>
> --
> Bert Van Nuffelen
>
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
>
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen ****
>



-- 
Bert Van Nuffelen

Semantic Technologies Software Architect at TenForce
www.tenforce.be

Bert.Van.Nuffelen at tenforce.com
Office: +32 (0)16 31 48 60
Mobile:+32 479 06 24 26
skype: bert.van.nuffelen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20131010/4ca54769/attachment.html>


More information about the ecodp-dev mailing list