[ckan-dev] CKAN showing up in Google searches - Language used

Alex (Maxious) Sadleir maxious at gmail.com
Mon Feb 2 02:16:24 UTC 2015


You can patch robots.txt in the CKAN source to exclude languages from
search results (while users can still switch on the site unless you
also disable languages in the config). I also exclude /_tracking

diff --git a/ckan/public/robots.txt b/ckan/public/robots.txt
index 279a33a..e410bdc 100644
--- a/ckan/public/robots.txt
+++ b/ckan/public/robots.txt
@@ -3,6 +3,50 @@ Disallow: /dataset/rate/
 Disallow: /revision/
 Disallow: /dataset/*/history
 Disallow: /api/
+Disallow: /_tracking
+Disallow: /_tracking
+
+Disallow: /ar/
+Disallow: /bg/
+Disallow: /ca
+Disallow: /cs_CZ/
+Disallow: /da_DK/
+Disallow: /de/
+Disallow: /dv/
+Disallow: /el/
+Disallow: /en_AU/
+Disallow: /en_GB/
+Disallow: /es/
+Disallow: /es_AR/
+Disallow: /fa_IR/
+Disallow: /fi/
+Disallow: /fr/
+Disallow: /hu/
+Disallow: /id/
+Disallow: /is/
+Disallow: /it/
+Disallow: /ja/
+Disallow: /km/
+Disallow: /ko_KR/
+Disallow: /lt/
+Disallow: /lv/
+Disallow: /my_MM/
+Disallow: /nl/
+Disallow: /no/
+Disallow: /pl/
+Disallow: /pt_BR/
+Disallow: /ro/
+Disallow: /ru/
+Disallow: /sk/
+Disallow: /sl/
+Disallow: /sq/
+Disallow: /sr/
+Disallow: /sr_Latn/
+Disallow: /sv/
+Disallow: /tr/
+Disallow: /uk_UA/
+Disallow: /zh_CN/
+Disallow: /zh_TW/

 User-Agent: *
 Crawl-Delay: 10

On Mon, Feb 2, 2015 at 12:59 PM, Aaron McGlinchy
<McGlinchyA at landcareresearch.co.nz> wrote:
> Hi, our instance of CKAN now has datasets showing up in google searches, which is great.  However I have noticed that often the link which comes up in the google search takes the user to a 'non-default' language version of the dataset or resource.  Ie. Our language is English, but a search for example for:  house mouse data  returns as the number 1 result one of our resources, but with the language as Arabic.  This is perfectly fine if the user doing the search is wanting the Arabic language interface, but not quite so user friendly if the users wants the English interface.
>
> Is there anything that can be done to influence this behaviour (without removing language options that some other users might wish to use)?
>
> Thanks
> Aaron
>
> ________________________________
>
> Please consider the environment before printing this email
> Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails.
> The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev



More information about the ckan-dev mailing list