[ckan-dev] CKAN showing up in Google searches - Language used

Stefan Oderbolz stefan.oderbolz at liip.ch
Tue Feb 3 07:08:35 UTC 2015


Btw, there is a very good Google Webmaster article describing how this
can be done: http://googlewebmastercentral.blogspot.ch/2010/09/unifying-content-under-multilingual.html

On Mon, Feb 2, 2015 at 11:40 PM, Stefan Oderbolz
<stefan.oderbolz at liip.ch> wrote:
> Hi there,
>
> while you could use the robozs.txt "trick" to exclude URLs, its not a very
> nice practice, because then the content is basically lost for Googel et al.
>
> The prefered way to handle this is by using canonical links (see
> https://support.google.com/webmasters/answer/139066?hl=en). This is a way to
> tell a search engine your prefered URL for content and even specify that
> each page exists in several languages.
>
> Afaik this has not (yet) been implemented in CKAN. It would definitely be a
> very welcome patch or extension.
>
> Regards Stefan
>
> On Feb 2, 2015 3:17 AM, "Alex (Maxious) Sadleir" <maxious at gmail.com> wrote:
>>
>> You can patch robots.txt in the CKAN source to exclude languages from
>> search results (while users can still switch on the site unless you
>> also disable languages in the config). I also exclude /_tracking
>>
>> diff --git a/ckan/public/robots.txt b/ckan/public/robots.txt
>> index 279a33a..e410bdc 100644
>> --- a/ckan/public/robots.txt
>> +++ b/ckan/public/robots.txt
>> @@ -3,6 +3,50 @@ Disallow: /dataset/rate/
>>  Disallow: /revision/
>>  Disallow: /dataset/*/history
>>  Disallow: /api/
>> +Disallow: /_tracking
>> +Disallow: /_tracking
>> +
>> +Disallow: /ar/
>> +Disallow: /bg/
>> +Disallow: /ca
>> +Disallow: /cs_CZ/
>> +Disallow: /da_DK/
>> +Disallow: /de/
>> +Disallow: /dv/
>> +Disallow: /el/
>> +Disallow: /en_AU/
>> +Disallow: /en_GB/
>> +Disallow: /es/
>> +Disallow: /es_AR/
>> +Disallow: /fa_IR/
>> +Disallow: /fi/
>> +Disallow: /fr/
>> +Disallow: /hu/
>> +Disallow: /id/
>> +Disallow: /is/
>> +Disallow: /it/
>> +Disallow: /ja/
>> +Disallow: /km/
>> +Disallow: /ko_KR/
>> +Disallow: /lt/
>> +Disallow: /lv/
>> +Disallow: /my_MM/
>> +Disallow: /nl/
>> +Disallow: /no/
>> +Disallow: /pl/
>> +Disallow: /pt_BR/
>> +Disallow: /ro/
>> +Disallow: /ru/
>> +Disallow: /sk/
>> +Disallow: /sl/
>> +Disallow: /sq/
>> +Disallow: /sr/
>> +Disallow: /sr_Latn/
>> +Disallow: /sv/
>> +Disallow: /tr/
>> +Disallow: /uk_UA/
>> +Disallow: /zh_CN/
>> +Disallow: /zh_TW/
>>
>>  User-Agent: *
>>  Crawl-Delay: 10
>>
>> On Mon, Feb 2, 2015 at 12:59 PM, Aaron McGlinchy
>> <McGlinchyA at landcareresearch.co.nz> wrote:
>> > Hi, our instance of CKAN now has datasets showing up in google searches,
>> > which is great.  However I have noticed that often the link which comes up
>> > in the google search takes the user to a 'non-default' language version of
>> > the dataset or resource.  Ie. Our language is English, but a search for
>> > example for:  house mouse data  returns as the number 1 result one of our
>> > resources, but with the language as Arabic.  This is perfectly fine if the
>> > user doing the search is wanting the Arabic language interface, but not
>> > quite so user friendly if the users wants the English interface.
>> >
>> > Is there anything that can be done to influence this behaviour (without
>> > removing language options that some other users might wish to use)?
>> >
>> > Thanks
>> > Aaron
>> >
>> > ________________________________
>> >
>> > Please consider the environment before printing this email
>> > Warning: This electronic message together with any attachments is
>> > confidential. If you receive it in error: (i) you must not read, use,
>> > disclose, copy or retain it; (ii) please contact the sender immediately by
>> > reply email and then delete the emails.
>> > The views expressed in this email may not be those of Landcare Research
>> > New Zealand Limited. http://www.landcareresearch.co.nz
>> > _______________________________________________
>> > ckan-dev mailing list
>> > ckan-dev at lists.okfn.org
>> > https://lists.okfn.org/mailman/listinfo/ckan-dev
>> > Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev



-- 
Liip AG  // Limmatstrasse 183 //  CH-8005 Zürich
Tel +41 43 500 39 80 // GnuPG 0x7B588C67 // www.liip.ch



More information about the ckan-dev mailing list