[ckan-dev] problems configuring SOLR 4.5.0 with CKAN 2.0
Elena Camossi
elena.camossi at ext.jrc.ec.europa.eu
Wed Nov 6 10:36:31 UTC 2013
Hi everyone,
we solved the problem.
We found out that it was due to a wrong setting of solr_url in the ckan
configuration file (.ini), together with a wrong settings of the default
core solr to use, so the missing resource was the solr schema.
The correct .ini file has this line:
solr_url = http://127.0.0.1:8080/solr/ckan-schema-2.0
instead of
solr_url = http://127.0.0.1:8080/solr
(8080 is because we're using Tomcat).
For who is interested, below you can find the steps for configuring Sorl 4.5
and Tomcat 7 for CKAN 2.0 on Red Hat, with one Solr core only for CKAN.
We adapted the steps in the official documentation and the procedure
described here
https://github.com/okfn/ckan/wiki/How-to-install-CKAN-2.0-on-CentOS-6.3-%28n
ew%29
Regards,
-Elena
----------------------------------------------------------------------------
------------------------------------------------------------------
Create a CKAN Configuration
1) create CKAN conf directory and configuration file
mkdir -p /etc/ckan/
chown -R ckan /etc/ckan/
su -s /bin/bash - ckan
. pyenv/bin/activate
cd pyenv/src/ckan
paster make-config ckan /etc/ckan/production.ini
2) Edit the production.ini file in a text editor, changing the following
options (change as appropriate with postgres user, password and database for
ckan ):
sqlalchemy.url =
postgresql://[ckan_postgres_user]:[ckan_postgres_pwd]@localhost/[ckan_postgr
es_database]
ckan.site_id =
solr_url = http://127.0.0.1:8080/solr/ckan-schema-2.0
Install Apache SOLR 4.5.0 and configure it to use the CKAN solr schema
1) Download and extract Apache SOLR (if you get an error, check if the
name of the .tgz file is solr-xxx.tgz or apache-solr-xxx.tgz)
curl http://archive.apache.org/dist/lucene/solr/4.5.0/solr-4.5.0.tgz | tar
xzf -
2) Create directories to hold SOLR cores (just one, in our case, called
ckan):
mkdir -p /usr/share/solr/ckan /var/lib/solr/data/ckan /etc/solr/ckan
3) Cd to the directory where solr-4.5.0.tgz was downloaded and unpacked
and Copy the Apache SOLR war to the desired location (you can eventually
change the name of the file to solr.war)
cp solr-4.5.0/dist/solr-4.5.0.war /usr/share/solr
cp -r solr-4.5.0/example/solr/collection1/conf /etc/solr/ckan
4) Create directory in solr home for libs and copy all .jar there
mkdir /usr/share/solr/lib
cp solr-4.5.0/dist/* /usr/share/solr/lib
cp solr-4.5.0/contrib/analysis-extras/lib/* /usr/share/solr/lib
cp solr-4.5.0/contrib/clustering/lib/* /usr/share/solr/lib
cp solr-4.5.0/contrib/dataimporthandler/lib/* /usr/share/solr/lib
cp solr-4.5.0/contrib/extraction/lib/* /usr/share/solr/lib
cp solr-4.5.0/contrib/langid/lib/* /usr/share/solr/lib
cp solr-4.5.0/contrib/uima/lib/* /usr/share/solr/lib
cp solr-4.5.0/contrib/velocity/lib/* /usr/share/solr/lib
To avoid copying all jar, consider to edit file
/etc/solr/ckan/conf/solrconfig.xml to remove all unnecessary sections (what
is not required by CKAN?)
5) Remove or comment in /etc/solr/ckan/conf/solrconfig.xml < dataDir>
section, which is no longer necessary (and raises an error)
6) Create a symbolic link between the configurations in /etc and /usr.
ln -s /etc/solr/ckan/conf /usr/share/solr/ckan/conf
7) Remove the provided schema from the configured core and link the
schema files in the CKAN source.
rm -f /etc/solr/ckan/conf/schema.xml
ln -s /usr/local/ckan/pyenv/src/ckan/ckan/config/solr/schema-2.0.xml
/etc/solr/ckan/conf/schema.xml
8) For SOLR multicore installation, for each core create in
/opt/tomcat7/conf/Catalina/localhost/ a new xml file specifying the context
(the solr home) for each core, each with the following content
<Context docBase="/usr/share/solr/solr-4.5.0.war" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="[complete
path to solr core home]" override="true" />
</Context>
For example, create file /opt/tomcat7/conf/Catalina/localhost/solr.xml:
<Context docBase="/usr/share/solr/solr-4.5.0.war" crossContext="true">
<Environment name="solr/home" type="java.lang.String"
value="/usr/share/solr" override="true" />
</Context>
8) Create a new file, called /usr/share/solr/solr.xml, specifying the
configuration for each core. For core ckan is:
<solr persistent="true" sharedLib="lib">
<cores adminPath="/admin/cores">
<core name ="ckan-schema-2.0" instanceDir="ckan"> <property
name="dataDir" value="/var/lib/solr/data/ckan" /></core>
</cores>
</solr>
Enabling Tomcat
1) Copy the solr jars library from /usr/share/solr/conf/lib/ into your
container's main lib directory (Tomcat). These jars will set up SLF4J and
log4j, Solr's new logging features.
cd <solr download directory>
cp -f solr-4.5.0/dist/* /opt/tomcat7/lib/
2) Set Permissions to enable tomcat user to access solr configuration
chown -R tomcat:tomcat /usr/share/solr /var/lib/solr
3) Set up tomcat to start at startup, or restart the service is it's
running
chkconfig tomcat on
service tomcat start
or
service tomcat restart
4) Check if Tomcat and Solr are running correctly
Open a browser and go to http://localhost:8080/ and check if tomcat is
running.
Then, go to http://localhost:8080/manager/html and connect as admin. Among
the applications you should see solr, and going to
http://localhost:80804/solr you should access Solr admin page. The path
depends on the name of the file in /opt/tomcat7/conf/Catalina/localhost. It
is [.]/solr if the name of the file is
/opt/tomcat7/conf/Catalina/localhost/solr.xml.
More information about the ckan-dev
mailing list