[ckan-changes] [ckan/ckan] 766dbe: [#1838] Small refactoring on plugin's datastore_se...

GitHub noreply at github.com
Wed Jul 23 14:21:38 UTC 2014


  Branch: refs/heads/master
  Home:   https://github.com/ckan/ckan
  Commit: 766dbe441e07bd60c4c04fb4093584493bbe4995
      https://github.com/ckan/ckan/commit/766dbe441e07bd60c4c04fb4093584493bbe4995
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-10 (Thu, 10 Jul 2014)

  Changed paths:
    M ckanext/datastore/plugin.py

  Log Message:
  -----------
  [#1838] Small refactoring on plugin's datastore_search()

I removed some unused variables, and stopped validating on _where(), as we're
already validating on datastore_validate().


  Commit: 30c6de01e3c050c32c1e84e454c86d6a2c8aa569
      https://github.com/ckan/ckan/commit/30c6de01e3c050c32c1e84e454c86d6a2c8aa569
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-10 (Thu, 10 Jul 2014)

  Changed paths:
    M ckanext/datastore/logic/schema.py
    M ckanext/datastore/plugin.py
    M ckanext/datastore/tests/test_search.py

  Log Message:
  -----------
  [#1838] Add full-text searching on specific fields on datastore_search

You still can send queries to `datastore_search` adding a `q` parameter as a
string, and they'll work as they always had (i.e. we're backwards compatible).
But now you're able to send `q` as a dict with a string value, as in:

```json
"q": {
    "title": "CKAN"
}
```

That would do a full-text search only on the "title" field, and return the
results. I haven't created the indexes yet, so these searches can be pretty
slow.


  Commit: 029ee11bf72acf488cdafdd1926c19cefc1549b3
      https://github.com/ckan/ckan/commit/029ee11bf72acf488cdafdd1926c19cefc1549b3
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-10 (Thu, 10 Jul 2014)

  Changed paths:
    M ckanext/datastore/plugin.py

  Log Message:
  -----------
  Revert "[#1838] Small refactoring on plugin's datastore_search()"

This reverts commit 766dbe441e07bd60c4c04fb4093584493bbe4995.


  Commit: 50c6fdefa202d1ed03b63f77f4337f6ea9d0fe8a
      https://github.com/ckan/ckan/commit/50c6fdefa202d1ed03b63f77f4337f6ea9d0fe8a
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-10 (Thu, 10 Jul 2014)

  Changed paths:
    M ckanext/datastore/plugin.py

  Log Message:
  -----------
  [#1838] Don't add "q" filters on fields that don't exist

We simply ignore them. We can't raise errors, because there might be another
extension that does understand them, so we can't say they're actually invalid.


  Commit: c74a9b27a8a358f252f9a030c9f9684f1e206718
      https://github.com/ckan/ckan/commit/c74a9b27a8a358f252f9a030c9f9684f1e206718
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-10 (Thu, 10 Jul 2014)

  Changed paths:
    M ckanext/datastore/plugin.py

  Log Message:
  -----------
  [#1838] Fix bg where we were limiting filters to columns on the fields param

The problem was that `_where` and `_sort` needed a list of all columns, but if
the user sent a field list using the `fields` parameter, they actually received
the list of fields in that paramter.

This fixes that.


  Commit: 9f9f4114d7bb4a663cbccad6ef1dae5fa14b26c9
      https://github.com/ckan/ckan/commit/9f9f4114d7bb4a663cbccad6ef1dae5fa14b26c9
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-16 (Wed, 16 Jul 2014)

  Changed paths:
    M ckanext/datastore/tests/test_create.py

  Log Message:
  -----------
  [#1838] Add tests for current index creation code


  Commit: 9072a9413ee3188c64e4700da823fa4d64e365f2
      https://github.com/ckan/ckan/commit/9072a9413ee3188c64e4700da823fa4d64e365f2
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-16 (Wed, 16 Jul 2014)

  Changed paths:
    M ckanext/datastore/db.py
    M ckanext/datastore/tests/test_create.py

  Log Message:
  -----------
  [#1838] Create FTS indexes on each textual field

We create those indexes at the same time we're indexing `_full_text`. This is
needed to allow full-text searches on specific columns, and not only on the
entire row.

For the test, unfortunately I couldn't test that a FTS index was created for
each specific field, but instead rely on the number of indexes, because as the
index is not on the column itself, but on the return value of `to_tsvector`, we
don't have that information available.

There're still issues with this code as we're using English for default (and
there's no way to overwrite that).


  Commit: 48c3b55ab713a9bb9267f1490569d64691075fb8
      https://github.com/ckan/ckan/commit/48c3b55ab713a9bb9267f1490569d64691075fb8
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-17 (Thu, 17 Jul 2014)

  Changed paths:
    M ckanext/datastore/db.py

  Log Message:
  -----------
  [#1838] Use the default Postgres' language to create the FTS index


  Commit: 0383138e9abdfed1513cdcf585bd407e892efb0a
      https://github.com/ckan/ckan/commit/0383138e9abdfed1513cdcf585bd407e892efb0a
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-17 (Thu, 17 Jul 2014)

  Changed paths:
    M ckanext/datastore/db.py

  Log Message:
  -----------
  [#1838] Escape % from all strings that may have them

The problem we faced when doing format() twice was that if someone passed a
query with "{}", it'll look confuse the second format(). For example, the
query:

```json
{
    filters: {
  "country": "{test}"
    }
}
```

It'll confuse the second format(). So, instead of doing two formats to escape
%, we escape % before running it.


  Commit: 2ab9ccce189e7496ad57eb478685685b17b41376
      https://github.com/ckan/ckan/commit/2ab9ccce189e7496ad57eb478685685b17b41376
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-18 (Fri, 18 Jul 2014)

  Changed paths:
    M ckanext/datastore/logic/schema.py
    M ckanext/datastore/tests/test_search.py

  Log Message:
  -----------
  [#1838] Fix bug where we weren't converting JSON to dicts


  Commit: 1210ebe3b1d520e1af357dbc130736891935a71b
      https://github.com/ckan/ckan/commit/1210ebe3b1d520e1af357dbc130736891935a71b
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-20 (Sun, 20 Jul 2014)

  Changed paths:
    M ckanext/datastore/db.py
    M ckanext/datastore/tests/test_create.py

  Log Message:
  -----------
  [#1838] Always create full-text search indexes

Before this, we only created FTS indexes when the user sent a `indexes` or
`primary_key` parameter to the `datastore_create` call, which only happens when
you're directly calling it (i.e. not uploading a file through CKAN's
interface).

This makes it faster to upload something to the datastore, but creates problems
as queries aren't on indexed fields. So this commit changes it to always create
at least the full-text search indexes.


  Commit: cbb08202f18f3079c12eb01ee84def52b2621723
      https://github.com/ckan/ckan/commit/cbb08202f18f3079c12eb01ee84def52b2621723
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-20 (Sun, 20 Jul 2014)

  Changed paths:
    M ckan/config/solr/schema.xml
    M ckan/lib/search/index.py
    M ckan/new_tests/lib/search/test_index.py
    M ckan/tests/functional/api/test_package_search.py
    M ckanext/datastore/db.py
    M ckanext/datastore/interfaces.py
    M ckanext/datastore/logic/action.py
    M ckanext/datastore/logic/schema.py
    M ckanext/datastore/plugin.py
    M ckanext/datastore/tests/test_search.py
    M doc/api/index.rst
    M doc/contributing/reviewing.rst

  Log Message:
  -----------
  Merge branch 'master' into 1838-fts-on-specific-columns

Conflicts:
	ckanext/datastore/plugin.py


  Commit: 27f54220cc8ada5df39e20edf848a497a6fbb84a
      https://github.com/ckan/ckan/commit/27f54220cc8ada5df39e20edf848a497a6fbb84a
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-20 (Sun, 20 Jul 2014)

  Changed paths:
    M ckanext/datastore/tests/test_create.py

  Log Message:
  -----------
  [#1838] Use assert_equal to facilitate test debugging


  Commit: ed9ece0fadb4891b8012fbd2e9ff4682b9c94f1c
      https://github.com/ckan/ckan/commit/ed9ece0fadb4891b8012fbd2e9ff4682b9c94f1c
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-21 (Mon, 21 Jul 2014)

  Changed paths:
    M ckanext/datastore/plugin.py
    M ckanext/datastore/tests/test_search.py

  Log Message:
  -----------
  [#1838] Calculate FTS rankings only on specific columns

The bug was that when we've done a full-text query on a specific column, the
query was OK, but the rank was calculated using all text from the row. It
messed up when trying to get DISTINCT values.


  Commit: d949a018e03c65998620ab510a255aa65904fd83
      https://github.com/ckan/ckan/commit/d949a018e03c65998620ab510a255aa65904fd83
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-21 (Mon, 21 Jul 2014)

  Changed paths:
    M ckanext/datastore/tests/test_create.py

  Log Message:
  -----------
  [#1838] Fix tests where we were asserting a assert_equals


  Commit: b50452da76170d77f41baef675fa71a16733f15f
      https://github.com/ckan/ckan/commit/b50452da76170d77f41baef675fa71a16733f15f
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-21 (Mon, 21 Jul 2014)

  Changed paths:
    M ckanext/datastore/db.py
    M ckanext/datastore/plugin.py
    A ckanext/datastore/tests/test_db.py
    M ckanext/datastore/tests/test_plugin.py

  Log Message:
  -----------
  [#1838] Set default FTS language using ckan.datastore.default_fts_lang

If you set `ckan.datastore.default_fts_lang` on the INI file, it'll be used as
the default language to be used when creating full-text search indexes and
queries. It can be overwritten on a per-request basis by using the `lang`
parameter. If none exist (neither `lang` or `ckan.datastore.default_fts_lang`),
"english" is used as a default.


  Commit: 8e365f5568f8f05c3c3c798565555e0537195f5f
      https://github.com/ckan/ckan/commit/8e365f5568f8f05c3c3c798565555e0537195f5f
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-22 (Tue, 22 Jul 2014)

  Changed paths:
    M ckan/new_tests/helpers.py
    M ckanext/datastore/tests/test_create.py

  Log Message:
  -----------
  [#1838] Fix test on FTS indexes creation

It's really bad that I'm now testing the number of indexes created, which is
quite flaky. The problem is that we change the index name depending on if we're
working on Postgres < or > 9. This works for now, but can be improved later.


  Commit: 3d8ba39e56c12b3a3f70fc8e6f609141c07fd99e
      https://github.com/ckan/ckan/commit/3d8ba39e56c12b3a3f70fc8e6f609141c07fd99e
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-22 (Tue, 22 Jul 2014)

  Changed paths:
    M ckanext/datastore/db.py

  Log Message:
  -----------
  [#1838] Refactor FTS index creation


  Commit: a9d2186a05ab0ad66bcfc412a0be739e118b4521
      https://github.com/ckan/ckan/commit/a9d2186a05ab0ad66bcfc412a0be739e118b4521
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-22 (Tue, 22 Jul 2014)

  Changed paths:
    M ckanext/datastore/db.py
    M ckanext/datastore/tests/test_create.py
    M ckanext/datastore/tests/test_db.py

  Log Message:
  -----------
  [#1838] Create index only if it doesn't exist

We do so by creating a unique index name that's specific to each pair
resource_id + field_name. Before trying to create an index, we check if an
index with this same name exist, and only create it if it doesn't.


  Commit: f8e8976d9fb418dc35d7ef3f1c422832a68a1cb8
      https://github.com/ckan/ckan/commit/f8e8976d9fb418dc35d7ef3f1c422832a68a1cb8
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-22 (Tue, 22 Jul 2014)

  Changed paths:
    M ckanext/datastore/tests/test_create.py

  Log Message:
  -----------
  [#1838] Refactor tests to be less flaky

Instead of asserting the number of indexes created, we now assert on the actual
index names.


  Commit: d2a6c4b5534cb17ab890e0d6c683a662a10ffbf0
      https://github.com/ckan/ckan/commit/d2a6c4b5534cb17ab890e0d6c683a662a10ffbf0
  Author: Vitor Baptista <vitor at vitorbaptista.com>
  Date:   2014-07-22 (Tue, 22 Jul 2014)

  Changed paths:
    M ckan/config/deployment.ini_tmpl
    M ckanext/datastore/logic/action.py
    M doc/maintaining/configuration.rst

  Log Message:
  -----------
  [#1838] Adds docs


  Commit: 922af510fc2c19bc3954ffb9cae341c1eac68e8b
      https://github.com/ckan/ckan/commit/922af510fc2c19bc3954ffb9cae341c1eac68e8b
  Author: amercader <amercadero at gmail.com>
  Date:   2014-07-23 (Wed, 23 Jul 2014)

  Changed paths:
    M ckan/config/deployment.ini_tmpl
    M ckan/new_tests/helpers.py
    M ckanext/datastore/db.py
    M ckanext/datastore/logic/action.py
    M ckanext/datastore/logic/schema.py
    M ckanext/datastore/plugin.py
    M ckanext/datastore/tests/test_create.py
    A ckanext/datastore/tests/test_db.py
    M ckanext/datastore/tests/test_plugin.py
    M ckanext/datastore/tests/test_search.py
    M doc/maintaining/configuration.rst

  Log Message:
  -----------
  Merge branch '1838-fts-on-specific-columns'


Compare: https://github.com/ckan/ckan/compare/33652504e93b...922af510fc2c


More information about the ckan-changes mailing list