[ckan-dev] Special HTML Characters in Ckan and Harvest/Spatial Harvest Plugins

Nathan Hook nhook at ucar.edu
Thu Oct 26 15:51:43 UTC 2017


A friendly 'bump' on this issue.  Should I create a bug report in the ckan
github repository?

Thank you for your time,

Nathan

On Wed, Oct 11, 2017 at 3:00 PM, Nathan Hook <nhook at ucar.edu> wrote:

> Good Day,
>
> I know that html encoded characters is close to everyone's least favorite
> topic (besides the handling of unicode characters), but we have run into a
> problem and need some friendly knowledge or advice to figure out if we have
> discovered a bug (or bugs) in ckan.
>
> We are using both the Harvest and Spatial Harvest plugins to import ISO
> xml files into our ckan instance.
>
> Some of our iso xml files use the html encoded values for the less than
> and greater than symbols:  < and >
>
> And we are seeing some strange behavior with these symbols.  Below are all
> the use cases that we could come up with to show the behavior of these
> characters based on the view and input (harvest or created via the ckan UI).
>
>
> When importing < and &gt without a CDATA via xml and the harvester...
>
> Text in xml:
> Here is the lessthan < this text should appear > the greaterthan
> should be before this text.
>
> Dataset view:
> Here is the lessthan the greaterthan sould be before this text.
>
> NOTE:
> The 'this text should appear' is missing.
>
> API Json Output:
> "Here is the lessthan < this text should appear > the greaterthan sould be
> before this text."
>
> Search Page Output:
> Here is the lessthan < this text should appear > the greaterthan sould be
> before this text.
>
>
> When importing < and &gt with a CDATA via xml and the harvester...
>
> Text in xml:
> <![CDATA[Here is the lessthan < this text should appear > the
> greaterthan should be before this text.]]>
>
> Dataset View:
> Here is the lessthan < this text should appear > the greaterthan should be
> before this text.
>
> API Json Output:
> "Here is the lessthan < this text should appear > the greaterthan
> should be before this text."
>
> Search Page Output:
> Here is the lessthan < this text should appear > the greaterthan sould be
> before this text.
>
>
> When importing < and > with a CDATA via xml and the harvester...
>
> Text in xml:
> <![CDATA[Here is the lessthan < this text should appear > the greaterthan
> should be before this text.]]>
>
> Dataset view:
> Here is the lessthan the greaterthan should be before this text.
>
> NOTE:
> The 'this text should appear' is missing.
>
> API Json Output:
> "Here is the lessthan < this text should appear > the greaterthan should
> be before this text."
>
> Search Page Output:
> Here is the lessthan < this text should appear > the greaterthan sould be
> before this text.
>
>
> When hand creating a record with < and > via the UI...
>
> Text entered:
> Here is the html encoded lessthan < this text should appear > the
> html encoded greaterthan should be before this text.
>
> Dataset view:
> Here is the html encoded lessthan < this text should appear > the html
> encoded greaterthan should be before this text.
>
> API Json Output:
> "Here is the html encoded lessthan < this text should appear > the
> html encoded greaterthan should be before this text."
>
> Search Page Output:
> Here is the html encoded lessthan < this text should appear > the html
> encoded greaterthan should be before this text.
>
>
> When hand creating a record with < and > via the UI...
>
> Text entered:
> Here is the lessthan < this text should appear > the greaterthan sould be
> before this text.
>
> Dataset view:
> Here is the lessthan the greaterthan sould be before this text.
>
> NOTE:
> The 'this text should appear' is missing.
>
> API Json Output:
> "Here is the lessthan < this text should appear > the greaterthan sould be
> before this text."
>
> Search Page Output:
> Here is the lessthan < this text should appear > the greaterthan sould be
> before this text.
>
>
>
> Those are all the use cases we could come up with to show the different
> ways that <, >, <, and > are used throughout our ckan installation.
>
>
> From this developer's/user's viewpoint.
>
> I feel that it would be best if ckan would store the < and > in the
> database and then use view/controller behavior to translate those values to
> html encoded characters when being used on an html page.
>
> Not always easy to do, but it would allow us to stop placing html
> characters in our iso xml.  Which I think is a big no no.
>
> It would also stop storing html encoded characters in the database and
> having those characters bleeding out to other views (like the api json
> view) of ckan.
>
> Is there something that I am missing or does this seem like a bug/issue
> with ckan?
>
>
> Thank you for your time and knowledge.  They are both greatly appreciated.
>
> Regards,
>
> Nathan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20171026/530d6293/attachment-0003.html>


More information about the ckan-dev mailing list