[CITE-Forum] WFS-T Update syntactic vs. semantic value equivalence
Tsvetan.Penev at avitech.aero
Fri Jul 25 12:42:04 EDT 2014
I am sorry that in my previous email it wasn't very clear what I was talking about. The funny thing is that the XML entity references which I used in my previous email got of course automatically replaced for their actual characters which makes my previous email look silly. But it is funny because it kind of makes my point <code>:)</code>.
I have tried to escape the XML entity references now by putting them in a < code > < / code > section, hoping that this will in the end display properly in a web browser.
The question that I want to pass is this. Are these two expressions "Ce n<code>'</code>est pas Vieux-Port de Montreal!" and "Ce n'est pas Vieux-Port de Montreal!" identical? I say yes.
Please check also the updated text from my last message below. I am looking forward to your comments.
Subject: [CITE-Forum] WFS-T Update syntactic vs. semantic value equivalence
while testing our WFS implementation against the WFS 2.0-r16 test suite we identified a quite tricky spot in the Transaction part of the test suite. The exact location is org.opengis.cite.iso19142.transaction.Update.updateGMLName. This behavior was noticed after a fix to issue 8 in GitHub was made (https://github.com/opengeospatial/ets-wfs20/issues/8).
The test that fails uses a Transaction>Update operation to change the value of the first gml:name property to this string: Ce n<code>'</code>est pas Vieux-Port de Montreal!
Once the update is completed the reported fes:ResourceId of the updated feature is returned and the test suite requests the updated feature from the service and tests to see if the property has been updated properly. The problem however is that our service reports back the updated property with apostrophe symbol as it is (similar to how this message is displayed here in the forum) and not represented with an XML entity as <code>'</code>. This causes the test to fail.
Issue 947 (http://cite.opengeospatial.org/issues) addresses another problem which was related to this (identical to GitHub issue 8). That issue addressed the problem that XPath is used for the comparison of values which needs to have the apostrophe in the value escaped. As a result of this <code>'</code> was introduced which solved the problem with XPath but leads to this issue.
Following the rules of XML both "Ce n<code>'</code>est pas Vieux-Port de Montreal!" and "Ce n'est pas Vieux-Port de Montreal!" are semantically identical but are obviously syntactically different. So what is identical and what is not? The line of thought goes like this. Should all values which have to be inserted/updated be escaped or not. The logical answer to this could be yes, because XML has some limitations and some characters must always be escaped so that XML parsers can parse the documents properly. However there is no guarantee that all the values which have to be inserted have not been escaped before. This could cause problems in particular with the "&" symbol which in XML is escaped as <code>&</code>. Escaping this once more would change this to <code>&amp;</code> which strictly speaking is identical and requires just a recursive replacement of the entity reference that in the end will return the "&" symbol. So comparing two XML element/attribute values character by character in their native XML representation without replacing all entity references with their true characters is not reliable. Here is another example:
"This is ampersand <code>&amp;amp;</code> and apostrophe <code>&apos;</code>!" and "This is ampersand & and apostrope '!". Are these identical? I believe so. However comparing those character by character would say they are different.
The only true way in my view to compare such values is to replace all entity references with the actual characters first and then to make a comparison. This way we have a common ground which does not suffer from some XML limitations and is unique. May be the test suite should make value comparisons only with all entity references replaced for their true characters.
What do you think? Does that make sense?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CITE-Forum