[Requests] Comment on WPS 14-065

Joan Masó joan.maso at uab.cat
Wed Oct 8 05:26:37 EDT 2014


PART A

1. Evaluator:
        Joan Masó, UAB-CREAF, joan.maso at uab.cat

2. Submission: [OpenGIS Project Document Number, Name]
        14-065 candidate OGC WPS 2.0 Interface Standard


PART B
Title: Allow for internal data

1. Requirement: General
The new WPS has not changed a lot from the previous version in practice.
One of the main problems in WPS is the relation about processes and data to
will be processed. In fact, figure 1 already reflects this. In some cases,
it is interesting that a the WPS server has several data in it but it is no
possible to know which data is available for doing a processes. The only way
of doing it seems to associate a “married” data service (e.g. WFS or WCS)
and assume that by sending a WCS o WFS request to the WPS it will be
intelligent enough to understand that it is actually its own data and get it
without download it.. In others case, a client has data that need to be
processed several times but it does not have any mechanism to send the data
once to the WPS server to be processed later and reuse it for several
operations. Currently it seems that  a WPS service is supposed to request to
a download service each and every time that it has to to operate with it.
Some mechanism to know and use a process internal data is needed.
Additionally, some mechanism to pre-send the data directly to the service
from a client is also needed . A data identifier could be one way of reuse
the data.

2. Implementation Specification Section number:  7.3, and more
Add a new data type to ComplexData, LiteralData..called InternalData that is
mainly just an id.
Also add some way of enumerate available data in the GetCapabilites or in
and Independent operation. Instead of trying to fully describe the data
provide URL’s to other services (e.g. WCS, WFS) that can describe this data.
Add the capability to assign data identifiers to the results in the
asynchronous request. This will allow for reusing an internal output as an
internal input for another process in the same WPS (an embryo for a process
chain) .

3. Criticality: Major

4. Comments/justifications for changes:
This will cover a common use case: A service that has a long list f
prearranged datasets that are available for processing. A common example is
a downstream remote sensing station that offers long series of data ready
for on-demand processing to get level 2 data from raw data when users
request it.



PART B

Title: Allow for sending internal data
1. Requirement: General
If you allow the previous request, then a natural extension is to allow
people to send data to WPS that become available for processing.

2. Implementation Specification Section number:  7
Add a new operation SendInput that allow sending data to a WPS and get back
a data id.

3. Criticality: Major

4. Comments/justifications for changes:
This will allow for sending data to a WPS and processing it several times
changing parameters or using it in different processes.



PART B
Title: Allow for data coming from a POST operation (not just LiteralData
URLs).

1. Requirement: General
Current approach allow for requesting data to be processed from a different
place by providing a URL or from a service using KVP in a URL. But. what
about data from a POST operation?.

2. Implementation Specification Section number: 7.3
Add a new input data type: PostData that allow for a URL+ a body in the
“call”. This way it could be possible to use WFS with complicated filters or
even request data to another WPS. Please check the last OWS Context standard
to see one way of encoding this.

3. Criticality: Major

4. Comments/justifications for changes:
This way it could be possible to use WFS with complicated filters or even
request data to another WPS introducing a “primitive” form of service chain
that can be extended in the future.



PART B
Title: Data Provenance and metadata

1. Requirement: General
Can we discuss a something about data provenance in the specification?

2. Implementation Specification Section number 7.3
Can we discuss a mechanism to respond data and metadata from a process? For
example, can we create a form of data output ComplexDataWithMetadata that
has a container for the data and another container for the metadata that
will “travel” together back to the client?

3. Criticality: Major

4. Comments/justifications for changes:
Please take into consideration discussions on OWS10 about metadata and
provenance but in a broader context. It could be good to clarify how
processes can produce data and at least metadata at the dataset level. I’m
sorry if I’m not proposing a more clear approach. If the group is willing to
think more about this, let’s do it.



PART B

Title: Complete WPS profiles

1. Requirement: General
The idea of fully describing profiles is VERY appealing but the current
specification says explicitly that no language or encoding is provided but,
in fact, a way of doing it is suggested later. A more solid direction is
needed here.

2. Implementation Specification Section number: 11
I thing the specification is VERY close to be able to propose an encoding
for WPS profiles based on the current encoding for describing processes in
Describe process. It could be VERY useful to have it.
IMHO we need:
* A XML language to describe inputs and outputs of an abstract process. We
are so close to this
 
* An identifier for the abstract process
* A way to link a WPS process to its profile (it might be already there; I
have to check)

3. Criticality: Major

4. Comments/justifications for changes:
I think we do not need to provide a catalogue in this version of the
standard (I think it is even out of scope). Just having XML documents will
help people to point to them and then, clients could look for alternative
processes in the network. 
This could also stipulate the academia to generate a corpus of “most have”
distributed GIS processes (and describe them) that they can formalize and
teach to their students. Students will be able to test each process in
different implementations from different vendors in the same way they can do
that today with “local” GIS comparing e.g. ArcGIS, QGIS,...



PART B
Title: WPS KVP Execute

1. Requirement: General
A KVP syntax is not possible in general but it is very attractive. Literal
Values and URLs can be easy encoded in a KVP syntax.

2. Implementation Specification Section number: 10.2.3
Add a section for a GET + KVP request only available for operations without
complex data as input. Parameter identifiers are keys and values and values
of the KVP syntax.

3. Criticality: Minor

4. Comments/justifications for changes:
Even if a general solution is not possible, it opens a door to hundreds of
operations that can be easily encoded in a URL such us: Where is the nearer
restaurant close to my coordinate, Give me the coordinate of this postal
address. Give me the combination of this 2 datasets that are available in
this pair of URL’s
 Please note that implementing a POST request is more
complicated, particularly in a web browser. I’m sorry if this was previously
discussed.


PART B
Title: DescribeProcess response in an independent requirements class

1. Requirement: General
The DescribeProcess response document could be described in an independent
requirement class without dependencies to the general WPS model. This will
allow for a generalization of the DescribeProcess language as a common
language to describe process inputs and outputs that could be used also for
desktop solutions (e.g. to generate helps of automatic GUI form interfaces).
This could also help metadata tools to report lineage information based on
DescribeProcess descriptions that will be complemented by a final user
intervention. 

2. Implementation Specification Section number: 9.8.2
IMHO we need:
* The  WPS 2.0 standard (14-065) document should specify clearly that the
DescribeProcess response is an independent requirements class and a
discussion of some of advantages for the geospatial community.

3. Criticality: Major

4. Comments/justifications for changes:
According to candidate WPS 2.0 standard (14-065) document, the
DescribeProcess response is already  an independent requirement class
because it is declared in its own description format. However from my point
of view it is not clearly remarked, thus the document should emphasize this
point and underline some of the advantages of doing in this way.  Having the
DescribeProcess response clearly separated from the rest of the WPS model
can increment the use of it to describe geospatial operations. This, as well
as being beneficial for reducing some interoperability problems, can help to
extract provenance information. The DescribeProcess response documents are
very helpful to capture detailed process descriptions, the document
describes the processes, the sources and the data outputs involved. In this
context, if the model allows us to relate the parameters of the process not
only with the data, but also with the metadata data, we could have the
capability to capture the data provenance of the data output.





More information about the Requests mailing list