Support #3060

FR: Widespread "Invalid byte 2 of 3-byte UTF-8 sequence" issue affecting Atom based INSPIRE Download Services

Added by Angelo Quaglia over 2 years ago. Updated over 2 years ago.

Status:FeedbackStart date:18 Dec 2017
Priority:NormalDue date:
Assignee:Angelo Quaglia% Done:

0%

Category:Harvesting results
Target version:-
Submitting Organisation:FR Knowledge-Base relevant?:
Proactive:Yes Keyword #1:
Country:FR - France Keyword #2:
Originating UI: Keyword #3:

Description

Dear Marie, Thierry,

lately, I am seeing many occurrences of the following errors:

"Invalid byte 2 of 3-byte UTF-8 sequence" 

It is reported for Atom-based INSPIRE Download Service like the following one:

http://atom.geo-ide.developpement-durable.gouv.fr/atomMetadata/GetResourceDescription?id=fr-120066022-orphan-7e4a0886-d92c-4b9a-a480-f125b81ceabf

The XML prolog declares UTF-8 but the text encoding is actually different.

Altova confirms the issue (see the picture below).

Could you please ask the Service Provider(s) to have a look?

Best regards,

Angelo

History

#1 Updated by Thierry Vilmus over 2 years ago

Dear Angelo,

 I have informed Robert RIVIERE and Philippe BELAIS from Geo-IDE. They will investigate the matter..

Best regards,

Thierry

 

#2 Updated by Angelo Quaglia over 2 years ago

  • Status changed from Assigned to Feedback

#3 Updated by Angelo Quaglia over 2 years ago

Dear Thierry,

many thanks for the prompt action.

Best regards,

Angelo

 

#4 Updated by Robert Riviere over 2 years ago

Hello,

we have corrected this bug and planned its patch deployment for 1st week of January.

Greetings,

Robert Rivière

#5 Updated by Angelo Quaglia over 2 years ago

Dear Robert,

many thanks and best regards,

Angelo

#6 Updated by Robert Riviere over 2 years ago

Hello,

best wishes for 2018 to all of you.

 

I'm pleased to announce that the  UTF-8 issue has been fixed on http://atom.geo-ide.developpement-durable.gouv.fr/

Greetings,

Robert Rivière

#7 Updated by Angelo Quaglia over 2 years ago

Dear Thierry, 

best wishes to you, too, many thanks.

I have just tested this URL:

http://atom.geo-ide.developpement-durable.gouv.fr/atomMetadata/GetResourceDescription?id=fr-120066022-orphan-7e4a0886-d92c-4b9a-a480-f125b81ceabf

and the UTF-8 issue is indeed solved.

 

I see, however, a new serious problem that prevents the INSPIRE Geoportal from interpreting the feed as an INSPIRE Download Service.

The link to the ISO 19139 metadata has a misspelt relation: "describedBy" instead of "describedby"

  <link href="http://catalogue.geo-ide.developpement-durable.gouv.fr/catalogue/srv/fre/xml_iso19139?uuid=fr-120066022-ldd-fee84eeb-3bdb-470b-b8a1-dc0931776baa"
         rel="describedBy"
         type="application/vnd.iso.19139+xml"/>

 

 

See https://www.iana.org/assignments/link-relations/link-relations.xhtml

https://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fatom.geo-ide.developpement-durable.gouv.fr%2FatomMetadata%2FGetResourceDescription%3Fid%3Dfr-120066022-orphan-7e4a0886-d92c-4b9a-a480-f125b81ceabf

Best regards,

Angelo

#8 Updated by Robert Riviere over 2 years ago

Okay, we'll work on a fix asap.

BTW, do we need to also consider other messages displayed by the validator ?

> Use of unknown namespace: http://inspire.ec.europa.eu/schemas/inspire_dls/1.0

> Missing textual content </entry>

About the latter, actually I can't see where the problem is ?

Greetings

Robert

#9 Updated by Angelo Quaglia over 2 years ago

Great, many thanks.

The first recommendation  is expected:

Use of unknown namespace: http://inspire.ec.europa.eu/schemas/inspire_dls/1.0

 

The second one is not.

Actually, I notice that your are missing a few elements in the top feed.

You might want to look at this online example, it is old but still useful:

http://inspire-geoportal.ec.europa.eu/demos/ccm/codeview.html

 

Best regards,

Angelo

#10 Updated by Angelo Quaglia over 2 years ago

Dear Robert,

In order to get rid of the second recommendation, you need to add a summary element inside each feed <entry> element.

It is not mandatory, but if you add it, you may want to follow this other recommendation from the RFC:

4.2.13. The "atom:summary" Element

The "atom:summary" element is a Text construct that conveys a short summary, abstract, or excerpt of an entry. atomSummary = element atom:summary { atomTextConstruct } It is not advisable for the atom:summary element to duplicate atom:title or atom:content because Atom Processors might assume there is a useful summary when there is none.

For example, in the feed published here:

http://inspire-geoportal.ec.europa.eu/demos/ccm/codeview.html

I used the summary element to provide a user-friendly HTML representation of the entry.

However, plain text will suffice to get rid of the message, for example:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss" xmlns:gml="http://www.opengis.net/gml" xmlns:inspire_dls="http://inspire.ec.europa.eu/schemas/inspire_dls/1.0">
  <title>Plan de prévention des risques naturels (PPRN) de la commune de PRADES</title>
...

  <entry>
    <title>Téléchargement de Plan de prévention des risques naturels (PPRN) de la commune de PRADES et des documents associés</title>
    <link href="http://atom.geo-ide.developpement-durable.gouv.fr/atomArchive/GetResource?id=fr-120066022-orphan-7e4a0886-d92c-4b9a-a480-f125b81ceabf&amp;dataType=datasetAggregate" rel="alternate" type="application/x-tab" hreflang="fr" title="Plan de prévention des risques naturels (PPRN) de la commune de PRADES" />
    <id>http://catalogue.geo-ide.developpement-durable.gouv.fr/fr-120066022-ldd-fee84eeb-3bdb-470b-b8a1-dc0931776baa</id>
    <updated>2015-03-26T23:00:00.000Z</updated>
    <category term="http://www.opengis.net/def/crs/EPSG/0/2154" label="RGF93 - Lambert 93" />
    <summary>Some useful summary (abstract), not the same as the title</summary>
  </entry>
</feed>

Best regards,

Angelo

 

 

 

Also available in: Atom PDF