what encoding does system.xml.xmldocument.save(string path) use to save the xml document if there is

  • Thread starter Thread starter Daniel
  • Start date Start date
D

Daniel

what encoding does system.xml.xmldocument.save(string path) use to save the
xml document if there is no <?xml... in the front of the xml document?
 
Daniel said:
what encoding does system.xml.xmldocument.save(string path) use to save the
xml document if there is no <?xml... in the front of the xml document?

The docs aren't clear, but a quick look with reflector shows that it
uses UTF-8 (which is what I'd expect).
 
what encoding does system.xml.xmldocument.save(string path) use to save the
xml document if there is no <?xml... in the front of the xml document?
<?xml... is mandatory for a valid XML.
And is really that difficult to just try?
 
Mihai N. said:
<?xml... is mandatory for a valid XML.
And is really that difficult to just try?

Well, it's mandatory if to be a *valid* document, but then you also
need to have a DTD to be valid, and most XML documents I've seen
don't...

You *don't* need to have a declaration in order to be well-formed,
which is all that most XML documents are, IME.
 
Well, it's mandatory if to be a *valid* document, but then you also
need to have a DTD to be valid, and most XML documents I've seen
don't...

You *don't* need to have a declaration in order to be well-formed,
which is all that most XML documents are, IME.

When in doubt, go to the standard:
http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-well-formed

"A data object is an XML document if it is well-formed, as defined in this
specification. In addition, the XML document is valid if it meets certain
further constraints."

[Definition: A textual object is a well-formed XML document if:]

1. Taken as a whole, it matches the production labeled document.
2. It meets all the well-formedness constraints given in this
specification.
3. Each of the parsed entities which is referenced directly or
indirectly within the document is well-formed.

document ::= ( prolog element Misc* ) - ( Char* RestrictedChar Char* )
prolog ::= XMLDecl Misc* (doctypedecl Misc*)?
XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'

So, well-formed => "matches the production labeled document" =>
prolog => XMLDecl => '<?xml'

No '<?xml', not well-formed.
QED.
 
Mihai N. said:
Well, it's mandatory if to be a *valid* document, but then you also
need to have a DTD to be valid, and most XML documents I've seen
don't...

You *don't* need to have a declaration in order to be well-formed,
which is all that most XML documents are, IME.

When in doubt, go to the standard:
http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-well-formed

"A data object is an XML document if it is well-formed, as defined in this
specification. In addition, the XML document is valid if it meets certain
further constraints."

[Definition: A textual object is a well-formed XML document if:]

1. Taken as a whole, it matches the production labeled document.
2. It meets all the well-formedness constraints given in this
specification.
3. Each of the parsed entities which is referenced directly or
indirectly within the document is well-formed.

document ::= ( prolog element Misc* ) - ( Char* RestrictedChar Char* )
prolog ::= XMLDecl Misc* (doctypedecl Misc*)?
XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'

So, well-formed => "matches the production labeled document" =>
prolog => XMLDecl => '<?xml'

No '<?xml', not well-formed.
QED.

That's for XML 1.1. Look at the equivalent bit of the XML 1.0 spec at
http://www.w3.org/TR/2006/REC-xml-20060816/#NT-prolog
and you'll find

prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?

The difference is between XMLDecl? and XMLDecl.

And indeed if you look above the bit you've quoted, you'll find:

<quote>
the following is an XML 1.0 document because it does not have an XML
declaration:

<greeting>Hello, world!</greeting>
</quote>

Now, I've seen *far* more XML 1.0 documents than XML 1.1 documents -
haven't you?
 
Now, I've seen *far* more XML 1.0 documents than XML 1.1 documents -
haven't you?

If you count XMLs without <?xml as 1.0, and not as mistakes, then yes :-)
But you are right, for 1.0 <?xml is not mandatory.

But in this case the standard is clear: if there is no <?xml,
or there is an <?xml without the encoding specified, then the encoding
is utf-8
 
Mihai N. said:
If you count XMLs without <?xml as 1.0, and not as mistakes, then yes :-)
But you are right, for 1.0 <?xml is not mandatory.

But in this case the standard is clear: if there is no <?xml,
or there is an <?xml without the encoding specified, then the encoding
is utf-8

In terms of reading, yes. Fortunately XmlDocument.Save obeys this as
well, and writes it out properly :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top