1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
|
/****************************************************************************
**
** Documentation on the xml module
**
** Copyright (C) 2005-2008 Trolltech ASA. All rights reserved.
**
** This file is part of the Qt GUI Toolkit.
**
** This file may be used under the terms of the GNU General
** Public License versions 2.0 or 3.0 as published by the Free
** Software Foundation and appearing in the files LICENSE.GPL2
** and LICENSE.GPL3 included in the packaging of this file.
** Alternatively you may (at your option) use any later version
** of the GNU General Public License if such license has been
** publicly approved by Trolltech ASA (or its successors, if any)
** and the KDE Free Qt Foundation.
**
** Please review the following information to ensure GNU General
** Public Licensing requirements will be met:
** http://trolltech.com/products/qt/licenses/licensing/opensource/.
** If you are unsure which license is appropriate for your use, please
** review the following information:
** http://trolltech.com/products/qt/licenses/licensing/licensingoverview
** or contact the sales department at sales@trolltech.com.
**
** This file may be used under the terms of the Q Public License as
** defined by Trolltech ASA and appearing in the file LICENSE.QPL
** included in the packaging of this file. Licensees holding valid Qt
** Commercial licenses may use this file in accordance with the Qt
** Commercial License Agreement provided with the Software.
**
** This file is provided "AS IS" with NO WARRANTY OF ANY KIND,
** INCLUDING THE WARRANTIES OF DESIGN, MERCHANTABILITY AND FITNESS FOR
** A PARTICULAR PURPOSE. Trolltech reserves all rights not granted
** herein.
**
**********************************************************************/
/*! \page xml.html
\title XML Module
\if defined(commercial)
This module is part of the \link commercialeditions.html Qt Enterprise Edition\endlink.
\endif
\tableofcontents
\target overview
\section1 Overview of the XML architecture in Qt
The XML module provides a well-formed XML parser using the SAX2
(Simple API for XML) interface plus an implementation of the DOM Level
2 (Document Object Model).
SAX is an event-based standard interface for XML parsers.
The Qt interface follows the design of the SAX2 Java implementation.
Its naming scheme was adapted to fit the Qt naming conventions.
Details on SAX2 can be found at
\link http://www.saxproject.org http://www.saxproject.org \endlink.
Support for SAX2 filters and the reader factory are under
development. The Qt implementation does not include the SAX1
compatibility classes present in the Java interface.
For an introduction to Qt's SAX2 classes see
"\link #sax2 The Qt SAX2 classes \endlink".
DOM Level 2 is a W3C Recommendation for XML interfaces that maps the
constituents of an XML document to a tree structure. Details and the
specification of DOM Level 2 can be found at
\link http://www.w3.org/DOM/ http://www.w3.org/DOM/ \endlink.
More information about the DOM classes in Qt is provided in the
\link #dom Qt DOM classes \endlink.
Qt provides the following XML related classes:
\table
\header \i Class \i Short description
\row \i \l QDomAttr
\i Represents one attribute of a QDomElement
\row \i \l QDomCDATASection
\i Represents an XML CDATA section
\row \i \l QDomCharacterData
\i Represents a generic string in the DOM
\row \i \l QDomComment
\i Represents an XML comment
\row \i \l QDomDocument
\i The representation of an XML document
\row \i \l QDomDocumentFragment
\i Tree of QDomNodes which is usually not a complete QDomDocument
\row \i \l QDomDocumentType
\i The representation of the DTD in the document tree
\row \i \l QDomElement
\i Represents one element in the DOM tree
\row \i \l QDomEntity
\i Represents an XML entity
\row \i \l QDomEntityReference
\i Represents an XML entity reference
\row \i \l QDomImplementation
\i Information about the features of the DOM implementation
\row \i \l QDomNamedNodeMap
\i Collection of nodes that can be accessed by name
\row \i \l QDomNode
\i The base class for all nodes of the DOM tree
\row \i \l QDomNodeList
\i List of QDomNode objects
\row \i \l QDomNotation
\i Represents an XML notation
\row \i \l QDomProcessingInstruction
\i Represents an XML processing instruction
\row \i \l QDomText
\i Represents textual data in the parsed XML document
\row \i \l QXmlAttributes
\i XML attributes
\row \i \l QXmlContentHandler
\i Interface to report logical content of XML data
\row \i \l QXmlDeclHandler
\i Interface to report declaration content of XML data
\row \i \l QXmlDefaultHandler
\i Default implementation of all XML handler classes
\row \i \l QXmlDTDHandler
\i Interface to report DTD content of XML data
\row \i \l QXmlEntityResolver
\i Interface to resolve extern entities contained in XML data
\row \i \l QXmlErrorHandler
\i Interface to report errors in XML data
\row \i \l QXmlInputSource
\i The input data for the QXmlReader subclasses
\row \i \l QXmlLexicalHandler
\i Interface to report lexical content of XML data
\row \i \l QXmlLocator
\i The XML handler classes with information about the actual parsing position
\row \i \l QXmlNamespaceSupport
\i Helper class for XML readers which want to include namespace support
\row \i \l QXmlParseException
\i Used to report errors with the QXmlErrorHandler interface
\row \i \l QXmlReader
\i Interface for XML readers (i.e. for SAX2 parsers)
\row \i \l QXmlSimpleReader
\i Implementation of a simple XML reader (a SAX2 parser)
\endtable
\target sax2
\section1 The Qt SAX2 classes
\target sax2Intro
\section2 Introduction to SAX2
The SAX2 interface is an event-driven mechanism to provide the user with
document information. An "event" in this context means something
reported by the parser, for example, it has encountered a start tag,
or an end tag, etc.
To make it less abstract consider the following example:
\code
<quote>A quotation.</quote>
\endcode
Whilst reading (a SAX2 parser is usually referred to as "reader")
the above document three events would be triggered:
\list 1
\i A start tag occurs (\c{<quote>}).
\i Character data (i.e. text) is found, "A quotation.".
\i An end tag is parsed (\c{</quote>}).
\endlist
Each time such an event occurs the parser reports it; you can set up
event handlers to respond to these events.
Whilst this is a fast and simple approach to read XML documents,
manipulation is difficult because data is not stored, simply handled
and discarded serially. The \link #dom DOM interface
\endlink reads in and stores the whole document in a tree structure;
this takes more memory, but makes it easier to manipulate the
document's structure..
The Qt XML module provides an abstract class, \l QXmlReader, that
defines the interface for potential SAX2 readers. Qt includes a reader
implementation, \l QXmlSimpleReader, that is easy to adapt through
subclassing.
The reader reports parsing events through special handler classes:
\table
\header \i Handler class \i Description
\row \i \l QXmlContentHandler
\i Reports events related to the content of a document (e.g. the start tag
or characters).
\row \i \l QXmlDTDHandler
\i Reports events related to the DTD (e.g. notation declarations).
\row \i \l QXmlErrorHandler
\i Reports errors or warnings that occurred during parsing.
\row \i \l QXmlEntityResolver
\i Reports external entities during parsing and allows users to resolve
external entities themselves instead of leaving it to the reader.
\row \i \l QXmlDeclHandler
\i Reports further DTD related events (e.g. attribute declarations).
\row \i \l QXmlLexicalHandler
\i Reports events related to the lexical structure of the
document (the beginning of the DTD, comments etc.).
\endtable
These classes are abstract classes describing the interface. The \l
QXmlDefaultHandler class provides a "do nothing" default
implementation for all of them. Therefore users only need to overload
the QXmlDefaultHandler functions they are interested in.
To read input XML data a special class \l QXmlInputSource is used.
Apart from those already mentioned, the following SAX2 support classes
provide additional useful functionality:
\table
\header \i Class \i Description
\row \i \l QXmlAttributes
\i Used to pass attributes in a start element event.
\row \i \l QXmlLocator
\i Used to obtain the actual parsing position of an event.
\row \i \l QXmlNamespaceSupport
\i Used to implement \link xml.html#namespaces namespace\endlink
support for a reader. Note that namespaces do not change the
parsing behavior. They are only reported through the handler.
\endtable
\target sax2Features
\section2 Features
The behaviour of an XML reader depends on its support for certain
optional features. For example, a reader may have the feature "report
attributes used for \link xml.html#namespaces namespace\endlink
declarations and prefixes along with the local name of a tag". Like
every other feature this has a unique name represented by a URI: it is
called \e http://xml.org/sax/features/namespace-prefixes.
The Qt SAX2 implementation can report whether the reader has
particular functionality using the \l QXmlReader::hasFeature()
function. Available features can be tested with QXmlReader::feature(),
and switched on or off using \l QXmlReader::setFeature().
Consider the example
\code
<document xmlns:book = 'http://trolltech.com/fnord/book/'
xmlns = 'http://trolltech.com/fnord/' >
\endcode
A reader that does not support the \e
http://xml.org/sax/features/namespace-prefixes feature would report
the element name \e document but not its attributes \e xmlns:book and
\e xmlns with their values. A reader with the feature \e
http://xml.org/sax/features/namespace-prefixes reports the namespace
attributes if the \link QXmlReader::feature() feature\endlink is
switched on.
Other features include \e http://xml.org/sax/features/namespace
(namespace processing, implies \e
http://xml.org/sax/features/namespace-prefixes) and \e
http://xml.org/sax/features/validation (the ability to report
validation errors).
Whilst SAX2 leaves it to the user to define and implement whatever
features are required, support for \e
http://xml.org/sax/features/namespace (and thus \e
http://xml.org/sax/features/namespace-prefixes) is mandantory.
The \l QXmlSimpleReader implementation of \l QXmlReader,
supports them, and can do namespace processing.
\l QXmlSimpleReader is not validating, so it
does not support \e http://xml.org/sax/features/validation.
\target sax2Namespaces
\section2 Namespace support via features
As we have seen in the \link #sax2Features previous section\endlink
we can configure the behavior of the reader when it comes to namespace
processing. This is done by setting and unsetting the
\e http://xml.org/sax/features/namespaces and
\e http://xml.org/sax/features/namespace-prefixes features.
They influence the reporting behavior in the following way:
\list 1
\i Namespace prefixes and local parts of elements and attributes can
be reported.
\i The qualified names of elements and attributes are reported.
\i \l QXmlContentHandler::startPrefixMapping() and \l
QXmlContentHandler::endPrefixMapping() are called by the reader.
\i Attributes that declare namespaces (i.e. the attribute \e xmlns and
attributes starting with \e{xmlns:}) are reported.
\endlist
Consider the following element:
\code
<author xmlns:fnord = 'http://trolltech.com/fnord/'
title="Ms"
fnord:title="Goddess"
name="Eris Kallisti"/>
\endcode
With \e http://xml.org/sax/features/namespace-prefixes set to TRUE
the reader will report four attributes; but with the \e
namespace-prefixes feature set to FALSE only three, with the \e
xmlns:fnord attribute defining a namespace being "invisible" to the
reader.
The \e http://xml.org/sax/features/namespaces feature is responsible
for reporting local names, namespace prefixes and URIs. With \e
http://xml.org/sax/features/namespaces set to TRUE the parser will
report \e title as the local name of the \e fnord:title attribute, \e
fnord being the namespace prefix and \e http://trolltech.com/fnord/ as
the namespace URI. When \e http://xml.org/sax/features/namespaces is
FALSE none of them are reported.
In the current implementation the Qt XML classes follow the definition
that the prefix \e xmlns itself isn't associated with any namespace at all
(see \link http://www.w3.org/TR/1999/REC-xml-names-19990114/#ns-using
http://www.w3.org/TR/1999/REC-xml-names-19990114/#ns-using \endlink).
Therefore even with \e http://xml.org/sax/features/namespaces and
\e http://xml.org/sax/features/namespace-prefixes both set to TRUE
the reader won't return either a local name, a namespace prefix or
a namespace URI for \e xmlns:fnord.
This might be changed in the future following the W3C suggestion
\link http://www.w3.org/2000/xmlns/ http://www.w3.org/2000/xmlns/ \endlink
to associate \e xmlns with the namespace \e http://www.w3.org/2000/xmlns.
As the SAX2 standard suggests, \l QXmlSimpleReader defaults to having
\e http://xml.org/sax/features/namespaces set to TRUE and
\e http://xml.org/sax/features/namespace-prefixes set to FALSE.
When changing this behavior using \l QXmlSimpleReader::setFeature()
note that the combination of both features set to
FALSE is illegal.
For a practical demonstration of how the two features affect the
output of the reader run the \link tagreader-with-features-example.html
tagreader with features example. \endlink
\target sax2NamespacesSummary
\section3 Summary
\l QXmlSimpleReader implements the following behavior:
\table
\header \i (namespaces, namespace-prefixes)
\i Namespace prefix and local part
\i Qualified names
\i Prefix mapping
\i xmlns attributes
\row \i (TRUE, FALSE) \i Yes \i Yes* \i Yes \i No
\row \i (TRUE, TRUE) \i Yes \i Yes \i Yes \i Yes
\row \i (FALSE, TRUE) \i No* \i Yes \i No* \i Yes
\row \i (FALSE, FALSE) \i41 Illegal
\endtable
<sup>*</sup> The behavior of these entries is not specified by SAX.
\target sax2Properties
\section2 Properties
Properties are a more general concept. They have a unique name,
represented as an URI, but their value is \c void*. Thus nearly
anything can be used as a property value. This concept involves some
danger, though: there is no means of ensuring type-safety; the user
must take care that they pass the right type. Properties are
useful if a reader supports special handler classes.
The URIs used for features and properties often look like URLs, e.g.
\c http://xml.org/sax/features/namespace. This does not mean that the
data required is at this address. It is simply a way of defining
unique names.
Anyone can define and use new SAX2 properties for their readers.
Property support is not mandatory.
To set or query properties the following functions are provided: \l
QXmlReader::setProperty(), \l QXmlReader::property() and \l
QXmlReader::hasProperty().
\target sax2Reading
\section2 Further reading
More information about XML (e.g. \link xml.html#namespaces namespaces \endlink)
can be found in the \link xml.html introduction to the Qt XML module. \endlink
\target dom
\section1 The Qt DOM classes
\target domIntro
\section2 Introduction to DOM
DOM provides an interface to access and change the content and
structure of an XML file. It makes a hierarchical view of the document
(a tree view). Thus -- in contrast to the SAX2 interface -- an object
model of the document is resident in memory after parsing which makes
manipulation easy.
All DOM nodes in the document tree are subclasses of \l QDomNode. The
document itself is represented as a \l QDomDocument object.
Here are the available node classes and their potential child classes:
\list
\i \l QDomDocument: Possible children are
\list
\i \l QDomElement (at most one)
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomDocumentType
\endlist
\i \l QDomDocumentFragment: Possible children are
\list
\i \l QDomElement
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomText
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomDocumentType: No children
\i \l QDomEntityReference: Possible children are
\list
\i \l QDomElement
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomText
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomElement: Possible children are
\list
\i \l QDomElement
\i \l QDomText
\i \l QDomComment
\i \l QDomProcessingInstruction
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomAttr: Possible children are
\list
\i \l QDomText
\i \l QDomEntityReference
\endlist
\i \l QDomProcessingInstruction: No children
\i \l QDomComment: No children
\i \l QDomText: No children
\i \l QDomCDATASection: No children
\i \l QDomEntity: Possible children are
\list
\i \l QDomElement
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomText
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomNotation: No children
\endlist
With \l QDomNodeList and \l QDomNamedNodeMap two collection classes
are provided: \l QDomNodeList is a list of nodes,
and \l QDomNamedNodeMap is used to handle unordered sets of nodes
(often used for attributes).
The \l QDomImplementation class allows the user to query features of the
DOM implementation.
\section2 Further reading
To get started please refer to the \l QDomDocument documentation.
\target namespaces
\section1 An introduction to namespaces
Parts of the Qt XML module documentation assume that you are familiar
with XML namespaces. Here we present a brief introduction; skip to
\link #namespacesConventions Qt XML documentation conventions \endlink
if you already know this material.
Namespaces are a concept introduced into XML to allow a more modular
design. With their help data processing software can easily resolve
naming conflicts in XML documents.
Consider the following example:
\code
<document>
<book>
<title>Practical XML</title>
<author title="Ms" name="Eris Kallisti"/>
<chapter>
<title>A Namespace Called fnord</title>
</chapter>
</book>
</document>
\endcode
Here we find three different uses of the name \e title. If you wish to
process this document you will encounter problems because each of the
\e titles should be displayed in a different manner -- even though
they have the same name.
The solution would be to have some means of identifying the first
occurrence of \e title as the title of a book, i.e. to use the \e
title element of a book namespace to distinguish it from, for example,
the chapter title, e.g.:
\code
<book:title>Practical XML</book:title>
\endcode
\e book in this case is a \e prefix denoting the namespace.
Before we can apply a namespace to element or attribute names we must
declare it.
Namespaces are URIs like \e http://trolltech.com/fnord/book/. This
does not mean that data must be available at this address; the URI is
simply used to provide a unique name.
We declare namespaces in the same way as attributes; strictly speaking
they \e are attributes. To make for example \e
http://trolltech.com/fnord/ the document's default XML namespace \e
xmlns we write
\code
xmlns="http://trolltech.com/fnord/"
\endcode
To distinguish the \e http://trolltech.com/fnord/book/ namespace from
the default, we must supply it with a prefix:
\code
xmlns:book="http://trolltech.com/fnord/book/"
\endcode
A namespace that is declared like this can be applied to element and
attribute names by prepending the appropriate prefix and a ":"
delimiter. We have already seen this with the \e book:title element.
Element names without a prefix belong to the default namespace. This
rule does not apply to attributes: an attribute without a prefix does
not belong to any of the declared XML namespaces at all. Attributes
always belong to the "traditional" namespace of the element in which
they appear. A "traditional" namespace is not an XML namespace, it
simply means that all attribute names belonging to one element must be
different. Later we will see how to assign an XML namespace to an
attribute.
Due to the fact that attributes without prefixes are not in any XML
namespace there is no collision between the attribute \e title (that
belongs to the \e author element) and for example the \e title element
within a \e chapter.
Let's clarify this with an example:
\code
<document xmlns:book = 'http://trolltech.com/fnord/book/'
xmlns = 'http://trolltech.com/fnord/' >
<book>
<book:title>Practical XML</book:title>
<book:author xmlns:fnord = 'http://trolltech.com/fnord/'
title="Ms"
fnord:title="Goddess"
name="Eris Kallisti"/>
<chapter>
<title>A Namespace Called fnord</title>
</chapter>
</book>
</document>
\endcode
Within the \e document element we have two namespaces declared. The
default namespace \e http://trolltech.com/fnord/ applies to the \e
book element, the \e chapter element, the appropriate \e title element
and of course to \e document itself.
The \e book:author and \e book:title elements belong to the namespace
with the URI \e http://trolltech.com/fnord/book/.
The two \e book:author attributes \e title and \e name have no XML
namespace assigned. They are only members of the "traditional"
namespace of the element \e book:author, meaning that for example two
\e title attributes in \e book:author are forbidden.
In the above example we circumvent the last rule by adding a \e title
attribute from the \e http://trolltech.com/fnord/ namespace to \e
book:author: the \e fnord:title comes from the namespace with the
prefix \e fnord that is declared in the \e book:author element.
Clearly the \e fnord namespace has the same namespace URI as the
default namespace. So why didn't we simply use the default namespace
we'd already declared? The answer is quite complex:
\list
\i attributes without a prefix don't belong to any XML namespace at
all, not even to the default namespace;
\i additionally omitting the prefix would lead to a \e title-title clash;
\i writing it as \e xmlns:title would declare a new namespace with the
prefix \e title instead of applying the default \e xmlns namespace.
\endlist
With the Qt XML classes elements and attributes can be accessed in two
ways: either by refering to their qualified names consisting of the
namespace prefix and the "real" name (or \e local name) or by the
combination of local name and namespace URI.
More information on XML namespaces can be found at
\l http://www.w3.org/TR/REC-xml-names/.
\target namespacesConventions
\section2 Conventions used in Qt XML documentation
The following terms are used to distinguish the parts of names within
the context of namespaces:
\list
\i The \e {qualified name}
is the name as it appears in the document. (In the above example \e
book:title is a qualified name.)
\i A \e {namespace prefix} in a qualified name
is the part to the left of the ":". (\e book is the namespace prefix in
\e book:title.)
\i The \e {local part} of a name (also refered to as the \e {local
name}) appears to the right of the ":". (Thus \e title is the
local part of \e book:title.)
\i The \e {namespace URI} ("Uniform Resource Identifier") is a unique
identifier for a namespace. It looks like a URL
(e.g. \e http://trolltech.com/fnord/ ) but does not require
data to be accessible by the given protocol at the named address.
\endlist
Elements without a ":" (like \e chapter in the example) do not have a
namespace prefix. In this case the local part and the qualified name
are identical (i.e. \e chapter).
*/
|