SQL Server 2005: Deep Dive On XML And XQuery Michael Rys DAT405 Program Manager, SQL Server XML Technologies Microsoft Corporation.

  • Published on

  • View

  • Download


  • SQL Server 2005: Deep Dive On XML And XQueryMichael RysDAT405 Program Manager, SQL Server XML TechnologiesMicrosoft Corporation

  • XML And Relational Data TodayFile SystemXMLXMLXMLXMLRelational DataRelational DataRelational DataRelational DataQuery and Combine

  • XML ScenariosData ExchangeBusiness-to-business (B2B), business-to-consumer (B2C), application-to-application (A2A)XML is ubiquitous, extensible, platform independent transport formatDocument ManagementXHTML, Office XML DocumentsMessagingSimple Object Access Protocol (SOAP), RSSMid-Tier CollaborationAd-hoc modeling of semistructured datastoring objects with sparse or multi-valued properties that do not fit well into the traditional relational schemataTransport, Store, and Query XML data

  • XML Or Relational?

  • XML And Relational!

  • SQL Server 2005 XML ArchitectureXML ParserXMLValidationXML data type (binary XML)Schema CollectionXMLRelationalXML SchemataOpenXML/nodes()FOR XML with TYPE directiveRowsetsXQueryXML-DMLNode TablePATH Index PROP Index VALUE Index PRIMARY XML INDEXXQuery

  • Why XQuery?

    SQL does not understand XMLXPath 1.0W3C RecommendationUsed in SQL Server 2000: SQLXML and OpenXMLNavigation, no reshapingLimited knowledge about typesXSLTW3C RecommendationData-driven reshaping (uses XPath)MSXML, System.XMLHard to author and optimize for large amount of dataNo XML data modification language (DML)

  • What Is XQuery?

    Queries and transforms trees Functional, declarative query languageCombines XPath with node constructionOperates on (XML Schema-)typed and unconstrained XMLDesigned to operate on large amounts of dataOptimizableCurrent Status: In final Last CallRecommendations in H2 CY2006Fulltext and DML extensions will follow later

  • XQuery Introduction

  • Key XQuery FeaturesFLWOR: FOR / LET / WHERE / ORDER BY / RETURNIncludes XPath 2.0 (/doc[@id = 123])Element constructors ({})Order-preserving operatorsInput order (FLWR)Document order (XPath, union)Statically (or dynamically) typedStrong typing with schema, weak typing without schemaSQL:SELECTFROMWHEREORDER BYWITHFORLETWHEREORDER BY& SETRETURN

  • XQuery Type System3 Classes of Item Types: Node types: element(), attribute(), comment() etc.Element content types: xs:anyType, user-defined (e.g., my:CustomerT)Atomic types: built-in and user-defined (e.g., xs:int, my:hatSize)XQuery uses XML Schema for content and atomic types Untyped data have special types (e.g., xdt:untypedAtomic)XML Schema (W3C standard)Rich mechanism for type definitions and validation constraintsCan be used to constrain XML documentsXML Schema Collections will be used for typing (meta-data)Benefits of typed data Guarantees shape of dataProvide type specific semanticsAllows storage and query optimizations

  • Static Typing In XQueryType Inference: Infers type of Expression during compilationType Check: Inferred Type is subtype of expected typeBenefits:Compile-time type error discoveryGuarantees correct type at runtimeMore efficient executionCosts:Sometimes type inference is less precise than data will be (inferring list on /a[1]/b, but there will always be only 1 b)Requires more explicit casts and pick first (/a[1]/b[1])

  • XML Data ModificationXQuery extensions: Insert, update, and deleteXML sub-tree modification:Add or delete XML sub-treesUpdate valuesGenerate consistent state

  • XML-DML:Customername: xs:stringOrderid: xs:intJanine42insertdeletereplace value ofinsert into /Customerinsert as last into /Customerinsert as first into /Customerinsert before /Customer/nameinsert after /Customer/namenotesnotesnotesdelete /Customer/Order[id = 42]Target needs to be statically one nodeNilsreplace value of (/Customer/name)[1] with Nils

  • XQuery And XML-DML In SQL Server 2005

  • XQuery And XML-DML In SQL Server 2005Subset of XQuery implementedIs aligned with July 2004 XQuery working draftAdded XML Data ModificationApplies to single XML data type instanceMethods on XML data type: query(), value(), exist(), modify(), nodes()Use SQL to iterate over collection of instances (XML-typed column)Can refer to relational data Take advantage of Schema-collection information to operate on typed XML dataWill make use of XML indices for optimization

  • XQuery Methodsquery() creates new, untyped XML data type instancevalue() extracts an XQuery value into the SQL value and type spaceExpression has to statically be a singleton String value of atomized XQuery item is cast to SQL typeSQL type has to be SQL scalar type (no XML or CLR UDT)exist() returns 1 if the XQuery expression returns at least one item, 0 otherwise

  • XQuery: nodes()Provides OpenXML-like functionality on XML data type column in SQL Server 2005Returns a row per selected nodeEach row contains a special XML data type instance thatReferences one of the selected nodesPreserves the original structure and typesCan only be used with the XQuery methods (not modify()), count(*), and IS (NOT) NULL

  • sql:column()/sql:variable()Map SQL value and type into XQuery values and types in context of XQuery or XML-DMLsql:variable(): accesses a SQL variable/parameter declare @value int set @value=42 select * from T where T.x.exist(/a/b[@id=sql:variable(@value)])=1sql:column(): accesses another column value tables: T(key int, x xml), S(key int, val int) select * from T join S on T.key=S.key where T.x.exist(/a/b[@id=sql:column(S.val)])=1Restrictions in SQL Server 2005: No XML, CLR UDT, datetime, or deprecated text/ntext/image

  • XQuery: modify()Used with SET: declare @xdoc xml set @xdoc.modify(delete /a/b[@id=42]) update T set T.xdoc.modify(insert into /a) where T.id=1Relational row-level concurrency: whole XML instance is locked

  • Combined SQL And XQuery/DML ProcessingXQuery ParserStatic TypingAlgebrizationXML Schema CollectionMetadataStatic PhaseRuntime Optimization and Execution of physical Op TreeDynamic PhaseXML and rel. IndicesStatic Optimization of combined Logical and Physical Operation TreeSELECT x.query(), y FROM T WHERE SQL ParserAlgebrizationStatic Typing

  • XML IndicesCreate XML index on XML columnCREATE PRIMARY XML INDEX idx_1 ON docs (xDoc)Create secondary indexes on tags, values, pathsSpeed up queriesResults can be served directly from indexSQLs cost based optimizer will consider indexPrimary and Secondary Indices will be efficiently maintained during updatesOnly subtree that changes will be updated

  • Example Index Contentsinsert into Person values (42, ' Bad Bugs Nobody loves bad bugs. Tree Frogs All right-thinking people love tree frogs. ')

  • Primary XML IndexCREATE PRIMARY XML INDEX PersonIdx ON Person (Pdesc)Assumes typed data; Columns and Values are simplified, see VLDB 2004 paper for details

  • Architectural Blueprint: Indexing XML Column in table T(id, x)Primary XML Index (1 per XML column) Clustered on Primary Key (of table T), XIDNon-clustered Secondary Indices (n per primary Index)Value IndexPath IndexProperty Index3121243312

  • XQuery Optimizations With XML Indices

  • Take-Away: XML Indices PRIMARY XML Index use when lots of XQueryFOR VALUE useful for queries where values are more selective than paths such as //*[.=Seattle]FOR PATH useful for Path expressions: avoids joins by mapping paths to hierarchical index (HID) numbers. Example: /person/address/zipFOR PROPERTY useful when optimizer chooses other index (e.g., on relational column, or FT Index) in addition so row is already known

  • Appendix: XML INDEXSome Requirements and RestrictionsThe user table must have a clustered index on the primary key To modify the primary key definition of a table, all XML indexes on the user table must be dropped first Primary XML index cannot be created on a computed XML column

  • Session SummarySQL Server 2005 provides XQuery and XML DML on XML datatype XQuery subset based on July 2004 WDTyping provided by XML Schema collections on XML datatypeNode-based Data Manipulation Language (DML)Integrates with relational processingOptimization:Using extended relational algebra and query optimizerIndexing of XML datatype

  • Community ResourcesAt PDCDAT Track lounge: Ill be there dailyAfter PDCMSDN dev center: http://msdn.microsoft.com/SQL/2005 XML and Databases whitepapers: http://msdn.microsoft.com/XML/BuildingXML/XMLandDatabase/ Online WebCasts: http://msdn.microsoft.com/sql/2005/2005webcasts/ Newsgroups & Forum: news:microsoft.public.sqlserver.xml http://forums.microsoft.com/msdn/ShowForum.aspx? ForumID=89 My E-mail: mrys@microsoft.com My Weblog: http://www.sqljunkies.com/weblog/mrysPlease fill out Session Evaluation

  • 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

  • Appendix: XML AbilitiesSelf-describing:This is an example.Complex dataTrees, recursive, graphStructured Data: highly regular, homogeneous structuresSemi-structured Data: heterogeneous, sparse dataMarkup Data: documents/content markupDocument ordering Schema/Type systemSchema-lessOptional Schema: semi-structured, structuredExtensibleAnnotations, multiple schemas (late binding)

  • Appendix: XQuery History and OutlookDec 1998: W3C Workshop on Querying XMLSep 1999: Start of W3C XQuery WGApril 2005: XQuery Data Model, Functions and Operators, Syntax and Semantics in final Last CallEnd 2005 (expected): Last Call Working Draft of XQuery Full-Text languageEnd 2005 (expected): First Working draft for Data Manipulation and Transformation Language Q2 of 2006 (expected): XQuery (without Full-Text, DML) in recommendationDuring 2006/7 (expected): Full-Text, DML in recommendation

  • Appendix: XID ORDPATHsOrdPath is abstracted from a bit array, see SIGMOD 2004 paper for details11. Nils as first into /book1.

  • Appendix: XQuery Execution PlansSELECT docFROM XMLdocWHERE 1 = doc.exist('declare namespace c = "urn:example/customer"; /c:doc/c:customer/c:order/@year[.>2000]');With Primary XML index specified. Estimated Cost: 0.007Primary XML IndexWithout Primary XML index specified. Estimated Cost: 1.008PK Index on XMLdoc

  • Appendix: XQuery Subset in SQL Server 2005Subset of Standard implemented. For example:FOR, WHERE, ORDER BY, RETURN, no LETNo user-defined XQuery functionsSubset of XQuery built-in functionsNo general expressions in path expressionsNo XQuery validation (use XML datatype validation)No expressions on a constructed nodeNo explicit schema import (only implicit)Dynamic errors are mapped to empty sequenceSome implementation restrictions:Limited support for xs:dateTime and friends,xs:QName, xsi:nil, list of unionsNo heterogeneous node and value sequencesNo XQuery joins across different XML instances (combine instances using FOR XML)Wrap XQuery expression into SQL UDF for CHECK constraints, computed columns


View more >