XQuery to SQL by XML Algebra Tree

  • Published on
    31-Dec-2015

  • View
    33

  • Download
    0

DESCRIPTION

XQuery to SQL by XML Algebra Tree. Brad Pielech, Brian Murphy Thanks: Xin. Outline. Overview of Rainbow System Process of translating XQuery -> SQL XML Operators Partial translation walkthrough with running example. Rainbow System. Complete XML SQL system - PowerPoint PPT Presentation

Transcript

  • XQuery to SQL by XML Algebra TreeBrad Pielech, Brian MurphyThanks:Xin

  • OutlineOverview of Rainbow SystemProcess of translating XQuery -> SQLXML OperatorsPartial translation walkthrough with running example

  • Rainbow SystemComplete XML SQL system Uses some ideas from XPERANTO, Niagara, and other systemsSeveral main subsystems:Document ShredderView GeneratorQuery Translation, Query RewriteResult GenerationWork in progress

  • Steps in TranslationUser inputs XQuery queryUser Query is converted into an XML Algebra Tree (XAT)Database Mapping Querys XAT generatedQueries are Decorrelated Trees are merged, unnecessary branches cut

  • Steps ContinuedComputation Pushdown (presentation concludes here) SQL GenerationQuery Execution Tagging of Results

  • What is the difference between the two queries? The user query is executed over a view of the XML document and specifies what to return and how to return itThe mapping query specifies how the view the user is querying maps to the databaseTherefore, combining the two queries into one is necessary in order to correctly process the users request

  • XAT OperatorsEach XAT is comprised of XAT Operators. Similar in concepts to Relational AlgebraOperator set is combination between Niagara and Xperanto papers

  • Set of OperatorsSQL like (9):Project, Select, Join (Theta, Outer, Semi), Groupby, Orderby, Union (Node, Outer), Cartesian Product.XML like (4):Tagger, Navigate, is(Element, Text), Aggregate.Special:SQL, Function, Source, NameColumn, FOR

  • SQL like Operators (9)

  • XML like Operators

  • Special Operators

  • Boston Red Sox Nomar Shortstop Pedro Pitcher Manny Outfield Sports XML Document Fenway Park 33,000 1912

  • Example XQuery

    {For $p in document("sports.xml")/sports/organizationLet $a = $p/team/text()Where $a = "Boston Red Sox"Return$p/starPlayer/pname/text()} List all of the star players names on the Boston Red Sox

  • XAT Tree for Example Query

  • RDBMS Tables of Sports InfoOrganizationStadiumStarPlayerPlayerInfo

    organizationIDteamNamestadiumnName1Boston Red SoxFenway Park

    stadiumIDsnameCapacity yearBuiltticketHighticketLow1Fenway Park33,00019125022

    starPlayerNamestarPlayePositionorganizationIDNomarShortStop1PedroPitcher1MannyOutfield1

    PlayerNameNumberrookieYearNomar51997Pedro451992Manny241993

  • Partial Default XML View

    1 Boston Red Sox Fenway Park

    1 Fenway Park

  • Challenge Question I

    1 Boston Red Sox Fenway Park

    Nomar shortstop 1

    Boston Red Sox Nomar Shortstop Pedro Pitcher Manny Outfield What is the XQuery that converts the document on the left (default XML view) to the document on the right (user view)?

  • Mapping Query Part ICreate view invoice as (

    FOR $organization IN view ("default") /Organization/row RETURN $organization/teamName/text()

    FOR $starPlayer IN view ("default") /StarPlayer/rowWHERE $starPlayer/organizationID = $organization/organizationID RETURN $starPlayer/starPlayerName/text() $starPlayer/starPlayerPosition/text() B1B2

  • Mapping Query Part IIFOR $stadium IN view ("default") /Stadium/rowRETURN $stadium/sname/text() $stadium/capacity/text() $stadium/yearBuilt/text()

    FOR $player IN view ("default") /PlayerInfo/rowRETURN

    ) B3B4

  • Cutting Mapping QueryThe mapping query has data that is unused by the user query, so we can get rid of it B3 and B4 are completely removed Remove stadium from B1 Remove position from B2

  • Mapping Query XAT General Form$organization := Navigate("/",Organization/row)Source(default.xml)FOR $organization More StuffSome StuffSource(default.xml)FOR $starPlayer Some Stuff will be shown in Part I More Stuff in Part IIB1B2$starPlayer := Navigate("/", StarPlayer/row)

  • Mapping Query XAT Part IB1$tname := Navigate($organization, teamName/text()) $starPlayer := Navigate("/", StarPlayer/row)Source("default.xml")FOR $starPlayer To: Part II Some StuffFOR $organization

  • Mapping Query XAT Part IIAggregateTo: Part IB2Tagger( $sname
  • Decorrelated Mapping XAT Part I

    Boston Red Sox Nomar Pedro Manny

    Tagger ($tname O:= Tagger( All )All = AggregateTagger( V0

  • Decorrelated Mapping XAT Part IISource("default.xml")Source("default.xml")$organization = Navigate("/", Organization/row)$starPlayer := (Navigate"/", StarPlayer/row)To Part IAggregateTagger( $sname
  • Progress Report User inputs XQuery queryUser Query is converted into an XML Algebra Tree (XAT)Database Mapping Querys XAT generatedQueries are Decorrelated Trees are merged, unnecessary branches cut

  • XAT merging Input: User Query XAT + Mapping Query XATOutput:Simplified composite XATApproach:The Tagger from the top of the Mapping Query is linked to the bottom of the User Query. The Source Operator at the bottom of the User Query is deleted Pushdown NavigationBy using the commutative rulesCancel out the navigation operatorsBy using the composition rules

  • Combined XATV1 := Aggregate$pname = Navigate($p, starPlayer/pname/text())Select($a = "Boston Red Sox")$a := Navigate($p, team/text())$p := Navigate(O, sports/organization)Tagger( V1 Tagger( $pname Top of Mapping Query User QueryRest of Mapping Query

  • Computation Pushdown Part IWhat is PushDown?After merging the 2 XATs, there may be redundancies in the larger tree. Ex: The user query and mapping query may navigate to the same thingThe decorrelated query tree may be unorganized and inefficient Pushdown aims to eliminate these problems

  • Computation Pushdown Part IIXPERANTO mentions pushdown as a means of pushing computation to relational engineNiagara defines equivalence rules and specifies several different heuristics for using the rules

  • XAT Pushdown Example Part IV1 := Aggregate$pname = Navigate($p, starPlayer/pname/text())Select($a = "Boston Red Sox")$a := Navigate($p, team/text())$p := Navigate(O, sports/organization)Tagger( V1 Tagger( $pname Tagger ($tname O:= Tagger( All )All = AggregateTagger( V0
  • XAT Pushdown Example Part IIV1 := Aggregate$pname = Navigate($p, starPlayer/pname/text())Select($a = "Boston Red Sox")$a := Navigate($p, team/text())$p := Navigate(O, sports/organization)Tagger( V1 Tagger( $pname Tagger ($tname O:= Tagger( All )All = AggregateTagger( V0
  • XAT Pushdown Example Part IIIV1 := Aggregate$pname = Navigate($p, starPlayer/pname/text())Select($a = "Boston Red Sox")$a := Navigate($p, team/text())$p := Navigate(O, sports/organization)Tagger( V1 Tagger( $pname Source("default.xml")Cartesian Product$organization = Navigate("/", Organization/row)$starPlayer := (Navigate"/", StarPlayer/row)$ID := Navigate($organization, organizationID)$starPlayerID := Navigate($starPlayer, organizationID) Select($starPlayerID = $ID)$sname := Navigate($starPlayer, starPlayerName) Source("default.xml")User QueryTagger( $sname
  • XAT Pushdown Example Part IVV1 := Aggregate$pname = Navigate($p, starPlayer/pname/text())Select($a = "Boston Red Sox")$a := Navigate($p, team/text())$p := Navigate(O, sports/organization)Tagger( V1 Tagger( $pname Source("default.xml")Cartesian Product$organization = Navigate("/", Organization/row)$starPlayer := (Navigate"/", StarPlayer/row)$ID := Navigate($organization, organizationID)$starPlayerID := Navigate($starPlayer, organizationID) Select($starPlayerID = $ID)$sname := Navigate($starPlayer, starPlayerName) Source("default.xml")User QueryTagger( $sname
  • Challenge Questions II & IIIWhat are some of the heuristics we could use during Pushdown? What can / should we try to accomplish?What should the tree look like afterwards?How could we go about pushing things down?What would the algorithm be?How do we know if an operator can be pushed down? When do we stop pushing an operator down?

  • Computation Pushdown Part IIIGoal: Tagger + SQL operators + XML operatorsUse Equivalence rules repository to swap operatorsStep 1: Navigation Pushdown.Cancel Mapping Query Taggers and corresponding AggregatesDelete redundant Navigates from User QueryRename columns in Mapping Query Step 2: SQL Computation Pushdown.By commutative and composition rules.

  • Equivalence RulesPair-wise rules that determine if one operator (parent) may be pushed through another (child)Navigate / Navigate rule: If the parent depends on the child, they may not be swappedNavigate / Join: Navigate is pushed to the side of the join that its entry point comes fromAnd many, many more

  • Pushdown ResultsPush Navigates to the correct side of Cartesian ProductCreate a NameColumn operator that renames $tname into $aCreate a 2nd NameColumn operator that renames $pname into $snameGet rid of all Taggers and Aggregates from Mapping Query and Navigates that were crossed out from User QueryMerge Select($starPlayerID = $ID) and Cartesian into a Join

  • XAT After Computation PushDown Part IFrom Part II

  • XAT After Computation PushDown Part IIJoin on ($ID = $starPlayerID)To Part I

  • Rest of the ProcessTake the Combined XAT from the previous slide and generate a single SQL query.Execute query on local RDBMSFormat result tuples according to TaggerReturn XML document to user

  • SummaryCreated XAT of the user queryCreated XAT for mapping queryCut information unused by user queryDecorrelated Mapping queryMerged two queries into 1 larger XATIdentified weaknesses in combined treeWalked through pushdown stepsDisplayed final, optimized tree

  • The End!!!

    Unnest is taken care by the Navigate, which is unnest = Navigate(,*)Tagger deal with flat structure of input. For any nesting has aggregation should be done by the Aggregate operator.

    Unnest = Follow(*)My CO is the DCO in the complex query decorrelation.

    CO is only needed when the return is in the inner block, which is a common cases in FLWR, because, F is the outer query and the LWR is the inner query.

    CO = Correlated Join and Projected on the right part, it is a semi-correlated join.

Recommended

View more >