A distributed mobile daon Pocket PC mobile deover Bluetooth
Hassan Artail a,, Manal Shihab
As mobile applications have become more ubiquitous, the demand for database data in mobile
ARTICLE IN PRESS
Contents lists available at ScienceDirect
Journal of Network and
Journal of Network and Computer Applications 32 (2009) 961151084-8045/$ - see front matter & 2008 Elsevier Ltd. All rights reserved.settings has increased, and this has created a need for engineering database architectures that are
Corresponding author. Tel.: +9611350000; fax: +9611744462.E-mail addresses: email@example.com (H. Artail), firstname.lastname@example.org (M. Shihab), email@example.com (H. Safa).1. Introductionconsumption. The obtained results indicated the feasibility
system and its potential for providing mobile users w
framework for aggregating disparate data that are stor
mobile databases in the wireless ad hoc network.
& 2008 Elsevier Ltd. All rights resData availability
operating system and communicate using Bluetooth, thus limiting
the architecture to eight devices, which is a restriction imposed by
piconets. Sample databases were congured on the devices that
ran the SQL Server CE database engine, and a list of 170 sample
queries of varying complexities were designed to conduct
performance evaluation. This evaluation involved measurement
of query response time, generated trafc, and device energya Department of Electrical and Computer Engin
P.O. Box 11-0236, Riad El-Solh 1107 2020, Beirb Department of Computer Science, American U
a r t i c l e i n f o
Received 15 November 2007
Received in revised form
11 April 2008
Accepted 14 April 2008
Keywords:tabase implementationvices communicating
a, Haidar Safa b
, American University of Beirut,
sity of Beirut, P.O. Box 11-0236, Riad El-Solh 1107 2020, Beirut, Lebanon
a b s t r a c t
This paper describes a distributed database system implementa-
tion built on top of stand-alone mobile databases found on mobile
devices. At the heart of the architecture are elected devices that
take on the role of data directories which collect the schema of the
databases and become the contact points for all nodes that wish
to submit queries against the distributed database. The system is
implemented on Pocket PCs that run the Microsoft WinCEjournal homepage: www.elsevier.com/locate/jnca
using the Bluetooth communication protocol, collaboration between mobile devices, and distributeddatabases in MANETs. Accordingly, this section briey discusses related work done in those areas.
ARTICLE IN PRESSA study of recent research trends and experimental guidelines in MANETs, which was presentedby Dow et al. (2005), surveyed more than 1300 MANET related papers from 1998 to 2003. It wasfound that topics like routing and power management attracted most of the attention, while issuessuch as IP addressing, fault tolerance, and collaboration were also popular among MANETresearchers. In the qualitative analysis covered by the studied papers, factors such as scalability,stability, and reliability were of primary concern in major MANET issues. The framework ofdistributed mobile databases addressed in this paper is collaborative in the sense that nodes (mobiledevices) cooperate by providing services to each other in order to answer a mobile nodes query.Therefore, the subject of this paper contributes to the thrust of MANET research, and provides anactual implementation on Pocket PCs that communicate using Bluetooth (BT) through the formationof a piconet, where one master device can connect with up to seven active slave devices. A set ofpiconets connected through sharing devices form a scatternet. Although proposed in the standards,this function is still not well-developed today. Several scatternet protocols have been proposed butthe BlueTrees (Zaruba et al., 2001) and BlueStars (Petrioli et al., 2003) are most known and have beenthe subject of further studies and comparisons (Amin et al., 2006; Basagni et al., 2004). Of concern toour work is the fact that current implementations of Bluetooth communication on mobile computingdevices, like Pocket PCs and PalmOS handhelds, do not support masterslave role switching, andtherefore scatternets. For this reason, our current implementation is limited to a wireless network ofeight or less devices that communicate and form a piconet structure. Similarly, previous works thatinvolved BT communication were limited to piconets. For example, Altundag and Gokturk (2006)developed a scatternet formation protocol and implemented a chat room application that utilizedthis protocol using J2ME. However, given the lack of scatternet and masterslave role switching onmobile phones, major features of the proposed protocol did not make it to the implementation, andsuitable for such dynamic environments (Chan and Roddick, 2005). The most interesting andpromising form of such environments is the mobile ad hoc network (MANET) in which a mobile hostcan act as a source of information, destination, or a router that transfers data toward its destination.Mobile hosts are generally small computing devices with relatively limited resources that can join orleave the network unexpectedly at any time. In addition to the increasingly more complexapplications that are targeting such devices, more sophisticated mobile database managementsystems are being engineered to offer capabilities that parallel those of enterprise level databasesystems.
The ultimate goal of our design and implementation is to provide a system through which mobiledevices share data in disconnected settings (i.e., away from an access point that connects these hoststo a xed network and through it to a central data source). That is, our work targets environments inwhich mobile hosts collaborate in answering each others queries. More specically, we develop adistributed database system on top of isolated mobile databases that exist on the mobile devices thathappen to be in close proximity with each other. The protocol we consider for communication isBluetooth which is prevalent on mobile devices due to its low cost and efcient utilization of batterypower. The implementation of the system involved several issues relating to locating the datasources for the particular query, fragmenting the query among the participating mobile devices, andjoining the individual results by some mobile device before returning them to the client application.
The rest of the paper is organized as follows. Section 2 provides a survey of related work andbriey describes the employed technologies in our implementation. Sections 3 and 4 describe thedesign and implementation of the system, respectively, while Section 5 presents the performanceresults. Finally, the paper is concluded in Section 6.
2. Literature overview
The developed application described in this paper tackles three main areas, namely, connectivity
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 97as a result, the chat room application was restricted to work within a single piconet.
ARTICLE IN PRESS3. System architecture
The proposed system provides a cooperative framework that builds a distributed database out ofpossibly heterogeneous databases running on mobile devices that may be roaming in the wirelessnetwork in close proximity (within transmission range) to each other. In the following, basicconcepts are covered, followed by a description of the modules that were implemented.
3.1. Basic concepts
The system consists of moving nodes that can take one of three possible roles, as depicted inFig. 1: requesting nodes (RNs), database nodes (DBNs), and a database directory (DD). A DBNnode has its own mobile database and is willing to be part of the one large database. On theother hand, the DD stores the schemas of the DBNs in the network and serves the RNs throughfragmenting their queries so they can be processed by the individual DBNs. An RN can be any nodethat sends its request (query) to the DD for processing. The latter will look up its schema to identifyall the DBNs that will participate in answering the query and will fragment the query accordingly.Fig. 1 shows a general diagram that illustrates the elements of the system, in particular the DD andDBN nodes.
3.2. Data directory election
The task of fragmentation is independent of the location and size of the data, and thus decidingwhich DD to perform fragmentation does not inuence the performance of the system. A DD nodemust meet some minimum criteria that involve expected time of staying in the network, battery life,and available memory. This translates into a score that each node must compute and updateOur work involves query processing and result coalescing. Several approaches have appearedthat discuss processing queries in distributed mobile environments. The scheme presented inKottkamp and Zukunft (1998) is one of such approaches. It aims to optimize location aware queriesfor mobile database systems, and develop a mobility-aware cost model that makes use of fuzzyinformation while trying to accommodate the limited resources of the mobile devices. Anotherapproach is the one described in Li (2004), which presents a cost evaluation model for joinoperations. It incorporates the primary parameters of wireless networks, including the size of thetransmitted data, the transmission distance, and the query cost at each node. To perform equi-joinoperations between nodes, the system executes a distributed query optimization procedure that iscomposed of two parts: distribution optimization and local optimization. The rst is based onreducing the communication cost while the second can be done using centralized databaseoptimization techniques. In Epstein et al. (1978), a set of algorithms for processing databasecommands that involve data from multiple mobile nodes in a distributed database environmentwere proposed. These algorithms use mechanisms to process queries by breaking the qualicationinto separate pieces using simple heuristics. The considered cost criteria are response time andcommunications trafc. Finally, the work in Kossmann (2000) presents a textbook architecture forquery processing and discusses a series of specic techniques for distributed database systems. Itdescribes several methods for shipping data from one site to others, implementing joins, andselection of sites for executing the elements of the queries. The query processor receives an SQLquery, translates and optimizes this query in several phases into an executable query plan, andexecutes the plan in order to obtain the results. For query optimization, a plan enumeration wasdeveloped with dynamic programming to produce optimal plans given that the cost model issufciently accurate. Such an algorithm has an exponential time and space complexity and istherefore not viable for complex queries, nor is it suitable for a distributed framework involvingmobile devices.
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 9611598periodically. Nodes must communicate their scores to each other so that a DD is selected and
ARTICLE IN PRESS
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 99becomes known to the rest of the network. The DD is the node possessing the maximum score, andpreferably not a DBN (for load balancing). Hence, a node, upon joining the network, sends itscomputed scores through a message to all nodes that it discovers, and similarly, receives the scores ofothers after being discovered. The message includes, in addition to the score, a ag indicating if thenode holds a database. Having the scores of all nodes in the network, a node with the highest score,announces its role as a DD to the other devices via a different message. This prompts all DBN nodes tosend the schema of their databases to the new DD which will in turn use them for queryfragmentation and building the query plan tree (QPT).
3.3. Coping with dynamic environments
Mobile devices in a mobile ad hoc network may be highly mobile, and could go ofineunexpectedly. With Bluetooth, departed or disconnected devices can be readily detected (discovered)by other devices. However, depending on the mobility behavior, a device that goes out oftransmission range of the other devices that it used to connect to may go back in range shortly afterdisconnection. Similarly, a device that becomes unavailable (appears to have been turned off) for ashort period of time (e.g., due to using it by its user) may go back online soon afterwards. These
DBNName DBNAddress SCN FTPSCN TableName Attributes Size DBN1 (00:12:D1:9B:E5:94) 09 03 Cinemas Cinema_Name,Mall_Name 105DBN2 (00:12:D1:9E:88:15) 09 03 Movies Movie_Name,CinemaName,Timings 588DBN3 (00:12:D1:9C:46:12) 09 03 Restaurants Restaurant_Name,Mall_Name,Specia 203DBN4 (00:12:D1:9C:BB:B9) 09 03 Restaurants Restaurant_Name,Mall_Name,Specia 290DBN5 (00:12:D1:28:06:83) 09 03 Malls Mall_Name,Address 153DBN5 (00:12:D1:28:32:11) 09 03 Stores Store_Name,Mall_Name,Products 852DBN5 (00:12:D1:9B:84:12) 09 03 Restaurants Restaurant_Name,Mall_Name,Specia 87
Fig. 1. An illustrative example of the proposed system and the schema structure.
ARTICLE IN PRESS3.4. Query processing procedure
The task of performing query processing in a distributed database environment reduces todeciding on a strategy for executing each part of the query over the network in the most cost-effective way. The process consists of the following steps: (1) query fragmentation, (2) QPTgeneration, and (3) result generation and joining into one set. If there are unions and/or joinsinvolved, and after constructing the QPT, the DD will include in each query fragment sent to aparticipating DBN the addresses and channel information of all participating DBN nodes along withags that specify the type of joint operation (i.e., union or join). This will imply to those DBNs thatthey must communicate to each other the sizes of their intermediate results in order to identify theone holding the largest dataset. That DBN will then be the receiver of intermediate results and thenode to actually perform the union or the join.
The decomposition of the user query into subqueries on distributed relations is accomplishedusing the information in the schema of the DD that describes the distribution of the relations amongthe DBNs. The DD uses an algorithm that recursively breaks up the query into smaller pieces: a queryis rst decomposed into a sequence of subqueries each having a unique relation (monorelation) andone multirelation query. Each monorelation (Ri) and the address of the DBN that will execute it willbe added to a list of subqueries. Next the multirelation query (R0) is divided into a set of subquerieseach having only one join condition. Once the decomposition process is completed, a QPT is built thatspecies the hierarchy of subquery execution. In a QPT, the monorelations occupy the leaf nodes ofthe tree, as illustrated in the example of Fig. 2. The procedure starts by processing all monorelations,followed by performing the union operations between similar tables, and nally the joins. Asillustrated in the gure, nodes that are involved in binary operations (unions or joins) will determineat runtime (i.e., during the execution of the distributed query) the senders and receivers ofintermediate results, with the overall objective of minimizing the amount of generated networkscenarios suggest that the discovering device(s) wait a period t before an action is taken, asillustrated below.
If a DD goes ofine, it gets detected shortly afterwards by all devices that had it in theirtransmission ranges. If a period t passes without this device (i.e., DD) coming back into range, areplacement DD will announce its new role to the rest of the devices in a manner similar to how thedisconnected DD was chosen. Upon learning about it, the DBNs will send it their schemas. In terms ofimpact on performance, the loss of a DD will possibly cause a delay experienced by the RN.Specically, if a DD disconnects before it completely dispatches the query fragments to theparticipating DBNs, and does not return within the allowable time t, the overall query will be lost,and the RN will have to resend the query to the new DD after it times out. On the other hand, if theDD disconnects after the generation of the QPT and sending the queries to the participating DBNnodes, the remainder of the query processing will not be affected, and as a result, the RN will notexperience any resultant delay.
In case a DBN goes ofine for longer than a period t, the DD will simply remove all entries in itsschema table that reference the departed DBN node. This implies that if there is a query in progressthat involves this DBN, it will simply be answered by the remaining participating DBN nodes, if any.For this to work effectively, the DD has to notify the rest of the participating DBNs and send them anupdated QPT. This will serve to inform a DBN that was supposed to receive intermediate results fromthe disconnected DBN not to wait any further, and to proceed to the next step (i.e., send its output toanother DBN in order to perform a join or a union, or to the RN). The DD notication will also informa DBN that was scheduled to send its output to the disconnected DBN to send it to the designatednode as specied in the updated QPT. It is noted that in all situations a completeness index ag will bewritten to the XML results le that will be returned to the RN to inform the user if the receivedresults are not 100% complete. This ag will have a value of 1 if the results are complete or 0 if theyare incomplete (i.e., because of the departure of one or more DBN nodes).
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115100trafc.
ARTICLE IN PRESS4. Implementation
The developed application employs Bluetooth for communication. As part of BT, the OBEXprotocol is used for transferring data objects such as XML les between devices. The application wasdeveloped using the C# language along with the .NET Compact Framework available on all newPocket PC and Smartphone device ROMs. For the database engine running on the devices, the SQLServer CE was chosen because it serves the purpose and it integrates well with the .NET CompactFramework and the Visual Studio .NET 2005 development environment.
The development environment, however, does not offer a library for BT communication support.For this purpose, Franson Bluetools (www.franson.com), implemented as a C# API and supportingthe Microsoft and Widcomm stacks, was used. Related to the choice of the programming language,C# was chosen because it is the primary language supported by the .NET Compact Framework, which
All leafnodes are
Determine at runtime:If Size(R1) < Size(R2) transfer results of DBN1 to DBN3 else transfer results of DBN3 to DBN1
DBN 1DBN 1 DBN 3
DBN 1 DBN 3
Determine at runtime:If Size(IntRes) < Size(R3) transfer IntRes to DBN6 else transfer results of DBN6 to DBN1 or DBN3
Fig. 2. Building a query plan tree. The symbol U denotes a union while J signies a join.
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 101in turn represents the native development platform for Pocket PCs. Moreover, and as was justmentioned earlier, the Farnson Bluetools BT communication library is implemented using C#, whichbasically made this language the obvious choice.
4.1. Application structure
The system consists of three applications: RNApplication running on all devices since each devicecan be a RN. The DDApplication only runs on the data directory node (DD), and similarly, theDBNApplication runs on the DBNs. These applications were implemented as services (serial port type)that advertise themselves for other devices to connect to and initiate requests. Once advertised,devices can discover the services exposed and connect to themwhen needed. Only the RNApplicationhas a Graphical User Interface, while the others only run in the background. The RN service isresponsible for creating requests while the DD service is in charge of analyzing those requests andgenerating subqueries and plan trees to be sent to the DBNs. The DBN service receives subqueriesfrom the DD service, generates intermediate or nal results and sends them to another DBN or the RNservice, respectively.
4.2. Handling incoming and outgoing DBNs
The process of handling incoming DBNs in the network is described in Fig. 3. Basically, it is the DDnode that is responsible for detecting incoming and outgoing DBNs. When the DDApplication loads, it
ARTICLE IN PRESSperforms a device discovery and initiates a timer that performs the discovery process every speciednumber of minutes to handle new comers and detect departed devices. When a new device isdetected, its name and address are stored in a NetworkDevices list on the DD, which is designed fortemporarily storing newly detected devices.
After device discovery is complete, all newly detected devices that start with the name DBN areselected from the NetworkDevices list. The DD then sends a RequestDBNSchemamessage to the rstDBN entry using the stored address and waits for an acknowledgement containing the name of thele that has the DBNs schema. Once the DD receives the acknowledgement, it deletes the DBNsentry from the database list, extracts the name of the schema le, creates a DBNSchema database, ifone is not already created, and starts reading the records from the schema le on the DBN andinserting them into the DBNSchema database. This process repeats until all the DBN schemas areretrieved. After the insertion of all schema records is over, an XML le with all the available DBNschemas in the network is built. When this le is generated or modied, it gets communicated to allother nodes in the piconet on demand, as shall be described in the next subsection. This will allow auser who wants to submit a request to be presented with a GUI that reects the type of data that areavailable in the network. Finally, we note that device discovery is repeated periodically so as to detectincoming and outgoing devices.
A device may leave the network in two scenarios: (1) when the DBNApplication running on thedevice is shut down, or (2) when the device gets out of range. In the rst situation, on closing, it
Fig. 3. Process diagram for handling incoming DBNs.H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115102connects to the DD and deletes its schema le. In the second situation, the DD detects it as lost whendevice discovery is performed. In both cases, the DD updates its DBNSchema database and informsthe remaining nodes about the existence of an updated schema le.
4.3. Data request procedure
Through the user interface, when the user taps on the Get Available Data button (Step1 in Fig.4), the RNApplication directly connects to the DD, retrieves the schema le, and generates a selectiontree with all the relation names and their respective attributes. From this tree, the user can eitherselect the relations to be retrieved fully, or can select specic attributes.
Next the user species the predicates by choosing the attributes (Step 2) and their correspondingvalues (Step 3). The user interface controls are dynamically generated based on the selections madein Step 2. Finally, the user taps the Next button on the Values tab. This prompts the DD to processthe query generated by the RNApplication from the user interface, and the DBN nodes to execute thequery fragments before one of them returns the results to the RN (RNApplication), which in turndisplays them in a grid (Step 4). The joins are implicitly determined from common attribute names(and types) among relations that have selected attributes. On the other hand, unions are also
ARTICLE IN PRESS
Fig. 4. RNApplication user interface.
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 103
automatically determined when there are multiple relations with identical names residing on two ormore DBNs.
4.4. Query processing by the data directory (DD) and request handling by DBNs
The request submitted by the RNApplication takes the following form:
where Query is a header that informs the DD that the incoming message is a query request, QueryIDis a unique id that bears the devices ID plus a monotonically increasing sequence number, RNNameis the name of the RN device, RNAddress is the RN address that references an entity within aBluetooth network (e.g., 00:12:D1:28:06:83), RNSCN is the RNApplication service channel number ofthe service advertised by the RN, RNFTPSCN is the RNApplication FTP service channel number (usedby the DBNs to connect to the FTP service on the RN and send the results le of the query), Relationsare the selected relations that the user wants to query, Attributes Values represent the conditionsand the corresponding specied values (including any AND or OR operators), and nally Selections,are the attributes that should be retrieved by the query, separated by commas.
A DD that receives the request will tokenize it into its elements and then send aQUERYRECEIVED message to the RN. The DD, next, checks for the number of relations in the
ARTICLE IN PRESS
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115104Send to participating DBNs
involved DBNsBuild corresponding SQL queries
relations on 2 DBNs
relations on 2 DBNsquery. If this number is one, two cases are considered: single relation single DBN, and single relationmultiple DBNs. On the other hand, if more than one relation was found in the query, then three othercases are considered, as will be discussed below. Fig. 5 summarizes the processing occurring at theDD node, while Fig. 6 depicts the actions taken by a DBN node participating in executing andanswering the users query. The text that follows will detail the processing occurring at the DD andparticipating DBN nodes for each of the above cases, and will describe the communications as well asthe messages exchanged between them.
Tokenize request from
Schemas of DBNs
Number of distinct relations?
One Two or more
Build corresponding SQL query
Send to concerned DBN
Types of joins (and
Joins on a single DBN
Joins and unions between
Request from RN node
DBNFig. 5. Processing at the data directory (DD) node.
In the case of a single relation multiple DBNs, the same query will be sent to all the participating
ARTICLE IN PRESS
results into size info with other No results in XML to
Perform unions from DD Send final results in
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 105DBNs. The selection operations will occur in parallel, but the union occurs after the intermediateresults have been generated and the DBN with the smaller intermediate dataset has sent its data (asXML) to the other DBN, which will in turn perform the union with its own results before sending thecombined results to the RN. The query request in this case appends to the previous format4.4.1. Single distinct relation
In the single relation single DBN case, the DD generates the SQL query statement from thesubstrings obtained after tokenizing the RNs query, gets the address of the DBN from the DBNSchemadatabase, connects to it, and sends it the query. The format of this query is:
The QueryType eld is used to determine whether the query is answerable from one DBN, from theunion of multiple DBN relations, containing a join, or containing a join and a union. In the case of asingle relationsingle DBN, once the DBN receives the request it processes SqlStatement, retrieves theresults in a DataSet, stores them in an XML le, and then sends it to the RN using the RNAddress,RNSCN and RNFTPSCN found in the request. If no records are found, the DBN sends to the RN anEMPTY message informing it the query cannot be answered.
and /or joins node XML format to RNnode
Fig. 6. Processing at the database (DBN) node, accounting for all possible scenarios.dataset is largest?
a dataset participating DBNs
To/From other participating DBNs
participating DBN with largest size
To DBN with largestresults size
Receive intermediate results from other participating DBNs
part of a larger query?
NoexnowhdaexLocalRetrieve Exchange dataset Send intermediate Query ormation about all the participating DBNs (Address, SCN, FTPSCN) so they know about each other.is is needed in order for all such DBNs to communicate to each other the sizes of their intermediatesults, and for the DBNs that hold the smaller results to send their output via FTP to the DBN withe largest intermediate results. The format of the query is:
te that the QueryType in the union case has the value of 1, and that the above query implies theistence of n participating databases (nX2). In our implementation, the participating DBNs that dot hold the largest intermediate results will send their output to the one with the largest results,ich was found to generate the least amount of trafc. The DBN holding the largest intermediatetaset will perform the unions and then send the combined results to the RN. The size updateschanged between DBN nodes have the following format:
fragments sent from the DD to the DBNs have the following format:
but get deleted after the join is performed.
ARTICLE IN PRESSH. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115106Query-QueryID-RNName-RNAddress-RNSCN-RNFTPSCN-QueryStatement-DBN1Address-DBN1SCN-DBN1FTPSCN-QueryType(i)-DBN2Address-DBN2SCN-DBN2FTPSCN-QueryType(i)-y -y-y- DBNnAddress-DBNnSCN-DBNnFTPSCN- QueryType(i)The third case to be considered is a combination of unions and a join. This query is aheterogeneous one comprising two query types, with the relation that exists on multiple DBN nodesassigned a QueryType 1 (union) while the other that will be associated with a join, has aQueryType 2.The process of executing this query simply builds on the previous cases while keeping in mind thatthe union occurs rst and then the join occurs at the DBN that carried out the union (call it DBNu), orat the DBN that holds the different relation (call it DBNd). This decision can be made withoutadditional size updates since DBNu can compare the size of the combined results that it holds to thatof the intermediate results reported by DBNd. Similarly, DBNd, which knows that it is DBNu thatexecuted the union since it held the largest individual intermediate dataset, can compare the size ofits output to the sum of the reported sizes by the other participating DBNs. Lastly, given that thisquery scenario is hybrid, the DD species the type of the joint operation at the DBN level so that aDBN knows if it is engaging in a union or a join. The format of the query fragment in this case is asfollows (the value of i is either 1 or 2).Query-QueryID-RNName-RNAddress-RNSCN-RNFTPSCN-QueryType(2)- QueryStatement-DBN1Address-DBN1SCN-DBN1FTPSCN-y-y-y- DBNnAddress-DBNnSCN-DBNnFTPSCN
The DBN that generates the results of a query knows that it should send the XML results le toanother DBN if the size of its intermediate results is not the maximum. Note that for a join to occur,the data from the received XML le are temporarily imported to the database of the receiving node,uery-QueryID-SourceDBNAddress-DBNResultsFileName
4.4.2. Multiple distinct relations
In situations in which the number of distinct relations is two or more, three cases are considered:Join between relations on a single DBN, Join between relations on separate DBNs, and a combinationof unions and a join. The rst thing that the DD performs once it gets two distinct relations is to checkfor common join attributes between them using common attribute names. If no join attributes arefound to link the relations, the DD connects to the RN and sends it an EMPTY message informing itthat its query cannot be answered.
In the case where all relations coexist on the same DBN, the DD constructs a single join query andtreats it as a QueryType 0 request with the following format:
The second case of multiple distinct relations is when the query species distinct relations on atleast two different DBNs. This involves sending intermediate results between DBNs via QueryType 2,where afterwards, intermediate results will be joined at one of the participating DBNs. The queryOnce a DBN determines that it has to forward the output of its query to another DBN, it sends thatDBN a notication message to inform it about the XML le name that it should expect. Such amessage has the following format:
An extension to the third case above is where there are two or more joins and possiblyone or more unions. This case can be thought of as a generalization of the previous cases. Theunions, if they exist, occur rst, followed by the joins that are performed sequentially.For intermediate results handling, a similar approach to the third case is taken in that the DBNholding the largest intermediate dataset among those that have the distinct relations will performthe joins.
For added illustration, Fig. 7 shows an example depicting the query processing procedure.In Fig. 7, DBN1 and DBN2 have different instances of the same relation while DBN3 has a totally
different relation. Given this and that all three DBN nodes know about each other (i.e., with respect tocollectively answering this particular query), they exchange the sizes of their intermediate results,and accordingly perform the union and join at the shown nodes.
ARTICLE IN PRESS
DBN Node 1
User wants to submit query
Request schema xml file
User makes selections and sets conditions
Fragment query and generate query plan
RN Node DD Node DBN Node 2
DBN Node 3
Generate selection results
Exchange result sizes
Query fragment Query fragment
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 107Query type 1
Generate union results
Query type 2
Generate join results
Display resultsFinal resultsFig. 7. Sequence diagram of a scenario with a union and a join involving three DBN nodes.
5. Experimental results
The applications described above were implemented on seven HP iPAQ hx2790 Pocket PCs(see Fig. 8), running Windows CE 5.0. Each application has its own setup procedure whichwill be elaborated in the coming section. After describing the system setup, we present and discussthe results obtained through implementing various scenarios and running sample queries.
5.1. System setup
The RNApplicationwas installed on one PDA, and so was the DDApplication. On the other hand theDBNApplication was installed on the remaining ve PDAs along with associated SQL Server CEdatabases. The RNApplication was congured with four variables that are device-specic. They storethe name of the device, its address, its Service Channel Number (SCN), and its FTPSCN. As was seenabove, they are used by other devices that want to connect to the RN and exchange messages andles with it. A similar setup is done for the DD after installing the DDApplication. For setting up theve DBNs, the DBNApplication on each device was identied with a single variable that indicatesthe device name, which in turn is used to name les that are sent over the network, and refers to thename of the database congured on that device.
The SQL Server CE database on each DBN supplies the needed functions to build relationaldatabases and answer SQL queries. To create the databases, a small application that executes a
ARTICLE IN PRESSH. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115108series of SQL statements was written and installed on the DBNs. Fig. 9 shows an instanceof the databases in the network, where as implied, simple joins on separate DBNs can occurbetween Movies and Cinemas, and between Cinemas, Restaurants, and Stores on one hand, and Mallson the other hand. Additionally, a union can occur between separate DBNs having the Restaurantsrelation.
The same application also creates a DBN schema of the relation(s) in the form of a DBNSchema.sdf database, and then imports its records into an XML le ready to be shipped to a DDwhen requested. With the design as shown in the table at the bottom of Fig. 1, the attributesFig. 8. A view of the Pocket PCs on which the experiments were performed.
ARTICLE IN PRESSCinemas Cinema_Name Mall_Name
MoviesMovie_Name Cinema_Name Timings Pirates of the Caribbean Concorde 2pm,4pm,6pm,9pm,11pmShrek3 Concorde 2pm,4pm,6pm,9pm,11pmLord of the Rings Concorde 2pm,4pm,6pm,9pm,11pmTroy Concorde 2pm,4pm,6pm,9pm,11pmSimba Etoile 3pm,5pm,7pm,9pm,11pmPirates of the Caribbean Etoile 3pm,5pm,7pm,9pm,11pmTroy Dune 2pm,4pm,6pm,9pm,11pmShrek3 Dune 2pm,4pm,6pm,9pm,11pmPirates of the Caribbean Dune 2pm 4pm 6pm 9pm 11pm
MallsMall_Name Address Verdun Kantari, Beirut Jounieh North district hill, Jounieh Zahle Downtown city center, Zahle Zouk 1200 main highway, Dbayyeh Hamra Ras Beirut area, Beirut
RestaurantsRestauarnt_Name Mall_Name Specialty Phone Mcdonalds Verdun Burgers 09876234 Dunkin Donuts Vardun Donuts 06728393 Burger King Verdun Burgers 05888333 KFC Verdun Chiken 01838373 Starbucks Hamra Coffe 04999333 Burger King Hamra Burgers 01333333 Dunkin Donuts Hamra Donuts 01234567
RestaurantsRestauarnt_Name Mall_Name Specialty Phone Burger King Jounieh Burgers 01333333
StoresStore_Name Mall_Name Products Batta Verdun Shoes Batta Jounieh Shoes Red Shoes Hamra Shoes Takbir Hamra Womens clothes In Style Vardun Mens clothes In Style Zahle Mens clothes Al Matbakh Zouk Kitchen ware
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 109of the schemas at the DD are: DBNName, DBNAddress, SCN, FTPSCN, TableName, Attributes, and Size(number of bytes).
5.2. Query response time
The experiments involved 170 queries of various complexities (combinations of selections, unions,and joins) that are described in Table 1 and concern a varying number of DBNs (up to 5). Whenmeasured across all submitted queries, the average selection ratio, which is determined by thepredicate in the query (i.e., WHERE conditions), was very close to 0.3. In other words, the size of thedataset returned by a single selection is 30% of the size of the relation, on average.
The reported response time is the time difference between when the results are received by theRN and when the query was submitted. The left graph in Fig. 10 shows the components ofthe response time (delays) for each query type in Table 1. Generally speaking, the various delaysincrease with the increase in query complexity (a larger query type corresponds to a more complexquery). The processing component includes query fragmentation, query plan generation, and resultswrapping into and unwrapping from XML. Therefore, it correlates with the size of the data and that ofthe query itself. The middle graph depicts the correlation of the total response time (the sum of allthree components) with the number of participating DBNs. The illustrated relationship is logicalsince an increase in the number of involved DBNs indicates additional joins and/or unions, which inturn implies more processing and more trafc. Finally, the right graph shows the direct relationshipbetween the query response time and the size of the total trafc generated. This relationship isbasically linear and suggests a growth in the response time by 2 s for each additional kilobyte oftrafc.
Starbucks Jounieh Coffe 01234567 KFC Jounieh Chiken 01987654 Dunkin Donuts Jounieh Donuts 07222222 Mcdonalds Jounieh Burgers 09876234 Starbucks Zouk Coffe 06728393 Mcdonalds Zouk Burgers 05888333 KFC Chik 01838373
Khoury Home Zouk AppliancesRestaurantsRestauarnt_Name Mall_Name Specialty Phone Mcdonalds Zahle Burgers 0987623Dunkin Donuts Zahle Donuts 0672839Burger King Zahle Burgers 0588833
Fig. 9. An instance of the databases on the ve DBN nodes (DBN 5 has three relations).
ARTICLE IN PRESSTable 1Types of queries used in the evaluation of system performance
Query type # Queries # DBNs No. of operations Involved relations
Selections Unions Joins Malls Stores Cinemas Movies Restaurants
1 3 1 1 0 0 p1 3 1 1 0 0 p
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 961151105.3. Generated trafc
The size of trafc accounts for the exchanged messages between the devices as well as the resultles (intermediate and nal), and was calculated from the number of incoming and outgoingmessages on each node. From this, the number of bytes sent and received by the applications wascomputed. Given that the applications use the Asynchronous Connectionless Link (ACL) of Bluetoothfor data communication, and that the data packets Bluetooth uses for ACL include, in addition to thepayload, 142 bits of encoding information (access code, header, and CRC), we computed the totaltrafc by multiplying the 142 bits by the number of packets and added the results to the
1 4 1 1 0 0 p1 10 1 1 0 0 p2 12 2 2 0 1 p p2 4 2 2 0 1 p p2 8 2 2 0 1 p p3 4 3 3 2 0 p4 10 3 3 0 2 p p p4 12 3 3 0 2 p p p5 6 3 4 2 1 p p5 7 3 4 2 1 p p6 9 3 4 0 3 p p p p7 10 3 5 2 2 p p p8 5 4 4 2 1 p p9 6 4 5 2 2 p p p9 11 4 5 2 2 p p p10 13 4 6 2 3 p p p p11 7 5 5 2 2 p p p12 8 5 6 2 3 p p p p12 12 5 6 2 3 p p p p13 6 5 7 2 4 p p p p p
1Number of participating DBNs
Total response time
Generated total traffic (KB)
3 5 7 9 11 13 2 3 4 5 1 2 3 4 5 6 7 8
Fig. 10. Query response time: individual elements (left) and total time (middle and right).
left two graphs of Fig. 11. It includes the query sent from the RN to the DD, the query fragments sentby the DD to the DBNs, the intermediate results sent between the DBNs, and the nal results
ARTICLE IN PRESStransmitted to the RN. The right graph depicts the overhead ratio, which is computed by dividing thetotal size of the query fragments and intermediate results by the size of the total trafc sent over thewireless network. As seen, this ratio is close to 50% when the number of participating databases istwo or more.
The graphs of Fig. 11 show that even for small relations (as indicated in the schema of Fig. 9), thegenerated trafc can be relatively signicant. In this regard, the type and size of data that we chose totest the systemwith may be typical for general information like the ones we used (i.e., data that mayconcern the general public). However, specialized mobile applications may entail much largerrelations and therefore larger trafc to be put on the radio link. In all, the relationship between thedelay and trafc volume, as depicted in Fig. 10 (right), still holds.
5.4. Power measurementapplications-generated bytes. The average total trafc generated for every query type is plotted in the
RN DD DBN
1Number of participating
RN DD DBN
Number of participatingDBNs
3 5 7 9 11 13 2 3 4 5 1 2 3 4 5
Fig. 11. Average trafc transmitted and received by each node type.
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 111The power measurement setup is illustrated in Fig. 12 and consists of an Agilent DSO3062Aoscilloscope connected to a PC that runs a data acquisition software for capturing instantaneousbattery voltage values at a rate of 200 samples per second. To measure the voltage, the batteryhad to be connected to the PDA externally. After examining the external wires, it turned out that,of the seven pins that connect the battery to the device, only two have non-negligible currentowing and both are used to provide power to the processor, the memory interface, as well asto the Bluetooth and WLAN communication interfaces. A 0.375O resistance was placed inseries between the PDA and the battery for each of these two pins, thus allowing for measuringthe voltage drop across the two resistors. The oscilloscope computes the sum of the two voltagesand feeds the data into the PC, which in turn computes the current by dividing the measuredvoltage by the resistance, and then the power by multiplying this current by the batterys voltage(3.7V).
In this study, the idle power, the processing power, and the communication power of theBluetooth interface were isolated. Additionally, the power consumed by transmission and thatconsumed by reception were also differentiated. To obtain the average power value, theinstantaneous powers were averaged over the measurement duration and then the idle power(discussed next) was subtracted. Getting the actual energy spent in Joules was simply a matter ofmultiplying the average power by the process duration in seconds.
ARTICLE IN PRESS
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 961151125.4.1. Idle power
PDAs consume power even when no processing seems to be involved. This power is used torefresh the 32MB of RAM and the LCD screen, and process some kernel events. We measured thefollowing values with the PDA in idle mode: 463mW without the Bluetooth (and WLAN)communication interface ON, and 507mW with Bluetooth ON. Hence, the power increased by44mW due to turning the Bluetooth interface ON. It should be mentioned here that all themeasurements were taken with the screen backlight off (it automatically goes off after 30 s ofinactivity). The left graph of Fig. 13 shows the captured waveforms which yielded the average powervalue. The graph illustrates the random pattern of the idle power, where it rises for few hundredmilliseconds every once in a while, in a random way.
5.4.2. Communication power
Power consumption due to Bluetooth communication can be divided into three parts: powerconsumed during peer discovery, power consumed by the communication interface during reception,and that during transmission. After being turned on, a device tries to discover other Bluetoothdevices in the vicinity. This process lasts between 5 and 12 s, depending on the number of peersdiscovered. In this phase, the device broadcasts packets at the maximal transmission power, whichwas measured to be 1083mW (1590507), where 1590 was the measured average power.
Transmission power over any wireless interface includes the baseband processing, the actual RFtransmission, and some connectivity-related processing. During the measurement process, thepower consumed due to transferring the query from the RN to the DD was identied with the aid ofthe log le that noted the exact time of the occurrence of this event. The same was also done withregard to the power consumed by transmitting the query fragments, and transferring the XML lescontaining the intermediate results and the nal results. The peaks that showed during themeasurements (e.g., middle graph of Fig. 13) and corresponded to these events had nearly the sameamplitude for a given setting, but varied from a setting to another according to the distance and the
Fig. 12. An illustration of the power measurement setup.
ARTICLE IN PRESSnumber of obstacles between the devices. The latter condition can be attributed to associatedpathloss, forcing the sender to accordingly increase its power when needed to overcome theattenuation. The middle graph of Fig. 13 depicts the instantaneous power with a good coverage at a60 cm separation, where the average power is equal to 523mW (1030507). It should be mentionedthat an estimate of the bit rate employed by Bluetooth was obtained by dividing the total number ofdata bytes transferred by the total duration. More specically, rst, for each query submitted, thetotal number of bytes corresponding to all XML les transferred (intermediate and nal) was dividedby the sum of le transfer durations (obtained from the log les). Next, an average was taken over allqueries to yield 294,764bps. By referring to the Bluetooth specication (Bluetooth Specications,2003), we nd that a device transmitting data using DM3 packets (occupy three transmission timeslots) over an asymmetric connection-oriented link (ACL) can reach a bit rate of up to 387kbps inasymmetric mode, while the lower rate DM1 packets allow for up to 108kbps. This suggests that thePDAs were sending data using DM3 packets. The reception power was computed in the same mannerwhile receiving the same les that were transmitted. The average power was found to be slightlyhigher: 613mW.
Power measurements were done for 80 queries that were randomly and uniformly selected fromthe list of 170 possible queries that belong to the 13 types listed in Table 1. The measured values wereaveraged across all queries of the same type for each type of activity (i.e., query fragmentation,database processing, and XML serialization and deserialization). A snapshot of the power measured
50 100 150 200 250 3000 50 100 150 200 2500 50 100 150 200 250 300
Fig. 13. Sample power measurements: idle (left), transmission (middle) and processing (right).
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 113during query fragmentation at the DD is seen in the right graph of Fig. 13.The most power consuming activity was database processing (selections, joins, and unions),
which includes the power used up by the SQL CE database engine itself. It was found to be 323mW.The average power spent by fragmenting the queries and building a query plan was measured at151mW, while XML serialization (wrapping the database dataset into an XML le) anddeserialization (extracting the results from the XML le) both measured at 155mW.
Fig. 14 shows the consumed energy in milli-Joules for each type of activity and for every nodetype. Although the transmission and reception powers are higher than all types of processing power,the durations of data transfer are very short. That is, a DM3 Bluetooth packet can carry 121bytes ofpayload and only occupies 1626 ms of the radio link time. This explains the relatively small amount ofenergy spent during transmission and reception of queries, query fragments, intermediate results,and nal results. As shown in the right graph, database operations are the most expensive (power-wise), which is attributed to both the higher average power level during such operations and to thelonger durations that such operations take.
The average energy consumption at each node across all 80 queries is depicted in the left graph ofFig. 15. The DBN nodes spend the largest amount of energy per query, followed by the RN, which hasto parse the XML results and display them to the user. It is important to note that the reported
ARTICLE IN PRESS
Query type3 5 7 9 11 13 1 3 5 7 9 11 13 1 3 5 7 9 11 13
Fig. 14. Aggregate node energy (in milli-Joules) consumption for each query type.
TX/RXXML ProcQuery Proc
(hr) RN DD DBNRN Energy Consumption
DD Energy Consumption
DBN Energy Consumption
CommunicationsResults to/from XMLDatabase operations
H. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115114average DBN energy is taken over all the ve DBNs in the piconet, and not just the participating onesin the different queries.
Finally, we computed the total energy for every node type and divided by it the capacity of the HPiPAQ hx2790 Pocket PC, which is specied at 1440mAh. Moreover, we varied the request ratebetween one request every 5min and seven requests per minute, and computed the associatedexpected battery life in hours. The results are illustrated in the right graph of Fig. 15, where it is seenthat the request rate impacts the DBN nodes the most. These results convey an important point: withcontinuous participation in the limited distributed database framework (i.e., in a piconet), the mobiledevice is not penalized signicantly in terms of consumed battery power as suggested by thedifference between the maximum and minimum values seen in the right graph of Fig. 15 (i.e., slightlyabove 0.5 h).
In this paper, a distributed mobile database system was proposed for implementation onBluetooth-enabled mobile devices. The architecture calls for three types of nodes that work togethertoward answering the users query. Details of the implementation on iPAQ Pocket PCs were providedand the mechanisms for processing the submitted queries and coalescing intermediate results wereelaborated. A signicant part of the paper was dedicated to describing the performance evaluation ofthe system that involved measurements of query response time, generated trafc, and consumedbattery power. The response time was in the order of seconds, with up to 18 s for a fairly complex
co DB Ops
0.2Request rate (queries/min)
DD DBN 0.5 1 2 3 4 5 6 7
Fig. 15. Average energy consumption (left) and expected battery life (h) for iPAQ hx 2790.
query that involves four joins and two unions. The consumed battery energy did not greatly reducethe battery life. It was shown that on average a continuous stream of queries at a rate of seven perminute will only reduce the battery life by a little over half an hour for a DBN. For other types ofnodes, the effect is even less.
As was described, the system is limited to eight mobile devices, which is the constraint imposedby the Bluetooth piconet. It is natural to extend the implementation to a scatternet that includesmore than just eight devices. In that case, the performance study would also examine the effect ofthe scatternet structure (i.e., number of piconets and how they are interconnected) on the delay,communication trafc, and energy consumption. Another future work involves studying the effect ofdata caching on reducing delays, trafc, and power consumption.
Altundag S, Gokturk M. A practical approach to scatternet formation and routing on Bluetooth. In: Proceedings of internationalsymposium on computer networks, 2006. p. 239.
Amin M, Bhuyan F, Rahman M. Bluetooth Scatternet Formation Protocol: a comparative performance analysis. In: Proceedingsof Asia-Pacic conference on communications, 2006. p. 15.
Basagni S, Bruno R, Mambrini G, Petrioli C. Comparative performance evaluation of scatternet formation protocols fornetworks of bluetooth devices. Wireless Networks 2004;10(2):197213.
ARTICLE IN PRESSH. Artail et al. / Journal of Network and Computer Applications 32 (2009) 96115 115Bluetooth Specications Version 1.2, vol. 2. Available online at: /http://www.bluetooth.comS; November 2003.Chan D, Roddick J. Summarisation for mobile databases. J Res Pract Inf Technol 2005;37(3):26784.Dow C, Lin P, Chen S, Lin J, Hwang S. A study of recent research trends and experimental guidelines in mobile ad-hoc networks.
In: Proceedings of 19th international conference on advanced information networking and application, vol. 1, no. 2830,2005. p. 727.
Epstein R, Stonebraker M, Wong E. Distributed query processing in a relational database system. In: Proceedings of the ACMSIGMOD international conference on management of data, 1978. p. 16980.
Kossmann D. The state of the art in distributed query processing. ACM Comput Surv 2000;32(4):42269.Kottkamp H-E, Zukunft O. Location-aware query processing in mobile database systems. In: Proceedings of 1998 ACM
symposium on applied computing, Atlanta, GA, 1998. p. 41623.Li J. Query processing in mobile ad hoc wireless networks. In: Proceedings of the fourth international conference on computer
and information technology, Wuhan, China, September 2004.Petrioli C, Basagni S, Chlamtac I. Conguring BlueStars: Multihop scatternet formation for Bluetooth networks. IEEE Trans
Comput 2003;52(6):77990.Zaruba G, Basagni S, Chlamtac I. Bluetrees-Scatternet formation to enable bluetooth-based ad hoc networks. IEEE Int Conf
A distributed mobile database implementation on Pocket PC mobile devices communicating over BluetoothIntroductionLiterature overviewSystem architectureBasic conceptsData directory electionCoping with dynamic environmentsQuery processing procedure
ImplementationApplication structureHandling incoming and outgoing DBNsData request procedureQuery processing by the data directory (DD) and request handling by DBNsSingle distinct relationMultiple distinct relations
Experimental resultsSystem setupQuery response timeGenerated trafficPower measurementIdle powerCommunication powerProcessing