# Introduction

n computing, a Distributed Database System (DDBS) is a database whose storage devices are not necessarily all linked to a common processing unit; but rather in this approach, the database can be stored on multiple computers, located in the same physical location or can be scattered on networked computers [1], [8]. The distribution transparency is the fundamental principle of the DDBS which consists of making a distributed system to appear similar to a centralized system to the users. The distribution transparency as well as the management of a DDBS are ensured by a program called Distributed Database Management System (DDBMS) [3]. The design of a DDBS requires that it be entirely resident on different sites of a computer network but not necessarily all. This means that at least two sites must host the database and not necessarily each site in the network, as depicted in the Fig. 1.

Thus, there are two distribution strategies: data fragmentation and data allocation on the one hand and data replication on the other hand. So, to make a good design, all these strategies are compiled [2], [3], [33]. The fragmentation consists in splitting a relation (a table of a database) into a number of sub-relations, called fragments; which can be horizontal, vertical or hybrid. Horizontal fragments are subsets of tuples (table records), vertical fragments are subsets of attributes (table columns), and hybrid fragmentation consists of mixing the two preceding ones. In turn the allocation is nothing more than the assignment of fragments to the sites in an optimal way [2]. When allocated fragments have to share data among them, they need the replication procedure. However, this work focuses on the data replication strategy. The replication consists of duplication and storage of multiple copies or replicas (at least two) of the same fragment or the entire relation (in the case of a fully replicated database) of a DDBS in multiple different sites. The replication is the strategy used to ensure the data exchange between fragments or relations in a fully replicated database [2], [3], [4], as illustrate in Fig. 2. In any case, the main problem of the data replication is the synchronization of replicas. Data synchronization is nothing than keeping consistent replicas in a Replicated Database System (RDBS) [5]. This means ens uring the exchange of updates between replicas. Nowadays P2P computer network is in full emergence. Comparatively to client/server model, in a P2P system, each client is itself a server. In this way replicating a Database over a P2P network require that all peers keep the same data copy. In the same way, the emergence of advanced applications of P2P systems, requiring general replication capabilities with different levels of granularity and multi-master mode [11], where each peer can transfer updates to all others and the same replica can be updated by several peers in a replicated databases environment [4], [10], the serialization of updates and the reconciliation of data turns out to be the particular P2P replication problems because those flows of updates (data) and refresh transactions conflict each other [8], [30], [33].

For example, the operations on an account, of a customer, opened in a bank with multiple branches can be replicated by several branches of the same bank and must be able to be updated by any branch anytime, to acquire reception of a transfer, for a deposit to the account, a withdrawal from the account, etc. Concretely, changes made by refresh transactions from different peers reach a destination site at the same time and multiple updates of the same replicas by different peers break the reliability and the consistency of replicas [2]. This is why this study aims to introduce an effective approach to serialize refresh transactions and to reconcile replicas in the case of inconsistency. To overcome one of DDBS homogeneity aspects, namely the same DBMS, the result of this design needs to be implemented as a synchronizer-mediator for database replication in a Graphical User Interface (GUI) using lazy decentralized sites strategy on a P2P network. To reach this purpose, the structure of this paper is organized as follow: the first section introduced by presenting the context of this research as well as the status of the problem, the second section will review the related works, the third will present the methodology, fourth section will show the simulation environment for experimentation, the fifth section will offer the result and finally the sixth section will conclude this study.


# II.


# Related Works

This section will rapidly review certain research works already realized to attempt to solve these two aforementioned problems.


# a) Data replication

Designing a RDBS pursue four majeure objectives, namely : improving data availability, improving performance, ensuring scalability and users applications requirements. These purposes can be summarized as "improving consistency and/or reliability" [2], [3]. To ensure consistency between replicas, the synchronization procedure uses the transaction running technique. A transaction is a collection of operations that transforms the database from a consistent state to another consistent state [6], as illustrated in Fig. 3. A transaction has a Begin Of Transaction (BOT) and an End Of Transaction (EOT). This End is managed by three different functions: either a "commit" to validate, a "rollback" to cancel, or an "abort" to interrupt the execution of operations inside the transaction. The consistency and/or reliability of a transaction are guaranteed by 4 properties: Atomicity, Coherence, Isolation, Durability (ACID) that make the "acidity" of a transaction [2], [7]. As we are dealing with data flow, our focus remains on the Structured Query Language (SQL) operators, especially the Data Manipulation Language operators in most of DDBMSs, which contains [9]: The write operators (Insert, Update and Delete SQL commands) and the read operator (Select SQL command). Typically, like the structuring of instructions of a procedural language, a transaction "T" can have the following structure:

Begin_Of_Transaction T Insert operator Update operator Delete operator Select operator End_Of_Transaction T However, to solve the aforementioned main problem of data replication, i.e. the synchronization of replicas, there already exits four replication strategies, resulting from the combination of two factors: "when" and "where". The "when" factor specifies when updates are broadcasted (synchronously/eagerly or asynchronously/lazily), while the "where" factor indicate where updates occur on a centralized site (primary copy/mono-master) or on decentralized sites (everywhere/multi-master) before being propagated. So when we take the factor "where" in "when", it emerges [1], [2], [3], [4], [30], [33], [34]:

A. Synchronous or Eager Replication: All replicas must be updated before the transaction commit i.e. in real-time. Here, the most up to date value of an item is guaranteed to the end user. There are two different strategies in synchronous replication:

1) Eager centralized site: This method is beneficial in case where reads are much more frequent than writes. It works under the principle "Read-One, Write-All (ROWA)". After transaction commitment, any one of replicas can be read; so the write process must update all replicas. 2) Eager decentralized sites: The principle is "update everywhere"; in this logic every site is allowed to propagate updates to all sites in the same transaction, at the same time so that on the end of the transaction updates become available on all sites.

B. Asynchronous or Lazy Replication: Allows different replicas of the same object to have different values for a short periods of time i.e. in near realtime. They are updated after a predefined interval of time. There are two different strategies in asynchronous replication:

1) Lazy centralized site: It works with the principle such that one copy of replicas is assigned as the "primary copy or mono-master" so that changes of data or writes are possible only on it. These changes are periodically propagated to the secondary copies. The secondary copies of data can only be read. 2) Lazy decentralized sites: Here the principle is so that changes can be performed "everywhere or multi -master", on each site. So these changes are propagated independently to other sites sporadically.

These replication strategies, have already been implemented in most of modern DDBMSs [9]. It is largely the centralized strategy that is much more wrapped in the replication models offered by almost all DBMSs. But, although these modelling are done, there remains a problem to emphasize in eager centralized site approach such that if there is a site unavailable during updates propagation by the master site, the transaction cannot commit. So, some researches are already attempting to design an optimal algorithm that can allow the update transaction commitment on the available sites and to update unavailable sites as soon as they become available again; hence the approach "Read-One, Write-All Available (ROWA-A)" [2], [30], [33]. In addition, one could expect the problem related to the momentary interpolation of the line of communication between the master site and the slave sites, because it is enough for example that the master site overlord or be inaccessible so that the slaves no more access to updates [8]. Well, there is only the decentralized strategy that can clear this concern.

Nevertheless, eager decentralized sites experience the same problem as eager centralized site, whereby update transactions that arise from all sites, if they find at least one site unavailable they abort. But to overcome this problem, such kind of systems should be able first of all to commit transactions on only available sites and so update unavailable sites as soon as they become available again; hence the approach "Update Everywhere Available" [17]. So nowadays, some researches attempt to improve these algorithms by distributed voting algorithm [4]. Thus, if the sites number quorum is reached the transaction commit on them; so afterwards, when writing, update all fraction of the replicas and when reading, read enough replicas to ensure you get at least one copy of the most recent value.

In view of the above, it seems that the lazy strategy is appropriate for P2P topology, especially since it allows replicas of various sites to diverge for a given moment. So as in a P2P network, the participants (Peers) are present or absent momentarily, updates propagation can be applicable on the present Peers while the absent Peers will remain with non -updated replicas in order to receive their updates when they become available again [10], [33]. Thus, lazy centralized sites approach is appropriate for the centralized P2P topology because updates are performed only on the central site and then forwarded to slave sites in near real-time while lazy decentralized sites approach is the most appropriate for the materialization of replication on a decentralized P2P topology because in near real-time, like centralized approach, updates can be performed everywhere, i.e. on each peer and then be broadcasted to all others.

Referring on our problem concerning replication over a decentralized P2P architecture, the observation has been that only a few of DDBMSs have already tried to implement the lazy decentralized strategy in order to formalize the P2P replication; let us quote for instance SQL Server [13] and Oracle [14]. Unfortunately, the particular problems of P2P replication still exist and will be developed in following lines:

? Transaction conflicts: Several updates carried by refreshing transactions, from different sites reach a destination site at the same time but they cannot be performed on the same time, then reliability and consistency will be lost and there will be the risk of transaction conflicts [2], [30],

[33], [35]. DDBMSs must ensure that transaction execution meets a set of properties that lead to the consistency of distributed databases and conveniently summarized by the ACID, since when the execution is always concurrent [6], [7]. Thus, several researches have already been undertaken to solv e the transaction concurrency control problem. Concurrent execution without harmonization constraints poses a number of problems, the most important of which is the loss of operations and incorrect readings. Therefore, it is necessary to set the serializability, a property determining a correct execution of the completion of transactions [3]. ? Data conflicts: P2P replication allows to perform changes on each peer in the topology and then forward them to other peers. However, as changes are performed at different peers, probable data conflicts are to be pointing out when modifications are being broadcasted [2],

[30], [33]. Thus, in all DDBMSs which have already succeed to implement the lazy decentralized sites approach to make it P2P replication, one can distinguish three types of data conflicts [13], [14],

[20], [21]: a) Primary key or uniqueness conflict: Occurs when a record with the same primary key has been created and inserted at more than one peer in the topology. So when those peers need to exchange updates, it is then impossible to violate the criterion of entity integrity; b) Foreign key conflict: Can occurs if in any case the refresh transaction forward updates which contains a record with a foreign key column but whose primary key is not yet forwarded to the destination peer. So it is then impossible to violate the criterion of referential integrity;


# Data modifications conflicts:

? Update conflict: occurs when the same record has been updated on more than one peer; ? Insertion/Update conflict: occurs when a record has been updated on a peer and the same record has been deleted and re-inserted on another peer; ? Insert/Delete conflict: occurs when a record has been deleted on a peer and the same record has been deleted and re-inserted on another peer; ? Update/Delete conflict occurs when a record has been updated on one peer and the same record has been deleted on another peer;

? Deletion conflict: occurs when a record has been deleted on more than one peer. Thus it is necessary to think about a certain number of rules to warranty the conflict policy avoidance in the decentralized P2P replicated environment. Apart from the inconsistency of data caused by transaction conflicts and data conflicts, there are other phenomena which make the replicated data inconsistent. Thus, although the transaction that propagates the updates is successfully committed, the data remains inconsistent. Hence, there is the need of an automatic data reconciler.


# b) Data reconciliation

Database reconciliation is a process of verifying data when there has been a migration or transfer of data from a source database to a destination. The purpose of this process is to ensure that the migration has been done accurately [22]. In this logic, in a global manner, the data is the set of tables of a given database and in a basic way, the set of records of definite tables which can be accessed by a certain selection criterion. In a replicated Databases environment, updates broadcasting as well consists to migrate or to transfer data changes from a Prima ry site toward Secondary sites [23].

However, during data migration, errors may have occurred [12]. Most are like execution failures due to network interruptions as well as network overload those end up corrupting transactions and causing data to be lost or remain in an invalid state at the destination [8], [34]. These phenomena lead to a series of problems such as: missing records, duplicate records, incorrect values, missing values, incorrectly formatted values, broken relationships between tables in case of forced redundancy, etc.

[22]. But, some researches have already been undertaken to find solutions in several ways and some algorithms are already implemented in DDBMs and particular software to reconcile data after migration process.

Oracle Corporation [24], possesses some databases reconciliation tools for their DDBMSs: Upgrade Reconciliation Toolkit is used to compare the data on the Oracle DB source and Oracle DB destinations after data migration and after running the parallel End Of Day (EOD) activities mostly for different branches of a bank. This tool generates also the reconciliation report at the end of the process. Another tool is mysqldbcompare especially for MySQL, this tool compares two databases by identifying differences between databases objects; changed or missing rows of tables are shown in standard formats like grid, table, etc. It is going beyond the data comparison; this utility compare also objects data definition of two databases [25]. Nevertheless, all these tools run reconciliation between one source and one destination. The only one which can reconcile one source and multiple destinations is Upgrade Reconciliation Toolkit for Oracle. Unfortunately, it is only limitated to Oracle DB. The tools mysqldbcompare and MySQL_Diff are also limitated to MySQL and they are not taking in to account multiple destinations. The Tool LegiTest's should be more interesting because it is able to reconcile multi-DBM S databases, but it is also one source, one destination; and all others which have been listed in this review present such kind of limitation.

Moreover, these data reconciliation tools rely on simple counting of records to keep track if the expected number of records has been migrated. It can be esteemed that this was mainly due to the importance of the processing of essential data to carry out field validation of a given data. Nowadays, for more accuracy, the data migration algorithm should provide data reconciliation capabilities that allow the reconciliation of each data or each field, i.e. at the intersection of each row an d each column (attributes by record) of each database table [12].

To preserve data inconsistency and to maintain acidity, all instructions of the replication procedure must be wrapped in transactions [2], [7]. The instructions of a transaction are the commands or operators of the data manipulation language. But, when an operator of the data modification language is executed on a site, some time passes while waiting for the response. While a transaction may have more than one operator and the factors are likely to be varied in a P2P environment, this phenomenon should greatly influence the temporal complexity in the event of variation of different factors. So it is necessary to design a prediction model of replication and reconciliation execution time.

The assumption of this study is formulated as follows: "it seems that P2P replication systems experience the weak performance, especially since the time to replicate and to reconcile data from a Master Peer to Slave Peers dependent, if not totally, partially of certain factors, such as: the number of records in each table, the number of tables whose data has changed, the number of peers connected during the propagation of updates and other factors (number of columns per table, data types columns, etc.)".

However, these problems deserve a special attention; that is why there is a reason to wonder about setting up "a synchronizer-mediator for lazy replicated databases over a decentralized P2P architecture". This system should be able to serialize updates performed simultaneously on different replicas of the same database and to reconcile this replicas, effectively, over a decentralized P2P network.


# III.


# Methodology

To ensure strong replica consistency in a distributed database, traditionally the implementation of a syn chronous or eager refresh algorithm which is specially Two -Phase-Commit (2PC) based technique is the unique gateway to avoid discrepancies between replicas [2]. However, this solution is inapplicable in a P2P architecture because does not guarantee the updates delivery to all peers as they are not all always available at the same time [15]. Thus, asynchronous or lazy replication is more appropriate for P2P systems because it allows replicas to be updated independently and to remain divergent until a refresh transaction takes place [16]. Modifications which have been done to the local replica, by local transactions are captured and the refresh transaction propagates them to remote replicas asynchronously i.e. in near real-time. The technique used in this work to capture modifications is audit-log.


# a) Audit-log technique

Almost all DDBMSs support this technique by running triggers belonging to a specific table in order to capture data modifications. A trigger is attached to an event produced by an Insert or Update or Delete operator so that it captures changes before or after the event has taken place in the database [5], [33]. So, in this work the interest is carried on after trigger. To achieve this, for each data-table the creation of one ( ) C audit-table is necessary. The audit-table is composed by the data-table primary key column, other data-table columns (apart from the primary key), the updated column name, the audit action, the timestamp and the synchronization ID. These elements are required for a record to do the comparison between data. Each table in the database would need three triggers to run after Insert, after Update and after Delete. The flow chart, Fig. 4 here below illustrates the audit-log creation. Suppose that the database is homogenous and full replicated, as soon as the audit log creation of each data table completed, on each peer, for each SQL data modification operation, the DDBMS performs following action accordingly:

? After each Insert operation in the data table, the "insert trigger" captures the newly added record and inserts it in the audit table, as shown in Fig.  The column synchronisation ID (Sync_ID) in Audit -tables don't have same value; for a Master Peer Audit-table its content is "Local-Transaction", value automatically provided by the trigger procedure when the transaction is initiated locally by the user application whereas for a remote transaction the synchronization procedure update automatically this column by the sync. ID provided by the Sync. Mediator-System. So, the synchronization procedure select only data whose Sync_ID is equal to "Local-Transaction" and whose Audit_Timestamp is in the interval of begging date and time to ending date and time and apply them to Slave Peers according to the Audit_Action value. This technique permits us to resolve the problem of the endless loop in the sync. procedure used two -ways or symmetrical replication which was knowing old synchronizers [5].


# b) Algorithmic method

The Algorithmic method will be used to design and to analyse instructions of algorithms and steps of a Peer-to-Peer Synchronizer. This method will take in account the Circulatin g Token Ring Algorithm, the Decentralized Peer-to-Peer Replication Algorithm and the Decentralized Peer-to-Peer Data Reconciliation Algorithm.


# i. Network Topology and Algorithm

When a peer needs to broadcast its captured updates toward other peers, it needs a token which gives it the state of a Master i.e. the permission to forward its updates and other peers become automatically Slaves. A fully replicated P2P database system includes p peers and each peer has a complete copy of the database. Peers communicate with each other by exchanging messages and forwarding updates or accessing peer data by performing transactions [17]. In this way, updates will be applied according to a circulating token, as depicted in Fig. 6, which determine transactions serialization order or one can give the privilege to updates from certain sites considered to be mo re important or privileged.

Suppose a network consisting of four peers A, B, C and D all networked. The Fig. 6 below presents the decentralized topology of peer-to-peer token ring network.  A predefined order of releasing or getting the token, since we are in a P2P network where a peer p may or may not be available, is not needed. The optimization policy here is to give the token directly to a peer which needs it instead of going through a list of peers that we are not sure of their availability at the time of the token release. So, transaction serialization is managed by the new circulating token algorithms 1 and 2, successively for getting the token and releasing the token.  Since when a peer (p), which can be "A" or "B" or "C" or "D" gets the token, it executes the transactions according to the algorithm 3, 4 and 5 for data replication and 6 for data reconciliation. Consequently, all transactions performed are accepted and none rejection because only a peer which possess the token can perform a transaction of its updates broadcasting and reconcile other peers' data with its updates. As soon as peer "A" finishes to perform updates and reconciliations with peers "B, C and D", it releases the token and other peers like "B" or "C" or "D" can randomly take it, but according to the token request minimum date and time, and do the same, unless a privileged peer requests it. 


# ii. Replication Protocol and Algorithm

Assuming that the database is homogenous, full replicated and each Peer work under a Two-Phase-Locking (2PL) concurrency control technique. The model of the lazy replication over a decentralized Peer-to-Peer Architecture is presented as follows: let W(x) be a write transaction where x is a replicated data item at Peers A, B, C and D. The Fig. 7, here below depicts how transactions update different copies at all Peers and after commit the refresh transaction, wrapped in the Sync. Mediator-System, forward updates to all peers.    


# iii. Reconciliation Protocol and Algorithm

After a large data transmission, to overcome the problem of data inconsistency due to untimely interruptions of connectivity, network overload and other technical hazards, updates forwarded to each peer in the replication procedure must be reconciled.

The model of the Decentralized Peer-to-Peer Data Reconciliation is presented as follows: let R(x) be a read transaction where x is a replicated data item at Peers A, B, C and D. The Fig. 8, here below depicts how reconciliation is performed on different copies of all peers.   After the implementation of these algorithms presented above, the main goal, according to which setting up a synchronizer-mediator for database replication being able to serialize the propagation of updates and their reconciliation in a replicated databases system over a decentralized P2P network is achieved. Although this goal be achieved, it is appropriate to know here that in computing the performance of an algorithm is assessed on the basis of its complexity [18]. The analysis of the theoretical complexity of this algorithm will be more concerned the time complexity than the space complexity especially as the data will be momentarily transit through the buffer to the destination. Nevertheless, the practical time that the execution of this algorithm takes will result from the simulation and will be calculated by the statistical method.


# c) Statistical method

The performance of a system depends on a certain number of factors. We have to determine the practical time, that makes our system to execute successively transactions of updates propagation or replication (insert, update and delete) and transactions of data reconciliation. To analyse this performance, we will use the linear regression test with the random sampling technique. The linear regression test is a statistical analysis method that describes the variations of an endogenous variable associated with the variations of one or more exogenous variables i.e. the relation between an endogenous variable and one or more exogenous variables. In the case where the study concerns an endogenous variable with one exogenous variable, it's a simple regress ion and when it's an endogenous variable with more than one exogenous variable, it is a multiple regression [19]. This test will be used not only to determine the execution time based on a certain sample, but also to make a linear regression model that will be used to predict the execution time , which is the dependant factor or endogenous variable, based on other independent factors or exogenous variables, namely the number of records, the number of tables in the database and the number of Slave Peers. The following variables are selected:

? Y i : is a random variable to explain "the time the synchronization algorithm takes to broadcast updates and to reconcile replicas for an execution i"; ? X i1 : is an explanatory variable "the number of records the synchronization algorithm broadcast from a Master Peer to Slaves and reconcile between the Master and Slaves for an execution i"; ? X i2 : is an explanatory variable "the number of tables in the database whose records knew updates which need to be broadcasted and reconciled with Slaves for an execution i" ? X i3 : is an explanatory variable "the number of Slave Peers available to receive updates and to be reconciled for an execution i". Given a sample (Y i , X i1 , X i2 , X i3 ) whose i ? [1, n], we will try to explain, as precisely as possible, the values taken by Y i , the so-called endogenous variable from a series of explanatory variables X i1 , X i2 , X i3 . The model formulated in terms of random variables, takes the form: ?? ?? = ?? 0 + ?? 1 ?? ??1 + ?? 2 ?? ??2 + ?? 3 ?? ??3 + ?? ?? Where:

? i = 1, 2, . . . , n ? b 0 is the constant term;

? b 1 , b 2 and b 3 are coefficients of the regression to be estimated; ? ? i : is the model error that expresses or summarizes the missing information in the linear explanation of the values of Y i from X i1 , X i2 , X i3 (a random variable of zero mathematical expectation in this model i.e. problem of specifications, variables not taken into account, etc.). The intensity of the relation between the independent variables and the dependent variable will be expressed by the correlation coefficient "R", which is the square root of the "R²", the determination coefficient of a linear regression model. The coefficient of correlation, will be used to determine the degree of linkage between the independent variables and the dependent variable while the coefficient of determination will help to measure the proportion of dependence of the dependent variable explained by independent variables. Thus, two sets of hypothesises are evoked as follow: These hypotheses will be verified at the end of the results which will be produced by a series of experiments perpetrated on a simulation environment which will be described in the following section.

IV.


# Simulation Environment

The implementation and experimentations will be run on a P2P network consisting of 4 traditional computers depicted in the Fig. 9, with the following properties: Processor: Intel Core i5, CPU 2.40GHz, Memory (RAM): 8.00GB and Storage: 1TB. The network will be based on a desktop switch of 100 Mbps of transmission speed, to establish a simple LAN using twisted -pair cables connection and RJ45 connectors. These computers will run under Windows 10 Professional 64 bits and SQL Server Management Studio 2012 Express as DDBMS, to manage databases and establish the connectivity between them. According to this Fig. 9 above, a node is composed by hardware and software as required previously. But in this same figure one can point out the presence of a "Mediator" for each peer. The mediator is nothing else than the synchronization system, "Sync. Mediator-System", a C# software which has been designed and in which it has been implemente d algorithms, already described in the methodology, to lead to a windows application running under a graphical user interface, as presented in the Multiple-Document Interface (MDI) window here below in the Fig. 10. Thus this mediator must be installed on each node to manage the replication transactions and the reconciliation of replicas. For the execution to be effective, there are prerequisites to fulfil.


# a) Prerequisites

When designing the global schema of the database, each table must have:

? The name such as "Data_tbTableName" and the first column as its primary-key to identify data and to make the difference between records. The creation of primary keys by automatic incremental systemprovided by the DBMSs is disadvised, it is preferable to program an automatic primary key combined with the site number to avoid redundancy;

? Bear in mind that the database is homogeneous i.e. the data structure of the replicated database must be uniform on all peers. Before the actual processing phase begins, under expected replication, "Sync. Mediator-System" provides two procedures that must be performed automatically in advance for each table, as showed in the window, Fig. 11: ? To create one audit table named "Audit_tbTableName", to store changes captured by 3 triggers belonging to each table. Each audit table must have its next four last columns to store respectively the updated column name, the audit action, the audit timestamp and the last column to store the synchronization ID;

? To create three triggers to run after Insert, after Update and after Delete, to capture data changes and store them in the specific audit table.


# The new circulating token algorithm has two phases: i. Data replication

Update transaction serialization: All update transactions must be executed in serial order. Before initiating a refresh transaction, each peer must first receive a single token of a sequential series, to get the order in which the transaction will be executed. Once a token has been assigned to a peer p, this last becomes directly a Master so it performs update transactions to all connected Slave peers, as showed in the window, Fig. 12. Update transaction performing: When a Slave peer receives an executing transaction, it places it according to its Master peer's token as well as its number (Sync_ID, in Fig. 5) and updates are performed to the Slave peer database. As soon asthe transaction ends on each Slave peer, it sends an appropriate message to the Master peer to certify the transaction commitment. The peers connected during the initiation of the transaction and whose transaction has been aborting during transaction performing, due to any kind of issue to the site which host the peer, must be mentioned in the pending list in order to be updated later in a new procedure reusing the same Sync_ID. Then the main transaction, initiated on the Master peer, ends when it has been executed on all peers and give immediately the relay to the reconciliation procedure.

ii. Data reconciliation Reconciliation transaction serialization: Reconciliation in turn will benefit from the serial order of their "Mather" update transactions. This phase must begin on the Master peer once the replication is complete. The reconciliation procedure must also initiate transactions to read updates received by Slave peers. These readings consist of a comparison between the data s ent by the Master peer and the data received by the Slave peers. The comparison operation is performed according to data carrying the token of the same Master initiator of the replication transactions, as revealed in the window, Fig. 12. All errors like missing records, duplicate records, incorrect values, missing values, incorrectly formatted values are retained in order to be fixed.

Reconciliation transaction execution: This phase consists of fixing all retained errors so that missing records are inserted, duplicate records are deleted, missing values are added to their respective fields, incorrectly formatted values are replaced by correct values. Data reconciliation process can be however restarted if the first one done didn't put replicas in consistent state. So procedure can be repeated until all replicas become consistent, then the Master peer can release the token. In the case where the inconsistency persists among data, probably it can be caused by conflicts.


# c) Conflicts avoidance rules

To avoid potential conflicts among data in the P2P replicated database environment, some rules must be respected:

? When using the database, it is inadvisable not to update the value of the primary key; instead, it is better to delete the entire record and re-insert it; ? When designing an application which communicate with the database, create procedures which cannot allow from a peer to update or to delete a record whose insertion was not performed on that same peer i.e. the modification of a data must be done only and After the configuration be performed as indicated in this section to simulate the replication process on a P2P network, the test and/or experiment sets yielded the results which are presented in the next section.

V.


# Result

This section is dedicated to testing this new synchronizer of databases, results and evaluating the performance of the newly proposed algorithm. To achieve this, it is necessary to analyse the performance in order to jus tify the effectiveness of the algorithm.


# a) Performance analysis

Suppose that this algorithm has to broadcast updates emerging from the replicated database over 4 peers A, B, C, and D, local servers of a bank branches. Being fully replicated and homogeneous, the physical schema of this database consists of 3 tables, as presented in Fig. 13. So, for all cases, consider the sample of 12 executions, to operate randomly and based on the reality of the replicated data manipulation in the distributed environment of banking database. However, in all cases, insertions are greater than or equal to updates and deletes. But updates can be more or less than deletions.

After the replication transaction has completed, if there has been an overload or interruption of the network corrupting the replication transaction, then assume that the data that the destination peers have received has experienced some inconsistencies with respect to those of the master peer. Fro m the total replicated data (inserts, updates, and deletes), consider that 25% are missing records that require re-insertion, incorrect values, missing values, and incorrectly formatted values which need to be updated and duplicate records that require deletion, as typically data to be reconciled does not exceed ¼ of that of replication [2], [22]. Thus, it resorts the data presented in the table 1 hereafter: For analysing the effectiveness of our algorithm, the experimentation will be realized in four scenarios, namely:

1. Experimentation based one table stored on a master peer with two slave peers ; 2. Experimentation based two tables stored on a master peer with two slave peers ; 3. Experimentation based one table stored on a master peer with three slave peers; 4. Experimentation based two tables stored on a master peer with three slave peers.

To carry out the analysis of the performance, based on the prediction of the execution time according to the data of the sample presented in the Table 1 above, it results the execution times obtained after experimentation and presented successively in the tables and charts below: All basic factors remaining unchanged i.e. one table stored on a master peer with two slave peers, replication and reconciliation models are successively presented as follow : insert operator, Fig. 14(a     By varying the factor number of tables, from one to two tables stored on a master peer, dividing the number of records equitably between two tables and maintaining unchanged the factor number of slave peers in "two (2) peers", the replication and the reconciliation models are successively given as follow: Year     When we increase the number of tables from one to two, in 1 second, the prediction of the execution time (y), during which this algorithm can successively replicate and reconcile the number of records (x), is calculated from the following way:

? For insert operator ? In replication procedure (Fig. 16(a)) : 1 = 0.021 ?? ? 1.3366 ??0.021 ?? =?1.3366 ? ?? =111.26 ? ?? ? 111 inserted records to be replicate in 1 second. So, as the coefficient of determination R² = 0.9846 then the dependence degree of insertion execution time compared to the number of records is 98.46% and as the coefficient of correlation R= ??? 2 ? R = ?0.9846 ? R = 0.9923 then the degree of linking between the insertion execution time and the number of records is 99.23%.

? In reconciliation procedure (Fig. 17    The experimentation of this algorithm on a topology consisting of two (2) slave peers proves that the variation of the number of tables containing data to replicate and reconcile in a P2P replication system has a significant impact o nly for the replication transaction as illustrated in Fig. 18. For all data modification operators , illustrated by graphs of Fig. 18(a), Fig. 18(b) and Fig. 18(c), successively, taken into account in the replication process, the execution time, when record s originate from one (1) table, is greater than the execution time when the same number of records emerge from two (2) different tables while for reconciliation the impact is not too great.

Hence this variation has no significant effect on the execution time of data reconciliation because the number of records to reconcile from one (1) table and average of execution time, calculated in Table 2, are not far different from those to reconcile from two (2) tables and whose average of execution time is calculated in Table 3. This is why the curves of the graphs depicted in Fig. 18(d  So, partially we can conclude that this algorithm is efficient for the replication of databases because generally a database does not have one table i.e. data to replicate are scattered in several tables. As for reconciliation, since it takes place only when it is necessary and mostly data to be reconciled do not exceed one quarter of that of replication, little importance should be attached to the computational time of this phenomenon. Year This conclusion was obtained after varying the factor number of tables. However, by keeping unchanged all other factors, except the number of slave peers that vary from two (2) to three (3) peers, using the same sample in Keeping the factor number of table unchanged, one table stored on a master peer with three slave peers, the replication and reconciliation models are successively presented as follow: insert operator, Fig. 19    In 1 second (y) we predict that this algorithm can successively replicate and reconcile following number of records (x):

? For insert operator ? In replication procedure (Fig. 19   Varying the factor number of table stored on a master peer with three slave peers, the replication and reconciliation models are successively presented as follow: insert operator, Fig. 21   After increasing the number of tables from one to two, in 1 second, the prediction of the execution time (y), during which this algorithm can successively replicate and reconcile the number of records (x), is established as follows:

? For insert operator ? In replication procedure (Fig. 21 When running this algorithm on a topology consisting of three (3) slave peers, the experimentation result proves that the variation in the number of tables containing data to replicate and to reconcile in a P2P replication systemhas a significant impact on the execution time of replication and reconciliation transactions, as shown in Fig. 23. Fig. 23: Effectiveness of replication and reconciliation based one table stored on a master peer with three slave peers vs two tables stored on a master peer with three slavepeers.

However, this impact is explained only by the comparison of averages, in Table 4 and 5 Mediation of Lazy Update Propagation in a Replicated Database over a Decentralized P2P Architecture execution time with one table. But, in terms of predictive models, we found that, when the records come from one table, the execution time is greater than the execution time when the same number of records is split and comes from two different tables. This phenomenon is clarified by the successive resolution of the prediction equations of the replication and reconciliation models which proved that the number of records to replicate and reconcile to 1 second, with two tables of origin is greater than those when there is only one table.

Thus, partially we can conclude that this algorithm is effective for the replication of databases, its performance increases with the increase of the tables for a certain number of records. So, since the data to replicate is usually scattered across multiple tables, we can count on its effectiveness. Fig. 24: Effectiveness of replication and reconciliation based one table stored on a master peer with two slave peers vs one table sto red on a master peer with three slave peers.

The result we have achieved so far comes from the analysis of performance by varying the numbers of tables in which the data to be replicated and reconciled originate. Nevertheless, later on, we have to analyse the performance of this algorith m starting from the variation of the slave peers. Thus, Fig. 24 and Fig. 25, show the effectiveness result when increasing the number of slave peers but the data to replicate and reconcile successively from a single table and two table. After increasing the number of slave peers, the execution time of the replication transaction as well as the reconciliation of the data, successively from a table, as illustrated in Fig. 24 and two tables, as shown in Fig. 25, knows a significant increase. This increase in execution time affects negatively the ? Secondly by comparing the predicted values, in this case the prediction of the number of records to replicate and reconcile to 1 second. After the successive resolution of the prediction models equations for replication and data reconciliation, we found that the number of records to replicate and reconcile are declining after increasing a slave peer. However, based on these observations from all the cases i.e. with the data to be replicated and reconciled from one or two tables, we can partially conclude that the increase of the number of slave peers on a Replicated Databases over a Decentralize d P2P topology is causing the loss of performance of the synchronization algorithm.


# b) Result summary

In view of what we have just achieved as a result, it is necessary to summarize and give a general conclusion. Thus, the Table 6 here below will first give a summary of the results. Starting from the results presented above and summarizin g in Table 6, our first group of hypotheses of the significance test of each independent variable gives the conclusion that each independent variable is a significant predictor of the dependent variable. In other words, the number of records in each table (xi1), the number of tables whose data has chan ged (xi2), the number of peers connected during the propagation of updates (xi3) and other factors (?) like number of columns per table, data types columns, etc., each taken separately predict significantly the execution time (y) of the replication transaction as well as that of reconciliation because almost all coefficient of determination (R²) are greater than or equal to the confidence level of 95%. In all the cases the execution time depend on other factors beyond 95% and these factors correlate positiv ely and tightly of the totality. This means that the changes made to one of these independent variables affect in 95% or more of the dependant variable and vice versa. Hence, we accept the alternative hypothesis (H1) and thus reject the null hypothesis (H0). As for the second group of hypotheses, since for all experimental scenarios all independent variables (the number of records in each table (xi1), the number of tables whose data has changed (xi2), the number of peers connected during the propagation of updates (xi3) and other factors (?) like number of columns per table, data types columns, etc.,) are significant predictors of the dependent variable which is the replication and reconciliation transaction execution time (y), the overall model of the regression is significant, at the same thresholds significance derived from the combination of factors by the experimental scenarios summarized in the Table 6 above.

The experimental results show that our algorithms are performant since when to 1 second, a time elementary unity, it can replicate and reconcile a considerable number of records, like present the last column in the Table 6, for the present experimental environment. However, since the performance of a computer algorithm is due to its execution time, this is how we assert our main hypothesis that P2P replicated databases systems experience the weak performance, especially since the time of transmission of updates from a Master Peer toward Slave Peers dependent in more than 95% of the number of records, the number of tables whose data know changes, the number of peers connected during the propagation of updates and other factors.

Nevertheless, as we have just seen, when we take two by two experimental scenarios those can be noted successively I: 1 and 2, II: 3 and 4, III: 1 and 3 and finally IV: 2 and 4 of Table 6 above, I made a good performance, II also made a performance gain but not far from the average, III made a loss of performance and IV made a loss as well. Taking III and IV it emerges the variation of number of peers connected whereas from I and II emerge the variation of the tables. During the experiment, it was found that the variation of number of the tables did not lose the performance, contrariwise it improved it. Moreover, among the independent variables, the number of records and the number of tables being factors directly related to the database before even hinting at the data replication, it is clear that it is the growth of number of connected peers which is at the base of the considerable loss of the performance i.e. the increase of the execution time of a synchronization algorithm of distributed databases.

Thus, as a future work to be carried out, as part of improving the performance of this proposed algorithm, the thought will revolve around synchronization algorithm for replicated databases over a decentralized P2P architecture with supernodes or super-peers [31], [32] belonging to peers clusters in order to reduce execution time of transactions and to reach load balancing during data transmission [35].


# VI.


# Conclusion

This article proposes a prototype of a synchronizer-mediator for lazy replicated databases over a decentralized P2P architecture in a Graphical User Interface. The motivation arises from the common problem of databases replication consisting to maintain consistent replicated databases over a decentralized P2P network.

However, two specific problems caught our attention: transactions broadcasting updates from different peers are performed concurrently on a destination peer replica, which always causes transactions conflicts and data conflicts. Moreover, during data migration, connectivity interruptions and network overload corrupt transactions so that destination peer databases can contract duplicated records, unsuitable data o r missing records which make replicas inconsistent. Differen t methodologies have been used to solve these problems : the audit log technique to capture and store data changes in audit tables; the algorithmic method to design and analyse algorithms for transactions serialization, for data replication transactions and the replicas reconciliation transactions end finally the statistical method to analyse the performance of algorithms and to produce prediction models of the execution time.

The C # prototype software has been designed to implement algorithms and permit to execute the test in order to make out the effectiveness of each experimental scenarios. Afterwards it has been shown that the algorithm has a good performance because it can replicate and reconcile a considerable number of records to 1 second. Finally, the assumption according to which "The execution time of replication and reconciliation transactions totally depends on independent factors" has been affirmed.
1![Fig. 1: Architecture of Distributed Database System.](image-2.png "Fig. 1 :")
2![Fig. 2: Protocol of Database replication.](image-3.png "Fig. 2 :")
3![Fig. 3: Protocol of Transaction running.](image-4.png "Fig. 3 :")
4![Fig. 4: Audit -log (Audit table and triggers) creation.](image-5.png "Fig. 4 :")
5![Fig. 5: Data table triggered and audit tables Master and Slave.](image-6.png "Fig. 5 :")
6![Fig. 6: T opology of Decentralized P2P circulating Token Ring.](image-7.png "Fig. 6 :")
1![P2P getting the circulating token Input: A set p of slave peers Output: A peer (Master/Slave) owning the token begingetTokenFunction() 1: selectall SlavePeers 2: for(p ? 0 toNumberOfSlavePeers -1)do](image-8.png "Algorithm 1 :")
7![Fig. 7: Protocol of Lazy Decentralized P2P Data Replication.](image-9.png "Fig. 7 :")
![Figure legend](image-10.png "Figure legend")
1![Modifications are performed to all replicas by users;2. The commitment of a transaction makes the modifications stable;3. The modifications (Inserted, Updated and Deleted data) are independently transmitted to the other data copies or replicas. According to the Fig. 7, arrows (1) and (2) deal with the user application i.e. for each local write transaction (1), the user application must receive the commitment (or abort) message (2). Changes carried by (3) are another set of transactions, wrapped in the Sync. Mediator-System, routed from each Master Peer to Slave Peers. The algorithm 3 here below establishes instructions in transactions of the Insert operator function. Year 2 019](image-11.png "1 .")
8![Fig. 8: Protocol of Decentralized P2P Data Reconciliation.](image-12.png "Fig. 8 :")
1![Figure legend 1. After the refresh transactions (Deleted, Updated and Inserted data) commit, then the reconciliation](image-13.png "Figure legend 1 .")
9![Fig. 9: Protocol of P2P Mediator-Synchronizer.](image-14.png "Fig. 9 :")
10![Fig. 10: Sync. Mediator-System MDI window.](image-15.png "Fig. 10 :")
![Mediation of Lazy Update Propagation in a Replicated Database over a Decentralized P2P Architecture b) Processing phases](image-16.png "")
11![Fig. 11: Audit log creation window.](image-17.png "Fig. 11 :")
12![Fig. 12: Synchronization editor window.](image-18.png "Fig. 12 :")
![Mediation of Lazy Update Propagation in a Replicated Database over a Decentralized P2P Architecture respectively on the peer that created it or inserted it.](image-19.png "")
13![Fig. 13: Physical schema of a Banking Database.](image-20.png "Fig. 13 :")
![) y = 0.0302x ? 0.5595 +?? for data replication and Fig. 15(a) y = 0.0093x ? 0.0777 + ?? for data reconciliation, update operator, Fig. 14(b) y = 0.0318 x ? 2.0714 + ?? for data replication and Fig. 15(b) y = 0.0208 x ? 0.4639 + ?? for data reconciliation and delete operator, Fig. 14(c) y = 0.0336 x ? 2.528 + ?? for data replication and Fig. 15(c) y = 0.0148 x ? 0.4124 + ?? for data reconciliation.](image-21.png "")
14![Fig. 14: Replication execution time: (a) Insertion, (b) Update and (c) Delete results from the experimentation based one table stored on a master peer with two slave peers.](image-22.png "Fig. 14 :")
15![Fig. 15: Reconciliation execution time: (a) missing records, (b) incorrect values, missing values, and incorrectly formatted values and (c) duplicate records results from the experimentation based one table stored on a master peer with two slave peers. Keeping unchanged basic factors, in 1 second (y) we predict that this algorithm can successively replicate and reconcile following number of records (x): ? For insert operator ? In replication procedure (Fig. 14(a)) : 1 = 0.0302 x ? 0.5595 ? ?0.0302 x = ?1.5595 ? x = 51.63 ? x ? 52 inserted records to be replicate in 1 second. So, as the coefficient of determination R² = 0.9865 then the insertion execution time depend on 98.65% of the number of records](image-23.png "Fig. 15 :")
16![Fig. 16: Replication execution time: (a) Insertion, (b) Update and (c) Delete results from the experimentation based two tables stored on a master peer with two slave peers.](image-24.png "Fig. 16 :")
17![Fig. 17: Reconciliation execution time: (a) missing records, (b) incorrect values, missing values, and incorrectly formatted values and (c) duplicate records results from the experimentation based two tables stored on a master peer with two slave peers. Insert operator, Fig. 16(a) y = 0.0218?? ? 1.3366 + ?? for data replication and Fig. 17(a) y = 0.0093?? ? 0.0671 + ?? for data reconciliation, update operator, Fig. 16(b) y = 0.023?? ? 2.0949 + ?? for data replication and Fig. 17(b) y = 0.0184?? ? 0.4798 + ?? for data reconciliation and delete operator, Fig. 16(c) y = 0.0239?? ? 2.4175 + ?? for data replication and Fig. 17(c) y = 0.0136?? ? 0.1746 + ?? for data reconciliation.When we increase the number of tables from one to two, in 1 second, the prediction of the execution time (y), during which this algorithm can successively replicate and reconcile the number of records (x), is calculated from the following way:](image-25.png "Fig. 17 :")
![(a) y = 0.0093?? ? 0.0671 + ?? for data reconciliation, update operator, Fig. 16(b) y = 0.023?? ? 2.0949 + ?? for data replication and Fig. 17](image-26.png "")


( ) Cconcerned record, with the new data that has justbeen set, and inserts it in the audit table, asshown in Fig. 6, row 6 to 8 in Slave Peer Audit-table;© 2019 Global Journals6, row 1 to 5 in Slave Peer Audit-table; ? After each Update operation of a column of data table, the "update trigger" captures the Mediation of Lazy Update Propagation in a Replicated Database over a Decentralized P2P Architecture
Algorithm 3: P2P Replication Algorithm for Data InsertionInput: Master peer inserted recordsOutput: Transaction Commitments or Abortions Algorithm 5: P2P Replication Algorithm for Data Delete begininsertFunction() Input: Master peer deleted records 1: begininsertMainTransaction 2: Output: Transaction Commitments or Abortions selectall Available Slave Peers 3: begindeleteFunction() for(p ?0 toNumberOfAvailableSlavePeers -1)do 4: 1: begindeleteMainTransaction begininsertSubTransactionPeer(p) 5: 2: selectall Available Slave Peers selectall Audit Table Names in Mater Peer Database 6: 3: for(p ?0 toNumberOfAvailableSlavePeers -1)do selectall Data Table Names in Slave Peer(p) Database 7: 4: begindeleteSubTransactionPeer(p) for(ts?0 toNumberOfDataTableNamesInSlavePeer(p)Database -1)do 8: selectall Rows in Audit Table(ts) of Master Peer Databasewhere AuditAction = 'Inserted' 5:Year 2 0199: 10:and AuditTimeStamp ?BeginningDateAndTime and AuditTimeStamp?EndingDateAndTime for(rtm?0 toRowsInAuditTable(ts)OfMasterPeerDatabase -1)do selectall Column Names in Data Table(ts) of Slave Peer(p) Database1211: 12:for(cts?0 to NumberOfColumnNamesInDataTable(ts)OfSlavePeer(p)Database -1)do ColumnNames?ColumnNames&ColumnNames[cts]13:Values ?Values & Row[rtm]Column[cts]14:end for cts15:insert in toDataTableNames(ts)InSlavePeer(p)Database (ColumnNames)values(Values)16:end for rtm17:end for ts20:endinsertSubTransaction(Commit or Abort)21:end for p22: endinsertMainTransaction(Commit or Abort)23: returnTransaction Commitments or AbortionsendinsertFunctionAfter records which have been inserted bewhich has the instructions in transactions of thereplicated to slave peers, the algorithm 4 here below,update function, also runs in turn.( ) CAlgorithm 4: P2P Replication Algorithm for Data Update Input: Master peer updated recordsOutput: Transaction Commitments or AbortionsbeginupdateFunction()1: beginupdateMainTransaction2:selectall Available Slave Peers3:for(p ?0 toNumberOfAvailableSlavePeers -1)do4:beginupdateSubTransactionPeer(p)5:selectall Audit Table Names in Mater Peer Database6:selectall Data Table Names in Slave Peer(p) Database7:for(ts?0 toNumberOfDataTableNamesInSlavePeer(p)Database -1)do8:selectall Rows in Audit Table(ts) of Master Peer Databasewhere AuditAction = 'Updated'and AuditTimeStamp?BeginningDateAndTime andAuditTimeStamp?EndingDateAndTime9:for(rtm?0 toRowsInAuditTable(ts)OfMasterPeerDatabase -1)do10:selectall Column Names in Data Table(ts) of Slave Peer(p) Database11:for(cts?0 toNumberOfColumnNamesInDataTable(ts)OfSlavePeer(p)Database -1) do12:if(ColumnName(cts)InDataTable(ts)OfSlavePeer(p)Database =UpdatedColumnName)then13:updateDataTable(ts)OfSlavePeer(p)DatabasesetColumnName(cts)InDataTable(ts)OfSlavePeer(p)Database ?'Row[rtm]Column[cts]'whereColumnName(0)InDataTable(ts)OfSlavePeer(p)Database ='Row[rtm]Column[0]'14:end if15:end for cts16:end for rtm17:end for ts20:endupdateSubTransaction(Commit or Abort)21:end for p22: endupdateMainTransaction(Commit or Abort)23: returnTransaction Commitments or AbortionsendupdateFunction© 2019 Global Journalsselectall Audit Table Names in Mater Peer Database 6: selectall Data Table Names in Slave Peer(p) Database 7: for(ts?0 toNumberOfDataTableNamesInSlavePeer(p)Database -1)do 8: selectall Rows in Audit Table(ts) of Master Peer Databasewhere AuditAction = 'Deleted' and AuditTimeStamp?BeginningDateAndTime and AuditTimeStamp?EndingDateAndTime 9: for(rtm?0 to RowsInAuditTable(ts)OfMasterPeerDatabase-1) do 10: selectall Column Names in Data Table(ts) of Slave Peer(p) Database 11: deletefromDataTable(ts)OfSlavePeer(p)DatabasewhereColumnName(0)InDataTable(ts) OfSlavePeer(p)Database = 'Row[rtm]Column[0]' 12: end forrtm 13: end for ts 14: enddeleteSubTransaction(Commit or Abort) 15: end for p 16: enddeleteMainTransaction(Commit or Abort) 17: returnTransaction Commitments or Abortions enddeleteFunction
Mediation of Lazy Update Propagation in a Replicated Database over a Decentralized P2P Architecture Mediation of Lazy Update Propagation in a Replicated Database over a Decentralized P2P ArchitectureYear 2 0192. Processing: ? Reading forwarded to other replicas by the refresh of updates independently transactions (in (2) dashed arrows); ? Comparison with the Master data read in (1) undashed arrows; ? Reconciliation is written to the Slave peers (in (2) undashed arrows). 3. The commitment (or cancelation) of a reconciliation transaction makes replicas consistent. 41: 42: else //Call function to delete duplicated records function of the algorithm 6 here below: 43: deleteDuplicatedRecordFunction(arguments) 44: end if end if instructions in transactions of the reconciliation 40: startSlaveLoop?rts + 1 The whole reconciliation process is established by 32: startSlaveLoop?0 33: Input: DataTableName(ts)OfSlavePeer(p)Database, UpdatedColumnName, cts, rtm for(rtm ? 0 toNumberOfRowsInAudit Table(ts)OfMasterPeerDatabase -1)do 34: Output: Nothing for(rts ? startSlaveLooptoNumberOfRowsInAudit Table(ts)OfSlavePeer(p)Database -beginupdateIncorrectValuesFunction(args) 1)do 35: 1: if(ColumnName(cts)InDataTable(ts)OfSlave Peer(p)Database=UpdatedColumnName)then if(Row[rtm]Column[0]InAudit Table(ts)OfMasterPeerDatabase = 2: updateDataTable(ts)OfSlavePeer(p)DatabasesetColumnName(cts)InDataTable(ts)OfSlave Row[rts]Column[0]InAuditTable(ts)OfSlavePeer(p)Database)then 36: Peer(p)Database = 'Row[rtm]Column[cts]'whereColumnName(0)InDataTable(ts)OfSlave if(rtm<NumberOfRowsInAudit Table(ts)OfMasterPeerDatabase -1)then 37: Peer(p)Database = 'Row[rtm]Column[0]' startSlaveLoop?rts + 1 38: 3: end if end for rts 39: else endupdateIncorrectValuesFunctionYear 2 019 Year 2 01914Algorithm 6: P2P Algorithm for Data Reconciliation 45: end for rts 46: end for rtm Input: Master peer replicated (inserted, updated and deleted) records 47: else Output: Transaction Commitments or Abortions //Reconcile incorrect, missing and incorrectly formatted values process start beginreconcileFunction() 48: for (rtm = 0 to NumberOfRowsInAudit Table(ts)OfMasterPeerDatabase -1)do 1: beginreconcileMainTransaction 2: 49: for (cts = 0 to NumberOfColumnNamesInDataTable(ts)OfSlavePeer(p)Database -selectall Available Slave Peers 3: 1)do for(p ?0 toNumberOfAvailableSlavePeers -1)do 4: 50: if (Row[rtm]Column[cts]InAudit Table(ts)OfMasterPeerDatabase ? beginreconcileSubTransactionPeer(p) 5: Row[rtm]Column[cts]InAuditTable(ts)OfSlavePeer(p)Database) then selectall Audit Table Names in Mater Peer Database //Call function to update Incorrect values, missing values, incorrectly formatted 6: values1351:updateIncorrectValuesFunction(arguments)52:end if53:end for cts( ) C11: 12: 13: 14: endreconcileFunction Table(ts)OfMasterPeerDatabase -1)then if(NumberOfRowsInAudit Table(ts)OfSlavePeer(p)Database -1 <NumberOfRowsInAudit 60: returnTransaction Commitments or Abortions sortRows of Audit Table(ts) of Slave Peer(p) Database 59: endreconcileMainTransaction(Commit or Abort) sortRows of Audit Table(ts) of Master Peer Database 58: end for p selectall Column Names in Data Table(ts) of Slave Peer(p) Database 57: endreconcileSubTransaction(Commit or Abort) 'Remote' 54: end for rtm 55: end if 56: end for ts( ) C ( ) C15: To insert missing records, the algorithm 7 here is called. rts?0//Reconcile missing records process start16: Algorithm 7: Function to insert missing records for(rtm?0 toNumberOfRowsInAuditTable(ts)OfMasterPeerDatabase -1)do17: Input: DataTable(ts)OfSlavePeer(p)Database, cts, rtm repeat18: Output: Nothingif(rts?NumberOfRowsInAudit Table(ts)OfSlavePeerDatabase -1)then19: begininsertMissingRecordFunction(args) if(Row[rtm]Column[0]InAudit Table(ts)OfMasterPeerDatabase =Row[rts]Column[0]InAuditTable(ts)OfSlavePeer(p)Database)then 1: for(cts?0 to NumberOfColumnNamesInDataTable(ts)OfSlavePeer(p)Database -1)do20: 2:Continue(rts ++) ColumnNames?ColumnNames&ColumnNames[cts]21: 3:end repeat Values ?Values & Row[rtm]Column[cts]22: 4: end for ctselse//Call function to insert missing records 5: insert in toDataTableNames(ts)InSlavePeer(p) Database (ColumnNames) values (Values)23: endinsertMissRecordFunction insertMissingRecordFunction(arguments)24: 25: To delete duplicated records, the algorithm 8 here is called. end if else//Call function to insert missing records Algorithm 8: Function to delete duplicated records 26: insertMissingRecordFunction(arguments) 27: Input: DataTable(ts)OfSlavePeer(p)Database, rtm end repeat 28: Output: Nothing end if 29: begindeleteDuplicatedRecordFunction(args) until(Row[rtm]Column[0]InAudit Table(ts)OfMasterPeerDatabase = 1: deletefromDataTable(ts)OfSlavePeer(p)DatabasewhereColumnName(0)InDataTable(ts)OfSlave Row[rts]Column[0]InAuditTable(ts)OfSlavePeer(p)Database) 30: Peer(p)Database = 'Row[rtm]Column[0]' end for rtm 31: else if(NumberOfRowsInAudit Table(ts)OfSlavePeer(p)Database -1 enddeleteDuplicatedRecordFunction>NumberOfRowsInAuditTable(ts)OfMasterPeerDatabase -1)then To update incorrect values, the algorithm 9 is called.//Reconcile duplicated records process start© 2019 Global Journals© 2019 Global Journals © 2019 Global Journalsselectall Audit Table Names in Slave Peer(p) Database 7: selectall Data Table Names in Slave Peer(p) Database 8: for(ts?0 toNumberOfDataTableNamesInSlavePeer(p)Database -1)do 9: selectall Rows in Audit Table(ts) of Master Peer Database where TransactionType = 'Local' and AuditTimeStamp?BeginningDateAndTime and AuditTimeStamp?EndingDateAndTime 10: selectall Rows in Audit Table(ts) of Slave Peer(p) Database where TransactionType =
1Nbr.Number of rowsNumber of rowsObs.to replicateto reconcile1.7231812.9002253.120304.25006255.12533136.80207.30007508.500012509.45011310.4860121511.60015012.23559Mean1643.42410.92Total197214931
2SampleInsert executionUpdate executionDelete executionnumberingtime (in Sec.)time (in Sec.)time (in Sec.)Nbr.MasterRepliReconciRepliReconciRepliReconciObs.Peercationliationcationliationcationliation1.B1921932022.A2422442423.C3031404.C67568126985.A353355364
2
3SampleInsert executionUpdate executionDelete executionnumberingtime (in Sec.)time (in Sec.)time (in Sec.)Nbr.MasterRepliReconciRepliReconciRepliReconciObs.Peercationliationcationliationcationliation1.B1221231122.A1521541633.C2021314.C45547114675.A2432452546.A1020217.B61761963108.B1041211024116189.A12112212110.C11511121231251611.C16116216112.B716171Mean34.503.7535.677.0836.835.42Total414454288544265
1SampleInsert executionUpdate executionDelete executionnumberingtime (in Sec.)time (in Sec.)time (in Sec.)Nbr.MasterRepliReconciRepliReconciRepliReconciObs.Peercationliationcationliationcationliation1.B2222332322.A2822852823.C3031504.C786791480115.D4134164256.A3020307.B9789717101158.D1851420030218219.A17116216110.C16512170271722011.D28228329312.B121111121Mean56.584.2558.179.0860.756.75Total6795169810972981
4( ) C
4
5SampleInsert executionUpdate executionDelete executionnumberingtime (in Sec.)time (in Sec.)time (in Sec.)Nbr.MasterRepliReconciRepliReconciRepliReconciObs.Peercationliationcationliationcationliation1.B2231951832.A2632862853.C6071624.C908931892155.D5855187666.A6060607.B188121811318023
6Experimental scenariosT ransactionOperatorModelR²RPrediction (to 1 Sec.)1. Experimentation based one table stored on a master peer with two slave peersReplication ReconciliationInsert Update Delete Insert Update Delete??=0.0302???0.5595+? ??=0.0318???2.0714+? ?? = 0.0336?? ? 2.528 + ? ??=0.0093???0.0777+? ??=0.0208???0.4639+? ??=0.0148???0.4124+?98.65% 99.34% 52 records 97.89% 98.94% 97 records 96.63% 96.63% 105 records 98.76% 99.38% 116 records 99.56% 99.78% 70 records 99.22% 99.61% 95 records2. Experimentation based two tables stored on a master peer with two slave peersReplication ReconciliationInsert Update Delete Insert Update Delete??=0.0210???1.3366+? ??=0.0230???2.0949+? ??=0.0239???2.4175+? ??=0.0093???0.0671+? ?? = 0.0184?? ? 0.4798 + ? 98.32% 99.16% 80 records 98.46% 99.23% 111 records 99.25% 99.63% 135 records 98.32% 99.16% 143 records 96.91% 98.44% 115 records ?? = 0.0136?? ? 0.1746 + ? 98.59% 99.29% 86 records3. Experimentation based one table stored on a master peer with three slave peersReplication ReconciliationInsert Update Delete Insert Update Delete?? = 0.0348?? ? 0.5762 + ? 99.14% 99.57% 45 records ?? = 0.0368?? ? 2.3047 + ? 99.05% 99.52% 52 records ?? = 0.0387?? ? 2.8053 + ? 98.49% 99.24% 98 records ?? = 0.0106?? ? 0.0883 + ? 99.64% 99.82% 103 records ?? = 0.0235?? ? 0.5576 + ? 97.35% 98.67% 66 records ?? = 0.0176?? ? 0.4611 + ? 98.48% 99.24% 83 records4. Experimentation based two tables stored on a master peer with three slave peersReplication ReconciliationInsert Update Delete Insert Update Delete?? = 0.0539?? ? 2.9424 + ? 94.95% 97.44% 73 records ?? = 0.0527?? ? 4.3298 + ? 96.22% 98.09% 101 records ?? = 0.0566?? ? 5.5273 + ? 97.05% 98.51% 115 records ?? = 0.0206?? ? 1.0387 + ? 95.93% 97.94% 99 records ?? = 0.0293?? ? 0.8713 + ? 97.09% 98.53% 64 records ?? = 0.0266?? ? 0.7763 + ? 98.12% 99.05% 67 records
			© 2019 Global Journals
			© 2019 Global JournalsMediation of Lazy Update Propagation in a Replicated Database over a Decentralized P2P Architecture
		
		
## Acknowledgement

Firstly, we are grateful to the Grace of Almighty God. We would also like to thank the academic corps of the Butembo (D. R. Congo) Institute of Building and Public Works for their encouragement and follow-up of our investigations. On finish, we thank the Research Technology and Development Centre (RTDC) of Sharda University, for its facilities to realize this work.

			
* 
	
		Synchronous and Asynchronous Replication
		
			KKituta
		
		
			RAgarwal
		
		
			BKaushik
		
	
		International Conference on Machine Learning and Computational Intelligence
				
			2017
		
	
	International


* 
	
		
			MTÖzsu
		
		
			PValduriez
		
		Principles of Distributed Database Systems
				New York, USA
		
			Springer Science & Business + Media
			2011
		
	
	3rd ed.


* 
	
		The Replication Technology in E-learning Systems
		
			NMagdalena
		
		
			I
		
	
		Procedia -Social and Behavioral Sciences
		
			28
			
			2011
		
	
	Publisher: Elsivier


* 
	
		Understanding Replication in Databases and Distributed Systems
		
			MWiesmann
		
	
		IEEE 20th International Conference on Distributed Computing Systems
				
			2002
		
	
* 
	
		T woways database synchronization in homogeneous DBMS using audit log approach
		
			RGudakesa
		
		
			MSukarsa
		
		
			GSasmita
		
	
		Journal of T heoretical and Applied Information Technology
		
			65
			
			2014
		
	
* 
	
		IDRC: A Distributed Real-T ime Commit Protocol
		
			SPandey
		
		
			UShanker
		
	
		th International Conference on Smart Computing and Communications ICSCC 2017
				
			2017
			125
			
		
	Publisher: Elsivier


* 
	
		An implementation of concurrency control between batch update and online entries
		
			TKudo
		
	
		18 th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems -KES2014, Procedia Computer Science
				
			2014
			35
			
		
* 
	
		Distributed Database Management T echniques for Wireless Sensor Networks
		
			ODiallo
		
		
			JoelRodrigues
		
		
			JSene
		
		
			MLloret
		
		
			J
		
	
		IEEE T ransactions on Parallel and Distributed Systems
		
			26
			2
			
			2015
		
	
* 
	
		Database system concepts
		
			ASilberschatz
		
		
			HFKorth
		
		
			SSudarshan
		
		
			1997
			McGraw-Hill
			New York
		
	
* 
	
		Peer-to-Peer Computing -Principles and Applications
		
			QVu
		
		
			MLupu
		
		
			COoi
		
		
			2010
			Springer
		
	
* 
	
		Considerations about an Oracle Database Multi-Master Replication
		
			IFilip
		
		
			CVasar
		
		
			RRobu
		
	
		IEEE 5th International Symposium on Applied Computational Intelligence and Informatics
				
			2009
		
	
* 
	
		Dynamic replication and migration of data objects with hot -spot and coldspot statuses across storage data centers
		
			YMansouri
		
		
			RBuyya
		
	
		Journal of Parallel and Distributed Computing
		
			126
			
			2018
		
	
	Publisher: Elsivier


* 
	
		Fundamentals of SQL Server
		
			MSebastian
		
		
			2012. 2013
			Simple T alk Publishing
			New York, United States of America
		
	
* 
	
		Oracle Streams 11g Data Replication
		
			DKirtikumar
		
		
			2011
			McGraw-Hill
			New York, United States of America
		
	
* 
	
		An optimized strategy for replication in peer-to-peer distributed databases
		
			AGeorge
		
		
			CBalakrishnan
		
	
		IEEE International Conference on Computational Intelligence and Computing Research
				
			2012
		
	
* 
	
		A Novel Replication Model with Enhanced Data Availability in P2P Platforms
		
			TZhang
		
	
		International Journal of Grid and Distributed Computing
		
			9
			4
			
			2016
		
	
* 
	
		Database Replication T echnology having high Consistency Requirements
		
			ZT Ing
		
		
			WYu
		
	
		IEEE Third International Conference on Information Science and Technology
				
			2013
		
	
* 
	
		
			TCormen
		
		
			H
		
		Introduction to Algorithms
				London, England
		
			The MIT Press
			2012
		
	
	4th ed.


* 
	
		In-House
		
			CKothari
		
		
			RGarg
		
		
			G
		
	
		Research methodology methods and techniques
				
			2014. 2018
			20
		
	
	Microsoft Corporation web site


* 
	
		
		Oracle Corporation web site
		
			2018
		
	
* 
	
		Experian Ltd web site
		
		
			2018
		
	
* 
	
		Dynamic Data Allocation with Replication in Distributed Systems. 30 th IEEE International Performance Computing and Communications Conference
		
			KShahin
		
		
			GPedram
		
		
			DKhuzaima
		
		
			2011
		
	
* 
	
		
		Oracle Corporation web site
		
			2018
		
	
* 
	
		
		Database Schema Difference Reconciliation
				Jonathan, H., MySQL_Diff
		
			2018. 2018
		
	
* 
	
		
		Pragmatic Works Inc. web site
				
			2018. 2018
		
	
* 
	
		
		ApexSQL LLC web site
		
			2018
		
	
* 
	
		Analysis of database replication protocols
		
			KKituta
		
		
			SKant
		
		
			RAgarwal
		
	
		International Journal of Latest T rends in Engineering and T echnology
		
			
			2018. 2018
		
	
	Special Issue ICRMR


* 
	
		
			ESpaho
		
	
		Modeling and Processing for Next -Generation Big-Data T echnologies. Modeling and Optimization in Science and Technologies
				
			FXhafa
			LBarolli
			ABarolli
			PPapajorgji
		
		
			Springer
			2015
			4
			
		
	P2P Data Replication: T echniques and Applications


* 
	
		Data Replication in Collaborative Systems
		
			XFatos
		
	
		IEEE Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing
				
			2012
		
	
* 
	
		A systematic review on distributed databases systems and their techniques
		
			KKituta
		
		
			SKant
		
		
			RAgarwal
		
	
		Journal of T heoretical and Applied Information Technology
		
			96
			1
			
			2019
		
	
* 
	
		Consistency of data replication protocols in database systems: A review
		
			ASouri
		
		
			SPashazadeh
		
		
			ANavin
		
		
			H
		
	
		International Journal on Information Theory (IJIT)
		
			3
			4
			
			2014
		
	
* 
	
		Evaluation of database replication techniques for cloud systems
		
			MSantana
		
		
			Enrique
		
		
			JFrancesc
		
		
			D
		
	
		Computing and Informatics
		
			34
			
			2015