Oracle7 Server Distributed Systems Manual, Vol. 2 | ![]() Library |
![]() Product |
![]() Contents |
![]() Index |
With a completely synchronous environment, you have the advantage of always having the most up-to-date information at all sites. You always make decisions based on the most current information, and conflicting updates never occur.
With a completely asynchronous environment, you have the advantage of continuous availability. No site is dependent upon another to allow an update to be made. If one site goes down, you can switch to another and continue working. Your business needs will determine which propagation method is most appropriate for you.
To make synchronous replication practical, you need to take steps to ensure stable networks and systems, or have flexibility in scheduling updates to allow for delays due to network and system outages.
Distributing your data can improve your performance by providing faster updates of the local subset of your data. It can also improve your availability. If one site in a distributed system becomes unavailable, you can continue to query and update the data at the remaining sites.
If you frequently perform queries that access multiple remote sites, you may experience some performance degradation because all of the data is at a single location. Additionally, if one of these sites becomes unavailable, you will not be able to complete this transaction until the site becomes available again.
For these reasons, a distributed model is most appropriate when you frequently query and update a distinct subset of your data from a single location, and seldom query or update the remaining portions. Oracle7 Server Distributed Systems, Volume I provides more detail on distributing your data.
Replicating your data can improve your performance by providing faster queries of all of your data. It can also provide improved availability to all of your data for queries. Even if your local site goes down, you can still access a complete copy of your data at another replicated location.
Synchronously replicating your data can decrease the availability of your data for updates, however. If one site becomes unavailable, you cannot update any other replicas until the downed site either becomes available or is dropped from the replicated environment. For this reason, you may prefer to use "near real-time" asynchronous replication, if you determine that you need to replicate, rather than distribute your data.
If you make frequent updates, yet require very up-to-date information, you will need to set a short propagation interval, to simulate real-time replication. If you want to minimize your communication costs, you may prefer to increase this interval to once a day, or even once a week.
If you require information that is up to date as of a particular point in time, you may prefer to propagate your changes at a scheduled time. For example, you might want to retrieve new pricing information at the start of each quarter.
Finally, if you do not know when your network connection will be available, such as if you are using a laptop computer to collect data on a sales call, you may want to propagate your changes on demand.
Because snapshot sites propagate changes only to their associated master site, and because they pull down changes from their master in an efficient, batch-oriented manner, sites requiring on-demand data propagation should typically be snapshot sites.
In addition to determining an appropriate replication interval for each site, you must also determine what data is appropriate for each site.
Certain sites may not require full copies of a replication group, and you may prefer to replicate only a subset of the data to these sites to conserve resources. For example, suppose you have regional branches of your business, each with their own client base. Each branch may require full copies of certain tables, such as price lists, but only subsets of others, such as customer lists. Because master sites must contain full copies of a replicated group, these branch sites would need to be created as snapshot sites.
The following sections of this chapter describe the various models that Oracle's symmetric replication facility supports in more detail, including hybrid models that combine snapshot and multi-master replication.
With primary site replication, each piece of information is owned by one site, and this ownership never changes. Other sites "subscribe" to the data owned by the primary site, which means that they have access to read-only copies of the replicated data. With primary site replication, you never need to worry about any discrepancies between data that is being updated at two different locations.
Figure 2 - 1. Information Offloading
Portions of this information might be used by your sales offices around the world. Instead of replicating the entire table at each sales office, you need only replicate the portions appropriate for that region, as shown in Figure 2 - 2.
For performance reasons, however, you might want to allow local updates of the data, thus allowing multiple sites to have access to a single table. You can still successfully avoid conflicts by implementing an advanced form of primary site ownership.
Instead of designating one site as the owner of the entire table, each site is allowed to "own" a distinct portion of this table. That is, each site would be allowed to modify only a subset of the rows or columns in each table. You might think of this as allowing each site to own a distinct horizontal or vertical partition of the data in a single table.
Ownership can either be enforced by your application, or can be enforced by using a combination of triggers, views, procedures, and horizontally partitioned updatable snapshots.
The CREATE statement for a snapshot of your CUSTOMERS table might look like
CREATE SNAPSHOT customers FOR UPDATE AS SELECT * FROM customers@hq.com WHERE region = 'North East';
Note: Ownership of vertical partitions requires the use of column groups. Column groups are described .
Each application module acts on an order, that is, performs updates to the order data, when the state of the order indicates that the previous processing steps have been completed. For example, the application module that ships an order will do so only after the order has been entered and approved.
By employing a dynamic ownership replication technique, such a system can be distributed across multiple sites and databases. Application modules can reside on different systems. For example, order entry and approval can be performed on one system, shipping on another, billing on another, and so on. Order data is replicated to a site when its state indicates that it is ready for the processing step performed by that site. Data may also be replicated to sites that need read-only access to the data. For example, order entry sites may wish to monitor the progression of processing steps for the orders they enter.
In some situations, however, it is desirable to allow multiple sites to update the same data, potentially at the same time. For example, it may be desirable to replicate customer data across multiple sites and systems rather than maintaining customer data centrally or maintaining it separately and redundantly within each system. Different sites, though, may need to update this data.
This occurrence is known as an update conflict. Replicated data has become inconsistent because the replicated data was updated at multiple sites. If you cannot tolerate such inconsistencies, you must either carefully partition ownership of your data or only allow for synchronous propagation of changes between sites. If all sites in your replicated environment are propagating changes to one another synchronously, update conflicts cannot occur. If, however, you have even one site sending or receiving changes asynchronously (for example, if you have an updatable snapshot site), you have the potential for conflicts. For some applications, these temporary inconsistencies can be permitted as long as they can be detected and resolved to ensure that over time the replicated data converges to a consistent state at all sites.
The symmetric replication facility provides a number of standard resolution routines from which the application developer can select. Standard resolution routines include: timestamp determined most recent update, commutative resolution of additive updates, applying the change from the site with the highest priority value, and min/max selection of updates. Alternatively, for more specialized cases, the application developer can write his or her own routines.
In the scenario above, a routine that uses timestamps to determine the most recent update can be employed so that the customer's address converges to the most recent update of the address at all sites. Update conflicts on the address will be automatically detected and immediately resolved at each site by selecting the most recent of the updates.
For example, an earlier discussion described how a distributed order entry system could be implemented using primary site replication techniques with horizontal partitioning.
In this scenario each sales office owned a distinct horizontal partition of the tables containing orders and customer information for the customers serviced by that office. Each sales office entered orders for its customers, but no others.
For some businesses, though, this is not the model. For example, a retail chain may have several stores in a metropolitan area. Customers may frequent the store closest to where they live, but they will go into other stores; and these others stores will want to take their orders when they do. If multiple stores perform updates to the same customer and order data, as illustrated in Figure 2 - 4, update conflicts potentially could occur. Sophisticated application developers can identify these conflicts and either select standard resolution routines or devise their own to implement such systems.
The Oracle Parallel Server provides system fault tolerance in locally connected cluster or massively parallel environments. In these environments, the Parallel Server will usually be the preferred option. Symmetric replication extends Oracle's system fault tolerance capability to geographically separated systems and local non-clustered environments. Another option is a standby database configuration which can offer performance advantages while also providing protection across geographically separated systems.
![]() ![]() Prev Next |
![]() Copyright © 1996 Oracle Corporation. All Rights Reserved. |
![]() Library |
![]() Product |
![]() Contents |
![]() Index |