|
|
|
||||||||||
Today’s IT infrastructure runs mostly on computer clusters. Whether the cluster has two, four, 68, or 128 nodes it does not matter, what is important is that many of the, e.g., banking, insurance, or electronic commerce applications that run today, are doing so on clusters. Achieving scalability through the use of clusters is called scaling out. The alternative is to scale up, that is, to use larger and larger computers. Unfortunately, scaling up is not economically viable, certainly not to the sizes that are needed to run large applications such as electronic bookstores or complex web sites. Hence, scaling out has become the dominant architecture for large IT installations.

Web servers and application servers (using .NET or J2EE Technology) can easily be scaled out. But what can one do if the database becomes the bottleneck?
The most visible effect of scaling out architectures on application software is the need for increased distribution and parallelism. Web servers, application servers, and general middleware platforms such as .Net or J2EE are all now distributed software systems where the primitives necessary for distribution (RMI, messages, queues, web services) are a key part of the support provided to developers. There is, however, an important part of the IT infrastructure that still remains largely a centralized component: the database engine. Due to the nature of the operations they perform and the need for consistency, databases are difficult to parallelize and can therefore easily become a bottleneck. The tradeoff between consistency, throughput, and distribution/parallelism is a well known one and has kept researchers and practitioners busy for many years. Although some solutions exist, there is no generic approach to the problem of building consistent, large scale, database clusters that can support a cluster based IT infrastructure.

We have been working on this important architectural problem for the last few years. We are right now at the last stages of completing a thin middleware layer that allows consistent replication of database engines over an arbitrary size cluster. The system is called Ganymed. One of the important features of the system is that it guarantees consistency at all times. It has been known for years that databases can be scaled out at the price of reducing consistency. Thus, one of the first problems we had to solve is to come up with a way of guaranteeing consistency without compromising the ability to scale. We accomplished this first step by using the notion of snapshot isolation and implementing our own replication protocol, RSI-PC (Replicated Snapshot Isolation with Primary Copy), which allows to build a cluster database system that behaves like a single snapshot isolation based database. While our approach enforces strict consistency, we can also offer a lesser form known as session consistency.
The use of RSI-PC gives the system the ability to provide consistent views to the clients without having to maintain internal consistency at all times. Another important characteristic of the system is that it is asymmetric, it uses a primary database and a collection of satellites databases that fulfill quite different purposes. The primary database (a so called master database) provides compatibility with existing systems as it can be a commercial database or an open source database. It also provides a well defined point in the architecture where to implement high availability and back-ups. The satellite databases provide the necessary support for scalability and extensibility. As an interesting feature, the satellite databases do not need to be the same as the primary data base.

Logical view of a Ganymed DBFarm: 300 Databases are spread over a set of 9 cluster nodes. Each database may have multiple replicas (satellite databases). Master databases are typically installed on the more powerful nodes.
We have shown that we can extend and scale a commercial database engine used as a primary database by using a cluster of open source satellites. We have also shown that we can add functionality to a commercial database engine that was not there before without affecting the performance and response time of the original database. Examples of such extended functionality are skyline queries, text search, or time travel.


With these features, Ganymed provides a degree of scalability and extensibility that was previously unknown to database engines. The possibility of combining a commercial database engine as primary with a cluster of open source databases for extensibility opens up very interesting opportunities to introduce new functionality and improve performance that would otherwise require a very large economic and development effort. As an example, we have implemented a farm of databases (a so called DBFarm) that runs on a cluster. This farm can concurrently run up to 300 standard databases on a small cluster without compromising performance, providing consistent views at all times, and using open source software. Such a farm can be used as the basis for a database service provider that sells database functionality to clients who then do not need to install and maintain a commercial database at their premises. Another interesting option is to use the system to implement functionality that it is not present in the original database. We are working on extending a commercial engine with a cluster based event monitoring system that will support user defined functions and complex queries over historical data.
The Ganymed project is an example of how to combine theoretical insights as well as system and engineering skills with a substantial development effort to achieve a result that advances the state of the art, opens up important new directions for research, and can also be exploited commercially. For more information on this project please go to the following web page:
Wichtiger Hinweis:
Diese Website wird in älteren Versionen von Netscape ohne
graphische Elemente dargestellt. Die Funktionalität der
Website ist aber trotzdem gewährleistet. Wenn Sie diese
Website regelmässig benutzen, empfehlen wir Ihnen, auf
Ihrem Computer einen aktuellen Browser zu installieren. Weitere
Informationen finden Sie auf
folgender
Seite.
Important Note:
The content in this site is accessible to any browser or
Internet device, however, some graphics will display correctly
only in the newer versions of Netscape. To get the most out of
our site we suggest you upgrade to a newer browser.
More
information