The Munin Distributed Shared Memory System

Distributed shared memory (DSM) is an abstraction of shared memory on a distributed memory machine. Hardware DSM systems support this abstraction at the architecture level; software DSM systems support the abstraction within the runtime system. DSM systems combine the best features of shared memory and distributed memory machines. They support the convenient shared memory programming model on distributed memory hardware, which is more scalable and less expensive to build. However, although many DSM systems have been proposed and implemented, achieving good performance on DSM systems for a sizable class of applications has proven to be a major challenge. Conventional DSM systems employ page-based write-invalidate consistency protocols that require substantial communication overhead in order to maintain consistency, or, put another way, to maintain the shared memory abstraction.

Ideally, the amount of communication for an application executing on a DSM system should be comparable to the amount of communication for the same application executing directly on the underlying message passing system. Conventional DSM systems have found it difficult to achieve this goal because of restrictive memory consistency models and inflexible consistency protocols. The Munin DSM system introduces four techniques for reducing the consistency-related communication in DSM systems:

bulletsoftware release consistency,
bulletmultiple consistency protocols,
bulletwrite-shared protocols, and
bulletan update-with-timeout mechanism.

These techniques were implemented and evaluated in the Munin DSM system. None of these techniques requires program restructuring, or any other significant modification of the ``ordinary'' shared memory parallel programs. Munin achieved performance improvements over conventional DSM implementations ranging from a few, to several hundred percent, depending on the application.

Munin was a joint research effort with Willy Zwaenepoel of the Department of Computer Science, and represents the doctoral thesis work of John Carter.

Munin Publications

bulletJ.B. Carter, J.K. Bennett, and W. Zwaenepoel. Techniques for reducing consistency-related communications in distributed shared memory systems. ACM Transactions on Computers, 13(3), 205-243, Aug. 1995.
bulletJ.K. Bennett, J.B. Carter, and W. Zwaenepoel. Adaptive software cache management for distributed shared memory architectures. In The cache coherence problem in shared memory multiprocessors: software solutions, Igor Tartalja and Veljko Milutinovic, editors, IEEE Computer Society Press, 1995.
bulletJ.K. Bennett, J.B. Carter, A.L. Cox, E.N. Elnozahy, D.B. Johnson, P. Keleher and W. Zwaenepoel. Distributed shared memory: Experience with Munin. In Proceedings of The Fifth European ACM SIGOPS Workshop, July, 1992.
bulletJ.K. Bennett, J.B. Carter, and W. Zwaenepoel. Munin: Distributed shared memory using multi-protocol release consistency, In Operating Systems of the 90s and Beyond, A.I. Karshner and J. Nehmer, editors, Lecture Notes in Computer Science, Springer-Verlag LNCS 563, pp. 56-60, 1991.
bulletJ.K. Bennett, J.B. Carter, and W. Zwaenepoel. Toward large scale shared memory multiprocessing. In Scalable Shared Memory Multiprocessors, M. Dubois and Shreekant Thakkar, editors, Kluwer Academic Publishers, Nov. 1991.
bulletJ.B. Carter, J.K. Bennett, and W. Zwaenepoel. Implementation and performance of Munin. In Proceedings of the 13th Symposium on Operating System Principles, pp. 152--164, Oct. 1991.
bulletJ.K. Bennett, J.B. Carter, and W. Zwaenepoel. Adaptive software cache management for distributed shared memory architectures. In Proceedings of the 17th International Symposium on Computer Architecture, pp. 125--134, May 1990.
bulletJ.K. Bennett, J.B. Carter, and W. Zwaenepoel. Munin: Distributed shared memory based on type--specific memory coherence. In Proceedings of the Second ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 168--176, Mar. 1990.