{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Topic_2_PT - CS3283 Distributed System Assignment 2 CS3283...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
CS3283 Distributed System Assignment 2 Page 1 / 13 CS3283 Distributed System Report for Team Works Topic 2: Replication Management and Google file systems Group Members: LEE HO MING (51306210) TSANG TSZ KIT (51274232) CHEUK HIU KING (51362570) “I declare that the materials presented in this assignment are original except explicitly acknowledged.”
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS3283 Distributed System Assignment 2 Page 2 / 13 Document Change Control Release Issue Date Updated By Section, Page(s) and Text Revised 1.0.0 2010-11-20 LEE HO MING First release of the document. - Add document background - Add doucment outline 1.0.1 2010-11-23 CHEUK HIU KING - Edit the introduction - Add the GFS Architecture 1.0.2 2010-11-24 LEE HO MING Add techinical issues and examples 1.0.3 2010-11-25 TSANG TSZ KIT Add the repliction management issues 2.0.0 2010-11-27 CHEUK HIU KING Add the conclusion and finallize
Background image of page 2
CS3283 Distributed System Assignment 2 Page 3 / 13 Table of Contents 1. Abstract ................................................................................................................... 4 2. Introduction/Background ....................................................................................... 4 3. Interface and Data .................................................................................................. 5 4. Architecture ............................................................................................................ 5 5. System Control ........................................................................................................ 6 5.1. Write Procedure .............................................................................................. 7 5.2. Communication between Master and Chunk Servers .................................... 8 5.3. Record Append ................................................................................................ 8 5.4. Snapshot .......................................................................................................... 9 5.5. Fault Tolerance ................................................................................................ 9 5.6. Fast Recovery ................................................................................................... 9 5.7. Chunk Replication ............................................................................................ 9 5.8. Master Mechanisms ........................................................................................ 9 6. Master Control ...................................................................................................... 10 6.1. Garbage Collection ........................................................................................ 10 6.2. Replication Management .............................................................................. 10 6.3. Replication Placement ................................................................................... 11 6.4. Replica Creation ............................................................................................. 11 6.5. Re-replication and Re-balancing ................................................................... 11 6.6. Stale Replica Detection .................................................................................. 12 7. Conclusion ............................................................................................................ 12 8. Reference .............................................................................................................. 13
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS3283 Distributed System Assignment 2 Page 4 / 13 1. Abstract With the internet is increasing popular nowadays, Google provided services such as Google Search, Gmail, GoogleDoc and GoogleMap are become an essential service while we using the computers. In order to serve more users, Google need to manage a lot of information, especially amount of data in search engine that is growing at an alarming rate. But how does Google effectively manage such a large amount of data, this is our motive to study of Google File System (GFS). 2. Introduction/Background Google File System is a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. The design of Google File System is driven by observations of Google application workloads and technological environment. It uses replication to distribute data to different locations and to remote or mobile users over local and wide area networks or wireless connection. A set of technologies for replicas, distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency. T he file system has successfully met Google’s storage needs that provided higher throughput including improving scalability and availability, data warehousing and reporting etc. It is widely deployed within Google as the storage platform for the generation and processing of data.
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}