Principles Of Distributed Database Systems Exercise Solutions Instant
Consider a global relation EMPLOYEE (EmpID, Name, Department, Salary, Location) .Design a primary horizontal fragmentation scheme based on the following predicates:
Replacing global relation names with local fragment names.
Consider two relations stored on different network nodes. Node 1: Employee (10,000 tuples, size = 100 bytes/tuple) Node 2: Department (100 tuples, size = 50 bytes/tuple) We need to execute the join query: Employee ⋈DeptID⋈ sub DeptID end-sub
Consistency condition: read quorum + write quorum > N (number of replicas). Here 2+4=6 >5 ✔. If R=2, W=4, then concurrent writes must intersect on at least one replica (4+4>5? No – two write quorums intersect if 4+4>5 → 8>5 ✔). Failure tolerance: To block a write, you need to knock out 5-4+1=2 replicas (since write quorum=4, failing 2 leaves 3 available – not enough for write). To block a read, you need to knock out 5-2+1=4 replicas (failing 4 leaves 1 < read quorum). So failure tolerance = min(2,4)-1? Let’s compute correctly: Write quorum=4 → max failures for availability = N-W =1. Read quorum=2 → max failures for availability = N-R=3. System can survive at most 1 failure and still perform writes. Here 2+4=6 >5 ✔
In distributed exercises, you'll often encounter the vs. Distributed 2PL debate.
Total Cost (Strategy B)=Cost1+Cost2=5,000+200,000=205,000 bytesTotal Cost (Strategy B) equals Cost sub 1 plus Cost sub 2 equals 5 comma 000 plus 200 comma 000 equals 205 comma 000 bytes
Consider a distributed database system that stores information about customers and orders. The database is fragmented and replicated across multiple sites. Describe how the system provides transparency. Failure tolerance: To block a write, you need
During a 2PC execution, the coordinator sends a Prepare message. All participants reply with Vote-Commit . The coordinator writes a Commit record to its log and crashes before sending out the Global-Commit message.
Query optimization in a DDBMS minimizes total cost, which heavily weights communication cost over local processing CPU/IO cost. Semi-Join Optimization Semi-joins (
If all vote "Yes," the coordinator sends a "Global Commit." If any vote "No" or timeout, it sends a "Global Abort." 000 + 50
The you are studying (e.g., Özsu and Valduriez)
While distributed systems focus on geographic separation, parallel systems focus on performance via multiple processors and disks. Architectures Fast but limited scalability.
Mastering the Core: Principles of Distributed Database Systems Exercise Solutions
If hash partitioning already aligned with join site: Each tuple of R and S sent exactly once to join site → cost = 10,000 + 50,000 = 60,000.
