Distributed System Labs
Some drafts and notes.
#1 MapReduce
test_mr.sh
mrcoordinator .//pg*txt
then multiple workers
mrworker .//.//mrapps/app.so
Workflow:
master awake, prepare rpc and splits. workers awake, prepare map/reduce functions and start asking for task.
- master get M map tasks and R reduce tasks ready. picking up workers to assign
- map workers read the splits and save the intermediate files, tell the master about the R locations(which files)
- master receive task done msg, master picks workers to do reduce task, telling them the location of the R they are going to deal with.
- reduce workers use rpc to read target R files from different map workers. once all read, sort them all by inter-key. reduce workers then iterate over the sorted data and do reduce func. the ouput of the func is appended to a final output file of this R partition. done, call master is done
- once all done, master
Conclusion
So I suppose that this is not that hard since it didn't take much time for me to accomplish the task. To be honest, the reason I'd been stuck for a little while when bash test-mr.sh is because the path of the file is incorrect (changed it to absolute path later ).
#2 Key/Value Server
To implement an in-memory server and clients. It's pretty easy that there's really no much to talk about.
Techniques Used
- Still, using RPC to call remote functions
- Using unique Ids to represent each of the clients and each request.
2 Things blocked me for a while
-
A silly mistake - the fact is that
Call()is executed only once, and then there's a terrible infinite loopfor ok := ck.Call("Server.Put", &args, &reply); !ok -
Using Unique Ids to represent
valuesinstead of Reqs - bad things would happen when other clients change the record, hence causing duplicate put/append
Conclusion
The lab is marked Easy, which is true just that I am supposed to finish it in real quick.
#3 Raft
Election
I've drawn a figure listing out the summary of rules, check out:

Also, some variables and details as:

Log
The rule of log replication can be interpreted as the following graphs:

Demonstrating an unreliable test that I'd debugged for a while and Tips:

In order to back up faster, I also implement the following technique:

By sending XTerm, XIndex, and Len back to the leader, the leader can then quickly decide what nextIndex[target] should be set and get rid of meaningless retries.
Persistence

Log Compaction


#4 Fault-Tolerant K/V Service
Note: All of the fields in RPC args should be initialized, otherwise the RPC will fail.(on returning(never return))
