Distributed System Labs

Some drafts and notes.

#1 MapReduce

test_mr.sh

mrcoordinator ./pg*txt

then multiple workers

mrworker ././mrapps/app.so

Workflow:

master awake, prepare rpc and splits. workers awake, prepare map/reduce functions and start asking for task.

master get M map tasks and R reduce tasks ready. picking up workers to assign
map workers read the splits and save the intermediate files, tell the master about the R locations(which files)
master receive task done msg, master picks workers to do reduce task, telling them the location of the R they are going to deal with.
reduce workers use rpc to read target R files from different map workers. once all read, sort them all by inter-key. reduce workers then iterate over the sorted data and do reduce func. the ouput of the func is appended to a final output file of this R partition. done, call master is done
once all done, master

Conclusion

So I suppose that this is not that hard since it didn’t take much time for me to accomplish the task. To be honest, the reason I’d been stuck for a little while when bash test-mr.sh is because the path of the file is incorrect (changed it to absolute path later ).

#2 Key/Value Server

To implement an in-memory server and clients. It’s pretty easy that there’s really no much to talk about.

Techniques Used

Still, using RPC to call remote functions
Using unique Ids to represent each of the clients and each request.

2 Things blocked me for a while

A silly mistake - the fact is that Call() is executed only once, and then there’s a terrible infinite loop
```
 for ok := ck.Call("Server.Put", &args, &reply); !ok
```
Using Unique Ids to represent values instead of Reqs - bad things would happen when other clients change the record, hence causing duplicate put/append

Conclusion

The lab is marked Easy, which is true just that I am supposed to finish it in real quick.

#3 Raft

Election

I’ve drawn a figure listing out the summary of rules, check out:

state_transfer

Also, some variables and details as:

election_variables_rules

Log

The rule of log replication can be interpreted as the following graphs:

log_replication_basic

Demonstrating an unreliable test that I’d debugged for a while and Tips:

log_replication_atest

In order to back up faster, I also implement the following technique:

log_replication_quick_backup

By sending XTerm, XIndex, and Len back to the leader, the leader can then quickly decide what nextIndex[target] should be set and get rid of meaningless retries.