Tuesday, May 20, 2008

Multi-User Hadoop

I've been setting up hadoop on a few (6) boxes and have been attempting to make it work for multiple users. It is not as easy as it sounds because the docs are a bit spread out.

Nevertheless, if everyone is in the same group, then you need to set your default group to represent that.


dfs.permissions.supergroup
groupname


You'll also need to change the group of any files you've already created.
hadoop dfs chgrp -R groupname /

Next, to allow multiple users to run mapreduce jobs, you'll need to set your configuration directory to be a place that is accessible to ALL the boxes. I'm using an nfs mount, but you could use hdfs just as easily.


mapred.system.dir
/mnt/myNFSMount/hadoop/mapred/system


Make sure that directory exists and is writable by the group mentioned above (or at least that all your mapreduce users can write to it).

At this point, that is all that I know needs to be changed. I have two people (including myself) using our hadoop cluster. So, I'll let you know as we run into more problems.

2 comments:

Anonymous said...

hope this work, multi user I mean, because, im busting my b***s off trying to figure that out

Wasim Bari said...

Could you please elaborate more or ref some documents for setting up hadoop cluster for multiple users ?