org.apache.nutch.ndfs
Class FSNamesystem

java.lang.Object
  extended byorg.apache.nutch.ndfs.FSNamesystem
All Implemented Interfaces:
FSConstants

public class FSNamesystem
extends Object
implements FSConstants

The FSNamesystem tracks several important tables. 1) valid fsname --> blocklist (kept on disk, logged) 2) Set of all valid blocks (inverted #1) 3) block --> machinelist (kept in memory, rebuilt dynamically from reports) 4) machine --> blocklist (inverted #2) 5) LRU cache of updated-heartbeat machines


Field Summary
static Logger LOG
           
 
Fields inherited from interface org.apache.nutch.ndfs.FSConstants
BLOCK_SIZE, BLOCKREPORT_INTERVAL, CHUNKED_ENCODING, COMPLETE_SUCCESS, DATANODE_STARTUP_PERIOD, EXPIRE_INTERVAL, HEARTBEAT_INTERVAL, LEASE_PERIOD, OBSOLETE_INTERVAL, OP_ACK, OP_BLOCKRECEIVED, OP_BLOCKREPORT, OP_CLIENT_ABANDONBLOCK, OP_CLIENT_ABANDONBLOCK_ACK, OP_CLIENT_ADDBLOCK, OP_CLIENT_ADDBLOCK_ACK, OP_CLIENT_COMPLETEFILE, OP_CLIENT_COMPLETEFILE_ACK, OP_CLIENT_DATANODE_HINTS, OP_CLIENT_DATANODE_HINTS_ACK, OP_CLIENT_DATANODEREPORT, OP_CLIENT_DATANODEREPORT_ACK, OP_CLIENT_DELETE, OP_CLIENT_DELETE_ACK, OP_CLIENT_EXISTS, OP_CLIENT_EXISTS_ACK, OP_CLIENT_ISDIR, OP_CLIENT_ISDIR_ACK, OP_CLIENT_LISTING, OP_CLIENT_LISTING_ACK, OP_CLIENT_MKDIRS, OP_CLIENT_MKDIRS_ACK, OP_CLIENT_OBTAINLOCK, OP_CLIENT_OBTAINLOCK_ACK, OP_CLIENT_OPEN, OP_CLIENT_OPEN_ACK, OP_CLIENT_RAWSTATS, OP_CLIENT_RAWSTATS_ACK, OP_CLIENT_RELEASELOCK, OP_CLIENT_RELEASELOCK_ACK, OP_CLIENT_RENAMETO, OP_CLIENT_RENAMETO_ACK, OP_CLIENT_RENEW_LEASE, OP_CLIENT_RENEW_LEASE_ACK, OP_CLIENT_STARTFILE, OP_CLIENT_STARTFILE_ACK, OP_CLIENT_TRYAGAIN, OP_ERROR, OP_FAILURE, OP_HEARTBEAT, OP_INVALIDATE_BLOCKS, OP_READ_BLOCK, OP_READSKIP_BLOCK, OP_TRANSFERBLOCKS, OP_TRANSFERDATA, OP_WRITE_BLOCK, OPERATION_FAILED, RUNLENGTH_ENCODING, STILL_WAITING, SYSTEM_STARTUP_PERIOD, WRITE_COMPLETE
 
Constructor Summary
FSNamesystem(File dir)
          dir is where the filesystem directory state is stored
 
Method Summary
 boolean abandonBlock(Block b, UTF8 src)
          The client would like to let go of the given block
 void blockReceived(Block block, UTF8 name)
          The given node is reporting that it received a certain block.
 Block[] checkObsoleteBlocks(UTF8 name)
          If the node has not been checked in some time, go through its blocks and find which ones are neither valid nor pending.
 void close()
           
 int completeFile(UTF8 src, UTF8 holder)
          Finalize the created file and make it world-accessible.
 DatanodeInfo[] datanodeReport()
           
 boolean delete(UTF8 src)
          Remove the indicated filename from the namespace.
 boolean exists(UTF8 src)
          Return whether the given filename exists
 Object[] getAdditionalBlock(UTF8 src)
          The client would like to obtain an additional block for the indicated filename (which is being written-to).
 UTF8[] getDatanodeHints(UTF8 src, long offset)
          Figure out a few hosts that are likely to contain the block referred to by the given filename, offset pair.
 NDFSFileInfo[] getListing(UTF8 src)
          Get a listing of all files at 'src'.
 void gotHeartbeat(UTF8 name, long capacity, long remaining)
          The given node has reported in.
 boolean isDir(UTF8 src)
          Whether the given name is a directory
 boolean mkdirs(UTF8 src)
          Create all the necessary directories
 int obtainLock(UTF8 src, UTF8 holder, boolean exclusive)
          Get a lock (perhaps exclusive) on the given file
 Object[] open(UTF8 src)
          The client wants to open the given filename.
 Object[] pendingTransfers(DatanodeInfo srcNode, int maxXfers)
          Return with a list of Block/DataNodeInfo sets, indicating where various Blocks should be copied, ASAP.
 void processReport(Block[] newReport, UTF8 name)
          The given node is reporting all its blocks.
 Block[] recentlyInvalidBlocks(UTF8 name)
          Return with a list of Blocks that should be invalidated at the given node.
 int releaseLock(UTF8 src, UTF8 holder)
          Release the lock on the given file
 boolean renameTo(UTF8 src, UTF8 dst)
          Change the indicated filename.
 void renewLease(UTF8 holder)
          Renew the lease(s) held by the given client
 Object[] startFile(UTF8 src, UTF8 holder, boolean overwrite)
          The client would like to create a new block for the indicated filename.
 long totalCapacity()
          Total raw bytes
 long totalRemaining()
          Total non-used raw bytes
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

FSNamesystem

public FSNamesystem(File dir)
             throws IOException
dir is where the filesystem directory state is stored

Method Detail

close

public void close()

open

public Object[] open(UTF8 src)
The client wants to open the given filename. Return a list of (block,machineArray) pairs. The sequence of unique blocks in the list indicates all the blocks that make up the filename. The client should choose one of the machines from the machineArray at random.


startFile

public Object[] startFile(UTF8 src,
                          UTF8 holder,
                          boolean overwrite)
The client would like to create a new block for the indicated filename. Return an array that consists of the block, plus a set of machines. The first on this list should be where the client writes data. Subsequent items in the list must be provided in the connection to the first datanode.


getAdditionalBlock

public Object[] getAdditionalBlock(UTF8 src)
The client would like to obtain an additional block for the indicated filename (which is being written-to). Return an array that consists of the block, plus a set of machines. The first on this list should be where the client writes data. Subsequent items in the list must be provided in the connection to the first datanode. Make sure the previous blocks have been reported by datanodes and are replicated. Will return an empty 2-elt array if we want the client to "try again later".


abandonBlock

public boolean abandonBlock(Block b,
                            UTF8 src)
The client would like to let go of the given block


completeFile

public int completeFile(UTF8 src,
                        UTF8 holder)
Finalize the created file and make it world-accessible. The FSNamesystem will already know the blocks that make up the file. Before we return, we make sure that all the file's blocks have been reported by datanodes and are replicated correctly.


renameTo

public boolean renameTo(UTF8 src,
                        UTF8 dst)
Change the indicated filename.


delete

public boolean delete(UTF8 src)
Remove the indicated filename from the namespace. This may invalidate some blocks that make up the file.


exists

public boolean exists(UTF8 src)
Return whether the given filename exists


isDir

public boolean isDir(UTF8 src)
Whether the given name is a directory


mkdirs

public boolean mkdirs(UTF8 src)
Create all the necessary directories


getDatanodeHints

public UTF8[] getDatanodeHints(UTF8 src,
                               long offset)
Figure out a few hosts that are likely to contain the block referred to by the given filename, offset pair.


obtainLock

public int obtainLock(UTF8 src,
                      UTF8 holder,
                      boolean exclusive)
Get a lock (perhaps exclusive) on the given file


releaseLock

public int releaseLock(UTF8 src,
                       UTF8 holder)
Release the lock on the given file


renewLease

public void renewLease(UTF8 holder)
Renew the lease(s) held by the given client


getListing

public NDFSFileInfo[] getListing(UTF8 src)
Get a listing of all files at 'src'. The Object[] array exists so we can return file attributes (soon to be implemented)


gotHeartbeat

public void gotHeartbeat(UTF8 name,
                         long capacity,
                         long remaining)
The given node has reported in. This method should: 1) Record the heartbeat, so the datanode isn't timed out 2) Adjust usage stats for future block allocation


processReport

public void processReport(Block[] newReport,
                          UTF8 name)
The given node is reporting all its blocks. Use this info to update the (machine-->blocklist) and (block-->machinelist) tables.


blockReceived

public void blockReceived(Block block,
                          UTF8 name)
The given node is reporting that it received a certain block.


totalCapacity

public long totalCapacity()
Total raw bytes


totalRemaining

public long totalRemaining()
Total non-used raw bytes


datanodeReport

public DatanodeInfo[] datanodeReport()

recentlyInvalidBlocks

public Block[] recentlyInvalidBlocks(UTF8 name)
Return with a list of Blocks that should be invalidated at the given node. Done in response to a file delete, which eliminates a number of blocks from the universe.


checkObsoleteBlocks

public Block[] checkObsoleteBlocks(UTF8 name)
If the node has not been checked in some time, go through its blocks and find which ones are neither valid nor pending. It often happens that a client will start writing blocks and then exit. The blocks are on-disk, but the file will be abandoned. It's not enough to invalidate blocks at lease expiry time; datanodes can go down before the client's lease on the failed file expires and miss the "expire" event. This function considers every block on a datanode, and thus should only be invoked infrequently.


pendingTransfers

public Object[] pendingTransfers(DatanodeInfo srcNode,
                                 int maxXfers)
Return with a list of Block/DataNodeInfo sets, indicating where various Blocks should be copied, ASAP. The Array that we return consists of two objects: The 1st elt is an array of Blocks. The 2nd elt is a 2D array of DatanodeInfo objs, identifying the target sequence for the Block at the appropriate index.



Copyright © 2006 The Apache Software Foundation