org.apache.hadoop.fs
Class FileSystem

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.fs.FileSystem
All Implemented Interfaces:
Closeable, Configurable
Direct Known Subclasses:
DistributedFileSystem, FilterFileSystem, HftpFileSystem, KosmosFileSystem, RawLocalFileSystem, S3FileSystem

public abstract class FileSystem
extends Configured
implements Closeable

An abstract base class for a fairly generic filesystem. It may be implemented as a distributed filesystem, or as a "local" one that reflects the locally-connected disk. The local version exists for small Hadoop instances and for testing.

All user code that may potentially use the Hadoop Distributed File System should be written to use a FileSystem object. The Hadoop DFS is a multi-machine system that appears as a single disk. It's useful because of its fault tolerance and potentially very large capacity.

The local implementation is LocalFileSystem and distributed implementation is DistributedFileSystem.


Field Summary
static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
protected FileSystem()
           
 
Method Summary
protected  void checkPath(Path path)
          Check that a Path belongs to this FileSystem.
 void close()
          No more filesystem operations are needed.
static void closeAll()
          Close all cached filesystems.
 void completeLocalOutput(Path fsOutputFile, Path tmpLocalFile)
          Called when we're all done writing to the target.
 void copyFromLocalFile(boolean delSrc, boolean overwrite, Path src, Path dst)
          The src file is on the local disk.
 void copyFromLocalFile(boolean delSrc, Path src, Path dst)
          The src file is on the local disk.
 void copyFromLocalFile(Path src, Path dst)
          The src file is on the local disk.
 void copyToLocalFile(boolean delSrc, Path src, Path dst)
          The src file is under FS, and the dst is on the local disk.
 void copyToLocalFile(Path src, Path dst)
          The src file is under FS, and the dst is on the local disk.
static FSDataOutputStream create(FileSystem fs, Path file, FsPermission permission)
          create a file with the provided permission The permission of the file is set to be the provided permission as in setPermission, not permission&~umask It is implemented using two RPCs.
 FSDataOutputStream create(Path f)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, boolean overwrite)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, boolean overwrite, int bufferSize)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, Progressable progress)
          Opens an FSDataOutputStream at the indicated Path with write-progress reporting.
 FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress)
          Opens an FSDataOutputStream at the indicated Path with write-progress reporting.
abstract  FSDataOutputStream create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress)
          Opens an FSDataOutputStream at the indicated Path with write-progress reporting.
 FSDataOutputStream create(Path f, Progressable progress)
          Create an FSDataOutputStream at the indicated Path with write-progress reporting.
 FSDataOutputStream create(Path f, short replication)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, short replication, Progressable progress)
          Opens an FSDataOutputStream at the indicated Path with write-progress reporting.
 boolean createNewFile(Path f)
          Creates the given Path as a brand-new zero-length file.
abstract  boolean delete(Path f)
          Delete a file
abstract  boolean exists(Path f)
          Check if exists.
static FileSystem get(Configuration conf)
          Returns the configured filesystem implementation.
static FileSystem get(URI uri, Configuration conf)
          Returns the FileSystem for this URI's scheme and authority.
 long getBlockSize(Path f)
          Deprecated. Use getFileStatus() instead
 long getContentLength(Path f)
          Return the number of bytes of the given path If f is a file, return the size of the file; If f is a directory, return the size of the directory tree
 long getDefaultBlockSize()
          Return the number of bytes that large input files should be optimally be split into to minimize i/o time.
 short getDefaultReplication()
          Get the default replication.
 String[][] getFileCacheHints(Path f, long start, long len)
          Return a 2D array of size 1x1 or greater, containing hostnames where portions of the given file can be found.
abstract  FileStatus getFileStatus(Path f)
          Return a file status object that represents the path.
 Path getHomeDirectory()
          Return the current user's home directory in this filesystem.
 long getLength(Path f)
          Deprecated. Use getFileStatus() instead
static LocalFileSystem getLocal(Configuration conf)
          Get the local file syste
 String getName()
          Deprecated. call #getUri() instead.
static FileSystem getNamed(String name, Configuration conf)
          Deprecated. call #get(URI,Configuration) instead.
 short getReplication(Path src)
          Deprecated. Use getFileStatus() instead
abstract  URI getUri()
          Returns a URI whose scheme and authority identify this FileSystem.
 long getUsed()
          Return the total size of all files in the filesystem.
abstract  Path getWorkingDirectory()
          Get the current working directory for the given file system
 Path[] globPaths(Path filePattern)
          Deprecated. 
 Path[] globPaths(Path filePattern, PathFilter filter)
          Deprecated. 
 FileStatus[] globStatus(Path pathPattern)
          Return all the files that match filePattern and are not checksum files.
 FileStatus[] globStatus(Path pathPattern, PathFilter filter)
          Return an array of FileStatus objects whose path names match pathPattern and is accepted by the user-supplied path filter.
abstract  void initialize(URI name, Configuration conf)
          Called after a new FileSystem instance is constructed.
 boolean isDirectory(Path f)
          Deprecated. Use getFileStatus() instead
 boolean isFile(Path f)
          True iff the named path is a regular file.
 Path[] listPaths(Path f)
          Deprecated. 
 Path[] listPaths(Path[] files)
          Deprecated. 
 Path[] listPaths(Path[] files, PathFilter filter)
          Deprecated. 
 Path[] listPaths(Path f, PathFilter filter)
          Deprecated. 
abstract  FileStatus[] listStatus(Path f)
          List the statuses of the files/directories in the given path if the path is a directory.
 FileStatus[] listStatus(Path f, PathFilter filter)
          Filter files/directories in the given path using the user-supplied path filter.
 Path makeQualified(Path path)
          Make sure that a path specifies a FileSystem.
static boolean mkdirs(FileSystem fs, Path dir, FsPermission permission)
          create a directory with the provided permission The permission of the directory is set to be the provided permission as in setPermission, not permission&~umask
 boolean mkdirs(Path f)
          Call mkdirs(Path, FsPermission) with default permission.
abstract  boolean mkdirs(Path f, FsPermission permission)
          Make the given file and all non-existent parents into directories.
 void moveFromLocalFile(Path src, Path dst)
          The src file is on the local disk.
 void moveToLocalFile(Path src, Path dst)
          The src file is under FS, and the dst is on the local disk.
 FSDataInputStream open(Path f)
          Opens an FSDataInputStream at the indicated Path.
abstract  FSDataInputStream open(Path f, int bufferSize)
          Opens an FSDataInputStream at the indicated Path.
static FileSystem parseArgs(String[] argv, int i, Configuration conf)
          Parse the cmd-line args, starting at i.
abstract  boolean rename(Path src, Path dst)
          Renames Path src to Path dst.
 void setOwner(Path p, String username, String groupname)
          Set owner of a path (i.e.
 void setPermission(Path p, FsPermission permission)
          Set permission of a path.
 boolean setReplication(Path src, short replication)
          Set replication for an existing file.
abstract  void setWorkingDirectory(Path new_dir)
          Set the current working directory for the given file system.
 Path startLocalOutput(Path fsOutputFile, Path tmpLocalFile)
          Returns a local File that the user can write output to.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

FileSystem

protected FileSystem()
Method Detail

parseArgs

public static FileSystem parseArgs(String[] argv,
                                   int i,
                                   Configuration conf)
                            throws IOException
Parse the cmd-line args, starting at i. Remove consumed args from array. We expect param in the form: '-local | -dfs '

Throws:
IOException

get

public static FileSystem get(Configuration conf)
                      throws IOException
Returns the configured filesystem implementation.

Throws:
IOException

initialize

public abstract void initialize(URI name,
                                Configuration conf)
                         throws IOException
Called after a new FileSystem instance is constructed.

Parameters:
name - a uri whose authority section names the host, port, etc. for this FileSystem
conf - the configuration
Throws:
IOException

getUri

public abstract URI getUri()
Returns a URI whose scheme and authority identify this FileSystem.


getName

public String getName()
Deprecated. call #getUri() instead.


getNamed

public static FileSystem getNamed(String name,
                                  Configuration conf)
                           throws IOException
Deprecated. call #get(URI,Configuration) instead.

Throws:
IOException

getLocal

public static LocalFileSystem getLocal(Configuration conf)
                                throws IOException
Get the local file syste

Parameters:
conf - the configuration to configure the file system with
Returns:
a LocalFileSystem
Throws:
IOException

get

public static FileSystem get(URI uri,
                             Configuration conf)
                      throws IOException
Returns the FileSystem for this URI's scheme and authority. The scheme of the URI determines a configuration property name, fs.scheme.class whose value names the FileSystem class. The entire URI is passed to the FileSystem instance's initialize method.

Throws:
IOException

closeAll

public static void closeAll()
                     throws IOException
Close all cached filesystems. Be sure those filesystems are not used anymore.

Throws:
IOException

makeQualified

public Path makeQualified(Path path)
Make sure that a path specifies a FileSystem.


create

public static FSDataOutputStream create(FileSystem fs,
                                        Path file,
                                        FsPermission permission)
                                 throws IOException
create a file with the provided permission The permission of the file is set to be the provided permission as in setPermission, not permission&~umask It is implemented using two RPCs. It is understood that it is inefficient, but the implementation is thread-safe. The other option is to change the value of umask in configuration to be 0, but it is not thread-safe.

Parameters:
fs - file system handle
file - the name of the file to be created
permission - the permission of the file
Returns:
an output stream
Throws:
IOException

mkdirs

public static boolean mkdirs(FileSystem fs,
                             Path dir,
                             FsPermission permission)
                      throws IOException
create a directory with the provided permission The permission of the directory is set to be the provided permission as in setPermission, not permission&~umask

Parameters:
fs - file system handle
dir - the name of the directory to be created
permission - the permission of the directory
Returns:
true if the directory creation succeeds; false otherwise
Throws:
IOException
See Also:
create(FileSystem, Path, FsPermission)

checkPath

protected void checkPath(Path path)
Check that a Path belongs to this FileSystem.


getFileCacheHints

public String[][] getFileCacheHints(Path f,
                                    long start,
                                    long len)
                             throws IOException
Return a 2D array of size 1x1 or greater, containing hostnames where portions of the given file can be found. For a nonexistent file or regions, null will be returned. This call is most helpful with DFS, where it returns hostnames of machines that contain the given file. The FileSystem will simply return an elt containing 'localhost'.

Throws:
IOException

open

public abstract FSDataInputStream open(Path f,
                                       int bufferSize)
                                throws IOException
Opens an FSDataInputStream at the indicated Path.

Parameters:
f - the file name to open
bufferSize - the size of the buffer to be used.
Throws:
IOException

open

public FSDataInputStream open(Path f)
                       throws IOException
Opens an FSDataInputStream at the indicated Path.

Parameters:
f - the file to open
Throws:
IOException

create

public FSDataOutputStream create(Path f)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path. Files are overwritten by default.

Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path.

Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 Progressable progress)
                          throws IOException
Create an FSDataOutputStream at the indicated Path with write-progress reporting. Files are overwritten by default.

Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 short replication)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path. Files are overwritten by default.

Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 short replication,
                                 Progressable progress)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path with write-progress reporting. Files are overwritten by default.

Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite,
                                 int bufferSize)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path.

Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite,
                                 int bufferSize,
                                 Progressable progress)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path with write-progress reporting.

Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite,
                                 int bufferSize,
                                 short replication,
                                 long blockSize)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path.

Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
replication - required block replication for the file.
Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite,
                                 int bufferSize,
                                 short replication,
                                 long blockSize,
                                 Progressable progress)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path with write-progress reporting.

Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
replication - required block replication for the file.
Throws:
IOException

create

public abstract FSDataOutputStream create(Path f,
                                          FsPermission permission,
                                          boolean overwrite,
                                          int bufferSize,
                                          short replication,
                                          long blockSize,
                                          Progressable progress)
                                   throws IOException
Opens an FSDataOutputStream at the indicated Path with write-progress reporting.

Parameters:
f - the file name to open
permission -
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
replication - required block replication for the file.
blockSize -
progress -
Throws:
IOException
See Also:
setPermission(Path, FsPermission)

createNewFile

public boolean createNewFile(Path f)
                      throws IOException
Creates the given Path as a brand-new zero-length file. If create fails, or if it already existed, return false.

Throws:
IOException

getReplication

@Deprecated
public short getReplication(Path src)
                     throws IOException
Deprecated. Use getFileStatus() instead

Get replication.

Parameters:
src - file name
Returns:
file replication
Throws:
IOException

setReplication

public boolean setReplication(Path src,
                              short replication)
                       throws IOException
Set replication for an existing file.

Parameters:
src - file name
replication - new replication
Returns:
true if successful; false if file does not exist or is a directory
Throws:
IOException

rename

public abstract boolean rename(Path src,
                               Path dst)
                        throws IOException
Renames Path src to Path dst. Can take place on local fs or remote DFS.

Throws:
IOException

delete

public abstract boolean delete(Path f)
                        throws IOException
Delete a file

Throws:
IOException

exists

public abstract boolean exists(Path f)
                        throws IOException
Check if exists.

Parameters:
f - source file
Throws:
IOException

isDirectory

@Deprecated
public boolean isDirectory(Path f)
                    throws IOException
Deprecated. Use getFileStatus() instead

Throws:
IOException

isFile

public boolean isFile(Path f)
               throws IOException
True iff the named path is a regular file.

Throws:
IOException

getLength

@Deprecated
public long getLength(Path f)
               throws IOException
Deprecated. Use getFileStatus() instead

Throws:
IOException

getContentLength

public long getContentLength(Path f)
                      throws IOException
Return the number of bytes of the given path If f is a file, return the size of the file; If f is a directory, return the size of the directory tree

Throws:
IOException

listPaths

@Deprecated
public Path[] listPaths(Path f)
                 throws IOException
Deprecated. 

List files in a directory.

Throws:
IOException

listStatus

public abstract FileStatus[] listStatus(Path f)
                                 throws IOException
List the statuses of the files/directories in the given path if the path is a directory.

Parameters:
f - given path
Returns:
the statuses of the files/directories in the given patch
Throws:
IOException

listStatus

public FileStatus[] listStatus(Path f,
                               PathFilter filter)
                        throws IOException
Filter files/directories in the given path using the user-supplied path filter.

Parameters:
f - a path name
filter - the user-supplied path filter
Returns:
an array of FileStatus objects for the files under the given path after applying the filter
Throws:
IOException - if encounter any problem while fetching the status

listPaths

@Deprecated
public Path[] listPaths(Path[] files)
                 throws IOException
Deprecated. 

Filter files in the given pathes using the default checksum filter.

Parameters:
files - a list of paths
Returns:
a list of files under the source paths
Throws:
IOException

listPaths

@Deprecated
public Path[] listPaths(Path f,
                                   PathFilter filter)
                 throws IOException
Deprecated. 

Filter files in a directory.

Throws:
IOException

listPaths

@Deprecated
public Path[] listPaths(Path[] files,
                                   PathFilter filter)
                 throws IOException
Deprecated. 

Filter files in a list directories using user-supplied path filter.

Parameters:
files - a list of paths
Returns:
a list of files under the source paths
Throws:
IOException

globStatus

public FileStatus[] globStatus(Path pathPattern)
                        throws IOException

Return all the files that match filePattern and are not checksum files. Results are sorted by their names.

A filename pattern is composed of regular characters and special pattern matching characters, which are:

?
Matches any single character.

*
Matches zero or more characters.

[abc]
Matches a single character from character set {a,b,c}.

[a-b]
Matches a single character from the character range {a...b}. Note that character a must be lexicographically less than or equal to character b.

[^a]
Matches a single character that is not from character set or range {a}. Note that the ^ character must occur immediately to the right of the opening bracket.

\c
Removes (escapes) any special meaning of character c.

{ab,cd}
Matches a string from the string set {ab, cd}

{ab,c{de,fh}}
Matches a string from the string set {ab, cde, cfh}

Parameters:
pathPattern - a regular expression specifying a pth pattern
Returns:
an array of paths that match the path pattern
Throws:
IOException

globStatus

public FileStatus[] globStatus(Path pathPattern,
                               PathFilter filter)
                        throws IOException
Return an array of FileStatus objects whose path names match pathPattern and is accepted by the user-supplied path filter. Results are sorted by their path names. Return null if pathPattern has no glob and the path does not exist. Return an empty array if pathPattern has a glob and no path matches it.

Parameters:
pathPattern - a regular expression specifying the path pattern
filter - a user-supplied path filter
Returns:
an array of FileStatus objects
Throws:
IOException - if any I/O error occurs when fetching file status

globPaths

@Deprecated
public Path[] globPaths(Path filePattern)
                 throws IOException
Deprecated. 

glob all the path names that match filePattern using the default filter

Throws:
IOException

globPaths

@Deprecated
public Path[] globPaths(Path filePattern,
                                   PathFilter filter)
                 throws IOException
Deprecated. 

glob all the path names that match filePattern and is accepted by filter.

Throws:
IOException

getHomeDirectory

public Path getHomeDirectory()
Return the current user's home directory in this filesystem. The default implementation returns "/user/$USER/".


setWorkingDirectory

public abstract void setWorkingDirectory(Path new_dir)
Set the current working directory for the given file system. All relative paths will be resolved relative to it.

Parameters:
new_dir -

getWorkingDirectory

public abstract Path getWorkingDirectory()
Get the current working directory for the given file system

Returns:
the directory pathname

mkdirs

public boolean mkdirs(Path f)
               throws IOException
Call mkdirs(Path, FsPermission) with default permission.

Throws:
IOException

mkdirs

public abstract boolean mkdirs(Path f,
                               FsPermission permission)
                        throws IOException
Make the given file and all non-existent parents into directories. Has the semantics of Unix 'mkdir -p'. Existence of the directory hierarchy is not an error.

Throws:
IOException

copyFromLocalFile

public void copyFromLocalFile(Path src,
                              Path dst)
                       throws IOException
The src file is on the local disk. Add it to FS at the given dst name and the source is kept intact afterwards

Throws:
IOException

moveFromLocalFile

public void moveFromLocalFile(Path src,
                              Path dst)
                       throws IOException
The src file is on the local disk. Add it to FS at the given dst name, removing the source afterwards.

Throws:
IOException

copyFromLocalFile

public void copyFromLocalFile(boolean delSrc,
                              Path src,
                              Path dst)
                       throws IOException
The src file is on the local disk. Add it to FS at the given dst name. delSrc indicates if the source should be removed

Throws:
IOException

copyFromLocalFile

public void copyFromLocalFile(boolean delSrc,
                              boolean overwrite,
                              Path src,
                              Path dst)
                       throws IOException
The src file is on the local disk. Add it to FS at the given dst name. delSrc indicates if the source should be removed

Throws:
IOException

copyToLocalFile

public void copyToLocalFile(Path src,
                            Path dst)
                     throws IOException
The src file is under FS, and the dst is on the local disk. Copy it from FS control to the local dst name.

Throws:
IOException

moveToLocalFile

public void moveToLocalFile(Path src,
                            Path dst)
                     throws IOException
The src file is under FS, and the dst is on the local disk. Copy it from FS control to the local dst name. Remove the source afterwards

Throws:
IOException

copyToLocalFile

public void copyToLocalFile(boolean delSrc,
                            Path src,
                            Path dst)
                     throws IOException
The src file is under FS, and the dst is on the local disk. Copy it from FS control to the local dst name. delSrc indicates if the src will be removed or not.

Throws:
IOException

startLocalOutput

public Path startLocalOutput(Path fsOutputFile,
                             Path tmpLocalFile)
                      throws IOException
Returns a local File that the user can write output to. The caller provides both the eventual FS target name and the local working file. If the FS is local, we write directly into the target. If the FS is remote, we write into the tmp local area.

Throws:
IOException

completeLocalOutput

public void completeLocalOutput(Path fsOutputFile,
                                Path tmpLocalFile)
                         throws IOException
Called when we're all done writing to the target. A local FS will do nothing, because we've written to exactly the right place. A remote FS will copy the contents of tmpLocalFile to the correct target at fsOutputFile.

Throws:
IOException

close

public void close()
           throws IOException
No more filesystem operations are needed. Will release any held locks.

Specified by:
close in interface Closeable
Throws:
IOException

getUsed

public long getUsed()
             throws IOException
Return the total size of all files in the filesystem.

Throws:
IOException

getBlockSize

@Deprecated
public long getBlockSize(Path f)
                  throws IOException
Deprecated. Use getFileStatus() instead

Throws:
IOException

getDefaultBlockSize

public long getDefaultBlockSize()
Return the number of bytes that large input files should be optimally be split into to minimize i/o time.


getDefaultReplication

public short getDefaultReplication()
Get the default replication.


getFileStatus

public abstract FileStatus getFileStatus(Path f)
                                  throws IOException
Return a file status object that represents the path.

Parameters:
f - The path we want information from
Returns:
a FileStatus object
Throws:
FileNotFoundException - when the path does not exist; IOException see specific implementation
IOException

setPermission

public void setPermission(Path p,
                          FsPermission permission)
                   throws IOException
Set permission of a path.

Parameters:
p -
permission -
Throws:
IOException

setOwner

public void setOwner(Path p,
                     String username,
                     String groupname)
              throws IOException
Set owner of a path (i.e. a file or a directory). The parameters username and groupname cannot both be null.

Parameters:
p - The path
username - If it is null, the original username remains unchanged.
groupname - If it is null, the original groupname remains unchanged.
Throws:
IOException


Copyright © 2008 The Apache Software Foundation