LibraryToggle FramesPrintFeedback
file:directoryName[?options]

or

file://directoryName[?options]

Where directoryName represents the underlying file directory.

You can append query options to the URI in the following format, ?option=value&option=value&...

[Tip]Only directories

Fuse Mediation Router 2.0 only support endpoints configured with a starting directory. So the directoryName must be a directory. If you want to consume a single file only, you can use the fileName option, e.g. by setting fileName=thefilename. Also, the starting directory must not contain dynamic expressions with ${ } placeholders. Again use the fileName option to specify the dynamic part of the filename.

In Fuse Mediation Router 1.x you could also configure a file and this caused more harm than good as it could lead to confusing situations.

[Warning]Avoid reading files currently being written by another application

Beware the JDK File IO API is a bit limited in detecting whether another application is currently writing/copying a file. And the implementation can be different depending on OS platform as well. This could lead to that Fuse Mediation Router thinks the file is not locked by another process and start consuming it. Therefore you have to do you own investigation as to what suits your environment. To help with this, Fuse Mediation Router provides different readLock options and the doneFileOption option that you can use. See also the section Consuming files from folders where others drop files directly.

Name Default Value Description
autoCreate true Automatically create missing directories in the file's pathname. For the file consumer, that means creating the starting directory. For the file producer, it means the directory where the files should be written.
bufferSize 128kb Write buffer sized in bytes.
fileName null Use Expression such as File Language to dynamically set the filename. For consumers, it's used as a filename filter. For producers, it's used to evaluate the filename to write. If an expression is set, it take precedence over the CamelFileName header. (Note: The header itself can also be an Expression). The expression options support both String and Expression types. If the expression is a String type, it is always evaluated using the File Language. If the expression is an Expression type, the specified Expression type is used - this allows you, for instance, to use OGNL expressions. For the consumer, you can use it to filter filenames, so you can for instance consume today's file using the File Language syntax: mydata-${date:now:yyyyMMdd}.txt.
flatten false Flatten is used to flatten the file name path to strip any leading paths, so it's just the file name. This allows you to consume recursively into sub-directories, but when you eg write the files to another directory they will be written in a single directory. Setting this to true on the producer enforces that any file name recived in CamelFileName header will be stripped for any leading paths.
charset null Camel 2.5: this option is used to specify the encoding of the file, and camel will set the Exchange property with Exchange.CHARSET_NAME with the value of this option.
Name Default Value Description
initialDelay 1000 Milliseconds before polling the file/directory starts.
delay 500 Milliseconds before the next poll of the file/directory.
useFixedDelay true Set to true to use fixed delay between pools, otherwise fixed rate is used. See ScheduledExecutorService in JDK for details.
runLoggingLevel TRACE Camel 2.8: The consumer logs a start/complete log line when it polls. This option allows you to configure the logging level for that.
recursive false If a directory, will look for files in all the sub-directories as well.
delete false If true, the file will be deleted after it is processed
noop false If true, the file is not moved or deleted in any way. This option is good for readonly data, or for ETL type requirements. If noop=true, Fuse Mediation Router will set idempotent=true as well, to avoid consuming the same files over and over again.
preMove null Use Expression such as File Language to dynamically set the filename when moving it before processing. For example to move in-progress files into the order directory set this value to order.
move .camel Use Expression such as File Language to dynamically set the filename when moving it after processing. To move files into a .done subdirectory just enter .done.
moveFailed null Use Expression such as File Language to dynamically set the filename when moving failed files after processing. To move files into a error subdirectory just enter error. Note: When moving the files to another location it can/will handle the error when you move it to another location so Fuse Mediation Router cannot pick up the file again.
include null Is used to include files, if filename matches the regex pattern.
exclude null Is used to exclude files, if filename matches the regex pattern.
idempotent false Option to use the Idempotent Consumer EIP pattern to let Fuse Mediation Router skip already processed files. Will by default use a memory based LRUCache that holds 1000 entries. If noop=true then idempotent will be enabled as well to avoid consuming the same files over and over again.
idempotentRepository null Pluggable repository as a org.apache.camel.processor.idempotent.MessageIdRepository class. Will by default use MemoryMessageIdRepository if none is specified and idempotent is true.
inProgressRepository memory Pluggable in-progress repository as a org.apache.camel.processor.idempotent.MessageIdRepository class. The in-progress repository is used to account the current in progress files being consumed. By default a memory based repository is used.
filter null Pluggable filter as a org.apache.camel.component.file.GenericFileFilter class. Will skip files if filter returns false in its accept() method. Fuse Mediation Router also ships with an ANT path matcher filter in the camel-spring component. More details in section below.
sorter null Pluggable sorter as a java.util.Comparator<org.apache.camel.component.file.GenericFile> class.
sortBy null Built-in sort using the File Language. Supports nested sorts, so you can have a sort by file name and as a 2nd group sort by modified date. See sorting section below for details.
readLock markerFile

Used by consumer, to only poll the files if it has exclusive read-lock on the file (i.e. the file is not in-progress or being written). Fuse Mediation Router will wait until the file lock is granted.

The readLock option supports the following built-in strategies:

  • markerFile is the behaviour from Fuse Mediation Router 1.x, where Fuse Mediation Router will create a marker file and hold a lock on the marker file. This option is not available for the FTP component.

  • changed uses a length/modification timestamp to detect whether the file is currently being copied or not. Will wait at least 1 second to determine this, so this option cannot consume files as fast as the others, but can be more reliable as the JDK IO API cannot always determine whether a file is currently being used by another process.

  • rename attempts to rename the file, in order to test whether we can get an exclusive read-lock.

  • none is for no read locks at all.

readLockTimeout 0 (for FTP, 2000) Optional timeout in milliseconds for the read-lock, if supported by the read-lock. If the read-lock could not be granted and the timeout triggered, then Fuse Mediation Router will skip the file. At next poll Fuse Mediation Router, will try the file again, and this time maybe the read-lock could be granted. Currently fileLock, changed and rename support the timeout.
readLockCheckInterval 1000 (for FTP, 5000) Camel 2.6: Interval in millis for the read-lock, if supported by the read lock. This interval is used for sleeping between attempts to acquire the read lock. For example when using the changed read lock, you can set a higher interval period to cater for slow writes. The default of 1 sec. may be too fast if the producer is very slow writing the file.
exclusiveReadLockStrategy null Pluggable read-lock as a org.apache.camel.component.file.GenericFileExclusiveReadLockStrategy implementation.
minDepth 0 Camel 2.8: The minimum depth to start processing when recursively processing a directory. Using minDepth=1 means the base directory. Using minDepth=2 means the first sub directory. This option is not supported by FTP consumer.
maxDepth Integer.MAX_VALUE Camel 2.8: The maximum depth to traverse when recursively processing a directory. This option is not supported by FTP consumer.
doneFileName null Camel 2.6: If provided, Camel will only consume files if a done file exists. This option configures what file name to use. Either you can specify a fixed name. Or you can use dynamic placeholders. The done file is always expected in the same folder as the original file. See using done file and writing done file sections for examples.
processStrategy null A pluggable org.apache.camel.component.file.GenericFileProcessStrategy allowing you to implement your own readLock option or similar. Can also be used when special conditions must be met before a file can be consumed, such as a special ready file exists. If this option is set then the readLock option does not apply.
maxMessagesPerPoll 0 An integer that defines the maximum number of messages to gather per poll. By default, no maximum is set. Can be used to set a limit of e.g. 1000 to avoid having the server read thousands of files as it starts up. Set a value of 0 or negative to disabled it.
startingDirectoryMustExist false Whether the starting directory must exist. Mind that the autoCreate option is default enabled, which means the starting directory is normally auto-created if it doesn't exist. You can disable autoCreate and enable this to ensure the starting directory must exist. Will throw an exception, if the directory doesn't exist.
directoryMustExist false Similar to startingDirectoryMustExist but this applies during polling recursive sub-directories.
Name Default Value Description
fileExist Override What to do if a file already exists with the same name. The following values can be specified: Override, Append, Fail and Ignore. Override, which is the default, replaces the existing file. Append adds content to the existing file. Fail throws a GenericFileOperationException, indicating that there is already an existing file. Ignore silently ignores the problem and does not override the existing file, but assumes everything is okay.
tempPrefix null This option is used to write the file using a temporary name and then, after the write is complete, rename it to the real name. Can be used to identify files being written and also avoid consumers (not using exclusive read locks) reading in progress files. Is often used by FTP when uploading big files.
tempFileName null Camel 2.1: The same as tempPrefix option but offering a more fine grained control on the naming of the temporary filename as it uses the File Language.
keepLastModified false Camel 2.2: Will keep the last modified timestamp from the source file (if any). Will use the Exchange.FILE_LAST_MODIFIED header to located the timestamp. This header can contain either a java.util.Date or long with the timestamp. If the timestamp exists and the option is enabled it will set this timestamp on the written file. Note: This option only applies to the file producer. You cannot use this option with any of the ftp producers.
eagerDeleteTargetFile true Camel 2.3: Whether or not to eagerly delete any existing target file. This option only applies when you use fileExists=Override and the tempFileName option as well. You can use this to disable (set it to false) deleting the target file before the temp file is written. For example you may write big files and want the target file to exists during the temp file is being written. This ensure the target file is only deleted until the very last moment, just before the temp file is being renamed to the target filename.
doneFileName null Camel 2.6: If provided, then Camel will write a 2nd done file when the original file has been written. The done file will be empty. This option configures what file name to use. Either you can specify a fixed name. Or you can use dynamic placeholders. The done file will always be written in the same folder as the original file. See writing done file section for examples.

Any move or delete operations is executed after (post command) the routing has completed; so during processing of the Exchange the file is still located in the inbox folder.

Lets illustrate this with an example:

    from("file://inbox?move=.done").to("bean:handleOrder");

When a file is dropped in the inbox folder, the file consumer notices this and creates a new FileExchange that is routed to the handleOrder bean. The bean then processes the File object. At this point in time the file is still located in the inbox folder. After the bean completes, and thus the route is completed, the file consumer will perform the move operation and move the file to the .done sub-folder.

The move and preMove options should be a directory name, which can be either relative or absolute. If relative, the directory is created as a sub-folder from within the folder where the file was consumed.

By default, Fuse Mediation Router will move consumed files to the .camel sub-folder relative to the directory where the file was consumed.

If you want to delete the file after processing, the route should be:

    from("file://inobox?delete=true").to("bean:handleOrder");

We have introduced a pre move operation to move files before they are processed. This allows you to mark which files have been scanned as they are moved to this sub folder before being processed.

    from("file://inbox?preMove=inprogress").to("bean:handleOrder");

You can combine the pre move and the regular move:

    from("file://inbox?preMove=inprogress&move=.done").to("bean:handleOrder");

So in this situation, the file is in the inprogress folder when being processed and after it's processed, it's moved to the .done folder.

The move and preMove option is Expression-based, so we have the full power of the File Language to do advanced configuration of the directory and name pattern. Fuse Mediation Router will, in fact, internally convert the directory name you enter into a File Language expression. So when we enter move=.done Fuse Mediation Router will convert this into: ${file:parent}/.done/${file:onlyname}. This is only done if Fuse Mediation Router detects that you have not provided a ${ } in the option value yourself. So when you enter an expression containing ${ }, the expression is interpreted as a File Language expression.

So if we want to move the file into a backup folder with today's date as the pattern, we can do:

move=backup/${date:now:yyyyMMdd}/${file:name}
Header Description
CamelFileName Specifies the name of the file to write (relative to the endpoint directory). The name can be a String; a String with a File Language or Simple expression; or an Expression object. If it's null then Fuse Mediation Router will auto-generate a filename based on the message unique ID.
CamelFileNameProduced The actual absolute filepath (path + name) for the output file that was written. This header is set by Camel and its purpose is providing end-users with the name of the file that was written.
Header Description
CamelFileName Name of the consumed file as a relative file path with offset from the starting directory configured on the endpoint.
CamelFileNameOnly Only the file name (the name with no leading paths).
CamelFileAbsolute A boolean option specifying whether the consumed file denotes an absolute path or not. Should normally be false for relative paths. Absolute paths should normally not be used but we added to the move option to allow moving files to absolute paths. But can be used elsewhere as well.
CamelFileAbsolutePath The absolute path to the file. For relative files this path holds the relative path instead.
CamelFilePath The file path. For relative files this is the starting directory + the relative filename. For absolute files this is the absolute path.
CamelFileRelativePath The relative path.
CamelFileParent The parent path.
CamelFileLength A long value containing the file size.
CamelFileLastModified A Date value containing the last modified timestamp of the file.

As the file consumer is BatchConsumer it supports batching the files it polls. By batching it means that Fuse Mediation Router will add some properties to the Exchange so you know the number of files polled the current index in that order.

Property Description
CamelBatchSize The total number of files that was polled in this batch.
CamelBatchIndex The current index of the batch. Starts from 0.
CamelBatchComplete A boolean value indicating the last Exchange in the batch. Is only true for the last entry.

This allows you for instance to know how many files exists in this batch and for instance let the Aggregator2 aggregate this number of files.

Fuse Mediation Router is of course also able to write files, i.e. produce files. In the sample below we receive some reports on the SEDA queue that we process before they are written to a directory.

public void testToFile() throws Exception {
    MockEndpoint mock = getMockEndpoint("mock:result");
    mock.expectedMessageCount(1);
    mock.expectedFileExists("target/test-reports/report.txt");

    template.sendBody("direct:reports", "This is a great report");

    assertMockEndpointsSatisfied();
}

protected JndiRegistry createRegistry() throws Exception {
    // bind our processor in the registry with the given id
    JndiRegistry reg = super.createRegistry();
    reg.bind("processReport", new ProcessReport());
    return reg;
}

protected RouteBuilder createRouteBuilder() throws Exception {
    return new RouteBuilder() {
        public void configure() throws Exception {
            // the reports from the seda queue is processed by our processor
            // before they are written to files in the target/reports directory
            from("direct:reports").processRef("processReport").to("file://target/test-reports", "mock:result");
        }
    };
}

private static class ProcessReport implements Processor {

    public void process(Exchange exchange) throws Exception {
        String body = exchange.getIn().getBody(String.class);
        // do some business logic here

        // set the output to the file
        exchange.getOut().setBody(body);

        // set the output filename using java code logic, notice that this is done by setting
        // a special header property of the out exchange
        exchange.getOut().setHeader(Exchange.FILE_NAME, "report.txt");
    }

}

Fuse Mediation Router supports Idempotent Consumer directly within the component so it will skip already processed files. This feature can be enabled by setting the idempotent=true option.

from("file://inbox?idempotent=true").to("...");

By default Fuse Mediation Router uses an in-memory based store for keeping track of consumed files, it uses a least recently used cache holding up to 1000 entries. You can plugin your own implementation of this store by using the idempotentRepository option using the # sign in the value to indicate it's a referring to a bean in the Registry with the specified id.

   <!-- define our store as a plain spring bean -->
   <bean id="myStore" class="com.mycompany.MyIdempotentStore"/>

  <route>
    <from uri="file://inbox?idempotent=true&dempotentRepository=#myStore"/>
    <to uri="bean:processInbox"/>
  </route>

Fuse Mediation Router will log at DEBUG level if it skips a file because it has been consumed before:

DEBUG FileConsumer is idempotent and the file has been consumed before. Will skip this file: target\idempotent\report.txt

In this section we will use the file based idempotent repository org.apache.camel.processor.idempotent.FileIdempotentRepository instead of the in-memory based that is used as default. This repository uses a 1st level cache to avoid reading the file repository. It will only use the file repository to store the content of the 1st level cache. Thereby the repository can survive server restarts. It will load the content of the file into the 1st level cache upon startup. The file structure is very simple as it stores the key in separate lines in the file. By default, the file store has a size limit of 1mb and when the file grows larger, Fuse Mediation Router will truncate the file store and rebuild the content by flushing the 1st level cache into a fresh empty file.

We configure our repository using Spring XML creating our file idempotent repository and define our file consumer to use our repository with the idempotentRepository using \# sign to indicate Registry lookup:

<!-- this is our file based idempotent store configured to use the .filestore.dat as file -->
<bean id="fileStore" class="org.apache.camel.processor.idempotent.FileIdempotentRepository">
    <!-- the filename for the store -->
    <property name="fileStore" value="target/fileidempotent/.filestore.dat"/>
    <!-- the max filesize in bytes for the file. Fuse Mediation Router will trunk and flush the cache
         if the file gets bigger -->
    <property name="maxFileStoreSize" value="512000"/>
    <!-- the number of elements in our store -->
    <property name="cacheSize" value="250"/>
</bean>

<camelContext xmlns="http://camel.apache.org/schema/spring">
    <route>
        <from uri="file://target/fileidempotent/?idempotent=true&dempotentRepository=#fileStore&ove=done/${file:name}"/>
        <to uri="mock:result"/>
    </route>
</camelContext>

In this section we will use the JPA based idempotent repository instead of the in-memory based that is used as default.

First we need a persistence-unit in META-INF/persistence.xml where we need to use the class org.apache.camel.processor.idempotent.jpa.MessageProcessed as model.

<persistence-unit name="idempotentDb" transaction-type="RESOURCE_LOCAL">
  <class>org.apache.camel.processor.idempotent.jpa.MessageProcessed</class>

  <properties>
    <property name="openjpa.ConnectionURL" value="jdbc:derby:target/idempotentTest;create=true"/>
    <property name="openjpa.ConnectionDriverName" value="org.apache.derby.jdbc.EmbeddedDriver"/>
    <property name="openjpa.jdbc.SynchronizeMappings" value="buildSchema"/>
    <property name="openjpa.Log" value="DefaultLevel=WARN, Tool=INFO"/>
  </properties>
</persistence-unit>

Then we need to setup a Spring jpaTemplate in the spring XML file:

<!-- this is standard spring JPA configuration -->
<bean id="jpaTemplate" class="org.springframework.orm.jpa.JpaTemplate">
    <property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalEntityManagerFactoryBean">
    <!-- we use idempotentDB as the persitence unit name defined in the persistence.xml file -->
    <property name="persistenceUnitName" value="idempotentDb"/>
</bean>

And finally we can create our JPA idempotent repository in the spring XML file as well:

<!-- we define our jpa based idempotent repository we want to use in the file consumer -->
<bean id="jpaStore" class="org.apache.camel.processor.idempotent.jpa.JpaMessageIdRepository">
    <!-- Here we refer to the spring jpaTemplate -->
    <constructor-arg index="0" ref="jpaTemplate"/>
    <!-- This 2nd parameter is the name  (= a cateogry name).
         You can have different repositories with different names -->
    <constructor-arg index="1" value="FileConsumer"/>
</bean>

And then we just need to reference the jpaStore bean in the file consumer endpoint, using the idempotentRepository option and the # syntax:

  <route>
    <from uri="file://inbox?idempotent=true&dempotentRepository=#jpaStore"/>
    <to uri="bean:processInbox"/>
  </route>

The ANT path matcher is shipped out-of-the-box in the camel-spring jar. So you need to depend on camel-spring if you are using Maven. The reasons is that we leverage Spring's AntPathMatcher to do the actual matching.

The file paths is matched with the following rules:

  • ? matches one character

  • * matches zero or more characters

  • ** matches zero or more directories in a path

The sample below demonstrates how to use it:

<camelContext xmlns="http://camel.apache.org/schema/spring">
    <template id="camelTemplate"/>

    <!-- use myFilter as filter to allow setting ANT paths for which files to scan for -->
    <endpoint id="myFileEndpoint" uri="file://target/antpathmatcher?recursive=true&ilter=#myAntFilter"/>

    <route>
        <from ref="myFileEndpoint"/>
        <to uri="mock:result"/>
    </route>
</camelContext>

<!-- we use the antpath file filter to use ant paths for includes and exlucde -->
<bean id="myAntFilter" class="org.apache.camel.component.file.AntPathMatcherGenericFileFilter">
    <!-- include and file in the subfolder that has day in the name -->
    <property name="includes" value="**/subfolder/**/*day*"/>
    <!-- exclude all files with bad in name or .xml files. Use comma to seperate multiple excludes -->
    <property name="excludes" value="**/*bad*,**/*.xml"/>
</bean>

Fuse Mediation Router supports pluggable sorting strategies. This strategy it to use the File Language to configure the sorting. The sortBy option is configured as follows:

sortBy=group 1;group 2;group 3;...

Where each group is separated with semi colon. In the simple situations you just use one group, so a simple example could be:

sortBy=file:name

This will sort by file name, you can reverse the order by prefixing reverse: to the group, so the sorting is now Z..A:

sortBy=reverse:file:name

As we have the full power of File Language we can use some of the other parameters, so if we want to sort by file size we do:

sortBy=file:length

You can configure to ignore the case, using ignoreCase: for string comparison, so if you want to use file name sorting but to ignore the case then we do:

sortBy=ignoreCase:file:name

You can combine ignore case and reverse, however reverse must be specified first:

sortBy=reverse:ignoreCase:file:name

In the sample below we want to sort by last modified file, so we do:

sortBy=file:modifed

And then we want to group by name as a 2nd option so files with same modifcation is sorted by name:

sortBy=file:modifed;file:name

Now there is an issue here, can you spot it? Well the modified timestamp of the file is too fine as it will be in milliseconds, but what if we want to sort by date only and then subgroup by name? Well as we have the true power of File Language we can use the its date command that supports patterns. So this can be solved as:

sortBy=date:file:yyyyMMdd;file:name

Yeah, that is pretty powerful, oh by the way you can also use reverse per group, so we could reverse the file names:

sortBy=date:file:yyyyMMdd;reverse:file:name
Comments powered by Disqus