Large files, multiple destinations

When using SFTP you can transport large files to a single destination without ever loading it in memory. Simply turn on the streamdownload and you are good to go. For one of my customers I had to create a route that does the same but then to multiple destinations. With files going upto 45 GB this could not be done in memory and just streaming it wont work because you can only read it once.

When sending a stream to multiple destinations with camel you need to enable streamCaching. This “caches” the stream and allows you to access it multiple times. Normally you simply add the streamCaching option to your route and it works. However when doing this with a large file I got an out of memory exception indicating it couldn’t transform my string to a streamCache. Now my perception was (as it is stated in the documentation) that if a stream is to large it is ofloaded to disk automatically by the streamCache implementation. Even when I changed the thresholds to 1kb.

After some debugging I noticed that this option was never triggered. Some checks in an if statement where preventing it. After some tinkering I came up with the following logic / configuration that enabled the offloading to disk:


private void configureStreamCaching(){
    StreamCachingStrategy streamCachingStrategy = new DefaultStreamCachingStrategy();
    streamCachingStrategy.setSpoolThreshold(spoolThreshold);
    if (!spoolDirectory.isEmpty()){
        streamCachingStrategy.setSpoolDirectory(new File(spoolDirectory));
    }
    streamCachingStrategy.setEnabled(true);
    streamCachingStrategy.addSpoolRule(length -> spoolThreshold < length);
    streamCachingStrategy.setSpoolChiper("AES/CTR/NoPadding");
    getContext().setStreamCachingStrategy(streamCachingStrategy);
}

I had to add the spoolRule and I had to enable the strategy. The cipher is set so that the tmp files are encrypted, you’ve got to keep your security officer satisfied! Now I also wanted to be able to determine the directory. The “strange” thing that I had to do was that I had to set the spoolDirecetory as a File object. You have the option to set it as a String. However, this only works when the strategy is not yet started. Using the File object ensures that the directory is actually used once set, no matter when you do it. A final note on this is that you have to make sure you have enough disk size, because that will be the limit of the file size you are moving.

Now that I can send large files to multiple destinations I had to find the “right” way to do it. I first used the regular to() which allows you to set multiple endpoints. This however results in to().to().to(). So if one failes the entire exchange failes which of course you can resolve in the exception handler. However using a receiptentList is easier. It creates shallow copies of the exchange which enables you the handle the exception of each destination in a nice and simple manner. 

The entire route looks a bit like the following code:


from(from).routeId("LargeFilesMultipleDestinations")
        .streamCaching()
        .onException(Exception.class)
            .log(LoggingLevel.WARN, "Cannot send ${header.CamelFileName} to ${exchangeProperty.CamelToEndpoint}")
            .to(“handle your error here”)
            .handled(true)
        .end()
        .process(exchange -> System.out.println())
        .setHeader("Endpoints", constant(toEndpointUrls))
        .recipientList(header("Endpoints"));

That is it, with this simple route you can move mountains of files. Open a jvisualvm console to see a nice sawtooth when the memory kicks in.