Alpakka FTP recursive listing of directory tree


#1

Hi,

I am using the Alpakka FTP Ftp.ls() source to poll the content of a directory. Everything works fine as long as there are no subfolders in the polled directory. But as soon as there is a subfolder, my logfiles are filling up with:

ERROR] [01/25/2019 18:30:58.251] [layline-akka.stream.default-blocking-io-dispatcher-35] [akka://layline/system/StreamSupervisor-0/flow-3-2-FtpBrowserSource] Error during postStop in [akka.stream.alpakka.ftp.impl.FtpSourceFactory$$anon$1@747b2287]: Truncated server reply: 
org.apache.commons.net.MalformedServerReplyException: Truncated server reply: 
	at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:332)
	at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:300)
	at org.apache.commons.net.ftp.FTP.sendCommand(FTP.java:523)
	at org.apache.commons.net.ftp.FTP.sendCommand(FTP.java:648)
...

messages.

In my test I am accessing a Unix FTP server on a Synology NAS from an Akka application running on a Windows system. I enabled the tracing on the FTP command channel by adding:

FtpSettings.create(InetAddress.getByName(getFtpConfig().getHost()))
                        .withConfigureConnectionConsumer(f -> {
                          f.addProtocolCommandListener(
                                  new PrintCommandListener(new PrintWriter(System.out), true));
                        });

From that output I can see the following:

USER *******
331 Guest login ok, send your email address as password.
PASS *******
230 Guest login ok, access restrictions apply.
SYST
215 UNIX Type: L8
PORT 192,168,178,23,17,146
200 PORT command successful.
LIST /in
150 Opening BINARY mode data connection for 'file list'.
226 Transfer complete.
PORT 192,168,178,23,17,147
18:30:58.251 TRACE Layline.Source.SourceFtp             - polling returned object 1.txt
200 PORT command successful.
LIST /\in\Folder1
150 Opening BINARY mode data connection for 'file list'.
550 /\in\Folder1: No such file or directory.
QUIT

For me it seems as if the FTP Server returns the path names using the Windows \ separator (for whatever reason). For the recursive lookup of Folder1 somewhere a Unix / is added “/\in\Folder1” which leads to the 550 error.

I then had a look to the Alpakka Source code in https://github.com/akka/alpakka/blob/master/ftp/src/main/scala/akka/stream/alpakka/ftp/impl/CommonFtpOperations.scala and found this:

/**
 * INTERNAL API
 */
@InternalApi
private[ftp] trait CommonFtpOperations {
  type Handler = FTPClient

  def listFiles(basePath: String, handler: Handler): immutable.Seq[FtpFile] = {
    val path = if (!basePath.isEmpty && basePath.head != '/') s"/$basePath" else basePath
    handler

Is this the place where the ‘/’ is added? Am I suppossed and can I somehow configure my client so that Unix pathes are returned? Or does the underlying Apache FTPClient automatically uses the \ if I am on a Windows System and that is not forseen in the Alpakka code?

Thanks for your help
Lay


(Martynas Mickevičius) #2

Great investigation! I see that the path separator character is not standardized and different FTP servers can use different separators. There is a command (although not mandatory to be implemented) which can query the type of operating system of the server. That could be used to determine the path separator automatically. This would be the ideal solution, but for starters we could expose it as a configuration parameter.

Can you create a ticket for this in the alpakka issue tracker? Also, if you’re up for creating a PR that fixes this, it would be very welcome.


#3

I have created a ticket https://github.com/akka/alpakka/issues/1470 for this issue.

I also had a look to the code. Even if the path separator of different FTP servers is not standardized, this issue is related to the internal handling of the path in the Alpakka FTP code. Internally the Java java.nio.Paths.normalize() method is used to normalize the pathes. On Windows this method uses the \ as separator character while it uses the / on Mac and Unix. These pathes are subsequently preprocessed in the method CommonFtpOperations.listFiles by prepending a / if the path does not start with a /. That leads to the erroneous behaviour.

Since the FTP servers I tested can not handle the Windows path separator anyhow, I would suggest to completely replace the Windows \ by a Unix / in all cases before returning the FTP file:

def listFiles(basePath: String, handler: Handler): immutable.Seq[FtpFile] = {
    val path = if (!basePath.isEmpty && basePath.head != '/') s"/$basePath" else basePath
    handler
      .listFiles(path)
      .collect {
        case file: FTPFile if file.getName != "." && file.getName != ".." =>
          FtpFile(
            file.getName,
            if (java.io.File.separatorChar== '\\')
              Paths.get(s"$path/${file.getName}").normalize.toString.replace('\\', '/')
            else
              Paths.get(s"$path/${file.getName}").normalize.toString,
            file.isDirectory,
            file.getSize,
            file.getTimestamp.getTimeInMillis,
            getPosixFilePermissions(file)
          )
      }
      .toVector

That also complies with SftpOperations.listFiles() which does not process the pathes returned by the server at all and so most likely in almost all cases uses the Unix /.

If you think that this is a good solution, I can prepare a pull request.


(Martynas Mickevičius) #4

I see now. The solution sounds good. I wonder if this bug would have been caught by the current test-suite running on Windows.


#5

I don’t know how the FTP server is mocked. To reproduce the problem a Windows client has to list a directory including its subfolders. So if the mocked FTP provides a directory structure like:

/in
/in/Folder1
/in/Folder1/file.txt

the problem should show up in the test.

But just while writing this I am not even sure anymore if you are getting an error at all or if just the logfiles filling up with the error message from the initial post and the error is ignored. I will test this once again if I find some time.