ncftpspooler - Global batch FTP job processor daemon
ncftpspooler -d [options]
ncftpspooler -l [options]
Command line flags:
The ncftpspooler program evolved from the ncftpbatch program. The ncftpbatch program was originally designed as a ``personal FTP spooler'' which would process a single background job a particular user and exit when it finished; the ncftpspooler program is a ``global FTP spooler'' which stays running and processes background jobs as they are submitted.
The job queue directory is monitored for specially-named and formatted text files. Each file serves as a single FTP job. The name of the job file contains the type of FTP job (get or put), a timestamp indicating the earliest the job should be processed, and optionally some additional information to make it easier to create unique job files (i.e. a sequence number). The contents of the job files have information such as the remote server machine to FTP to, username, password, remote pathname, etc.
Your job queue directory must be readable and writable by the user that you plan to run ncftpspooler as, so that jobs can be removed or renamed within the queue.
More importantly, the user that is running the program will need adequate privileges to access the local files that are involved in the FTPing. I.e., if your spooler is going to be processing jobs which upload files to remote servers, then the user will need read permission on the local files that will be uploaded (and directory access permission the parent directories). Likewise, if your spooler is going to be processing jobs which download files, then the user would need to be able to write to the local directories.
Once you have created your spool directory with appropriate permissions and ownerships, you can run ncftpspooler -d to launch the spooler daemon. You can run additional spoolers if you want to process more than FTP job from the same job queue directory simultaneously. You can then monitor the log file (i.e., using tail -f ) to track the progress of the spooler. Most of the time it won't be doing anything, unless job files have appeared in the job queue directory.
When the ncftpspooler program monitors the job queue directory, it ignores any files that do not follow the naming convention for job files. The job files must be prefixed in the format of X-YYYYMMDD-hhmmss where X denotes a job type, YYYY is the four-digit year, MM is the two-digit month number, DD is the two-digit day of the month, hh is the two-digit hour of the day (00-23), mm is the two-digit minute, and ss is the two-digit second. The date and time represent the earliest time you want the job to be run.
The job type can be g for a get (download from remote host), or p for aput (upload to remote host).
As an example, if you wanted to schedule an upload to occur at 11:45 PM on December 7, 2001, a job file could be named
In practice, the job files include additional information such as a sequence number or process ID. This makes it easier to create unique job file names. Here is the same example, with a process ID and a sequence number:
When submitting job files to the queue directory, be sure to use a dash character after the hhmmss field if you choose to append any additional data to the job file name.
Job files are ordinary text files, so that they can be created by hand. Each line of the file is a key-pair in the format variable=value, or is a comment line beginning with an octothorpe character (#), or is a blank line. Here is an example job file:
# This is a NcFTP spool file entry. job-name=g-20011016-100656-008299-1 op=get hostname=ftp.freebsd.org xtype=I passive=1 remote-dir=pub/FreeBSD local-dir=/tmp remote-file=README.TXT local-file=readme.txt
Job files are flexible since they follow an easy-to-use format and do not have many requirements, but there are a few mandatory parameters that must appear for the spooler to be able to process the job.
For a regular get job, these parameters are required:
For a regular put job, these parameters are required:
For a recursive get job, these parameters are required:
For a recursive put job, these parameters are required:
The rest of the parameters are optional. The spooler will attempt to use reasonable defaults for these parameters if necessary.
Generally speaking, post-shell-command is much more useful than pre-shell-command since if you need to use these options you're more likely to want to do something after the FTP transfer has completed rather than before. For example, you might want to run a shell script which pages an administrator to notify her that her 37 gigabyte file download has completed.
When your custom program is run, it receives on standard input the contents of the job file (i.e. several lines of variable=value key-pairs), as well as additional data the spooler may provide, such as a result key-pair with a textual description of the job's completion status.
post-shell-command update a log file named /var/log/ncftp_spooler.
#!/usr/bin/perl -w use strict; my ($line); my (%params) = (); while (defined($line = <STDIN>)) { $params{$1} = $2 if ($line =~ /^([^=\#\s]+)=(.*)/); } if ((defined($params{"result"})) && ($params{"result"} =~ /^Succeeded/)) { open(LOG, ">> /var/log/ncftp_spooler.log") or exit(1); print LOG "DOWNLOAD" if ($params{"op"} eq "get"); print LOG "UPLOAD" if ($params{"op"} eq "put"); print LOG " ", $params{"local-file"}, "\n"; close(LOG); }
The log file should be examined to determine if any ncftpspooler processes are actively working on jobs. The log contains copious amounts of useful information, including the entire FTP control connection conversation between the FTP client and server.
The recursive option may not be reliable since ncftpspooler depends on functionality which may or may not be present in the remote server software. Additionally, even if the functionality is available, ncftpspooler may need to use heuristics which cannot be considered 100% accurate. Therefore it is best to create individual jobs for each file in the directory tree, rather than a single recursive directory job.
For resumption of downloads to work, the remote server must support the FTP SIZE and MDTM primitives. Most modern FTP server software can do this, but there are still a number of bare-bones ftpd implementations which do not. In these cases, ncftpspooler will re-download the file in entirety each time until the download succeeds.
The program needs to be improved to detect jobs that have no chance of ever completing successfully. There are still a number of cases where jobs can get spooled but get retried over and over again until a vigilant sysadmin manually removes the jobs.
The spool files may contain usernames and passwords stored in cleartext. These files should not be readable by any user except the user running the program!
Mike Gleason, NcFTP Software (mgleason@ncftp.com).
ncftpput(1), ncftpget(1), ncftp(1), ftp(1), rcp(1), tftp(1).
LibNcFTP (http://www.ncftp.com/libncftp/).