FTP
FTP or file transfer protocol is a commonly used protocol for exchanging files over any network that supports the TCP/IP protocol (such as the Internet or an intranet). There are two computers involved in an FTP transfer: a server and a client. The FTP server, running FTP server software, listens on the network for connection requests from other computers. The client computer, running FTP client software, initiates a connection to the server. Once connected, the client can do a number of file manipulation operations such as uploading files to the server, download files from the server, rename or delete files on the server and so on. Any software company or individual programmer is able to create FTP server or client software because the protocol is an open standard. Virtually every computer platform supports the FTP protocol. This allows any computer connected to a TCP/IP based network to manipulate files on another computer on that network regardless of which operating systems are involved (if the computers permit FTP access). There are many existing FTP client and server programs, and many of these are free.
Overview
FTP is commonly run on two ports, 20 and 21, and runs exclusively over TCP. The FTP server listens on port 21 for incoming connections from FTP clients. A connection on this port forms the control stream, on which commands are passed to the FTP server. For the actual file transfer to take place, a different connection is required. Depending on the transfer mode, the client (active mode) or the server (passive mode) can listen for the incoming data connection. Before file transfer begins, the client and server also negotiate the port of the data connection. In case of active connections (where the server connects to the client to transfer data), the server binds on port 20 before connecting to the client. For passive connections, there is no such restriction.
While data is being transferred via the data stream, the control stream sits idle. This can cause problems with large data transfers through firewalls which time out sessions after lengthy periods of idleness. While the file may well be successfully transferred, the control session can be disconnected by the firewall, causing an error to be generated.
When FTP is used in a UNIX environment, there is an often-ignored but valuable command, "reget" (meaning "get again") that will cause an interrupted "get" command to be continued, hopefully to completion, after a communications interruption. The principle is obvious - the receiving station has a record of what it got, so it can spool through the file at the sending station and re-start at the right place for a seamless splice. The converse would be "reput" but is not available. Again, the principle is obvious: The sending station does not know how much of the file was actually received, so it would not know where to start.
Objectives of FTP
The objectives of FTP, as outlined by its RFC, are:
- To promote sharing of files (computer programs and/or data).
- To encourage indirect or implicit use of remote computers.
- To shield a user from variations in file storage systems among different hosts.
- To transfer data reliably and efficiently.
Criticisms of FTP
- Passwords and file contents are sent in clear text, which can be intercepted by eavesdroppers. There are protocol enhancements that circumvent this.
- Multiple TCP/IP connections are used, one for the control connection, and one for each download, upload, or directory listing. Firewall software needs additional logic to account for these connections.
- It is hard to filter active mode FTP traffic on the client side by using a firewall, since the client must open an arbitrary port in order to receive the connection. This problem is largely resolved by using passive mode FTP.
- It is possible to abuse the protocol's built-in proxy features to tell a server to send data to an arbitrary port of a third computer; see FXP.
- FTP is an extremely high latency protocol due to the number of commands needed to initiate a transfer.
- No integrity check on the receiver side. If transfer is interrupted the receiver has no way to know if the received file is complete or not. It is necessary to manage this externally for example with MD5 sums or cyclic redundancy checking.
Security problems
The original FTP specification is an inherently insecure method of transferring files because there is no method specified for transferring data in an encrypted fashion. This means that under most network configurations, user names, passwords, FTP commands and transferred files can be "sniffed" or viewed by anyone on the same network using a packet sniffer. This is a problem common to many Internet protocol specifications written prior to the creation of SSL such as HTTP, SMTP and Telnet. The common solution to this problem is to use either SFTP (SSH File Transfer Protocol), or FTPS (FTP over SSL), which adds SSL or TLS encryption to FTP as specificed in RFC 4217.
FTP return codes
See also: List of all FTP server return codes.
FTP server return codes indicate their status by the digits within them. A brief explanation of various digits' meanings are given below:
- 1yz: Positive Preliminary reply. The action requested is being initiated but there will be another reply before it begins.
- 2yz: Positive Completion reply. The action requested has been completed. The client may now issue a new command.
- 3yz: Positive Intermediate reply. The command was successful, but a further command is required before the server can act upon the request.
- 4yz: Transient Negative Completion reply. The command was not successful, but the client is free to try the command again as the failure is only temporary.
- 5yz: Permanent Negative Completion reply. The command was not successful and the client should not attempt to repeat it again.
- x0z: The failure was due to a syntax error.
- x1z: This response is a reply to a request for information.
- x2z: This response is a reply relating to connection information.
- x3z: This response is a reply relating to accounting and authorization.
Anonymous FTP
Many sites that run FTP servers enable so-called "anonymous ftp". Under this arrangement, users do not need an account on the server. The user name for anonymous access is typically 'anonymous' or 'ftp'. This account does not need a password. Although users are commonly asked to send their email addresses as their passwords for authentication, usually there is trivial or no verification, depending on the FTP server and its configuration. Internet Gopher has been suggested as an alternative to anonymous FTP, as well as Trivial File Transfer Protocol.
Data format
While transferring data over the network, two modes can be used
The two types differ in the way they send the data. When a file is sent using an ASCII-type transfer, the individual letters, numbers, and characters are sent using their ASCII character codes. The receiving machine saves these in a text file in the appropriate format (for example, a Unix machine saves it in a Unix format, a Macintosh saves it in a Mac format). Hence if an ASCII transfer is used it can be assumed plain text is sent, which is stored by the receiving computer in its own format. Translating between text formats entails substituting the end of line and end of file characters used on the source platform with those on the destination platform, e.g. a Windows machine receiving a file from a Unix machine will replace the carriage returns with carriage return-line feed pairs. ASCII transfer is also marginally faster, as the highest-order bit is dropped from each byte in the file.[1]
Sending a file in binary mode is different. The sending machine sends each file bit for bit and as such the recipient stores the bitstream as it receives it. Any form of data that is not plain text will be corrupted if this mode is not used.
By default, most FTP clients use ASCII mode. Some clients try to determine the required transfer-mode by inspecting the file's name or contents.
FTP and web browsers
Most recent web browsers and file managers can connect to FTP servers, although they may lack the support for protocol extensions such as FTPS. This allows manipulation of remote files over FTP through an interface similar to that used for local files. This is done via an FTP URL, which takes the form ftp(s)://<ftpserveraddress> (e.g., [2]). A password can optionally be given in the URL, e.g.: ftp(s)://<login>:<password>@<ftpserveraddress>:<port>. Most web-browsers require the use of passive mode FTP, which not all FTP servers are capable of handling. Some browsers allow only the downloading of files, but offer no way to upload files to the server.
FTP over SSH
FTP over SSH refers to the practice of tunneling a normal FTP session over an SSH connection.
Because FTP (unusual for a TCP/IP protocol that is still in use) uses multiple TCP connections, it is particularly difficult to tunnel over SSH. With many SSH clients, attempting to set up a tunnel for the control channel (the initial client-to-server connection on port 21) will only protect that channel; when data is transferred, the FTP software at either end will set up new TCP connections (data channels) which will bypass the SSH connection, and thus have no confidentiality, integrity protection, etc.
If the FTP client is configured to use passive mode and to connect to a SOCKS server interface that many SSH clients can present for tunnelling, it is possible to run all the FTP channels over the SSH connection. Alternatively, The GNU licensed software FONC [3] allows both data and control of active and passive FTP connections to be encrypted over ssh tunnels.
Otherwise, it is necessary for the SSH client software to have specific knowledge of the FTP protocol, and monitor and rewrite FTP control channel messages and autonomously open new forwardings for FTP data channels.
FTP over SSH is sometimes referred to as secure FTP; this should not be confused with other methods of securing FTP, such as with SSL/TLS (FTPS). Other methods of transferring files using SSH which are not related to FTP include SFTP or SCP; in both of these, the entire conversation (credentials and data) is always protected by the SSH protocol.