MSc-IT Study Material
June 2010 Edition

Computer Science Department, University of Cape Town

Application Level Protocols

Networks build their various communication protocols on top of each other. While IP allows a computer to communicate across a network, it misses various features which TCP adds. TCP itself is a network protocol that uses IP underneath it. The application software that creates the original source data is also important in determining the protocol that is used: the destination application must understand the data being transmitted to it, and for this a well define communications protocol is needed. While different classes of application each specify their own protocol (such as email, which transfers data differently to, say, HTTP), they each build on top of lower level protocols, such as TCP and IP. These "higher level" protocols are known as application level protocols, and we will investigate a few of these now.

The figure below shows the relation between different protocols.

SMTP, the protocol used for sending email, is the workhorse protocol built on TCP/IP. However, SMTP can send only text messages. The growing need to send more than text has led to the introduction of Mime.

Simple Mail Transfer Protocol (SMTP)

SMTP messages on the ASCII character set (the ISO 646 set). This means that only 255 different characters can be used in an email. SMTP also logs the route taken by any particular email message. There are three main areas to consider: The Sender, the protocol and the receiver:

Sender

  • If the message is being sent to multiple users on a single host, then only a single copy of that message need be sent to the host.

  • It resends the message if it was initially unsuccessful in sending it. Re-queuing should not be an indefinite process and should stop after email's "lifetime" has Expired.

  • Informs the user if they have supplied a malformed Address.

Protocol

  • The protocol is used to transfer a message from the sender to the receiver over a TCP Connection.

  • It does not supply delivery notifications, although SMTP is generally reliable.

Receiver

  • The receiver accepts incoming mail and distributes it to the correct mailboxes.

  • Each message has an header defined in RFC 822 and usually follows the format below:

            Date: Mon, 14 Mar 2005 09:26:34 (GMT+2)
            From: 'CS Network Administrator' < admin@cs.uct.ac.za >
            Subject: Use of Computer Laboratories
            To: jsmith@cs.uct.ac.za
            CC:tadams@cs.uct.ac.za
    	      

Multipurpose Internet Mail Extensions (MIME)

MIME addresses the following problems with SMTP and RFC 822:

  • SMTP is restricted to the 7-bit ASCII character set.

  • SMTP places limits on the size of an email message.

  • SMTP cannot transmit either executable or binary files.

Additions to the message header provide solutions to these problems. One of these additions is the content type field. Content type specifies the type of data contained in the message so that the data can be handled in an appropriate way. Further changes also allows for multiple independent parts of the message, including attachments.

On being sent, each email message is first divided into fragments based on their MIME type; after establishing a TCP connection, these fragments are them then packaged as a standard SMTP message and sent.

Hypertext Transfer Protocol (HTTP)

A key concept of HTTP is the Universal Resource Locator (URL), which we have already discussed.

An URL is defined as

a compact representation of the location and access method for the resource available via the Internet - (RFC1738, 1808)

A resource can be any kind of file, or more commonly an HTML document. The URL is simply the name and address of that resource. Further, the URL also shows the protocol used to access the resource.

An URL has the following format:

protocol : address

There are many different protocols at this level: FTP, TELNET and HTTP are well known examples. The HTTP URL schema has the following format:

http://host: port / path / file
	

The default HTTP port is 80, while differs for other protocols. Default values are almost always omitted from an URL.

HTTP is the foundation protocol of the World Wide Web. HTTP is a client-server protocol capable of transferring different types of data: graphics, audio and video. It also facilitates the ability to obtain a single Web page from several locations. HTTP uses TCP, but each page is passed independently: a new TCP connection is made for each request the client makes to the server.

Here are two terms associated with HTTP with which you should become familiar:

  • Proxy: A proxy is an intermediate system between the client and server: it acts as a server to the client, and as a client to the server. A proxy is often used when a firewall has been set up: it acts as a server, letting only legitimate information through. Due to the many versions of HTTP available, a proxy can be set up between client and server, and each different version of the protocol is directed to the corresponding server.

  • Gateway: Acts on behalf of the server and can be used for security issues. A gateway acts as an intermediary device that can convert non-HTTP protocols. For example, a request is made to an FTP site. The gateway acts as an intermediate system and makes the request on behalf of the client to the relevant server. The gateway then converts the FTP to HTTP format, which is sent to the client

Both gateways and proxies are useful when securing networks and creating firewalls. Firewalls will be discussed in the security chapter.

Activity 1: Packet Route Tracing on the Internet

Most operating systems provide a way for the user to trace the route taken by IP packets to reach a particular server. Windows XP/2000 has the tracert command, which can be run from the command console (Press Start, then run, then type cmd, followed by enter). If you use another operating system, find out how you can trace packet routes. The Windows command takes in a destination server's URL and gives a printout of the route taken to reach that server. For example, type in the following command:

tracert www.cs.uct.ac.za

Can you explain what you see? Try URLs from other geographical areas.