Sunday, March 1, 2009

97. Hypertext Transfer Protocol (HTTP)

HyperText Transfer Protocol, or HTTP, is the underlying and primary communications protocol used by the World Wide Web (WWW); other Internet protocols include File Transfer Protocol (FTP), Gopher, and Telnet. HTTP is the protocol used in the transfer of HyperText Markup Language (HTML) files and stands at the very core of the World Wide Web. HTTP and HTML are closely linked; one defines connectivity, the other defines interface. HTTP is the transport or transfer protocol offering the base method by which all clients (meaning your Web browser program installed on your computer) and servers (the Web server hosting a Web site displayed in your browser) communicate with each other. HTML, on the other hand, is the base standard by which content is formatted and displayed effectively in Web pages in browsers. The communications protocol used to connect to Web servers on the Internet or on a local network (intranet). Its primary function is to establish a connection with the server and send HTML pages back to the user's browser. It is also used to download files from the server either to the browser or to any other requesting application that uses HTTP.

As the transport protocol, HTTP defines how information is formatted and transmitted, and what actions Web servers and browsers take in response to commands you send over the Internet. For example, when you enter a URL, or Uniform Resource Locator, in the "Location" or "Address" field of your browser, you are sending an HTTP command to the Web server (which hosts that URL or Web site), directing it to fetch and transmit the requested Web page with its various media embedded--text, graphics, audio, or video. The HTTP protocol uses the concept of reference provided by the Universal Resource Identifier as a Location (URL) or Name (URN). When a hyperlink is composed in HTML, the URL uses the general form http://host:port-number/path/file.html. In HTTP/0.9 and 1.0, the connection is closed after a single request/response pair. In HTTP/1.1 a keep-alive-mechanism was introduced, where a connection could be reused for more than one request. Such persistent connections reduce lag perceptibly, because the client does not need to re-negotiate the TCP connection after the first request has been sent.

HTTP is a "stateless" request/response system. The connection is maintained between client and server only for the immediate request, and the connection is closed. After the HTTP client establishes a TCP connection with the server and sends it a request command, the server sends back its response and closes the connection. The first version of HTTP caused considerable overhead. Each time a graphics file on the page was requested, a new protocol connection had to be established between the browser and the server. In HTTP Version 1.1, multiple files could be downloaded with the same connection. It also improved caching and made it easier to create virtual hosts (multiple Web sites on the same server). Methods PUT and DELETE are defined to be idempotent, meaning that multiple identical requests should have the same effect as a single request. Methods GET, HEAD, OPTIONS and TRACE, being prescribed as safe, should also be idempotent. HTTP is a stateless protocol.

By contrast, the POST method is not necessarily idempotent, and therefore sending an identical POST request multiple times may further affect state or cause further side effects (such as financial transactions). In some cases this may be desirable, but in other cases this could be due to an accident, such as when a user does not realize that their action will result in sending another request, or they did not receive adequate feedback that their first request was successful. While web browsers may show alert dialog boxes to warn users in some cases where reloading a page may re-submit a POST request, it is generally up to the web application to takes responsibility for handling cases where a POST request should not be submitted more than once. Note that whether a method is idempotent is not enforced by the protocol or web server. It is perfectly possible to write a web application in which (for example) a database insert or other non-idempotent action is triggered by a GET or other request. Ignoring this recommendation, however, may result in undesirable consequences if a user agent assumes that repeating the same request is safe when it isn't.

No comments:

Post a Comment