ch01: Overview of HTTP
0. Guide:
How web clients and servers communicate
Where resources(web content) come from
How web transactions work
The format of the messages used for HTTP communication
The underlying TCP network transport
The different variations of the HTTP protocol
Some of the many HTTP architectural componenets installed around the Internet
1. Web Clients and Servers
Together, HTTP clients and HTTP servers make up the basic components of the World Wide Web.
2. Resources
A resource is any kind of content.
2.1 Media Types
Because the Internet hosts many thousands of different data types, HTTP carefully tags each object being transported through the Web with a data format label called a MIME type.
A MIME type is a textual label, represented as a primary obejct type and a specific subtype, separated by a slash. For example:
An HTML-formatted text document would be labeled with type
text/html
.A plain ASCII text document would be labeled with type
text/plain
.A JPEG version of an image would be
image/jpeg
.A GIG-format impage would be
image/gif
.An Apple QuickTime movie would be
video/quicktime
.A Microsoft PowerPoint presentation would be
application/vnd.ms-powerpoint
.
2.2 URIs
Each web server resource has a name, so clients can point out what resources they are interested in.
The server resource name is called a uniform resource identifier, or URI:
URIs come in two flavors, called URLs and URNs.
2.3 URLs
URL(uniform resource locator) is the most common form of resource identifier.
URLs describe the specific location of a resource on a particular server.
Standardized format:
The first part of the URL is called the scheme, and it describes the protocol used to access the resource. This is usually the HTTP protocol(
http://
).The second part gives the server Internet address.(e.g.
www.joe-hardware.com
).The rest names a resource on the web server(e.g.
/specials/saw-blade.gif
).
Today, almost every URI is a URL.
2.4 URNs
URN(uniform resource name), servers unique name for a particular piece of content, independent of where the resource currently resides.
URNs are still experimental and not yet widely adopted.
3. Transactions
An HTTP transaction consists of a request command(sent from client to server), and a response result(sent from the server back to the client).
This communication happens with formatted blocks of data called HTTP messages:
3.1 Methods
Every HTTP request message has method. The method tells the server what action to perform:
HTTP method | Description |
GET | Send named resource from the server to client. |
PUT | Store data from client into a named server resource. |
DELETE | Delete the named resource from a server. |
POST | Send client data into a server gateway application. |
HEAD | Send just the HTTP headers from the response for the named resource. |
3.2 Status Codes
Every HTTP response message comes back with a status code.
HTTP also sends an explanatory textual "reason phrase" with each numeric status code.
3.3 Web Pages Can Consist of Multiple Objects
An application often issues multiple HTTP transactions to accomplish a task:
4. Message
HTTP messages consist of three parts:
Start line
The first line of the message is the start line, indicating what to do for a request or what happened for a response.
Header fields
Zero or more header fields follow the start line. Each header field consists of a name and a value, separated by a colon(:) for easy parsing. The headers end with a blank line. Adding a header field is as easy as adding another line.
Body
After the blank line is an optional message body containing any kind of data. Request bodies carry data to the web server; response bodies carry data back to the client. Unlike the start lines and headers, which are textual and structured, the body can contain arbitary binary data. Of course, the body can also contain text.
5. Connections
5.1 Connections, IP Addresses, and Port Numbers
Steps:
The browser extracts the server's hostname from the URL.
The browser converts the server's hostname into the server/s IP address.
The browser extracts the port number(if any) from the URL.
The browser establishes a TCP connection with the web server.
The browser sends an HTTP request message to the server.
The server sends an HTTP response back to the browser.
The connection is closed, and the browser displays the document.
6. Architectural Components of the Web
There are many other web applications that you interact with on the Internet.
Proxies: HTTP intermediaries that sit between clients and servers
Caches: HTTP storehouses that keep copies of popular web pages close to clients
Gateways: Special web servers that connect to other applications
Tunnels: Special proxies that blindly forward HTTP communications
Agents: Semi-intelligent web clients that make automated HTTP requests
6.1 Proxies
Proxt servers, important building blocks for web security, application integration, and performance optimizaiton.
6.2 Caches
Web cache or caching proxy is a special type of HTTP proxy server that keeps copies of popular documents that pass through the proxy:
6.3 Gateways
Gateways are special servers that act as intermediaries for other servers. They are often used to convert HTTP traffic to another protocol:
6.4 Tunnels
Tunnels are HTTP applications that, after setup, blindly relay raw data between two connections.
6.5 Agents
Agents are client programs that make HTTP requests on the user's behalf. Any application that issues web requests is an HTTP agent:
Last updated