Decide on the maximum number of pages your user is expected to browse in 1 session. Do a client fetch that gets set of primary keys that satisfy your maximum criteria and return this set to your client. This process is performed only 1 time. Each time the user requests next or previous page, use the cashed set of keys to get the desired rows based on the page size. This method always retrieves at most n rows where n is the number of rows in your page (after the initial cash retrieval). When the user is done, flush the keys cash. This method is specially useful when you have a complex query where a simple SQL such as "SELECT * FROM ... Where Key > lastKey" won't work. The drawbacks of this approach are:
1 - This method ignores new and removed records after the user has requested initial browse request, however, this is usually acceptable in many types of LOB applications.
2 - This method requires fetching the keys in advance, however, if your max. number of pages is reasonable, this should not be a problem, specially when the query is well-qualified.
I know the TCP protocol binds itself to a port till the transfer of messages is over (port 80)
A TCP connection is uniquely identified by the 4-tuple (source address, source port, dest address, dest port). The source and destination IP addresses will take care of themselves, but you're slightly confused about the ports.
A port is just a 16-bit integer, used to distinguish between multiple active sockets on the same host, but there are certain conventions governing their allocation (reference: wikipedia):
"well-known" ports are < 1024.
These ports are generally protected by the OS, so an unprivileged process cannot bind one and hence masquerade as a well-known service.
As you noted, 80 is the "well-known" port for HTTP, which means it's the default unless your URL specifies otherwise.
"registered" ports, useable by unprivileged processes to provide services, are between 1024 and 49152. For example, 8080 is commonly used for an unprivileged HTTP server
- remaining values from 49152 to 65535 are used for ephemeral ports. When you create a socket and connect to a server, without binding your socket to a particular local port, the kernel assigns a free port from the ephemeral range. This is just to create a unique 4-tuple identifying your connection, and you'll normally never care what the value is.
- NB. the actual range used for ephemeral ports may vary by OS and even be configurable - it'll always start above 1024 though.
and UDP is best effort (ie no binding).
UDP stands for User Datagram Protocol. You're right that it is best-effort, but that has nothing to do with ports. Both TCP and UDP use exactly the same IP addressing scheme, with the same 4-tuple. You'll probably never use it for HTTP though (see the answer to your last question below).
My question is if I try and access two websites at the same time (multiple tabs on my browser), assuming both websites are web servers, my questions are
- Does my computer communicate with one webservice (website) first and then communicate with the other (serially). Also if this is the case is the time difference so small that I feel it loads simultaneously?
Yes and no. You create two sockets and connect them both. The source IP will be the same for each (since they're on the same machine, and presumably using the same network interface on that machine). The destination IP and port will be the same (they're connecting to the same HTTP server). The source port however will be different, because your OS allocated a different ephemeral port to each socket.
Because each socket is your endpoint for a different TCP connection (they have different unique 4-tuples), they can run in parallel. However, assuming as above the two connections are over the same physical network interface, they can't send or receive physical packets simultaneously. In practise this doesn't matter, since the OS will interleave their packets onto the physical network for you.
The connections will generally be asynchronous, so both sockets can have in-flight requests at once, and the replies can also be interleaved.
- Suppose I have my own web server (tomcat) running on port 80, how can I communicate with other websites if it happens on the same port?
Your website will be listening on the IP,port tuple (localhost,80). If you connect to it from the same machine, your connection will be something like (localhost, ephemeral1, localhost, 80). If you connect to a web server on a different machine, your connection will be something like (localhost, ephemeral2, remotehost, 80). They're still different, even if they both have an 80 in one of the 4 values.
The only thing you can't do is have two different web-servers both listening to port 80 on the same machine.
- Do websites decide which protocol to use TCP or UDP?
You can always check this stuff yourself: the standard is here. Here's the relevant section:
HTTP communication usually takes place over TCP/IP connections. The
default port is TCP 80 [19], but other ports can be used. This does
not preclude HTTP from being implemented on top of any other protocol
on the Internet, or on other networks. HTTP only presumes a reliable
transport; any protocol that provides such guarantees can be used;
the mapping of the HTTP/1.1 request and response structures onto the
transport data units of the protocol in question is outside the scope
of this specification.
So you see HTTP doesn't have to use TCP, but it does assume a reliable (and connection-oriented) protocol, so UDP is out.
Best Answer
There is no such thing as a "persistent" TCP connection. All TCP connections persist from connection start to close.
There is no concept of "close connection" for HTTP. HTTP knows only requests and responses and an exchange is done the request is fully send and the response is fully received. With keep-alive you can have multiple such exchanges inside a single TCP connection.
Close of TCP connection means close of HTTP connection, but there is no explicit close of the HTTP connection.