Yes, size matters. Your co-worker's argument is that any amount of traffic up to the MTU will take the same amount of time to transmit, and that simply isn't true.
Forget for a minute that you have any protocols at all, just a pipe where bits go in one end and (hopefully) make out out the other. If the pipe can transfer 8,000 bits per second, 500 bytes (4,000 bits) will take half a second and 1,000 bytes (8,000 bits) will take a full second. This holds true for pretty much every kind of interface because they're all serial. Some interfaces are partially parallel in that they may let you transfer more than one bit per clock cycle, but once you've exceeded that limit, you're serially transferring a stream of whatever those units are. So there's one basic limit right there.
When you add protocols, which are necessary because pipes aren't always reliable or you have lots of pipes that form a network, you add information that tell the protocol implementation what to do with the packet. That information takes up space (and therefore transmission time) that you can no longer use for data and is required overhead, because you can't run the protocol without it.
Your example is a UDP/IP* packet, which contains your payload, an 8-byte header added by the UDP protocol and a 20-byte header added by the IP protocol. If you send 500-byte datagrams, the 28-byte overhead imposed by the protocols will make up 5.3% of the 528-byte packet that has to be sent. Increase the payload to 1,000 bytes and you're sending more of your own data in each 1,028-byte datagram for an overhead rate of 2.7%. If you had a transmission medium that had a large MTU and could swallow a 10,028-byte packet, overhead would shrink even further to 0.3%.
TCP, being a more complex protocol than UDP, has a 20-byte header, which makes the total overhead 40 bytes when run over IP. For your 500- and 1,000-byte payload examples, overhead rates would be 7.4% and 3.8%.
Sending those TCP/IP or UDP/IP packets over Ethernet adds 14 bytes of Ethernet's own header, four bytes of its trailer and 12 bytes of idle time between frames for a total of 30 bytes. As it is with the protocols above, larger payloads mean fewer frames and fewer frames mean less time spent on overhead.
There are a number of reasons why you don't just send arbitrarily-large packets to keep the overhead down. The biggest of those reasons is that you don't want to have to retransmit a whole megabyte of data because you lost six bytes somewhere in the middle. Unless you know your transmission medium is extremely reliable, it's much better to incur the overhead so you only have to re-send a one-kilobyte frame.
The real-world parallel to this would be using 53-foot trucks for shipping. It's a lot cheaper to pack each one as tightly as possible than it is to hire drivers and buy gas for a lot of them to carry just a couple of things each.
*What is commonly called UDP is actually UDP over IP or UDP/IP. UDP can actually be run on top of protocols other than IP, but IP is by far the usual case.
To address dbasnett's comment: The point isn't that UDP is run on top of other protocols, it's that its place in the protocol layer cake means it can be. UDP, being a transport-layer protocol, assumes host addressing has already been taken care of by the network layer. This means you can run UDP (and just UDP, not UDP/IP) across pretty much anything if your only needs were to identify sending and receiving ports. A serial link between two hosts (with no network layer, since addressing would be implicit) would work. Ethernet frames would, too, if MAC addresses were sufficient to identify hosts. I'm not saying anyone actually does this, but a properly-designed network stack will allow replacement of lower layers without the higher ones having to know or care.
The sequence numbers in TCP wrap around. This means that after 2^32-1 (4294967295), the sequence numbers continue with 0.
You might think that this could pose problems with distinguishing between old and new data with the same sequence number, but that doesn't happen, because TCP also has the concept of a window of acceptable sequence numbers and that window is at most 2^16 sequence numbers wide. This means that long before you need to re-use sequence numbers, they have fallen outside the window of acceptable sequence numbers.
The initial sequence number can be selected in multiple ways.
The original TCP specification leaves it fairly open, but does specify that the initial sequence number should not have been used in a preceding connection (on the same port) in the last few hours. One example given is that the initial sequence numbers are chosen based on a bit-clock with a period of 4.55 hours (after that time the sequence numbers start to repeat).
It is also possible to select the initial sequence numbers more randomly, as long as there is no chance that the packets of the new connection can't be confused with those from a previous connection.
Best Answer
A TCP connection is uniquely identified by the 4-tuple (source address, source port, dest address, dest port). The source and destination IP addresses will take care of themselves, but you're slightly confused about the ports.
A port is just a 16-bit integer, used to distinguish between multiple active sockets on the same host, but there are certain conventions governing their allocation (reference: wikipedia):
"well-known" ports are < 1024.
These ports are generally protected by the OS, so an unprivileged process cannot bind one and hence masquerade as a well-known service.
As you noted, 80 is the "well-known" port for HTTP, which means it's the default unless your URL specifies otherwise.
"registered" ports, useable by unprivileged processes to provide services, are between 1024 and 49152. For example, 8080 is commonly used for an unprivileged HTTP server
UDP stands for User Datagram Protocol. You're right that it is best-effort, but that has nothing to do with ports. Both TCP and UDP use exactly the same IP addressing scheme, with the same 4-tuple. You'll probably never use it for HTTP though (see the answer to your last question below).
Yes and no. You create two sockets and connect them both. The source IP will be the same for each (since they're on the same machine, and presumably using the same network interface on that machine). The destination IP and port will be the same (they're connecting to the same HTTP server). The source port however will be different, because your OS allocated a different ephemeral port to each socket.
Because each socket is your endpoint for a different TCP connection (they have different unique 4-tuples), they can run in parallel. However, assuming as above the two connections are over the same physical network interface, they can't send or receive physical packets simultaneously. In practise this doesn't matter, since the OS will interleave their packets onto the physical network for you.
The connections will generally be asynchronous, so both sockets can have in-flight requests at once, and the replies can also be interleaved.
Your website will be listening on the IP,port tuple (localhost,80). If you connect to it from the same machine, your connection will be something like (localhost, ephemeral1, localhost, 80). If you connect to a web server on a different machine, your connection will be something like (localhost, ephemeral2, remotehost, 80). They're still different, even if they both have an 80 in one of the 4 values.
The only thing you can't do is have two different web-servers both listening to port 80 on the same machine.
You can always check this stuff yourself: the standard is here. Here's the relevant section:
So you see HTTP doesn't have to use TCP, but it does assume a reliable (and connection-oriented) protocol, so UDP is out.