You are misunderstanding the function of the assembler command.
If you program a "GOTO" in C, the jmp assembler command comes out. It is just what it says on the tin: Jump. It tells the compiler nothing about code location.
Nowhere in your code do you specify what code goes where, as such the compiler just decides for you that you want the code to go where it thinks is best. Which most likely is the start of non-vector non-boot space.
I think you'd be well off going to:
http://www.avrfreaks.net/
And searching for "[TUT] Bootloader" and see if there's a tutorial that pops up that fits your general way of thinking. It'll take you at least a good half day, but you will learn most you need to know about it.
Edit:
I forgot to point out that your code also doesn't stop at a predictable point. These days often the compiler helps out in the assumption you mean to stop at the end of main(), but it's "good manners" to include a while(1){}; at the end, where you want the code to halt. That way you are in absolute control of what happens and you are absolutely sure your code doesn't just run on into empty flash.
Yes. Obviously you can send multiple packets.
There is a tradeoff between efficiency, and the cost of getting a packet wrong. With small packets, there is more total overhead for the amount of data actually delivered. However, if there is a error, less data will have to be re-sent.
That all said, data errors are, or at least should be, rare. Generally you send maximum-size packets until there is less than a full packet of data left.
Their checksum algorithm is rather a joke, let's say "primitive" at best. Smaller packets have more checksum protection because there are effectively more checksum bits per data bit.
Of course you aren't stuck with this bootloader. You can write one that does whatever you want. See this question for more information on bootloaders in general.
More on verification
As Peter pointed out in a comment, data correctness is essential in a bootloader.
That is a good argument for ditching this bootloader and it's ridiculously naïve checksum algorithm. What I usually do is not even attempt to checksum individual packets. I design the protocol for reasonable efficiency, but add a good checksum to the whole image.
Communicating the data from the host to the device is only one part of the problem. It then has to be written correctly to the non-volatile memory. That's actually much more likely to fail. A power glitch, or the user hitting the reset button at a bad time can cause corruption, for example.
My strategy is generally not to worry about the possible errors in individual steps of the process, but to verify the final result. I usually put something like a 24 or 32 bit CRC checksum at the end of the uploaded image. Before the bootloader runs the main app, it verifies the checksum. If it fails, it requests a new image, runs the previous image, signals error, or something. Never ever run a corrupted image.
From a comment:
The binary file contains a Firmware (some .c and .h and libraries generated with an embedded IDE like Keil) .. How to add this checksum to the image itself? Is it safe to modify such a file?
No, the binary file doesn't contain any .c and .h files. It's a binary file. You need to step back and learn what the compiler, librarian, and linker each do, and what kind of information is in the various file types. You can't go about developing code where you're close to the hardware without understanding these things.
Adding the checksum to the binary file (often in Intel Hex format) is one way to
add a checksum to the image. You could write a separate tool that runs after the linker. This tool reads the image, computes the checksum, and embeds it back into the image at a known place.
Another way is to have the uploader program add the checksum on the fly. The uploader reads the unmodified binary file, and adds the checksum at the known locations, usually the last few bytes of the image. This modified image is then sent to the bootloader, but the binary file is never altered.
Best Answer
The reset vector is the second word (32 bit) in the interrupt vector table. It must be an odd value (thumb bit is set).