Git can generate patches/diffs for binary files as well as for text files.
I'm trying to figure out what encoding it uses for its binary patches.
Here is an example:
diff --git a/www/images/openconnect.png b/www/images/openconnect.png
new file mode 100644
index 0000000000000000000000000000000000000000..51a5d620083cafdc8be07fc42db44ee4a273cacc
GIT binary patch
literal 55947
zcmdRWhd<R{{QouL+Lx^CE7^o(&x`1mLiWne-g{(SE4%EOaTT&R8Ie)46SA_pbc?KP
zzUO|vkMHk)_}#}t>6X0T@AEpZ*K-|lT94EzNSR0>5D3M64OJZo1TO`A$Uup}J5o}F
zs^B+5FT{OaD0l@!ZDPTnN!&GzydV&wVcZAa(v9vV@a7F~HAC+wZg$>&mY%i{KR-WV
...
zM_(nPM^0iqGn&ziW^}xgq{7*>(Z~zK&uB(7n$e8P)2d=17EN{7l9w}@(Trv^qY==m
zVj$pbc}Z>Q83UQojAk^WG1F>eAQJOc8zsxw&S*w6n$e8P)3L}vzB|q+oEgn%Ml+fb
z)3L}vX6CCI&1gn5ngGoh$c$z*qZ!R;AX+sH#8%$gAZR*cATyfLjAk?eS~Uy=GVLP*
l@a=IAWJWWZ(TrvU{C~H_V_Z$W5taY|002ovPDHLkV1k~|z(xQ7
literal 0
HcmV?d00001
This is clearly some kind of binary-to-ASCII encoding… but it is not the common Base64. It appears to use more ASCII characters… and all the encoded lines (except for the last one!?) begin with z
.
Best Answer
Aha, it's RFC1924's version of the base85 encoding, which uses 5 ASCII characters to represent 4 bytes (80% efficiency):
The additional wrinkle is that Git prefixes every line with a single letter (
[A-Za-z]
) to indicate 1-52 bytes encoded on that line.Source code: https://github.com/git/git/blob/master/base85.c
Announcement of this feature on the Git mailing list: http://www.gelato.unsw.edu.au/archives/git/0605/19975.html