If a server received a base64 string and wanted to check it's length before converting,, say it wanted to always permit the final byte array to be 16KB. How big could a 16KB byte array possibly become when converted to a Base64 string (assuming one byte per character)?
Base64: What is the worst possible increase in space usage
base64expansion
Related Solutions
This roughly follows your document, simple massage some names, or add xml serializer attributes to get the xml document you want:
public class Document
{
public string Actor { get; set; }
public string Description { get; set; }
public string Title { get; set; }
public string type { get { return "A"; } }
public int Sequence { get; set; }
public byte[] Content { get; set; }
}
var d = new Document() { Actor = "Sean Connory", Description = "Thriller", Title = "The Rock" };
d.Content = new byte[] { 43,45,23,43,82,90,34 };
var xmls = new System.Xml.Serialization.XmlSerializer(typeof(Document));
using (var ms = new System.IO.MemoryStream())
{
xmls.Serialize(ms, d);
Console.Write(System.Text.Encoding.UTF8.GetString(ms.ToArray()));
}
Console.ReadLine();
the XmlSerializer
will convert the byte[]
property (in this case Content) automatically to and from base64encoding. You were looking for the 'best' way to convert large files and to put them in xml documents. Their my be other (better) ways of doing so. But I have used this way with a great amount of success in the past. If setup correctly this solution could save you a lot of trouble as it will both build the xml document for you and convert the data to Base64 if the object is setup correctly. On the reverse side you can take an xml document and populate an object with all of its data which saves you the time of navigating xml nodes to find the data you want.
Update
If this doesn't work for large files as expected, I did find this MSDN article on serializing a stream into a base64 stream. I have never worked with this before and so cannot provide any sort of great insight for you but it sounds more like something you are looking for.
Q Does a base64 string always end with =?
A: No. (the word USB is base64 encoded into dXNi)
Q Why does an = get appended at the end?
A: As a short answer: The last character ("=" sign) is added only as a complement(padding) in the final process of encoding a message with a special number of characters.
You will not have a '=' sign if your string has a multiple of 3 characters number, because Base64
encoding takes each three bytes (a character=1 byte) and represents them as four printable characters in the ASCII standard.
Example:
(a) If you want to encode
ABCDEFG <=> [ABC
] [DEF
] [G
Base64
will deal with the first block (producing 4 characters) and the second (as they are complete). But for the third it will add a double ==
in the output in order to complete the 4 needed characters. Thus, the result will be QUJD REVG Rw== (without spaces).
(b) If you want to encode ABCDEFGH <=> [ABC
] [DEF
] [GH
similarly, it will add just a single =
in the end of the output to get 4 characters.
The result will be QUJD REVG R0g= (without spaces).
Best Answer
Base64 encodes each set of three bytes into four bytes. In addition the output is padded to always be a multiple of four.
This means that the size of the base-64 representation of a string of size n is:
So, for a 16kB array, the base-64 representation will be ceil(16*1024/3)*4 = 21848 bytes long ~= 21.8kB.
A rough approximation would be that the size of the data is increased to 4/3 of the original.