C# – PDF File is damaged and cannot be repaired when moving memory stream to filestream

citextnetvb.net

I am using iTextSharp with VB.Net to stamp images onto PDF documents. (Since this is not language specific I tagged for C#, too.) I have two applications using the process.

  • The first uses the bytes from the memorystream to display the PDF
    documents online. This piece is working.

  • The second uses the same function but instead saves the PDF to a
    file. This piece generates an invalid PDF.

I have seen some similar questions, but they are all creating a document initially and have a document object in the code. Their memory streams are corrupt from the outset. My code does not have a document object and my original memory stream opens fine.

Here is the place where I get the error: (I have to put the buffer from m into a new memory stream because the stamper in the fillPDF function defaults to closing the stream unless marked otherwise.)

Dim m As MemoryStream = PDFHelper.fillPDF(filename, Nothing, markers, "")
Dim m2 As New MemoryStream(m.GetBuffer, 0, m.GetBuffer.Length)
Dim f As FileStream = New FileStream("C:\temp.pdf", FileMode.Create)
m2.CopyTo(f, m.GetBuffer.Length)
m2.Close()
f.Close()

Here is one of the ways I successfully use it on the website. This one does not use images, although some other similar successful places do use images on multiple documents that are then merged together.

Dim m As System.IO.MemoryStream = PDFHelper.fillPDF(filename, New Dictionary(Of String, String), New List(Of PDFHelper.PDfImage), "SAMPLE")
Dim data As Byte() = m.GetBuffer
Response.Clear()

//Send the file to the output stream
Response.Buffer = True

//Try and ensure the browser always opens the file and doesn’t just prompt to “open/save”.
Response.AddHeader("Content-Length", data.Length.ToString())
Response.AddHeader("Content-Disposition", "inline; filename=" + "Sample")
Response.AddHeader("Expires", "0")
Response.AddHeader("Pragma", "cache")
Response.AddHeader("Cache-Control", "private")

//Set the output stream to the correct content type (PDF).
Response.ContentType = "application/pdf"
Response.AddHeader("Accept-Ranges", "bytes")

//Output the file
Response.BinaryWrite(data)

//Flushing the Response to display the serialized data to the client browser.
Response.Flush()

Try
    Response.End()
Catch ex As Exception
    Throw ex
End Try

Here is the function in my utility class (PDFHelper.fillPDF)

  Public Shared Function fillPDF(fileToFill As String, Optional fieldValues As Dictionary(Of String, String) = Nothing, Optional images As List(Of PDfImage) = Nothing, Optional watermarkText As String = "") As MemoryStream

        Dim m As MemoryStream = New MemoryStream() // for storing the pdf
        Dim reader As PdfReader = New PdfReader(fileToFill) // for reading the document
        Dim outStamper As PdfStamper = New PdfStamper(reader, m) //for filling the document

        If fieldValues IsNot Nothing Then
            For Each kvp As KeyValuePair(Of String, String) In fieldValues
                outStamper.AcroFields.SetField(kvp.Key, kvp.Value)
            Next
        End If


        If images IsNot Nothing AndAlso images.Count > 0 Then //add all the images

            For Each PDfImage In images
                Dim img As iTextSharp.text.Image = Nothing //image to stamp

                //set up the image (different for different cases
                Select Case PDfImage.ImageType
                    //removed for brevity
                End Select

                Dim overContent As PdfContentByte = outStamper.GetOverContent(PDfImage.PageNumber) // specify page number for stamping
                overContent.AddImage(img)

            Next

        End If

        //add the water mark
        If watermarkText <> "" Then
            Dim underContent As iTextSharp.text.pdf.PdfContentByte = Nothing
            Dim watermarkRect As iTextSharp.text.Rectangle = reader.GetPageSizeWithRotation(1)

          //removed for brevity
        End If

        //flatten and close out
        outStamper.FormFlattening = True
        outStamper.SetFullCompression()
        outStamper.Close()
        reader.Close()
        Return m

Best Answer

Since your code is working to stream the PDF, one simple way to fix your problem is to make a small change to your fillPDF method - have it return a byte array:

// other parameters left out for simplicity sake  
public static byte[] fillPDF(string resource) {
  PdfReader reader = new PdfReader(resource);
  using (var ms = new MemoryStream()) {
    using (PdfStamper stamper = new PdfStamper(reader, ms)) {
      // do whatever you need to do
    }
    return ms.ToArray();
  }      
}

Then you can stream the byte array to the client in ASP.NET and save it to the file system:

// get the manipulated PDF    
byte[] myPdf = fillPDF(inputFile);
// stream via ASP.NET
Response.BinaryWrite(myPdf);
// save to file system
File.WriteAllBytes(outputFile, myPdf);

If you're generating the PDF from a standard ASP.NET web form, don't forget to call Response.End() after the PDF is written, otherwise the byte array will have HTML markup garbage appended at the end.

Related Topic