create Pdf without 3rd-party?

Scanzee

Member
Joined
May 26, 2007
Messages
23
Programming Experience
1-3
How can I create a pdf file without using 3th party applications?
The pdf should be created and stored on the server.
Can asp or javascript handle this?
 
ASP should be able to handle it fine. I used to create simple PDFs in VB6 (havent done it yet in .NET), but the code could have been easily changed into ASP.
 
It's not easy :D

Here is a part of an example PDF. Within a PDF, there are a number of "objects".

Object 1 is the header.
VB.NET:
%PDF-1.4
1 0 obj
  <<
    /Title (whatever)
    /Subject (whatever)
    /Producer (Dynamically created using code written by Martin)
    /CreationDate (D:" & Format(Now(), "yyyymmddHhNnSs") & ")
    /Keywords (Keyword1 keyword2)
    /MyKey (Copyright \(You need a backslash to use a bracket\) )
    /Author (Anybody)
    /Creator (Anybody)
  >>
endobj


Object 2 is the Root object, which dictates which object to display when opening the PDF, and how to display it.
VB.NET:
2 0 obj << /Type /Catalog /Outlines 3 0 R /OpenAction [37 0 R /Fit] /Pages 36 0 R /PageLayout /OneColumn /PageMode /UseNone >> endobj

Object 3 is the thumbnails object (I believe, as I never used it), and you then add font objects as necessary. You dont have to use all the font objects, you only need to add the ones you are going to use.

VB.NET:
3 0 obj << /Type /Outlines /Count 0 >> endobj
4 0 obj << /Type /Font /Subtype /Type1 /Name /F1 /BaseFont /Courier /Encoding /MacRomanEncoding >> endobj
5 0 obj << /Type /Font /Subtype /Type1 /Name /F2 /BaseFont /Courier-Bold /Encoding /MacRomanEncoding >> endobj
6 0 obj << /Type /Font /Subtype /Type1 /Name /F3 /BaseFont /Courier-Oblique /Encoding /MacRomanEncoding >> endobj
7 0 obj << /Type /Font /Subtype /Type1 /Name /F4 /BaseFont /Courier-BoldOblique /Encoding /MacRomanEncoding >> endobj
8 0 obj << /Type /Font /Subtype /Type1 /Name /F5 /BaseFont /Times-Roman /Encoding /MacRomanEncoding >> endobj
9 0 obj << /Type /Font /Subtype /Type1 /Name /F6 /BaseFont /Times-Bold /Encoding /MacRomanEncoding >> endobj
10 0 obj << /Type /Font /Subtype /Type1 /Name /F7 /BaseFont /Times-Italic /Encoding /MacRomanEncoding >> endobj
11 0 obj << /Type /Font /Subtype /Type1 /Name /F8 /BaseFont /Times-BoldItalic /Encoding /MacRomanEncoding >> endobj
12 0 obj << /Type /Font /Subtype /Type1 /Name /F9 /BaseFont /Helvetica /Encoding /MacRomanEncoding >> endobj
13 0 obj << /Type /Font /Subtype /Type1 /Name /F10 /BaseFont /Helvetica-Bold /Encoding /MacRomanEncoding >> endobj
14 0 obj << /Type /Font /Subtype /Type1 /Name /F11 /BaseFont /Helvetica-Oblique /Encoding /MacRomanEncoding >> endobj
15 0 obj << /Type /Font /Subtype /Type1 /Name /F12 /BaseFont /Helvetica-BoldOblique /Encoding /MacRomanEncoding >> endobj


For each page, there are several objects. Listed below,
Object 34 is a page resource utilisation. It shows that it is going to use Object 4 (Courier) as F1, Object 5 (Courier Bold) as F2 etc.
Object 38 is the actual page content, with its length calculated as 2301. You'll need to have a basic knowledge of Postscript to work out how to write content onto pages.
Object 36 is the page "parent", which says how it will construct the "child" page object.
Object 37 holds the child page definition, the page size, what content object to use, and what resource object to use.
36 0 obj << /Type /Pages /Kids [37 0 R ] /Count 1 >> endobj
37 0 obj << /Type /Page /Parent 36 0 R /MediaBox [0 0 595.2630075 841.8866143] /Contents [38 0 R] /Resources 34 0 R >> endobj


VB.NET:
34 0 obj << /Procset [/PDF /Text ] /Font << /F1 4 0 R /F2 5 0 R /F3 6 0 R /F4 7 0 R /F5 8 0 R /F6 9 0 R /F7 10 0 R /F8 11 0 R /F9 12 0 R /F10 13 0 R /F11 14 0 R /F12 15 0 R /F13 16 0 R /F14 21 0 R /F15 24 0 R /F16 27 0 R  /F17 30 0 R>> >> endobj
38 0 obj << /Length 2301 >>
stream
 BT 0 g /F14 8 Tf 141.732 85.039 Td (abcdefghijklmnopqrstuvwxyz) Tj 0 Tc ET BT 0 g /F15 8 Tf 141.732 104.882 Td (abcdefghijklmnopqrstuvwxyz) (etc etc etc)
endstream
endobj
36 0 obj << /Type /Pages /Kids [37 0 R ] /Count 1 >> endobj
37 0 obj << /Type /Page /Parent 36 0 R /MediaBox [0 0 595.2630075 841.8866143] /Contents [38 0 R] /Resources 34 0 R >> endobj


After the last object, you get the xref positions. You'll need to write the file first, and then read it back to find the starting positions of each object. For example, object 2 begins at 1867 characters into the file.
VB.NET:
xref
0 39
0000000000 65535 f
0000000010 00000 n
0000001867 00000 n
0000002002 00000 n
.
.
.
0000062643 00000 n
0000062704 00000 n
0000060284 00000 n


Then you have the trailer, which contains the Size (number of objects), what the root object is (2) and the info object (1), and the position of the start of the xref definition.
VB.NET:
trailer << /Size 39 /Root 2 0 R /Info 1 0 R >>
startxref
62833
%%EOF

As I said at the top, it's not easy :D
 
Last edited:
If I was trying to generate the following (which says Hello World)

VB.NET:
%PDF-1.4
1 0 obj
  <<
    /Title (Test Document)
    /Subject (Test PDF for evaluation)
    /Producer (Dynamically created using code written by Martin)
    /CreationDate (D:20071213142546)
    /Keywords (Put any keywords here)
    /MyKey (Copyright AnyCompany \(Whereever\) Limited)
    /Author (AnyCompany \(Whereever\) Limited)
    /Creator (AnyCompany \(Whereever\) Limited)
  >>
endobj
2 0 obj << /Type /Catalog /Outlines 3 0 R /OpenAction [11 0 R /Fit] /Pages 10 0 R /PageLayout /OneColumn /PageMode /UseNone >> endobj
3 0 obj << /Type /Outlines /Count 0 >> endobj
4 0 obj << /Type /Font /Subtype /Type1 /Name /F1 /BaseFont /Courier /Encoding /MacRomanEncoding >> endobj
5 0 obj << /Type /Font /Subtype /Type1 /Name /F2 /BaseFont /Courier-Bold /Encoding /MacRomanEncoding >> endobj
6 0 obj << /Type /Font /Subtype /Type1 /Name /F3 /BaseFont /Courier-Oblique /Encoding /MacRomanEncoding >> endobj
7 0 obj << /Type /Font /Subtype /Type1 /Name /F4 /BaseFont /Courier-BoldOblique /Encoding /MacRomanEncoding >> endobj
8 0 obj << /Procset [/PDF /Text ] /Font << /F1 4 0 R /F2 5 0 R /F3 6 0 R /F4 7 0 R>> >> endobj
9 0 obj << /Length 61 >>
stream
 BT 0 g /F1 10 Tf 141.732 700.157 Td (Hello world) Tj 0 Tc ET
endstream
endobj
10 0 obj << /Type /Pages /Kids [11 0 R ] /Count 1 >> endobj
11 0 obj << /Type /Page /Parent 10 0 R /MediaBox [0 0 595.2630075 841.8866143] /Contents [9 0 R] /Resources 8 0 R >> endobj

xref
0 12
0000000000 65535 f
0000000010 00000 n
0000000404 00000 n
0000000539 00000 n
0000000586 00000 n
0000000693 00000 n
0000000805 00000 n
0000000920 00000 n
0000001039 00000 n
0000001135 00000 n
0000001251 00000 n
0000001312 00000 n

trailer << /Size 12 /Root 2 0 R /Info 1 0 R >>
startxref
1439
%%EOF

I would do it like this. Write the following to a file :-

VB.NET:
%PDF-1.4
1 0 obj
  <<
    /Title (Test Document)
    /Subject (Test PDF for evaluation)
    /Producer (Dynamically created using code written by Martin)
    /CreationDate (D:20071213142546)
    /Keywords (Put any keywords here)
    /MyKey (Copyright AnyCompany \(Whereever\) Limited)
    /Author (AnyCompany \(Whereever\) Limited)
    /Creator (AnyCompany \(Whereever\) Limited)
  >>
endobj
2 0 obj << /Type /Catalog /Outlines 3 0 R /OpenAction [11 0 R /Fit] /Pages 10 0 R /PageLayout /OneColumn /PageMode /UseNone >> endobj
3 0 obj << /Type /Outlines /Count 0 >> endobj
4 0 obj << /Type /Font /Subtype /Type1 /Name /F1 /BaseFont /Courier /Encoding /MacRomanEncoding >> endobj
5 0 obj << /Type /Font /Subtype /Type1 /Name /F2 /BaseFont /Courier-Bold /Encoding /MacRomanEncoding >> endobj
6 0 obj << /Type /Font /Subtype /Type1 /Name /F3 /BaseFont /Courier-Oblique /Encoding /MacRomanEncoding >> endobj
7 0 obj << /Type /Font /Subtype /Type1 /Name /F4 /BaseFont /Courier-BoldOblique /Encoding /MacRomanEncoding >> endobj
8 0 obj << /Procset [/PDF /Text ] /Font << /F1 4 0 R /F2 5 0 R /F3 6 0 R /F4 7 0 R>> >> endobj
9 0 obj << /Length 61 >>
stream
 BT 0 g /F1 10 Tf 141.732 700.157 Td (Hello world) Tj 0 Tc ET
endstream
endobj
10 0 obj << /Type /Pages /Kids [11 0 R ] /Count 1 >> endobj
11 0 obj << /Type /Page /Parent 10 0 R /MediaBox [0 0 595.2630075 841.8866143] /Contents [9 0 R] /Resources 8 0 R >> endobj

The object numbers must be consecutive.

Close the file. Open it back up again, and read the WHOLE file into a string. Close the file. Search the string for all the objects. Look for CHR(10) & "1 0 obj", then 2 etc. until you cant find any more objects. Define an array with this length. Write the following to the file

VB.NET:
(blank line)
xref
0 12
where 12 is the "number of objects + 1"

Open the file again, and write the opening line "0000000000 65535 f" (this is the xref start definition line). Follow this with the starting positions of each object to the file.
VB.NET:
0000000000 65535 f
0000000010 00000 n
0000000404 00000 n
0000000539 00000 n
0000000586 00000 n
0000000693 00000 n
0000000805 00000 n
0000000920 00000 n
0000001039 00000 n
0000001135 00000 n
0000001251 00000 n
0000001312 00000 n

ie Object 1 starts at character 10, object 2 starts at character 404 etc

Write the following to the file
VB.NET:
(blank line)
trailer << /Size 12 /Root 2 0 R /Info 1 0 R >>
startxref
where 12 is the "number of objects + 1", "2 0 R" is the number of the Catalog object, and "1 0 R" is the number of the Title Definition object.

Close the file, open it again and read all the text into a string. Search for CHR(10) & "xref" and find the position. Write the following to the file :-

VB.NET:
1439
%%EOF
where 1439 is the position of the start of the "xref".

Oh, and (blank line) means an actual blank line, not the text :D

File is attached to this post as well.
 

Attachments

  • test.pdf
    1.7 KB · Views: 29
If you want to make your PDF illegible, you can encode the page content using an ASCII85Encode function.

So

VB.NET:
9 0 obj << /Length 61 >>
stream
 BT 0 g /F1 10 Tf 141.732 700.157 Td (Hello world) Tj 0 Tc ET
endstream
endobj

would become

VB.NET:
9 0 obj << /Length 82 /Filter /ASCII85Decode >>
stream
+@9$M0Hb!N01IZ=0ea_LAfrfb0ePC@1*AM00J5(;2]u(1+=KclCi"#4GAhM<A18X#C*52Q<+@%><$3;+~>
endstream
endobj
 
Last edited:
When you add text content to a page, you'll need to replace

( with /(
) with /)
[ with /[
] with /]

All positions have to be in points. The conversion from millimetres to points is multiply by 2.834645675 - so in the above example

VB.NET:
Tf 141.732 700.157 Td (Hello world)
translates to "Move to 50mm, 247mm and write 'Hello world'"

When you add the XY position to the content, I found 3 decimal places is accurate enough for a simple text document.
 
You can return the PDF document using code, if you dont want to store it.

Here is the ASP version of the above - please note it is ASP and NOT ASP.NET code - you will need to convert it to .NET for it to work.

VB.NET:
<%

Function getBinaryFile(strFilePath) 

  Dim TypeBinary, oStream 
  
  
  TypeBinary = 1   ' Indicates a binary file 
  
  ' Create the object 
  Set oStream = Server.CreateObject("ADODB.Stream") 
  
  ' Open our file 
  oStream.Open 
  
  ' Retreive binary data from the file 
  oStream.Type = TypeBinary 
  oStream.LoadFromFile strFilePath 
  
  
  ' Return the binary data to the caller 
  getBinaryFile = oStream.read 
  
  ' Destroy the ADO object   
  Set oStream = Nothing 

End Function 

dim strOutput
strOutput = ""
strOutput = strOutput & "%PDF-1.4" & vbcrlf
strOutput = strOutput & "1 0 obj" & vbcrlf
strOutput = strOutput & "  <<" & vbcrlf
strOutput = strOutput & "    /Title (Test Document)" & vbcrlf
strOutput = strOutput & "    /Subject (Test PDF for evaluation)" & vbcrlf
strOutput = strOutput & "    /Producer (Dynamically created using code written by Martin)" & vbcrlf
strOutput = strOutput & "    /CreationDate (D:20071213142546)" & vbcrlf
strOutput = strOutput & "    /Keywords (Put any keywords here)" & vbcrlf
strOutput = strOutput & "    /MyKey (Copyright AnyCompany \(Whereever\) Limited)" & vbcrlf
strOutput = strOutput & "    /Author (AnyCompany \(Whereever\) Limited)" & vbcrlf
strOutput = strOutput & "    /Creator (AnyCompany \(Whereever\) Limited)" & vbcrlf
strOutput = strOutput & "  >>" & vbcrlf
strOutput = strOutput & "endobj" & vbcrlf
strOutput = strOutput & "2 0 obj << /Type /Catalog /Outlines 3 0 R /OpenAction [11 0 R /Fit] /Pages 10 0 R /PageLayout /OneColumn /PageMode /UseNone >> endobj" & vbcrlf
strOutput = strOutput & "3 0 obj << /Type /Outlines /Count 0 >> endobj" & vbcrlf
strOutput = strOutput & "4 0 obj << /Type /Font /Subtype /Type1 /Name /F1 /BaseFont /Courier /Encoding /MacRomanEncoding >> endobj" & vbcrlf
strOutput = strOutput & "5 0 obj << /Type /Font /Subtype /Type1 /Name /F2 /BaseFont /Courier-Bold /Encoding /MacRomanEncoding >> endobj" & vbcrlf
strOutput = strOutput & "6 0 obj << /Type /Font /Subtype /Type1 /Name /F3 /BaseFont /Courier-Oblique /Encoding /MacRomanEncoding >> endobj" & vbcrlf
strOutput = strOutput & "7 0 obj << /Type /Font /Subtype /Type1 /Name /F4 /BaseFont /Courier-BoldOblique /Encoding /MacRomanEncoding >> endobj" & vbcrlf
strOutput = strOutput & "8 0 obj << /Procset [/PDF /Text ] /Font << /F1 4 0 R /F2 5 0 R /F3 6 0 R /F4 7 0 R>> >> endobj" & vbcrlf
strOutput = strOutput & "9 0 obj << /Length 61 >>" & vbcrlf
strOutput = strOutput & "stream" & vbcrlf
strOutput = strOutput & " BT 0 g /F1 10 Tf 141.732 700.157 Td (Hello world) Tj 0 Tc ET" & vbcrlf
strOutput = strOutput & "endstream" & vbcrlf
strOutput = strOutput & "endobj" & vbcrlf
strOutput = strOutput & "10 0 obj << /Type /Pages /Kids [11 0 R ] /Count 1 >> endobj" & vbcrlf
strOutput = strOutput & "11 0 obj << /Type /Page /Parent 10 0 R /MediaBox [0 0 595.2630075 841.8866143] /Contents [9 0 R] /Resources 8 0 R >> endobj" & vbcrlf
strOutput = strOutput & "" & vbcrlf
strOutput = strOutput & "xref" & vbcrlf
strOutput = strOutput & "0 12" & vbcrlf
strOutput = strOutput & "0000000000 65535 f" & vbcrlf
strOutput = strOutput & "0000000010 00000 n" & vbcrlf
strOutput = strOutput & "0000000404 00000 n" & vbcrlf
strOutput = strOutput & "0000000539 00000 n" & vbcrlf
strOutput = strOutput & "0000000586 00000 n" & vbcrlf
strOutput = strOutput & "0000000693 00000 n" & vbcrlf
strOutput = strOutput & "0000000805 00000 n" & vbcrlf
strOutput = strOutput & "0000000920 00000 n" & vbcrlf
strOutput = strOutput & "0000001039 00000 n" & vbcrlf
strOutput = strOutput & "0000001135 00000 n" & vbcrlf
strOutput = strOutput & "0000001251 00000 n" & vbcrlf
strOutput = strOutput & "0000001312 00000 n" & vbcrlf
strOutput = strOutput & "" & vbcrlf
strOutput = strOutput & "trailer << /Size 12 /Root 2 0 R /Info 1 0 R >>" & vbcrlf
strOutput = strOutput & "startxref" & vbcrlf
strOutput = strOutput & "1439" & vbcrlf
strOutput = strOutput & "%%EOF" & vbcrlf


set fs = CreateObject ("Scripting.FileSystemObject")
strTempName = fs.GetTempName
fname = server.mappath ("/temp/" & strTempName & ".txt")
set writefile = fs.OpenTextFile (fname, 8, true)
writefile.writeline strOutput
writefile.close
set writefile = nothing

  
Response.Buffer = True
Response.Clear
Response.ContentType = "application/pdf"

Response.BinaryWrite getBinaryFile(fname) 
Response.Binarywrite strOutput
Response.flush

If fs.FileExists(fname) Then fs.DeleteFile fname, True 
set fs = nothing

Response.End
%>
 
Thanks a lot...
It's going to be very hard but it's doable... or how ever you english guys say that...

That's about right :D

It may seem like hard work, but once you've written standard functions for the header, the xref positioning etc, the rest is easy :)

If you are going to try and do any text alignment, I would strongly suggest using a fixed-width font. If you dont like using Courier, let me know and I'll send you the code for a pleasant alternative :)
 
Back
Top