org.apache.pdfbox.cos
Class COSDocument

java.lang.Object
  extended by org.apache.pdfbox.cos.COSBase
      extended by org.apache.pdfbox.cos.COSDocument
All Implemented Interfaces:
COSObjectable

public class COSDocument
extends COSBase

This is the in-memory representation of the PDF document. You need to call close() on this object when you are done using it!!

Version:
$Revision: 1.28 $
Author:
Ben Litchfield

Constructor Summary
COSDocument()
          Constructor.
COSDocument(java.io.File scratchDir)
          Constructor that will create a create a scratch file in the following directory.
COSDocument(java.io.File scratchDir, boolean forceParsing)
          Constructor that will use a temporary file in the given directory for storage of the PDF streams.
COSDocument(RandomAccess file)
          Constructor that will use the following random access file for storage of the PDF streams.
COSDocument(RandomAccess scratchFile, boolean forceParsing)
          Constructor that will use the given random access file for storage of the PDF streams.
 
Method Summary
 java.lang.Object accept(ICOSVisitor visitor)
          visitor pattern double dispatch method.
 void addXRefTable(java.util.Map<COSObjectKey,java.lang.Integer> xrefTable)
          Populate XRef HashMap with given values.
 void close()
          This will close all storage and delete the tmp files.
 void dereferenceObjectStreams()
          This method will search the list of objects for types of ObjStm.
protected  void finalize()
          Warn the user in the finalizer if he didn't close the PDF document.
 COSObject getCatalog()
          This will get the document catalog.
 COSArray getDocumentID()
          This will get the document ID.
 COSDictionary getEncryptionDictionary()
          This will get the encryption dictionary if the document is encrypted or null if the document is not encrypted.
 java.lang.String getHeaderString()
           
 COSDictionary getLastSignatureDictionary()
           
 COSObject getObjectByType(COSName type)
          This will get the first dictionary object by type.
 COSObject getObjectByType(java.lang.String type)
          This will get the first dictionary object by type.
 COSObject getObjectFromPool(COSObjectKey key)
          This will get an object from the pool.
 java.util.List<COSObject> getObjects()
          This will get a list of all available objects.
 java.util.List<COSObject> getObjectsByType(COSName type)
          This will get a dictionary object by type.
 java.util.List<COSObject> getObjectsByType(java.lang.String type)
          This will get all dictionary objects by type.
 RandomAccess getScratchFile()
          This will get the scratch file for this document.
 SignatureInterface getSignatureInterface()
           
 int getStartXref()
          Return the startXref Position of the parsed document.
 COSDictionary getTrailer()
          This will get the document trailer.
 float getVersion()
          This will get the version of this PDF document.
 java.util.Map<COSObjectKey,java.lang.Integer> getXrefTable()
          Returns the xrefTable which is a mapping of ObjectKeys to byte offsets in the file.
 boolean isEncrypted()
          This will tell if this is an encrypted document.
 void print()
          This will print contents to stdout.
 COSObject removeObject(COSObjectKey key)
          Removes an object from the object pool.
 void setDocumentID(COSArray id)
          This will set the document ID.
 void setEncryptionDictionary(COSDictionary encDictionary)
          This will set the encryption dictionary, this should only be called when encrypting the document.
 void setHeaderString(java.lang.String header)
           
 void setSignatureInterface(SignatureInterface signatureInterface)
           
 void setStartXref(int startXref)
          This method set the startxref value of the document.
 void setTrailer(COSDictionary newTrailer)
          // MIT added, maybe this should not be supported as trailer is a persistence construct.
 void setVersion(float versionValue)
          This will set the version of this PDF document.
 void setWarnMissingClose(boolean warn)
          Controls whether this instance shall issue a warning if the PDF document wasn't closed properly through a call to the close() method.
 
Methods inherited from class org.apache.pdfbox.cos.COSBase
getCOSObject, getFilterManager, isDirect, isNeedToBeUpdate, setDirect, setNeedToBeUpdate
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

COSDocument

public COSDocument(RandomAccess scratchFile,
                   boolean forceParsing)
Constructor that will use the given random access file for storage of the PDF streams. The client of this method is responsible for deleting the storage if necessary that this file will write to. The close method will close the file though.

Parameters:
scratchFile - the random access file to use for storage
forceParsing - flag to skip malformed or otherwise unparseable document content where possible

COSDocument

public COSDocument(java.io.File scratchDir,
                   boolean forceParsing)
            throws java.io.IOException
Constructor that will use a temporary file in the given directory for storage of the PDF streams. The temporary file is automatically removed when this document gets closed.

Parameters:
scratchDir - directory for the temporary file, or null to use the system default
forceParsing - flag to skip malformed or otherwise unparseable document content where possible
Throws:
java.io.IOException

COSDocument

public COSDocument()
            throws java.io.IOException
Constructor. Uses memory to store stream.

Throws:
java.io.IOException - If there is an error creating the tmp file.

COSDocument

public COSDocument(java.io.File scratchDir)
            throws java.io.IOException
Constructor that will create a create a scratch file in the following directory.

Parameters:
scratchDir - The directory to store a scratch file.
Throws:
java.io.IOException - If there is an error creating the tmp file.

COSDocument

public COSDocument(RandomAccess file)
Constructor that will use the following random access file for storage of the PDF streams. The client of this method is responsible for deleting the storage if necessary that this file will write to. The close method will close the file though.

Parameters:
file - The random access file to use for storage.
Method Detail

getScratchFile

public RandomAccess getScratchFile()
This will get the scratch file for this document.

Returns:
The scratch file.

getObjectByType

public COSObject getObjectByType(java.lang.String type)
                          throws java.io.IOException
This will get the first dictionary object by type.

Parameters:
type - The type of the object.
Returns:
This will return an object with the specified type.
Throws:
java.io.IOException - If there is an error getting the object

getObjectByType

public COSObject getObjectByType(COSName type)
                          throws java.io.IOException
This will get the first dictionary object by type.

Parameters:
type - The type of the object.
Returns:
This will return an object with the specified type.
Throws:
java.io.IOException - If there is an error getting the object

getObjectsByType

public java.util.List<COSObject> getObjectsByType(java.lang.String type)
                                           throws java.io.IOException
This will get all dictionary objects by type.

Parameters:
type - The type of the object.
Returns:
This will return an object with the specified type.
Throws:
java.io.IOException - If there is an error getting the object

getObjectsByType

public java.util.List<COSObject> getObjectsByType(COSName type)
                                           throws java.io.IOException
This will get a dictionary object by type.

Parameters:
type - The type of the object.
Returns:
This will return an object with the specified type.
Throws:
java.io.IOException - If there is an error getting the object

print

public void print()
This will print contents to stdout.


setVersion

public void setVersion(float versionValue)
This will set the version of this PDF document.

Parameters:
versionValue - The version of the PDF document.

getVersion

public float getVersion()
This will get the version of this PDF document.

Returns:
This documents version.

isEncrypted

public boolean isEncrypted()
This will tell if this is an encrypted document.

Returns:
true If this document is encrypted.

getEncryptionDictionary

public COSDictionary getEncryptionDictionary()
This will get the encryption dictionary if the document is encrypted or null if the document is not encrypted.

Returns:
The encryption dictionary.

getSignatureInterface

public SignatureInterface getSignatureInterface()

setEncryptionDictionary

public void setEncryptionDictionary(COSDictionary encDictionary)
This will set the encryption dictionary, this should only be called when encrypting the document.

Parameters:
encDictionary - The encryption dictionary.

getLastSignatureDictionary

public COSDictionary getLastSignatureDictionary()
                                         throws java.io.IOException
Throws:
java.io.IOException

getDocumentID

public COSArray getDocumentID()
This will get the document ID.

Returns:
The document id.

setDocumentID

public void setDocumentID(COSArray id)
This will set the document ID.

Parameters:
id - The document id.

setSignatureInterface

public void setSignatureInterface(SignatureInterface signatureInterface)

getCatalog

public COSObject getCatalog()
                     throws java.io.IOException
This will get the document catalog. Maybe this should move to an object at PDFEdit level

Returns:
catalog is the root of all document activities
Throws:
java.io.IOException - If no catalog can be found.

getObjects

public java.util.List<COSObject> getObjects()
This will get a list of all available objects.

Returns:
A list of all objects.

getTrailer

public COSDictionary getTrailer()
This will get the document trailer.

Returns:
the document trailer dict

setTrailer

public void setTrailer(COSDictionary newTrailer)
// MIT added, maybe this should not be supported as trailer is a persistence construct. This will set the document trailer.

Parameters:
newTrailer - the document trailer dictionary

accept

public java.lang.Object accept(ICOSVisitor visitor)
                        throws COSVisitorException
visitor pattern double dispatch method.

Specified by:
accept in class COSBase
Parameters:
visitor - The object to notify when visiting this object.
Returns:
any object, depending on the visitor implementation, or null
Throws:
COSVisitorException - If an error occurs while visiting this object.

close

public void close()
           throws java.io.IOException
This will close all storage and delete the tmp files.

Throws:
java.io.IOException - If there is an error close resources.

finalize

protected void finalize()
                 throws java.io.IOException
Warn the user in the finalizer if he didn't close the PDF document. The method also closes the document just in case, to avoid abandoned temporary files. It's still a good idea for the user to close the PDF document at the earliest possible to conserve resources.

Overrides:
finalize in class java.lang.Object
Throws:
java.io.IOException - if an error occurs while closing the temporary files

setWarnMissingClose

public void setWarnMissingClose(boolean warn)
Controls whether this instance shall issue a warning if the PDF document wasn't closed properly through a call to the close() method. If the PDF document is held in a cache governed by soft references it is impossible to reliably close the document before the warning is raised. By default, the warning is enabled.

Parameters:
warn - true enables the warning, false disables it.

getHeaderString

public java.lang.String getHeaderString()
Returns:
Returns the headerString.

setHeaderString

public void setHeaderString(java.lang.String header)
Parameters:
header - The headerString to set.

dereferenceObjectStreams

public void dereferenceObjectStreams()
                              throws java.io.IOException
This method will search the list of objects for types of ObjStm. If it finds them then it will parse out all of the objects from the stream that is contains.

Throws:
java.io.IOException - If there is an error parsing the stream.

getObjectFromPool

public COSObject getObjectFromPool(COSObjectKey key)
                            throws java.io.IOException
This will get an object from the pool.

Parameters:
key - The object key.
Returns:
The object in the pool or a new one if it has not been parsed yet.
Throws:
java.io.IOException - If there is an error getting the proxy object.

removeObject

public COSObject removeObject(COSObjectKey key)
Removes an object from the object pool.

Parameters:
key - the object key
Returns:
the object that was removed or null if the object was not found

addXRefTable

public void addXRefTable(java.util.Map<COSObjectKey,java.lang.Integer> xrefTable)
Populate XRef HashMap with given values. Each entry maps ObjectKeys to byte offsets in the file.

Parameters:
_xrefTable - xref table entries to be added

getXrefTable

public java.util.Map<COSObjectKey,java.lang.Integer> getXrefTable()
Returns the xrefTable which is a mapping of ObjectKeys to byte offsets in the file.

Returns:
mapping of ObjectsKeys to byte offsets

setStartXref

public void setStartXref(int startXref)
This method set the startxref value of the document. This will only be needed for incremental updates.

Parameters:
readInt -

getStartXref

public int getStartXref()
Return the startXref Position of the parsed document. This will only be needed for incremental updates.

Returns:
a int with the old position of the startxref