Pdf/FileParser.php
Zend Framework
LICENSE
This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to [email protected] so we can send you a copy immediately.
- Category
- Zend
- Copyright
- Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)
- License
- New BSD License
- Package
- Zend_Pdf
- Subpackage
- FileParser
- Version
- $Id: FileParser.php 24593 2012-01-05 20:35:02Z matthew $
\Zend_Pdf_FileParser
Abstract utility class for parsing binary files.
Provides a library of methods to quickly navigate and extract various data types (signed and unsigned integers, floating- and fixed-point numbers, strings, etc.) from the file.
File access is managed via a Zend_Pdf_FileParserDataSource object. This allows the same parser code to work with many different data sources: in-memory objects, filesystem files, etc.
- Children
- \Zend_Pdf_FileParser_Image
- \Zend_Pdf_FileParser_Font
- Copyright
- Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)
- License
- New BSD License
Constants
Properties


\Zend_Pdf_FileParserDataSource $_dataSource = null
Object representing the data source to be parsed.
null
Details


boolean $_isParsed = false
Flag indicating that the file has been sucessfully parsed.
false
Details- Type
- boolean
Methods


__construct(\Zend_Pdf_FileParserDataSource $dataSource) : void
Object constructor.
Verifies that the data source has been properly initialized.
Name | Type | Description |
---|---|---|
$dataSource | \Zend_Pdf_FileParserDataSource |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


getDataSource() : \Zend_Pdf_FileParserDataSource
Returns the data source object representing the file being parsed.
Type | Description |
---|---|
\Zend_Pdf_FileParserDataSource |


isBitSet(integer $bit, integer $bitField) : boolean
Returns true if the specified bit is set in the integer bitfield.
Name | Type | Description |
---|---|---|
$bit | integer | Bit number to test (i.e. - 0-31) |
$bitField | integer |
Type | Description |
---|---|
boolean |


isParsed() : boolean
Returns true if the file has been successfully parsed.
Type | Description |
---|---|
boolean |


isScreened() : boolean
Returns true if the file has passed a cursory validation check.
Type | Description |
---|---|
boolean |


moveToOffset(integer $offset) : void
Convenience wrapper for the data source object's moveToOffset() method.
Name | Type | Description |
---|---|---|
$offset | integer | Destination byte offset. |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


parse() : void
Reads and parses the complete binary file.
Must set $this->_isParsed to true if successful.
Exception | Description |
---|---|
\Zend_Pdf_Exception |


readBytes(integer $byteCount) : string
Convenience wrapper for the data source object's readBytes() method.
Name | Type | Description |
---|---|---|
$byteCount | integer | Number of bytes to read. |
Type | Description |
---|---|
string |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


readFixed(integer $mantissaBits, integer $fractionBits, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN) : float
Reads the signed fixed-point number from the binary file at the current byte offset.
Common fixed-point sizes are 2.14 and 16.16.
Advances the offset by the number of bytes read. Throws an exception if an error occurs.
Name | Type | Description |
---|---|---|
$mantissaBits | integer | Number of bits in the mantissa |
$fractionBits | integer | Number of bits in the fraction |
$byteOrder | integer | (optional) Big- or little-endian byte order. Use the BYTE_ORDER_ constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian. |
Type | Description |
---|---|
float |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


readInt(integer $size, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN) : integer
Reads the signed integer value from the binary file at the current byte offset.
Advances the offset by the number of bytes read. Throws an exception if an error occurs.
Name | Type | Description |
---|---|---|
$size | integer | Size of integer in bytes: 1-4 |
$byteOrder | integer | (optional) Big- or little-endian byte order. Use the BYTE_ORDER_ constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian. |
Type | Description |
---|---|
integer |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


readStringMacRoman(integer $byteCount, string $characterSet = '') : string
Reads the Mac Roman-encoded string from the binary file at the current byte offset.
You must supply the desired resulting character set.
Advances the offset by the number of bytes read. Throws an exception if an error occurs.
Name | Type | Description |
---|---|---|
$byteCount | integer | Number of bytes (characters) to return. |
$characterSet | string | (optional) Desired resulting character set. You may use any character set supported by {@link iconv()}. If omitted, uses 'current locale'. |
Type | Description |
---|---|
string |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


readStringPascal(string $characterSet = '', integer $lengthBytes = 1) : string
Reads the Pascal string from the binary file at the current byte offset.
The length of the Pascal string is determined by reading the length bytes which preceed the character data. You must supply the desired resulting character set.
Advances the offset by the number of bytes read. Throws an exception if an error occurs.
Name | Type | Description |
---|---|---|
$characterSet | string | (optional) Desired resulting character set. You may use any character set supported by {@link iconv()}. If omitted, uses 'current locale'. |
$lengthBytes | integer | (optional) Number of bytes that make up the length. Default is 1. |
Type | Description |
---|---|
string |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


readStringUTF16(integer $byteCount, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN, string $characterSet = '') : string
Reads the Unicode UTF-16-encoded string from the binary file at the current byte offset.
The byte order of the UTF-16 string must be specified. You must also supply the desired resulting character set.
Advances the offset by the number of bytes read. Throws an exception if an error occurs.
Name | Type | Description |
---|---|---|
$byteCount | integer | Number of bytes (characters * 2) to return. |
$byteOrder | integer | (optional) Big- or little-endian byte order. Use the BYTE_ORDER_ constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian. |
$characterSet | string | (optional) Desired resulting character set. You may use any character set supported by {@link iconv()}. If omitted, uses 'current locale'. |
Type | Description |
---|---|
string |
Exception | Description |
---|---|
\Zend_Pdf_Exception |
- Todo
- Consider changing $byteCount to a character count. They are not always equivalent (in the case of surrogates).
- Todo
- Make $byteOrder optional if there is a byte-order mark (BOM) in the string being extracted.


readUInt(integer $size, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN) : integer
Reads the unsigned integer value from the binary file at the current byte offset.
Advances the offset by the number of bytes read. Throws an exception if an error occurs.
NOTE: If you ask for a 4-byte unsigned integer on a 32-bit machine, the resulting value WILL BE SIGNED because PHP uses signed integers internally for everything. To guarantee portability, be sure to use bitwise operators operators on large unsigned integers!
Name | Type | Description |
---|---|---|
$size | integer | Size of integer in bytes: 1-4 |
$byteOrder | integer | (optional) Big- or little-endian byte order. Use the BYTE_ORDER_ constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian. |
Type | Description |
---|---|
integer |
Exception | Description |
---|---|
\Zend_Pdf_Exception |


screen() : void
Performs a cursory check to verify that the binary file is in the expected format.
Intended to quickly weed out obviously bogus files.
Must set $this->_isScreened to true if successful.
Exception | Description |
---|---|
\Zend_Pdf_Exception |


skipBytes(integer $byteCount) : void
Convenience wrapper for the data source object's skipBytes() method.
Name | Type | Description |
---|---|---|
$byteCount | integer | Number of bytes to skip. |
Exception | Description |
---|---|
\Zend_Pdf_Exception |