DocumentPage Class

Content and layout elements extracted from a page from the input.

Constructor

DocumentPage(*args: Any, **kwargs: Any)

Variables

Name	Description
page_number	int 1-based page number in the input document. Required.
angle	float The general orientation of the content in clockwise direction, measured in degrees between (-180, 180].
width	float The width of the image/PDF in pixels/inches, respectively.
height	float The height of the image/PDF in pixels/inches, respectively.
unit	str or LengthUnit The unit used by the width, height, and polygon properties. For images, the unit is "pixel". For PDF, the unit is "inch". Known values are: "pixel" and "inch".
spans	list[DocumentSpan] Location of the page in the reading order concatenated content. Required.
words	list[DocumentWord] Extracted words from the page.
selection_marks	list[DocumentSelectionMark] Extracted selection marks from the page.
lines	list[DocumentLine] Extracted lines from the page, potentially containing both textual and visual elements.
barcodes	list[DocumentBarcode] Extracted barcodes from the page.
formulas	list[DocumentFormula] Extracted formulas from the page.

Methods

as_dict	Return a dict that can be JSONify using json.dump.
clear
copy
get
items
keys
pop
popitem
setdefault
update
values

as_dict

Return a dict that can be JSONify using json.dump.

as_dict(*, exclude_readonly: bool = False) -> Dict[str, Any]

Keyword-Only Parameters

Name	Description
exclude_readonly	bool Whether to remove the readonly properties. Default value: False

Returns

Type	Description
dict	A dict JSON compatible object

clear

clear() -> None

copy

copy() -> Model

get

get(key: str, default: Any = None) -> Any

Parameters

Name	Description
key Required
default	Default value: None

items

items() -> ItemsView[str, Any]

keys

keys() -> KeysView[str]

pop

pop(key: str, default: ~typing.Any = <object object>) -> Any

Parameters

Name	Description
key Required
default

popitem

popitem() -> Tuple[str, Any]

setdefault

setdefault(key: str, default: ~typing.Any = <object object>) -> Any

Parameters

Name	Description
key Required
default

update

update(*args: Any, **kwargs: Any) -> None

values

values() -> ValuesView[Any]

Attributes

angle

The general orientation of the content in clockwise direction, measured in degrees between (-180, 180].

angle: float | None

barcodes

Extracted barcodes from the page.

barcodes: List[_models.DocumentBarcode] | None

formulas

Extracted formulas from the page.

formulas: List[_models.DocumentFormula] | None

height

The height of the image/PDF in pixels/inches, respectively.

height: float | None

lines

Extracted lines from the page, potentially containing both textual and visual elements.

lines: List[_models.DocumentLine] | None

page_number

1-based page number in the input document. Required.

page_number: int

selection_marks

Extracted selection marks from the page.

selection_marks: List[_models.DocumentSelectionMark] | None

spans

Location of the page in the reading order concatenated content. Required.

spans: List[_models.DocumentSpan]

unit

The unit used by the width, height, and polygon properties. For images, the unit is "pixel". For PDF, the unit is "inch". Known values are: "pixel" and "inch".

unit: str | _models.LengthUnit | None

width

The width of the image/PDF in pixels/inches, respectively.

width: float | None

words

Extracted words from the page.

words: List[_models.DocumentWord] | None

Feedback

Was this page helpful?

Share via

DocumentPage Class

Constructor

Variables

Methods

as_dict

Keyword-Only Parameters

Returns

clear

copy

get

Parameters

items

keys

pop

Parameters

popitem

setdefault

Parameters

update

values

Attributes

angle

barcodes

formulas

height

lines

page_number

selection_marks

spans

unit

width

words

Feedback