DocumentPage Class
Content and layout elements extracted from a page from the input.
Constructor
DocumentPage(*args: Any, **kwargs: Any)
Variables
| Name | Description |
|---|---|
|
page_number
|
1-based page number in the input document. Required. |
|
angle
|
The general orientation of the content in clockwise direction, measured in degrees between (-180, 180]. |
|
width
|
The width of the image/PDF in pixels/inches, respectively. |
|
height
|
The height of the image/PDF in pixels/inches, respectively. |
|
unit
|
str or
LengthUnit
The unit used by the width, height, and polygon properties. For images, the unit is "pixel". For PDF, the unit is "inch". Known values are: "pixel" and "inch". |
|
spans
|
Location of the page in the reading order concatenated content. Required. |
|
words
|
Extracted words from the page. |
|
selection_marks
|
Extracted selection marks from the page. |
|
lines
|
Extracted lines from the page, potentially containing both textual and visual elements. |
|
barcodes
|
Extracted barcodes from the page. |
|
formulas
|
Extracted formulas from the page. |
Methods
| as_dict |
Return a dict that can be JSONify using json.dump. |
| clear | |
| copy | |
| get | |
| items | |
| keys | |
| pop | |
| popitem | |
| setdefault | |
| update | |
| values |
as_dict
clear
clear() -> None
copy
copy() -> Model
get
get(key: str, default: Any = None) -> Any
Parameters
| Name | Description |
|---|---|
|
key
Required
|
|
|
default
|
Default value: None
|
items
items() -> ItemsView[str, Any]
keys
keys() -> KeysView[str]
pop
pop(key: str, default: ~typing.Any = <object object>) -> Any
Parameters
| Name | Description |
|---|---|
|
key
Required
|
|
|
default
|
|
popitem
popitem() -> Tuple[str, Any]
setdefault
setdefault(key: str, default: ~typing.Any = <object object>) -> Any
Parameters
| Name | Description |
|---|---|
|
key
Required
|
|
|
default
|
|
update
update(*args: Any, **kwargs: Any) -> None
values
values() -> ValuesView[Any]
Attributes
angle
The general orientation of the content in clockwise direction, measured in degrees between (-180, 180].
angle: float | None
barcodes
Extracted barcodes from the page.
barcodes: List[_models.DocumentBarcode] | None
formulas
Extracted formulas from the page.
formulas: List[_models.DocumentFormula] | None
height
The height of the image/PDF in pixels/inches, respectively.
height: float | None
lines
Extracted lines from the page, potentially containing both textual and visual elements.
lines: List[_models.DocumentLine] | None
page_number
1-based page number in the input document. Required.
page_number: int
selection_marks
Extracted selection marks from the page.
selection_marks: List[_models.DocumentSelectionMark] | None
spans
Location of the page in the reading order concatenated content. Required.
spans: List[_models.DocumentSpan]
unit
The unit used by the width, height, and polygon properties. For images, the unit is "pixel". For PDF, the unit is "inch". Known values are: "pixel" and "inch".
unit: str | _models.LengthUnit | None
width
The width of the image/PDF in pixels/inches, respectively.
width: float | None
words
Extracted words from the page.
words: List[_models.DocumentWord] | None