Loading...
Searching...
No Matches
pdftron::PDF::Word Class Reference

#include <TextExtractor.h>

Public Member Functions

int GetNumGlyphs ()
Rect GetBBox ()
void GetBBox (double out_bbox[4])
std::vector< double > GetQuad ()
void GetQuad (double out_quad[8])
std::vector< double > GetGlyphQuad (int glyph_idx)
void GetGlyphQuad (int glyph_idx, double out_quad[8])
Style GetCharStyle (int char_idx)
Style GetStyle ()
int GetStringLen ()
const UnicodeGetString ()
Word GetNextWord ()
int GetCurrentNum ()
bool IsValid ()
bool operator== (const Word &) const
bool operator!= (const Word &) const
 Word ()

Detailed Description

TextExtractor::Word object represents a word on a PDF page. Each word contains a sequence of characters in one or more styles (see TextExtractor::Style).

Definition at line 430 of file TextExtractor.h.

Constructor & Destructor Documentation

◆ Word()

pdftron::PDF::Word::Word ( )

Member Function Documentation

◆ GetBBox() [1/2]

Rect pdftron::PDF::Word::GetBBox ( )
Parameters
out_bboxThe bounding box for this word (in unrotated page coordinates).
Note
To account for the effect of page '/Rotate' attribute, transform all points using page.GetDefaultMatrix().

◆ GetBBox() [2/2]

void pdftron::PDF::Word::GetBBox ( double out_bbox[4])

◆ GetCharStyle()

Style pdftron::PDF::Word::GetCharStyle ( int char_idx)
Parameters
char_idxThe index of a character in this word.
Returns
The style associated with a given character.

◆ GetCurrentNum()

int pdftron::PDF::Word::GetCurrentNum ( )
Returns
the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).

◆ GetGlyphQuad() [1/2]

std::vector< double > pdftron::PDF::Word::GetGlyphQuad ( int glyph_idx)
Parameters
glyph_idxThe index of a glyph in this word.
out_quadThe quadrilateral representing a tight bounding box for a given glyph in the word (in unrotated page coordinates).

◆ GetGlyphQuad() [2/2]

void pdftron::PDF::Word::GetGlyphQuad ( int glyph_idx,
double out_quad[8] )

◆ GetNextWord()

Word pdftron::PDF::Word::GetNextWord ( )
Returns
the next word on the current line.

◆ GetNumGlyphs()

int pdftron::PDF::Word::GetNumGlyphs ( )
Returns
The number of glyphs in this word.

◆ GetQuad() [1/2]

std::vector< double > pdftron::PDF::Word::GetQuad ( )
Parameters
out_quadThe quadrilateral representing a tight bounding box for this word (in unrotated page coordinates).

◆ GetQuad() [2/2]

void pdftron::PDF::Word::GetQuad ( double out_quad[8])

◆ GetString()

const Unicode * pdftron::PDF::Word::GetString ( )
Returns
the content of this word represented as a Unicode string.

◆ GetStringLen()

int pdftron::PDF::Word::GetStringLen ( )
Returns
the number of characters in this word.

◆ GetStyle()

Style pdftron::PDF::Word::GetStyle ( )
Returns
predominant style for this word.

◆ IsValid()

bool pdftron::PDF::Word::IsValid ( )
Returns
true if this is a valid word, false otherwise.

◆ operator!=()

bool pdftron::PDF::Word::operator!= ( const Word & ) const

◆ operator==()

bool pdftron::PDF::Word::operator== ( const Word & ) const

The documentation for this class was generated from the following file: