![]() In itextsharp I can get the bounds for each single glyph, like this: // itextsharp 5.2.1.0 ![]() It seems to me, all the necessary implementations should already there, just not much is exposed in the API. I know this is definitely possible in other libraries. ![]() I can get the bounds and text of the entire line, but I need the bounds (and value) of each character/glyph, so I can combine them into "words" (= character sequences, separated by whitespace or large offsets). But in practice, often enough the entire line would be one single text element, with the content SO CompanyOrder Number:0123456789, and all positioning and spacing done via offsets and indices only. Let's say, I want to get the order number ( 0123456789) from the document, knowing its precise position on the page. Problem: a text element can have a complex structure, with each glyph being positioned using individual settings.Ĭonsider this common example of a page header: Billing Info Date: 2 In GemBox.Pdf, I can extract text elements including their bounds and content, but: Goal: extract a value from a specific location inside a PDF page.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
May 2023
Categories |