ImageEn for Delphi and C++ Builder ImageEn for Delphi and C++ Builder

 

ImageEn Forum
Profile    Join    Active Topics    Forum FAQ    Search this forumSearch
 All Forums
 ImageEn Library for Delphi, C++ and .Net
 ImageEn and IEvolution Support Forum
 Creating PDF+OCR

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

View 
UserName:
Password:
Format  Bold Italicized Underline  Align Left Centered Align Right  Horizontal Rule  Insert Hyperlink   Browse for an image to attach to your post Browse for a zip to attach to your post Insert Code  Insert Quote Insert List
   
Message 

 

Emoji
Smile [:)] Big Smile [:D] Cool [8D] Blush [:I]
Tongue [:P] Evil [):] Wink [;)] Black Eye [B)]
Frown [:(] Shocked [:0] Angry [:(!] Sleepy [|)]
Kisses [:X] Approve [^] Disapprove [V] Question [?]

 
Check here to subscribe to this topic.
   

T O P I C    R E V I E W
Merlin Posted - Mar 07 2025 : 06:51:39
Hello,

Is it possible to subsequently convert a PDF into a PDF with OCR content or do the pages of the PDF file have to be exported and then reassembled using TIEVisionSearchablePDFGenerator?

An example program would be great :)

Thanx
4   L A T E S T    R E P L I E S    (Newest First)
Merlin Posted - Mar 12 2025 : 08:20:53
Hello Nigel,


thank you, I will give it a try :)
xequte Posted - Mar 10 2025 : 19:46:39
Why not do it as follows:


// Convert "in.pdf" (pages are images) to "out.pdf" (text in pages now selectable)
ImageEnMView1.MIO.LoadFromFile( 'D:\in.pdf' );
pdfGen := IEVisionLib.createSearchablePDFGenerator('./', IEOCRLanguageList[OCR_English_language].Code);
pdfGen.beginDocument(PAnsiChar(AnsiString(langPath + 'out')), PAnsiChar(AnsiString('title')));
for i := 0 to ImageEnMView1.ImageCount - 1 do
begin
  ImageEnMView1.SelectedImage := i; // Show the image being processed
  pdfGen.addPage(ImageEnMView1.IEBitmap.GetIEVisionImage());
end;
pdfGen.endDocument();


You will need to add iepdf32.dll to your EXE folder.

Nigel
Xequte Software
www.imageen.com
Merlin Posted - Mar 10 2025 : 04:46:37
Hello

yes, I want to apply text recognition to a pdf file that does not contain any text. To do this, the file must be loaded, the individual pages exported as images and then the text content must be determined with the text recognition via pdfGen : TIEVisionSearchablePDFGenerator.

Hmm, maybe there's a small example available if I do not have to use external libraries for the export of the individual PDF pages.

Thanks
xequte Posted - Mar 07 2025 : 19:07:06
Sorry, do you mean that you have a PDF that contains images of text (not text itself), and you want to convert it into a PDF where the text is available (text has been OCR'ed)?

Nigel
Xequte Software
www.imageen.com