Do you have a question? Post it now! No Registration Necessary. Now with pictures!
January 28, 2009, 7:09 am
rate this thread
I am doing following task in PHP....
I am using pdftotext command line utility of xpdf package for Windows
and Linux. It successfully extracts English text from PDF files. Now I
need to extract Unicoded Arabic text from PDF files. For this, I
"pdftotext -enc UTF-8 arabicFile.pdf arabicFile.txt"
If I remove -enc switch/parameter, there is empty space in place of
Arabic text, but English text is extracted from PDF. With -enc UTF-8,
some Arabic characters/alphabet s are extracted from PDF, but the
complete Arabic text is not extracted. I also have downloaded and
installed the xpdf-Arabic package from internet. I couldn't get the
required result i.e. Arabic Text from PDF.
Can anyone help on urgent basis? How to configure xpdf-Arabic or some