Convert Pdf to Word in Java Example

Convert Pdf to Word in Java:

Required Jars:
1. itextpdf-5.4.4
2. xmlbeans-xpath-2.3.0
3. xmlbeans-2.6.0
4. poi-3.9
5. dom4j-1.6.1
6. poi-ooxml-schemas-3.7
7. poi-ooxml-3.7

Java Program:

package in.javadomain;

import java.io.FileOutputStream;
import java.io.IOException;

import org.apache.poi.xwpf.usermodel.BreakType;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;

import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.PdfReaderContentParser;
import com.itextpdf.text.pdf.parser.SimpleTextExtractionStrategy;
import com.itextpdf.text.pdf.parser.TextExtractionStrategy;

public class ConvertPdf2Word {

	public static void main(String[] args) throws IOException {
		System.out.println("Document converted started");
		XWPFDocument doc = new XWPFDocument();
		String pdf = "D:\\javadomain.pdf";
		PdfReader reader = new PdfReader(pdf);
		PdfReaderContentParser parser = new PdfReaderContentParser(reader);
		for (int i = 1; i <= reader.getNumberOfPages(); i++) {
			TextExtractionStrategy strategy = parser.processContent(i,
					new SimpleTextExtractionStrategy());
			String text = strategy.getResultantText();
			XWPFParagraph p = doc.createParagraph();
			XWPFRun run = p.createRun();
			run.setText(text);
			run.addBreak(BreakType.PAGE);
		}
		FileOutputStream out = new FileOutputStream("D:\\javadomain.docx");
		doc.write(out);
		out.close();
		reader.close();
		System.out.println("Document converted successfully");
	}
}

 

Input: [pdf file]
pdf input

 

Output: [word file]
word output

Recommended Books:

821 total views, 2 views today

7 comments

Leave a Reply

Your email address will not be published. Required fields are marked *