Java 读取Word文档中的文本内容

1、添加依赖关系

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>3.8</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-scratchpad</artifactId>
            <version>3.8</version>
        </dependency>

2、读取word内容代码

 String buffer = "";
        try {
            if (path.endsWith(".doc")) {
                FileInputStream is = new FileInputStream(path);
                WordExtractor ex = new WordExtractor(is);
                buffer = ex.getText();
                is.close();
            } else if (path.endsWith("docx")) {
                OPCPackage opcPackage = POIXMLDocument.openPackage(path);
                POIXMLTextExtractor extractor = new XWPFWordExtractor(opcPackage);
                buffer = extractor.getText();
                opcPackage.close();
            } else {
                return AjaxResult.error("文件不是word文件");
            }
        } catch (Exception e) {
            //e.printStackTrace();
            return AjaxResult.error("读取word文件失败"+e.getMessage());
        }
原文地址:https://www.cnblogs.com/yechangzhong-826217795/p/15384158.html