CentOS 7使用PuppeteerSharp无头浏览器注意事项

环境:

CentOS 7.6.1810

.net core 3.1

PuppeteerSharp 2.0.0

1.如网络部稳定可以提前下载需要的chromium  

下载地址:https://storage.googleapis.com/chromium-browser-snapshots/Linux_x64/706915/chrome-linux.zip

各个系统下载地址可以查看PuppeteerSharp源码中设置的,或者用国内镜像

https://github.com/hardkoded/puppeteer-sharp/blob/37ea56934281209830254df3ec3ffe37c57cfac2/lib/PuppeteerSharp/BrowserFetcher.cs

将压缩包解压到当前程序目录下,文件夹路径:.local-chromium/Linux-706915/chrome-linux/

如运行报错:加载libX11.so.6库错误,则先装该库

Unhandled exception. System.AggregateException: One or more errors occurred. (Failed to launch Chromium! /PuppeteerTest/PuppeteerTest/.local-chromium/Linux-706915/chrome-linux/chrome: error while loading shared libraries: libX11.so.6: cannot open shared object file: No such file or directory
)
 ---> PuppeteerSharp.ChromiumProcessException: Failed to launch Chromium! /PuppeteerTest/PuppeteerTest/.local-chromium/Linux-706915/chrome-linux/chrome: error while loading shared libraries: libX11.so.6: cannot open shared object file: No such file or directory

   at PuppeteerSharp.ChromiumProcess.State.StartingState.StartCoreAsync(ChromiumProcess p)
   at PuppeteerSharp.ChromiumProcess.State.StartingState.StartCoreAsync(ChromiumProcess p)
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options)
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options)
   --- End of inner exception stack trace ---

从pkgs网站找到该库 https://pkgs.org/download/libX11.so.6

进入该库

https://centos.pkgs.org/7/centos-x86_64/libX11-1.6.7-2.el7.i686.rpm.html

找到yum安装命令:

Install libX11 rpm package:

# yum install libX11

 在Xshell执行该命令

 

其他库报错,如libXcomposite库,则一样到pkgs网站查找库和安装命令

Unhandled exception. System.AggregateException: One or more errors occurred. (Failed to launch Chromium! /PuppeteerTest/PuppeteerTest/bin/Debug/netcoreapp3.1/.local-chromium/Linux-706915/chrome-linux/chrome: error while loading shared libraries: libXcomposite.so.1: cannot open shared object file: No such file or directory
)
 ---> PuppeteerSharp.ChromiumProcessException: Failed to launch Chromium! /PuppeteerTest/PuppeteerTest/bin/Debug/netcoreapp3.1/.local-chromium/Linux-706915/chrome-linux/chrome: error while loading shared libraries: libXcomposite.so.1: cannot open shared object file: No such file or directory

   at PuppeteerSharp.ChromiumProcess.State.StartingState.StartCoreAsync(ChromiumProcess p)
   at PuppeteerSharp.ChromiumProcess.State.StartingState.StartCoreAsync(ChromiumProcess p)
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options)
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options)
   --- End of inner exception stack trace ---

https://centos.pkgs.org/

 

 其他库报错安装方法一样。

报libXss.so.1这个库错误的时候,运行命令:

yum install libXss* -y

报libatk-1.0.so库错误运行命令:

yum install atk

报libatk-bridge-2.0.so库错误运行命令:

yum install at-spi2-atk-devel

报libpangocairo-1.0.so库错误运行命令:

yum install pango-devel

报libgtk-3.so库错误运行命令:

yum install gtk3-devel

   

 Pupperteer官网可以查到CentOS的相关依赖

 https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md

全部依赖都安装好后,运行还是报错:--no-sandbox

Unhandled exception. System.AggregateException: One or more errors occurred. (Failed to launch Chromium! [0416/165456.543755:ERROR:zygote_host_impl_linux.cc(89)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.
)
 ---> PuppeteerSharp.ChromiumProcessException: Failed to launch Chromium! [0416/165456.543755:ERROR:zygote_host_impl_linux.cc(89)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.

   at PuppeteerSharp.ChromiumProcess.State.StartingState.StartCoreAsync(ChromiumProcess p)
   at PuppeteerSharp.ChromiumProcess.State.StartingState.StartCoreAsync(ChromiumProcess p)
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options)
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options)
   --- End of inner exception stack trace ---

根据网上的一篇文章,启动的时候需要加上--no-sandbox参数

https://segmentfault.com/a/1190000018553178

var launchOptions = new LaunchOptions
                {
                    Headless = true
                };
                launchOptions.Args = new string[] {
                    "--no-sandbox"
                };
                var    browser = Puppeteer.LaunchAsync(launchOptions).Result;

 重新启动下程序,已经可以抓取网页了。

原文地址:https://www.cnblogs.com/townsend/p/12714055.html