朝中社数据接口和页面结构

首页部分新闻

curl "http://kcna.kp/kcna.user.home.retrieveAjaxList.kcmsf" -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0" -H "Accept: */*" -H "Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2" --compressed -H "Content-Type: application/x-www-form-urlencoded" -H "Origin: http://kcna.kp" -H "Connection: keep-alive" -H "Referer: http://kcna.kp/kcna.user.home.retrieveHomeInfoList.kcmsf;jsessionid=00495698FFD1BF89D479A894F18C45C2" -H "Cookie: JSESSIONID=00495698FFD1BF89D479A894F18C45C2" -H "Cache-Control: max-age=0" --data ""

最高xxx著作

curl "http://kcna.kp/kcna.user.home.retrieveArticlePageContentInfoAjax.kcmsf" -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0" -H "Accept: */*" -H "Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2" --compressed -H "Content-Type: application/x-www-form-urlencoded" -H "Origin: http://kcna.kp" -H "Connection: keep-alive" -H "Referer: http://kcna.kp/kcna.user.article.retrieveNewsViewInfoList.kcmsf" -H "Cookie: JSESSIONID=00495698FFD1BF89D479A894F18C45C2" --data ""

最高XXXxx活动(CSS选择器)

.harticle15

最新XX

.harticle2

北南关系

div.harticle14:nth-child(5)

国际

.harticle8

庆祝xxx消息

div.hsidebar_style1:nth-child(1)

主要xx

div.hsidebar_style1:nth-child(2)

体育

.hsidebar_style4

科技

.hsidebar_style3

旅游名胜

div.hsidebar_style1:nth-child(5)

2019年11月10日更新

用自己写的爬虫发现,朝鲜的反爬虫确实厉害,毛都抓不到,想想别的办法回头再试试看

原文地址:https://www.cnblogs.com/dXIOT/p/11750469.html