NIUCLOUD是一款SaaS管理后台框架多应用插件+云编译。上千名开发者、服务商正在积极拥抱开发者生态。欢迎开发者们免费入驻。一起助力发展! 广告
**爬取富文本** 继续上一个例子,现在要爬取详情页面的内容 ![](https://img.kancloud.cn/27/40/2740ca896dd86fd92193f1debaa9dbf0_1847x909.png) 开始写代码 ``` package main import ( "github.com/PeterYangs/article-spider/fileTypes" "github.com/PeterYangs/article-spider/form" "github.com/PeterYangs/article-spider/spider" ) func main() { f := form.Form{ Host: "https://www.weixz.com", Channel: "/zxzx/list_[PAGE].html", Limit: 5, PageStart: 1, ListSelector: "body > div > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.information-main-list > ul > li", ListHrefSelector: "div.information-main-list-title > a", DetailFields: map[string]form.Field{ "title": {Types: fileTypes.SingleField, Selector: "body > div > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.informationContents > div.informationContentTitle > h1"}, "image": {Types: fileTypes.SingleImage, Selector: "body > div > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.informationContents > div.informationContentText img:nth-child(1)", ImagePrefix: "/image", ImageDir: "[date:Y-m-d]"}, "content": {Types: fileTypes.HtmlWithImage, Selector: "body > div.wrap > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.informationContents > div.informationContentText"}, }, } spider.Start(f) } ``` 结果为 ![](https://img.kancloud.cn/13/55/1355a056d6fedceabd18605a36cf8947_1882x397.png) <br/><br/> 内容中的图片也被替换成了本地图片路径 ![](https://img.kancloud.cn/10/88/1088d6a24dcc1c54e6c566c5e1980f64_957x293.png) <br/><br/> 如果想要修改图片的路径可以修改为 ``` package main import ( "github.com/PeterYangs/article-spider/fileTypes" "github.com/PeterYangs/article-spider/form" "github.com/PeterYangs/article-spider/spider" ) func main() { f := form.Form{ Host: "https://www.weixz.com", Channel: "/zxzx/list_[PAGE].html", Limit: 5, PageStart: 1, ListSelector: "body > div > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.information-main-list > ul > li", ListHrefSelector: "div.information-main-list-title > a", DetailFields: map[string]form.Field{ "title": {Types: fileTypes.SingleField, Selector: "body > div > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.informationContents > div.informationContentTitle > h1"}, "image": {Types: fileTypes.SingleImage, Selector: "body > div > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.informationContents > div.informationContentText img:nth-child(1)", ImagePrefix: "/image", ImageDir: "[date:Y-m-d]"}, "content": {Types: fileTypes.HtmlWithImage, Selector: "body > div.wrap > div.information-main.mt-20px.wd1200.displayFlex > div.information-main-left > div.informationContents > div.informationContentText", ImagePrefix: "/image", ImageDir: "[date:Y-m-d]"}, }, } spider.Start(f) } ``` 跟爬取图片一样