VB.NET抓取網(wǎng)頁(yè)出現(xiàn)錯(cuò)誤解決方案
VB.NET編程語(yǔ)言特點(diǎn)比較突出,它是一個(gè)真正的實(shí)現(xiàn)面向?qū)ο笠约爸С掷^承性的編程語(yǔ)言,其應(yīng)用范圍廣泛,功能強(qiáng)大,幫助開(kāi)發(fā)人員大大提高了編程效率。在調(diào)試VB.NET調(diào)用Microsoft.XMLHttp組件抓取網(wǎng)頁(yè)時(shí),遇到了抓取中文字符出現(xiàn)亂碼,經(jīng)測(cè)試若網(wǎng)頁(yè)meta標(biāo)簽charset為utf-8的網(wǎng)頁(yè)不會(huì)亂碼,而charset為Gb2312的則會(huì)出現(xiàn)亂碼,本文提供了一個(gè)完整的解決方法,希望對(duì)研究VB.NET抓取網(wǎng)頁(yè)或者VB.NET實(shí)現(xiàn)采集功能的朋友有所幫助。
以下為VB.NET抓取網(wǎng)頁(yè)的函數(shù)LobDotCn 注:url_Link為抓取的目標(biāo)頁(yè)面 IsGb2312為是否Gb2312字符
- Public Function LobDotCn(ByVal url_Link
As String, ByVal IsGb2312 As Boolean)- On Error Resume Next
- Dim XmlHttp As Object
- XmlHttp = CreateObject("Microsoft.XMLHttp")
- XmlHttp.Open("POST", url_Link, False)
- XmlHttp.Send()
- Dim WebContent As Object
- Dim Str_WebContent As String
- If IsGb2312 Then
- WebContent = XmlHttp.ResponseBody
- Str_WebContent = System.Text.Encoding.
Default.GetString(WebContent)- Else
- WebContent = XmlHttp.ResponseText
- Str_WebContent = WebContent.ToString
- End If
- XmlHttp = Nothing
- LobDotCn = Str_WebContent
- End Function
VB.NET抓取網(wǎng)頁(yè)的調(diào)用方式 :
變量 = LobDotCn("http://www.lob.cn", True) '抓取 Gb2312網(wǎng)頁(yè)
變量 = LobDotCn("此處填寫(xiě)網(wǎng)址", False) ' 抓取utf-8網(wǎng)頁(yè)
【編輯推薦】