PHP中CURL技术模拟登陆抓取网站信息,用与微信公众平台成绩查询

伴随微信的红火,微信公众平台成为许多开发者的下一个目标。笔者本身对于这种新鲜事物没有如此多的吸引力。但是最近有朋友帮忙开发微信公众平台中一个成绩查询的功能。于是便在空余时间研究了一番。

  主要的实现步骤是,通过PHP的CURL技术模拟登陆目标网站,通过登陆的用户,获取到用户的成绩信息,使用正则表达式对数据进行抓取和存储,使用HTML技术对数据进行重新弄排版。

  微信公众平台的功能就是通过浏览的目的来实现成绩查询。整体的技术实现就在于PHP的CURL技术。下面就随便找了一个文件,获取成绩。具体代码如下。

  


<HTML>
<HEAD><TITLE>请您登录</TITLE>
<script language="JavaScript">
function Judge()
    {
    var WebUserNO=document.all["WebUserNO"].value;
    if(WebUserNO=="")
       {alert("登录用户不能为空!");
       document.all["WebUserNO"] .focus();
       return false;
       }
    
}
</script>
<META http-equiv=Content-Type content="text/html; charset=gb2312">
<STYLE type=text/css>TD {
    FONT-SIZE: 12px
}
p1 {
    FONT-SIZE: 12px
}
INPUT {
    FONT-SIZE: 12px
}
p2 {
    FONT-SIZE: 12px; LINE-HEIGHT: 14pt
}
p3 {
    FONT-SIZE: 14px
}
p4 {
    FONT-SIZE: 14px; LINE-HEIGHT: 14pt
}
p5 {
    FONT-SIZE: 16px
}
p6 {
    FONT-SIZE: 14px; LINE-HEIGHT: 180%
}
p7 {
    FONT-SIZE: 12px; COLOR: #136792; LINE-HEIGHT: 160%
}
BIG {
    FONT-SIZE: 18px
}
A:link {
    COLOR: #0000ff
}
A:visited {
    COLOR: #0000ff
}
A:hover {
    COLOR: #ff0000
}
hand {
    CURSOR: hand; BACKGROUND-COLOR: rgb(208,207,192)
}
</STYLE>
<!--style end-->
<META content="MSHTML 6.00.2600.0" name=GENERATOR></HEAD>
<BODY bgColor=#ffffff topMargin=7 marginheight="0" marginwidth="25">
<form name="LoginForm" method="post" action="qing.php">
<TABLE cellSpacing=0 cellPadding=0 width="100%" border=0>
  <TBODY>
  <TR>
    <TD bgColor=#e6e6e6 height=20></TD>
      <TD align=right bgColor=#e6e6e6>&nbsp;</TD>
    </TR></TBODY></TABLE><BR>
<TABLE cellSpacing=0 cellPadding=1 width=492 align=center border=0>
  <TBODY>
  <TR>
    <TD>
      <TABLE borderColor=#c1eaff cellSpacing=0 cellPadding=20 width=474 
      align=center border=1>
        <TBODY>
        <TR>
          <TD><TABLE width=283 height="100" 
              border=0 align=center cellPadding=0 cellSpacing=0>
                                   <tr>
                      <td width="50" rowspan="4">&nbsp;</td>
                    <td align="left">
                      </td>
                      </tr>
                                   <tr>
                                     <td height="22" align="left">用户名:
                                     <input name="WebUserNO" type="text" id="WebUserNO" size="12"></td>
                                   </tr>
                                   <tr>
                                     <td height="22" align="left">&nbsp;&nbsp;码:
                                     <input name="Password" type="password" id="Password" size="12"></td>
                                   </tr>
                                   <tr>
                                     <td height="22" align="left" valign="middle"><p>附加码:
                                     <input name="Agnomen" type="text" id="Agnomen" size="12">
                                     </p>
                                     <p><A href="User_JSP/FuJiaMa.htm" target="_blank" ><img src="http://218.61.108.163/ACTIONVALIDATERANDOMPICTURE.APPPROCESS" width="60" height="20" alt="验证码说明" border="0"></a></p></td>
                                   </tr>
                  <tr align="center"> 
                    <td colspan="2"><input type="image" border="0" name="submit" src="http://218.61.108.163/User_JSP/images/Logon.gif" width="37" height="18" onClick="javascript:return Judge();">
                    
                    </td>
                </tr>    
                  <tr> 
                    <td colspan="2"><div align="center"><input name="applicant" type="hidden" value="ACTIONQUERYSTUDENTSCORE"></div></td>
                </tr>    
                </TABLE>
            <br>
          </TD>
        </TR>
        </TABLE></TD></TR></TBODY></TABLE>
</form>        
<BR>
<BR>
</BODY></HTML>


qing.php

<?php
    $cookie_file = tempnam('./temp','cookie');
    $login_url = 'http://218.61.108.163/ACTIONQUERYSTUDENTSCORE.APPPROCESS';
    
    $post_fields = 'WebUserNO=stuid&Password=passwd&Agnomen=code&applicant=ACTIONQUERYGRADUATESCHOOLREPORTBYSELF';
    
    $ch = curl_init($login_url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
    curl_exec($ch);
    curl_close($ch);
    
    $url='http://218.61.108.163/ACTIONQUERYGRADUATESCHOOLREPORTBYSELF.APPPROCESS';
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
    $contents = curl_exec($ch);
    
    //正则表达式提取数据。
    $match="|(<=<td>).*(?=</td>)|";
    preg_match_all($match,$contents,$b);
    $abc = $b[0];
    $abs = $b[1];
    $abd = $b[1];
    echo $abc;
    echo $abs;
    echo $abd;
    curl_close($ch);
?>


获取到成绩的界面

原文地址:https://www.cnblogs.com/ZM-Rid/p/3601513.html