get_headers对PHP采集的作用

作者:enenba | 发表于:2012-01-09 17:09 | 分类:php采集

php中的get_headers()  函数 可以响应消息头部信息,可以用于采集时对页面状态的判断。

 

手册中的例子:

<?php
$url = 'http://www.example.com';
print_r(get_headers($url));
print_r(get_headers($url, 1));

/*
输出:
Array
(
    [0] => HTTP/1.1 200 OK
    [1] => Date: Sat, 29 May 2004 12:28:13 GMT
    [2] => Server: Apache/1.3.27 (Unix)  (Red-Hat/Linux)
    [3] => Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
    [4] => ETag: "3f80f-1b6-3e1cb03b"
    [5] => Accept-Ranges: bytes
    [6] => Content-Length: 438
    [7] => Connection: close
    [8] => Content-Type: text/html
)

Array
(
    [0] => HTTP/1.1 200 OK
    [Date] => Sat, 29 May 2004 12:28:14 GMT
    [Server] => Apache/1.3.27 (Unix)  (Red-Hat/Linux)
    [Last-Modified] => Wed, 08 Jan 2003 23:11:55 GMT
    [ETag] => "3f80f-1b6-3e1cb03b"
    [Accept-Ranges] => bytes
    [Content-Length] => 438
    [Connection] => close
    [Content-Type] => text/html
)
*/
?> 
只解释第2组结果(和还会出现的结果),

[0] => HTTP/1.1 200 OK   200的状态是成功发送请求。其它状态还会有301、404等等

[Date] => Sat, 29 May 2004 12:28:14 GMT 日期,具体哪 个时区不考虑

[Server] => Apache/1.3.27 (Unix)  (Red-Hat/Linux)  服务器和版本不解释,你懂的

[Last-Modified] => Wed, 08 Jan 2003 23:11:55 GMT 最后更新

[Accept-Ranges] => bytes [Content-Length] => 438 字节数
[Content-Type] => 'text/html ;charset=UTF-8',  html语言,字符集为utf-8

[X-Powered-By]=>  'PHP/5.2.17',   支持语言和版本
[Location] => 'http://enenba.com/admin/',  跳转位置,一般有301才会出现。也就是实现的访问地址



 


 

 

上一篇: php正则替换回调函数preg_replace_callback()初解   |   下一篇:PHP正则匹配中文UTF-8 和 gb2312的正则区别» 标签: php函数 php采集 数据采集

评论: