
如何做搜尋引擎蜘蛛日誌分析
搜尋引擎蜘蛛日誌檔案是一種非常強大但未被站長充分利用的檔案,分析它可以獲取有關每個搜尋引擎如何爬取網站內容的相關資訊點,及檢視搜尋引擎蜘蛛在一段時間內的行為。
IP地址(58) | 伺服器名稱 | 所屬國家 |
---|---|---|
35.216.239.19 | 19.239.216.35.bc.googleusercontent.com | CH |
35.216.133.80 | 80.133.216.35.bc.googleusercontent.com | CH |
35.216.216.224 | 224.216.216.35.bc.googleusercontent.com | CH |
35.216.135.199 | 199.135.216.35.bc.googleusercontent.com | CH |
35.216.195.189 | 189.195.216.35.bc.googleusercontent.com | CH |
35.216.250.8 | 8.250.216.35.bc.googleusercontent.com | CH |
35.216.204.22 | 22.204.216.35.bc.googleusercontent.com | CH |
35.216.223.217 | 217.223.216.35.bc.googleusercontent.com | CH |
35.216.166.21 | 21.166.216.35.bc.googleusercontent.com | CH |
35.216.181.8 | 8.181.216.35.bc.googleusercontent.com | CH |
35.216.214.50 | 50.214.216.35.bc.googleusercontent.com | CH |
35.216.190.15 | 15.190.216.35.bc.googleusercontent.com | CH |
35.216.236.162 | 162.236.216.35.bc.googleusercontent.com | CH |
35.216.178.21 | 21.178.216.35.bc.googleusercontent.com | CH |
35.216.180.106 | 106.180.216.35.bc.googleusercontent.com | CH |
35.216.233.65 | 65.233.216.35.bc.googleusercontent.com | CH |
35.216.140.243 | 243.140.216.35.bc.googleusercontent.com | CH |
35.216.159.17 | 17.159.216.35.bc.googleusercontent.com | ? |
35.216.133.23 | 23.133.216.35.bc.googleusercontent.com | CH |
35.216.158.209 | 209.158.216.35.bc.googleusercontent.com | CH |
35.216.234.14 | 14.234.216.35.bc.googleusercontent.com | CH |
35.216.188.224 | 224.188.216.35.bc.googleusercontent.com | CH |
35.216.209.210 | 210.209.216.35.bc.googleusercontent.com | CH |
35.216.244.245 | 245.244.216.35.bc.googleusercontent.com | CH |
35.216.250.151 | 151.250.216.35.bc.googleusercontent.com | CH |
35.216.207.137 | 137.207.216.35.bc.googleusercontent.com | CH |
35.216.169.98 | 98.169.216.35.bc.googleusercontent.com | CH |
35.216.136.158 | 158.136.216.35.bc.googleusercontent.com | CH |
35.216.194.252 | 252.194.216.35.bc.googleusercontent.com | CH |
35.216.167.199 | 199.167.216.35.bc.googleusercontent.com | CH |
35.216.200.187 | 187.200.216.35.bc.googleusercontent.com | CH |
35.216.183.188 | 188.183.216.35.bc.googleusercontent.com | CH |
35.216.240.203 | 203.240.216.35.bc.googleusercontent.com | CH |
35.216.159.69 | 69.159.216.35.bc.googleusercontent.com | CH |
35.216.186.246 | 246.186.216.35.bc.googleusercontent.com | CH |
35.216.194.110 | 110.194.216.35.bc.googleusercontent.com | CH |
35.216.146.75 | 75.146.216.35.bc.googleusercontent.com | CH |
35.216.225.22 | 22.225.216.35.bc.googleusercontent.com | CH |
35.216.197.46 | 46.197.216.35.bc.googleusercontent.com | CH |
35.216.179.192 | 192.179.216.35.bc.googleusercontent.com | CH |
35.216.192.164 | 164.192.216.35.bc.googleusercontent.com | CH |
35.216.152.171 | 171.152.216.35.bc.googleusercontent.com | CH |
35.216.203.226 | 226.203.216.35.bc.googleusercontent.com | CH |
35.216.167.104 | 104.167.216.35.bc.googleusercontent.com | CH |
35.216.218.118 | 118.218.216.35.bc.googleusercontent.com | CH |
35.216.218.233 | 233.218.216.35.bc.googleusercontent.com | CH |
35.216.141.220 | 220.141.216.35.bc.googleusercontent.com | CH |
35.216.186.88 | 88.186.216.35.bc.googleusercontent.com | CH |
35.216.152.230 | 230.152.216.35.bc.googleusercontent.com | CH |
35.216.244.73 | 73.244.216.35.bc.googleusercontent.com | CH |
35.216.234.27 | 27.234.216.35.bc.googleusercontent.com | CH |
35.216.148.67 | 67.148.216.35.bc.googleusercontent.com | CH |
35.216.253.131 | 131.253.216.35.bc.googleusercontent.com | CH |
35.216.185.223 | 223.185.216.35.bc.googleusercontent.com | CH |
35.216.247.45 | 45.247.216.35.bc.googleusercontent.com | CH |
35.216.216.74 | 74.216.216.35.bc.googleusercontent.com | CH |
35.216.172.135 | 135.172.216.35.bc.googleusercontent.com | CH |
35.216.251.227 | 227.251.216.35.bc.googleusercontent.com | CH |
對於未知蜘蛛或者爬蟲。它的用途對網站來說可能是好的,也可能是壞的,這取決於它是什麼。所以說,這需要站長進一步分析判斷這些尚不明確的爬蟲行為,再作最終決定。 但,根據以往的經驗,未宣告行為目的及未命名的蜘蛛爬蟲,通常都有不可告人的祕密,我們理應對其行為進行控制,比如攔截。
您可以通過在網站的 robots.txt 中設定使用者代理訪問規則來遮蔽 XMCO bot 或限制其訪問許可權。我們建議安裝 Spider Analyser 外掛,以檢查它是否真正遵循這些規則。
# robots.txt # 下列程式碼一般情況可以攔截該代理 User-agent: XMCO bot Disallow: /
您無需手動執行此操作,可通過我們的 Wordpress 外掛 Spider Analyser 來攔截不必要的蜘蛛或者爬蟲。