
如何做搜尋引擎蜘蛛日誌分析
搜尋引擎蜘蛛日誌檔案是一種非常強大但未被站長充分利用的檔案,分析它可以獲取有關每個搜尋引擎如何爬取網站內容的相關資訊點,及檢視搜尋引擎蜘蛛在一段時間內的行為。
IP地址(37) | 伺服器名稱 | 所屬國家 |
---|---|---|
35.208.4.226 | 226.4.208.35.bc.googleusercontent.com | US |
35.208.11.77 | 77.11.208.35.bc.googleusercontent.com | US |
35.208.171.33 | 33.171.208.35.bc.googleusercontent.com | US |
35.206.64.40 | 40.64.206.35.bc.googleusercontent.com | US |
35.208.166.142 | 142.166.208.35.bc.googleusercontent.com | US |
35.208.230.178 | 178.230.208.35.bc.googleusercontent.com | US |
35.206.83.134 | 134.83.206.35.bc.googleusercontent.com | US |
35.208.249.95 | 95.249.208.35.bc.googleusercontent.com | US |
35.208.158.97 | 97.158.208.35.bc.googleusercontent.com | US |
35.208.251.242 | 242.251.208.35.bc.googleusercontent.com | US |
35.209.11.140 | 140.11.209.35.bc.googleusercontent.com | US |
35.209.23.88 | 88.23.209.35.bc.googleusercontent.com | US |
35.209.43.252 | 252.43.209.35.bc.googleusercontent.com | US |
35.208.216.25 | 25.216.208.35.bc.googleusercontent.com | US |
35.206.111.229 | 229.111.206.35.bc.googleusercontent.com | US |
35.209.96.23 | 23.96.209.35.bc.googleusercontent.com | US |
35.209.252.144 | 144.252.209.35.bc.googleusercontent.com | US |
35.208.69.92 | 92.69.208.35.bc.googleusercontent.com | US |
35.206.96.46 | 46.96.206.35.bc.googleusercontent.com | US |
35.209.251.85 | 85.251.209.35.bc.googleusercontent.com | US |
35.209.64.98 | 98.64.209.35.bc.googleusercontent.com | US |
35.208.109.166 | 166.109.208.35.bc.googleusercontent.com | US |
35.206.94.70 | 70.94.206.35.bc.googleusercontent.com | US |
35.208.13.186 | 186.13.208.35.bc.googleusercontent.com | US |
35.208.85.40 | 40.85.208.35.bc.googleusercontent.com | US |
35.208.185.177 | 177.185.208.35.bc.googleusercontent.com | US |
35.208.82.100 | 100.82.208.35.bc.googleusercontent.com | US |
35.208.220.93 | 93.220.208.35.bc.googleusercontent.com | US |
35.208.245.95 | 95.245.208.35.bc.googleusercontent.com | US |
35.206.118.73 | 73.118.206.35.bc.googleusercontent.com | US |
35.208.18.114 | 114.18.208.35.bc.googleusercontent.com | US |
35.209.191.144 | 144.191.209.35.bc.googleusercontent.com | US |
35.209.183.206 | 206.183.209.35.bc.googleusercontent.com | US |
35.208.189.127 | 127.189.208.35.bc.googleusercontent.com | US |
35.208.149.230 | 230.149.208.35.bc.googleusercontent.com | US |
35.209.166.65 | 65.166.209.35.bc.googleusercontent.com | US |
35.208.200.151 | 151.200.208.35.bc.googleusercontent.com | US |
IP地址(2) | 伺服器名稱 | 所屬國家 |
---|---|---|
35.208.19.1 | 1.19.208.35.bc.googleusercontent.com | US |
35.209.174.162 | 162.174.209.35.bc.googleusercontent.com | US |
IP地址(1) | 伺服器名稱 | 所屬國家 |
---|---|---|
174.129.99.169 | ec2-174-129-99-169.compute-1.amazonaws.com | US |
54.159.109.55 | 54.159.109.55 | US |
18.206.227.252 | ec2-18-206-227-252.compute-1.amazonaws.com | US |
52.87.95.167 | ec2-52-87-95-167.compute-1.amazonaws.com | US |
54.237.123.90 | ec2-54-237-123-90.compute-1.amazonaws.com | US |
54.227.117.173 | ec2-54-227-117-173.compute-1.amazonaws.com | US |
35.171.182.19 | ec2-35-171-182-19.compute-1.amazonaws.com | US |
3.90.183.121 | ec2-3-90-183-121.compute-1.amazonaws.com | US |
107.23.184.59 | ec2-107-23-184-59.compute-1.amazonaws.com | US |
3.86.106.249 | ec2-3-86-106-249.compute-1.amazonaws.com | US |
52.90.12.243 | ec2-52-90-12-243.compute-1.amazonaws.com | US |
IP地址(1) | 伺服器名稱 | 所屬國家 |
---|---|---|
52.90.12.243 | ec2-52-90-12-243.compute-1.amazonaws.com | US |
IP地址(1) | 伺服器名稱 | 所屬國家 |
---|---|---|
52.51.86.52 | ec2-52-51-86-52.eu-west-1.compute.amazonaws.com | IE |
一般不需要攔截,尤其是如果你自己也受益於搜尋引擎優化服務。不過,如果你擔心伺服器資源佔用等問題,且您都不使用這些工具,當然也可以選擇攔截它們。
您可以通過在網站的 robots.txt 中設定使用者代理訪問規則來遮蔽 Hypefactors crawler 或限制其訪問許可權。我們建議安裝 Spider Analyser 外掛,以檢查它是否真正遵循這些規則。
# robots.txt # 下列程式碼一般情況可以攔截該代理 User-agent: Hypefactors crawler Disallow: /
您無需手動執行此操作,可通過我們的 Wordpress 外掛 Spider Analyser 來攔截不必要的蜘蛛或者爬蟲。