
如何做搜尋引擎蜘蛛日誌分析
搜尋引擎蜘蛛日誌檔案是一種非常強大但未被站長充分利用的檔案,分析它可以獲取有關每個搜尋引擎如何爬取網站內容的相關資訊點,及檢視搜尋引擎蜘蛛在一段時間內的行為。
IP地址(173) | 伺服器名稱 | 所屬國家 |
---|---|---|
34.78.206.105 | ? | BE |
34.140.28.151 | ? | BE |
35.240.110.95 | 95.110.240.35.bc.googleusercontent.com | BE |
34.76.247.7 | ? | BE |
34.76.53.191 | ? | BE |
35.241.148.92 | ? | BE |
104.155.11.97 | ? | BE |
34.140.160.115 | ? | BE |
35.241.131.105 | ? | BE |
34.76.109.174 | 174.109.76.34.bc.googleusercontent.com | BE |
35.195.102.22 | 22.102.195.35.bc.googleusercontent.com | BE |
104.199.69.247 | ? | BE |
35.240.87.191 | ? | ? |
34.76.213.37 | 37.213.76.34.bc.googleusercontent.com | BE |
35.187.28.25 | ? | BE |
34.79.134.93 | ? | BE |
34.79.55.44 | ? | BE |
34.140.99.15 | ? | BE |
35.241.238.47 | 47.238.241.35.bc.googleusercontent.com | BE |
34.78.127.176 | ? | BE |
34.78.246.103 | ? | BE |
34.77.38.133 | 133.38.77.34.bc.googleusercontent.com | BE |
35.233.127.212 | ? | BE |
34.76.69.47 | ? | BE |
34.77.143.14 | 14.143.77.34.bc.googleusercontent.com | BE |
34.76.33.117 | ? | BE |
35.195.63.42 | ? | BE |
35.233.38.16 | ? | BE |
23.251.141.159 | 159.141.251.23.bc.googleusercontent.com | BE |
34.77.45.209 | ? | BE |
34.79.137.245 | ? | BE |
34.140.121.182 | 182.121.140.34.bc.googleusercontent.com | ? |
34.140.38.201 | ? | BE |
35.205.39.160 | ? | BE |
35.233.64.22 | 22.64.233.35.bc.googleusercontent.com | BE |
35.195.69.239 | ? | BE |
35.187.115.138 | ? | BE |
35.187.65.98 | ? | BE |
35.187.13.71 | ? | BE |
34.79.192.249 | ? | BE |
34.79.199.23 | 23.199.79.34.bc.googleusercontent.com | BE |
34.140.42.165 | ? | BE |
34.79.102.4 | ? | BE |
34.140.23.223 | 223.23.140.34.bc.googleusercontent.com | BE |
23.251.134.30 | ? | BE |
35.195.213.196 | 196.213.195.35.bc.googleusercontent.com | BE |
35.195.97.46 | ? | BE |
35.187.82.86 | 86.82.187.35.bc.googleusercontent.com | BE |
35.195.72.183 | ? | BE |
34.22.199.247 | ? | US |
34.78.82.141 | 141.82.78.34.bc.googleusercontent.com | BE |
34.76.201.53 | ? | BE |
34.78.109.220 | ? | BE |
34.140.241.12 | ? | BE |
34.78.1.210 | ? | BE |
34.140.223.10 | 10.223.140.34.bc.googleusercontent.com | BE |
34.77.141.46 | ? | BE |
34.77.82.27 | ? | BE |
104.199.63.204 | ? | BE |
34.76.57.230 | ? | BE |
34.140.63.207 | ? | BE |
34.77.68.81 | 81.68.77.34.bc.googleusercontent.com | BE |
34.79.132.101 | ? | BE |
130.211.50.44 | ? | BE |
35.190.217.244 | 244.217.190.35.bc.googleusercontent.com | BE |
34.38.103.23 | ? | US |
34.38.40.156 | ? | US |
104.155.78.4 | ? | BE |
34.77.100.36 | ? | BE |
34.77.189.45 | ? | BE |
35.187.80.56 | ? | BE |
34.38.164.184 | ? | ? |
35.205.248.39 | ? | BE |
34.79.129.41 | ? | BE |
34.38.191.46 | 46.191.38.34.bc.googleusercontent.com | US |
34.140.211.16 | ? | BE |
34.78.127.164 | ? | BE |
34.22.128.217 | 217.128.22.34.bc.googleusercontent.com | BE |
34.78.234.189 | ? | ? |
34.38.172.49 | ? | ? |
34.78.0.149 | 149.0.78.34.bc.googleusercontent.com | BE |
35.195.110.254 | ? | BE |
34.38.95.145 | ? | BE |
34.38.193.198 | ? | BE |
35.241.234.76 | 76.234.241.35.bc.googleusercontent.com | ? |
35.205.234.156 | ? | BE |
34.79.197.90 | ? | BE |
35.233.118.238 | ? | ? |
35.241.238.117 | 117.238.241.35.bc.googleusercontent.com | BE |
34.76.119.137 | ? | BE |
34.79.185.44 | 44.185.79.34.bc.googleusercontent.com | BE |
35.233.86.212 | 212.86.233.35.bc.googleusercontent.com | BE |
34.76.159.203 | ? | BE |
35.189.220.111 | ? | BE |
34.79.80.58 | ? | BE |
34.76.60.184 | ? | BE |
34.76.252.50 | ? | BE |
34.78.100.131 | ? | BE |
34.38.88.99 | 99.88.38.34.bc.googleusercontent.com | BE |
34.38.182.160 | 160.182.38.34.bc.googleusercontent.com | BE |
34.77.41.248 | ? | BE |
34.140.200.104 | ? | BE |
34.140.2.245 | ? | BE |
35.240.53.235 | ? | BE |
34.78.48.176 | 176.48.78.34.bc.googleusercontent.com | BE |
35.195.222.101 | ? | BE |
34.34.142.161 | ? | BE |
35.233.34.219 | ? | BE |
34.38.146.254 | ? | BE |
34.22.167.194 | ? | BE |
35.241.187.71 | ? | BE |
34.140.180.169 | ? | BE |
34.38.158.75 | ? | BE |
34.76.131.50 | ? | BE |
34.76.139.24 | ? | BE |
34.38.232.151 | ? | BE |
35.187.11.48 | ? | BE |
104.155.102.208 | ? | BE |
34.77.196.127 | ? | BE |
35.189.197.128 | 128.197.189.35.bc.googleusercontent.com | BE |
34.79.79.52 | ? | BE |
34.76.13.98 | ? | BE |
35.233.72.51 | ? | BE |
35.205.226.172 | ? | BE |
34.22.241.248 | ? | BE |
34.78.25.94 | ? | BE |
35.190.215.95 | ? | BE |
34.38.157.160 | ? | BE |
35.241.140.155 | ? | BE |
34.140.1.47 | ? | BE |
34.79.86.45 | ? | BE |
34.78.165.220 | ? | BE |
可以考慮攔截。。爬蟲通常會下載公開的網際網路內容,這些內容預設情況下可以免費訪問。不過,如果你不希望你的內容被用於未經授權的目的,你應該攔截它們。
您可以通過在網站的 robots.txt 中設定使用者代理訪問規則來遮蔽 DnBCrawler 或限制其訪問許可權。我們建議安裝 Spider Analyser 外掛,以檢查它是否真正遵循這些規則。
# robots.txt # 下列程式碼一般情況可以攔截該代理 User-agent: DnBCrawler Disallow: /
您無需手動執行此操作,可通過我們的 Wordpress 外掛 Spider Analyser 來攔截不必要的蜘蛛或者爬蟲。