Webalizer是一款Linux下常用的web日志分析腳本,當然對於nginx也適用。筆者在安裝後比較好奇這些數據背後的含義。到底代表著什麼,想必您也想知道吧,下面就是一些關於Webalizer名詞解釋。不到之處,還請高手不吝賜教^_^
Hits (點擊數)
Any request made to the server which is logged, is considered a ‘hit’.The requests can be for
在網站日志中記錄的任意一次請求,不管是html、圖片、音頻、CGI等等都被看做一次“點擊”。
anything…html pages, graphic images, audiofiles, CGI scripts, etc… Each valid line in the server log is
實際上就是日志文件中的一行對應一次“點擊”。
counted as a hit. This number represents the total number of requests that were made to the server during the specified report period.
“點擊”統計代表服務器在指定時間段內響應請求的總數。
Files(文件數)
Some requests made to the server, require that the server then send something back to the requesting client, such as
客戶端在向服務器發出請求後,服務器會向發出請求的客戶段傳送數據,例如
a html page or graphic image. When this happens, it is considered a ‘file’ and the files total is incremented.
html頁面或圖片。這種情形被定義為“文件數”。
The relationship between ‘hits’ and ‘files’ can be thought of as ‘incoming requests’ and ‘outgoing responses’.
“點擊數”和“文件數”之間存在著一定區別,可以看做是前者代表“進來的請求”而後者代表“服務器作出的響應”。
Pages(網頁數)
Pages are, well, pages! Generally, any HTML document, or anything that generates an HTML document,
網頁….就是網頁啦!一般來說,一個html文檔或者是動態頁面(php、asp、jsp等)就是“網頁數”統計的目標。
would be considered a page. This does not include the other stuff that goes into a document, such as
“網頁數”不包括頁面中的圖片、音頻片斷等等…..
graphic images, audio clips, etc… This number represents the number of ‘pages’ requested only, and does
被統計的“網頁數”僅僅是網頁本身而已,不包括網頁中其他如js、css等等。
not include the other ‘stuff’ that is in the page. What actually constitutes a ‘page’ can vary from server to server. The default action is to treat anything with the extension ‘.htm’, ‘.html’ or ‘.cgi’ as a page. A lot of
在默認情況下腳本只識別後綴名是“.htm”“.html”或者“cgi”的網頁。
sites will probably define other extensions, such as ‘.phtml’, ‘.php3′ and ‘.pl’ as pages as well. Some people
很多站點可能會有其他默認後綴名網頁,如“.phtml”、“php3”、“.pl”等等。
consider this number as the number of ‘pure’ hits… I’m not sure if I totally agree with that viewpoint. Some other programs (and people refer to this as ‘Pageviews’。
其實說白了就是PV(訪問量)啦^_^
Sites(站點數)
Each request made to the server comes from a unique ‘site’, which can be referenced by a name or
一般請求是由“站點”向服務器發送的,它可能是域名或IP地址。
ultimately, an IP address. The ‘sites’ number shows how many unique IP addresses made requests to the
“站點數”代表在指定時間段內有多少個獨立IP地址向服務器發送了請求。
server during the reporting time period. This DOES NOT mean the number of unique individual users (real
這個數值並不代表獨立訪客(真實用戶而不是機器人)的訪問數量。
people) that visited, which is impossible to determine using just logs and the HTTP protocol (however, this
number might be about as close as you will get).
Visits(訪客數)
Whenever a request is made to the server from a given IP address (site), the amount of time since a previous request by the address is calculated (if any). If the time difference is greater than a pre-configured ‘visit timeout’ value (or has never made a request before), it is considered a ‘new visit’, and this total is incremented (both for the site, and the IP address). The default timeout value is 30 minutes (can be changed), so if a user visits your site at 1:00 in the afternoon, and then returns at 3:00, two visits would be registered. Note: in the ‘Top Sites’ table, the visits total should be discounted on ‘Grouped’ records, and thought of as the “Minimum number of visits” that came from that grouping instead. Note: Visits only occur on PageType requests, that is, for any request whose URL is one of the ‘page’ types defined with the PageType and PagePrefix option, and not excluded by the OmitPage option. Due to the limitation of the HTTP
protocol, log rotations and other factors, this number should not betaken as absolutely accurate, rather, it should be considered a pretty close “guess”.
(這裡的訪客數是一個大概的數值,如果統一IP在默認30分鐘以內登錄則算做一個訪客,間隔30分鐘以上再次訪問則又被計算做一個新訪客)筆者個人理解,其實就是包含了一個IP地址某個時間段內到訪次數