Analysis and reports
Once the logs are transferred to an SQL server and processed they are ready for analysis. Separating log analysis from logs processing gives the analyst more freedom in using the most appropriate tools and provides a possibility of custom tools development. As mentioned before, the most flexible and universal tool is an OLAP cube. OLAP cube can be explored via Excel, a web-interface or third-party components. The advantages of this approach are hierarchical views, multidimensional tables, the ability to execute arbitrary queries, fast data retrieval and a convenient interface. The screenshot below demonstrates how an OLAP cube can be used to deduce geographical distribution of traffic generated by search engines for the last three months. Data mining took less than two minutes.
Apart from analytical reports that require active participation of an analyst, there is a number of standard reports that can be generated automatically, such as daily reports on the number of visitors, traffic generated from target search engines, number of page loads, number of initiated contacts, etc. The average and total time spent on a particular page by a visitor are very useful metrics demonstrating the interest in a particular part of the site. Pages which trigger extended periods of visitors' time and attract many visitors constitute so-called "zones of interest" and should be designed with great care. Pages with a smaller number of visitors but longer periods of time spent on them often contribute more to the overall time spent on the site than the pages with a higher number of hits.
An important component of any large site is a search subsystem that allows a visitor to find information by-passing the main navigation. Actually, the search subsystem extends the site navigation and therefore must deliver highly relevant results. One of the most effective implementations of the search system is queering GoogleAPI service. When using this approach you have to bear in mind that your site should be regularly indexed by GoogleBot. Otherwise, the results will correspond to an outdated content. For a site with such a search subsystem, one of the most important metrics is a number of pages retrieved by GoogleBot. Knowing what visitors search over the site helps to optimize the content structure and the navigation system by making the most popular pages easily accessible.
Standard reports are easily generated by running one or several SQL queries. There is a number of software products that can be used for generating these reports in a variety of formats (Crystal Reports and MSSQL Reporting Services, etc.). DataArt uses MS SQL Reporting Services as it provides an integrated development environment, web-interface access and a scheduled delivery of reports to specified e-mail addresses. Employees responsible for running the web site start their daily duties with reviewing reports for the previous day. These reports usually include:
- Total number of visitors (number of sessions, distinct IP addresses, pages retrieved by users and crawlers).
- A list of the most popular pages;
- Full paths of visitors who searched the site
- Full paths of visitors from particular countries that visited pages with web forms
- Referrers from search engines and phrases they searched for
|
|
|
 |
|