Analysis of Apache Logs Using Hadoop and Hive
Journal: TEM JOURNAL - Technology, Education, Management, Informatics (Vol.7, No. 3)Publication Date: 2018-08-27
Authors : Aleksandar Velinov Zoran Zdravev;
Page : 645-650
Keywords : Logs; Hadoop; Hive; analysis;
Abstract
In this paper we consider an analysis of Apache web logs using Cloudera Hadoop distribution and Hive for querying the data in the web logs. We used public available web logs from NASA Kennedy Space Center server. HDFS (Hadoop distributed file system) was used as a logs container. The apache web logs were copied to the HDFS from the local file system. We made an analysis for the total number of hits, unique IPs, the most common hosts that made request to the NASA server in Florida, the most common types of errors. We also examined the ratio between the number of rows in the logs and the time of execution.
Other Latest Articles
- EMS – A Workflow Programming Language and Environment
- Concept of SME Business Model for Industry 4.0 Environment
- Predicting Student Success Using Data Generated in Traditional Educational Environments
- Non-Destructive Diagnostics of Hard-to-Reach Places by Spatial Digitization
- The AHP Method Implementation for ERP Software Selection with Regard to the Data Protection Criteria
Last modified: 2018-09-01 00:49:06