Optimal Implementation of Closed High Utility Itemsets Discovery with Timespan Consideration
Journal: International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) (Vol.6, No. 6)Publication Date: 2018-01-08
Authors : Bhojaraj H. Barhate Abhijit Ingale;
Page : 168-171
Keywords : Keywords— closed high utility itemsets; utility Mining; Concise and lossless representation; data mining.;
Abstract
Abstract It is not always easy to implement the planned things in algorithm as it is. An implementation always has different set of issues, though we have an algorithm or any sort of blueprint ready to solve the problem. This paper describes how CHUD is implemented with timespan consideration. Overall four to five main modules we have to form, namely, Connection Module, Threshold value computation, CHUD Module, DAHU [1] Module, Dataset Population Module and Result Analysis module. Each of the modules has its own significance and contribution. The Connection Module allows us to connect our software to our database as well as to the dataset on which utility mining should be performed. This module will collect the information about columns like ItemSold (which actually contains the itemnames/itemids sold in transaction), transaction Id, item quantity, profit unit etc. Threshold value computation with timespan consideration helps us to define a specific value, on which decision of HUI and non HUI get decided. As we know CHUD works on promising items only, we need to extract promising items from the given dataset. At the same time while we scan whole dataset we build an association matrix/table which is a vertical representation of given dataset. CHUD module discovers closed high utility itemsets, and records these itemsets in PhaseIOutput table. DAHU later on uses each itemset from PhaseIOutput table and work from the top. Work from the top suggest that it operates on the itemsets of maximum length ‘k' and discovers all possible high utility itemsets of length ‘k-1', which not present in PhaseIOutput table. All high utility itemsets discovered by DAHU recorded in PhaseIIOutput. The both the tables PhaseIOutput and PhaseIIOutputcombinely form a final result set which have all the high utility itemsets for the given dataset. C= {X1, X2………...Xn} where Xi ?PhaseIOutput. D= {Y1, Y2………. Ym} where Yi ?PhaseIIOutput. Suppose R is the final result set, then, R=C ? D Finally, we represent the result with result pattern module, which use the data of both the table PhaseIOutput and PhaseIIOutput, and represent the items and itemsets with their utility value in Bar chart, PI Chart or any other representation style element. One can make a good analysis with these results.
Other Latest Articles
- A study on Defect Prevention in E-commerce Web Sites
- Analytical Study on Encryption Techniques and Challenges in Network Security
- Information dissemination using computer and communication technologies for improving agriculture productivity
- A Novel Data Fusion of Navigation and Surveillance Facilities using Multi Dimensional Kalman Filter Algorithm in Linux Environment for Optimal Air Space Management
- Subset Sum Problem-New Representation Approach for finding the solution
Last modified: 2018-01-19 15:13:45