CDKP - 2013

Accepted Papers

Application of Distributed Datamining Techniques for Email Forensics
Salhi Dhai eddine¹, Tari Abdelkamal² and Kechadi M-Tahar¹,¹ University of bejaia - Algeria,² University college Dublin - Ireland
ABSTRACT
In our days, the emails have become a daily means of communication most popular accessible via Internet. Accounts in our reception we receive emails gangs (forensics), but we do not know.
From there, the idea of building a system of automatic check is coming a necessity.
To this end, in this paper we present a new method of treatment of emails to extract the bad emails in a mail server or an inbox of a user, using distributed data mining techniques. This study will reduce the risk of email users being hacked and even gives out to mail server administrators to detect bad emails and put the servers more secure.
Visualization Of A Synthetic Representation Of Association Rules To Assist Expert Validation
Amdouni Hamida¹ and Gammoudi Mohamed Mohsen²,¹FST, University of Tunis ElManar,²ISAMM, University of Manouba, Tunisia,
ABSTRACT
In order to help the expert to validate association rules, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data structure from which the rules are extracted. The second one has two subcategories: The first one consists on providing to the expert a tool for rule interactive exploration. In fact, they present these rules in textual form. The second subcategory includes the use of visualization systems to facilitate the task of rules mining. However, this last subcategory assumes that experts have statistical knowledge to interpret and validate association rules. Furthermore, the statistical methods have a lack of semantic representation and could not help the experts during the process of validation. To solve this problem, we propose in this paper a method which visualizes to the experts a synthetic representation of association rules as a formal conceptual graph (FCG). FCG represents his area of interest and allows him to realize the task of rules mining easily due to its semantic richness.
Planning Based On Classification By Induction Graph
Sofia Benbelkacem, Baghdad Atmani and Mohamed Benamina, University of Oran, Algeria

ABSTRACT
In Artificial Intelligence, planning refers to an area of research that proposes to develop systems that can automatically generate a result set, in the form of an integrated decision-making system through a formal procedure, known as plan. Instead of resorting to the scheduling algorithms to generate plans, it is proposed to operate the automatic learning by decision tree to optimize time. In this paper, we propose to build a classification model by induction graph from a learning sample containing plans that have an associated set of descriptors whose values change depending on each plan. This model will then operate for classifying new cases by assigning the appropriate plan.
Transformation Rules For Building Owl Ontologies From Relational Databases
Mohammed Reda Chbihi Louhdi¹, Hicham Behja² and Said Ouatik El Alaoui¹, ¹Dhar El Mehraz, Fez,²Ecole Nationale Superieure d'Electricite et de Mecanique, Casablanca, Morocco
ABSTRACT
Relational Databases (RDB) are used as the backend database by most of information systems. RDB encapsulate conceptual model and metadata needed in the ontology construction. Schema mapping is a technique that is used by all existing approaches for ontology building from RDB. However, most of those methods use poor transformation rules that prevent advanced database mining for building rich ontologies. In this paper, we propose transformation rules for building owl ontologies from RDBs. It allows transforming all possible cases in RDBs into ontological constructs. The proposed rules are enriched by analyzing stored data to detect disjointness and totalness constraints in hierarchies, and calculating the participation level of tables in n-ary relations. In addition, our technique is generic; hence it can be applied to any RDB. The proposed rules were evaluated using a normalized and open RDB. The obtained ontology is richer in terms of non- taxonomic relationships.
Demand - Drivan Asset Reutilization Analytics
Abbas R. Ali, Pitipong J. Lin, Dr., and Raul Zeng,IBM,United Kingdom
ABSTRACT
Manufacturers have long benefitted from reusing returned products and parts. This beneficial approach can help contain costs and the manufacturer to play a role in sustaining the environment. Reusing returned products and parts aids sustainability by reducing the use of raw materials, eliminating energy use to produce new parts, and minimizing waste materials. However, handling returns effectively and efficiently can be difficult if the processes and systems do not provide the visibility that is necessary to track, manage, and re-use the returns.
This paper applies advanced analytics to support reutilization on procurement data to increase reutilization in new build by optimizing Equal-to-New (ETN) parts return. This will reduce 'the spend' on new buy parts for building new product units. The process involves forecasting and matching returns supply to demand for new build. Complexity in the process is the forecasting and matching while making sure a reutilization engineering process is available. Also, this will identify high demand/value/yield parts for Development Engineering to focus.
Customer relationship management by Semi-supervised learning
Siavash Emtiyazand Shilan RahmaniAzar,Sardasht Branch, Islamic Azad University,Iran
ABSTRACT
With the increase of customer information and the rapid change of customer requirements, the need for automated intelligent systems is becoming more vital. An automated system reduces human
intervention, improves the quality of information extracted, and provides fast feedback for decision making purposes. This study investigates the use of a technique, semi-supervised learning, for the
management and analysis of customer-related data warehouse and information. The idea of semisupervised learning is to learn not only from the labeled training data, but to exploit also the structural
information in additionally available unlabeled data. The proposed semi-supervised method is a model by means of a feed-forward neural network trained by a back propagation algorithm (multi-layer
perceptron) in order to predict the category of an unknown customer (potential customers). In addition, this technique can be used with Rapid Miner tools for both labeled and unlabeled data.
Membership calculation based on dimension Hierarchical division
Jinlei Wang^1'2 , Ping Zhou¹ , Xiankai Chen², Guanjun Zhang²,¹GuiLin University of Electronic Technology, GuiLin, China,²Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,china
ABSTRACT
Since dataset usually contain noises, it is very helpful to find out and remove the noise in a preprocessing step. Fuzzy membership can measure a sample's weight. The weight should be smaller for noise sample but bigger for important sample. Therefore, appropriate sample memberships are vital. In this paper, we propose a novel approach, Membership Calculate based on Hierarchical Division (MCHD), to calculate the membership of training samples. MCHD uses the conception of dimension similarity, which develop a bottom-up clustering technique to calculate the sample membership iteratively. The weight of membership that we computed in each iteration will be considered. The experiment indicates that MCHD can effectively detect noise and removes them from the dataset. Fuzzy support vector machine based on MCHD outperforms most of approaches published recently and hold the better generalization ability to handle the noise.
Clustering Methodology apply in MANET for Energy Aware and Power Optimisation
Sajal Kanta Das,Women's PolytechnicHapania,Agartala,India.
ABSTRACT
The study of Mobile Ad-hoc Network remains attractive due to the desire to achieve better performance and scalability. MANETs are distributed systems consisting of mobile hosts that are connected by multi-hop wireless links. Such systems are self-organized and facilitate communication in the network without any centralized administration. MANETs exhibit battery power constraint and suffer scalability issues therefore cluster formation is expensive. This is due to the large number of messages passed during the process of cluster formation.
Clustering has evolved as an imperative research domain that enhances system performance such as throughput and delay in Mobile Ad hoc Networks (MANETs) in the presence of both mobility and a large number of mobile terminals.In this thesis, we present a clustering scheme that minimizes message overhead and congestion for cluster formation and maintenance. The algorithm is devised to be independent of the MANET Routing algorithm. Depending upon the context, the clustering algorithm may be implemented in the routing or in higher layers. The dynamic formation of clusters helps reduce data packet overhead, node complexity and power consumption. The simulation shows that the number of clusters formed is in proportion with the number of nodes in MANET.
Exploiting Context in Kernel-Mapping Recommender System Algorithms
Mustansar Ali Ghazanfar¹, Adam Prugel-Bennett²,¹University of Engineering and Technology,Pakistan,²University of Southampton,United Kingdom.

ABSTRACT
Recommender systems apply machine learning techniques for filtering unseen information and can predict whether a user would like a given item. Kernel Mapping Recommender (KMR) algorithms have been proposed which give state-of-the-art performance. In this paper, we show how context information can be added to Kernel-Mapping Recommender (KMR). We consider the trusted friends of a user as their social context and show how this information can be used to provide more personalised, refined, and trustworthy recommendations. The limited set of friends, however, restricts the amount of data available to create useful recommendations. This paper sheds light on this issue and specifically on the amount of friends necessary to get satisfactory recommendations. Furthermore, we describe how the proposed system might be used to generate recommendation in distributed way rather than traditional centralised one.
Extraction of Features for Predicting Patterns of Heart Disease
Iqra Basharat, Mamuna Fatima, Ali Raza Anjum and Shoab Ahmed Khan,National University of Sciences & Technology, Pakistan.
ABSTRACT
There is a huge amount of 'knowledge-enriched data' in hospitals, which needs to be processed in order to extract useful information from it. That knowledge-enriched data is very useful in making valuable medical decisions. However, there is a lack of effective analysis tools to discover hidden relationships in data. The objective of this research is to analyze the heart patients data and extract the useful information that helps the doctors in making wise decisions. We have a huge quantity of historical unstructured data of patients in the form of their medical reports along with unstructured doctors remarks. In this research, K-means clustering technique is used to extract features for predicting patterns of heart disease. Using patients' medical profiles such as age, sex, ECG, LVEF, EVS, blood pressure and previous history significant features (as male patients above 60 years with high blood pressure and hypertension are having TVCAD) are extracted. Based on these extracted patterns medical practitioners can make intelligent verdicts. Results of this study could be very constructive for medical researchers in the field of medicine research and can help medical team and doctors to suggest best diagnosis for a disease. There is a huge amount of 'knowledge-enriched data' in hospitals, which needs to be processed in order to extract useful information from it. That knowledge-enriched data is very useful in making valuable medical decisions. However, there is a lack of effective analysis tools to discover hidden relationships in data. The objective of this research is to analyze the heart patients' data and extract the useful information that helps the doctors in making wise decisions. We have a huge quantity of historical unstructured data of patients' in the form of their medical reports along with unstructured doctors' remarks. In this research, K-means clustering technique is used to extract features for predicting patterns of heart disease. Using patients' medical profiles such as age, sex, ECG, LVEF, EVS, blood pressure and previous history significant features (as male patients above 60 years with high blood pressure and hypertension are having TVCAD) are extracted. Based on these extracted patterns medical practitioners can make intelligent verdicts. Results of this study could be very constructive for medical researchers in the field of medicine research and can help medical team and doctors to suggest best diagnosis for a disease.
Improving Rule-Based Method for Arabic POS Tagging using HMM Technique
Meryeme Hadni¹, Said Alaoui Ouatik¹, and Abdelmonaime Lachkar², ¹FSDM, University Sidi Mohamed Ben Abdellah (USMBA), Morocco,²E.N.S.A, University Sidi Mohamed Ben Abdellah (USMBA), Morocco
ABSTRACT
Part-of-speech (POS) tagger plays an important role in Natural Language Applications like Speech Recognition, Natural Language Parsing, Information Retrieval and Multi Words Term Extraction. This study proposes a building of an efficient and accurate POS Tagger technique for Arabic language using statistical approach. Arabic Rule-Based method suffers from misclassified and unanalyzed words due to the ambiguity issue. To overcome these two problems, we propose a Hidden Markov Model (HMM) integrated with Arabic Rule-Based method. Our POS tagger generates a set of 4 POS tags: Noun, Verb, Particle, and Quranic Initial (INL). The proposed technique uses the different contextual information of the words with a variety of the features which are helpful to predict the various POS classes. To evaluate its accuracy, the proposed method has been trained and tested with the Quran Corpus containing 77 430 terms for undiacritized Classical Arabic language. The experiment results demonstrate the efficiency of our method for Arabic POS Tagging. The obtained accuracies are respectively 97.6% and 94.4% for our method and for the Rule based tagger method.
An Efficient Approach To Improve Arabic Documents Clustering Based On A New Keyphrases Extraction Algorithm
Hanane Froud , Issam Sahmoudi and Abdelmonaime Lachkar,L.S.I.S, E.N.S.A University Sidi Mohamed Ben Abdellah (USMBA) Fez, Morocco
ABSTRACT
Document Clustering algorithms group a set of documents into subsets or clusters. The algorithms goal is to create clusters that are coherent internally, but clearly different from each other. In other words, documents within a cluster should be as similar as possible and documents in one cluster should be as dissimilar as possible from documents in other clusters. This task is often can be affected by the documents contents, the useful words on the documents is often accompanied by a large amount of noise words. Therefore, it is necessary to eliminate the noise word and keeping just the useful information to improve the performance of Documents Clustering algorithms.
Cardiac Data Mining (CDM); Organization and Predictive Analytics on Biomedical (Cardiac) Data
M.Musa Bilal, Masood Hussain, Iqra Basharat, Mamuna Fatima,College of E&ME NUST,pakistan
ABSTRACT
Data mining and data analytics has been of immense importance to many different fields as we witness the evolution of data sciences over recent years. Biostatistics and Medical Informatics has proved to be the foundation of many modern biological theories and analysis techniques. These are the fields which applies data mining practices along with statistical models to discover hidden trends from data that comprises of biological experiments or procedures on different entities. The objective of this research study is to develop a system for the efficient extraction, transformation and loading of such data from cardiologic procedure reports given by Armed Forces Institute of Cardiology. It also aims to devise a model for the predictive analysis and classification of this data to some important classes as required by cardiologists all around the world. This includes predicting patient impressions and other important features..
Decision Tree Clustering: A Column-Stores Tuple Reconstruction
Tejaswini Apte¹, Dr. Maya Ingle² and Dr. A.K.Goyal², ¹Symbiosis Institute of Computer Studies and Research, India,²Devi Ahilya VishwaVidyalaya,India
ABSTRACT
Column-Stores gained popularity as a promising physical design alternative for aggregate queries. However, for multi-attribute queries column-stores pays performance penalties due to on-the-fly tuple reconstruction. This paper presents an adaptive approach for reducing tuple reconstruction time. Our approach exploits decision tree algorithm to cluster attributes for each projection and also eliminates
frequent database scanning. Experimentations with TPC-H data shows the effectiveness of proposed technique.
The Application Of Improved Dynamic Decision Tree Based On Particle Swarm Optimization During Transportation Process
LI Xin-hai and LI Li,Shijiazhuang University, China
ABSTRACT
Data mining during transport between the various environmental parameters and event variables come to the corresponding decision tree, by which we are able to predict the occurrence of the event under the certain conditions. Inspired by No Free Lunch Theorem (NFL) , the validity of the decision tree is improved based on Particle Swarm Optimization . Compare the actual outcome to verify the higher efficiency of the new algorithm.
Mining Triadic Association Rules
Sid Ali Selmane¹, Rokia Missaoui², Omar Boussaid¹ and Fadila Bentayeb¹, ¹Pierre Mendes France, France and ²rue Saint-Jean-Bosco Gatineau (Quebec), Canada
ABSTRACT
The objective of this research is to extract triadic association rules from a triadic formal context K := (K1, K2, K3, Y) where K1, K2 and K3 respectively represent the sets of objects, properties (or attributes) and conditions while Y is a ternary relation between these sets. Our approach consists to define a procedure to map a set of dyadic association rules into a set of triadic ones. The advantage of the triadic rules compared to the dyadic ones is that they are less numerous and more compact than the second ones and convey a richer semantics of data. Our approach is illustrated through an example of ternary relation representing a set of Customers who purchase their Products from Suppliers. The algorithms and approach proposed have been validated with experimentations on large real datasets.



Home Paper Submission Program Committee Accepted Papers Workshops Contact Us AIRCC
Home Paper Submission Program Committee Accepted Papers Workshops Contact Us AIRCC	Accepted Papers Application of Distributed Datamining Techniques for Email Forensics Salhi Dhai eddine¹, Tari Abdelkamal² and Kechadi M-Tahar¹,¹ University of bejaia - Algeria,² University college Dublin - Ireland ABSTRACT In our days, the emails have become a daily means of communication most popular accessible via Internet. Accounts in our reception we receive emails gangs (forensics), but we do not know. From there, the idea of building a system of automatic check is coming a necessity. To this end, in this paper we present a new method of treatment of emails to extract the bad emails in a mail server or an inbox of a user, using distributed data mining techniques. This study will reduce the risk of email users being hacked and even gives out to mail server administrators to detect bad emails and put the servers more secure. Visualization Of A Synthetic Representation Of Association Rules To Assist Expert Validation Amdouni Hamida¹ and Gammoudi Mohamed Mohsen²,¹FST, University of Tunis ElManar,²ISAMM, University of Manouba, Tunisia, ABSTRACT In order to help the expert to validate association rules, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data structure from which the rules are extracted. The second one has two subcategories: The first one consists on providing to the expert a tool for rule interactive exploration. In fact, they present these rules in textual form. The second subcategory includes the use of visualization systems to facilitate the task of rules mining. However, this last subcategory assumes that experts have statistical knowledge to interpret and validate association rules. Furthermore, the statistical methods have a lack of semantic representation and could not help the experts during the process of validation. To solve this problem, we propose in this paper a method which visualizes to the experts a synthetic representation of association rules as a formal conceptual graph (FCG). FCG represents his area of interest and allows him to realize the task of rules mining easily due to its semantic richness. Planning Based On Classification By Induction Graph Sofia Benbelkacem, Baghdad Atmani and Mohamed Benamina, University of Oran, Algeria ABSTRACT In Artificial Intelligence, planning refers to an area of research that proposes to develop systems that can automatically generate a result set, in the form of an integrated decision-making system through a formal procedure, known as plan. Instead of resorting to the scheduling algorithms to generate plans, it is proposed to operate the automatic learning by decision tree to optimize time. In this paper, we propose to build a classification model by induction graph from a learning sample containing plans that have an associated set of descriptors whose values change depending on each plan. This model will then operate for classifying new cases by assigning the appropriate plan. Transformation Rules For Building Owl Ontologies From Relational Databases Mohammed Reda Chbihi Louhdi¹, Hicham Behja² and Said Ouatik El Alaoui¹, ¹Dhar El Mehraz, Fez,²Ecole Nationale Superieure d'Electricite et de Mecanique, Casablanca, Morocco ABSTRACT Relational Databases (RDB) are used as the backend database by most of information systems. RDB encapsulate conceptual model and metadata needed in the ontology construction. Schema mapping is a technique that is used by all existing approaches for ontology building from RDB. However, most of those methods use poor transformation rules that prevent advanced database mining for building rich ontologies. In this paper, we propose transformation rules for building owl ontologies from RDBs. It allows transforming all possible cases in RDBs into ontological constructs. The proposed rules are enriched by analyzing stored data to detect disjointness and totalness constraints in hierarchies, and calculating the participation level of tables in n-ary relations. In addition, our technique is generic; hence it can be applied to any RDB. The proposed rules were evaluated using a normalized and open RDB. The obtained ontology is richer in terms of non- taxonomic relationships. Demand - Drivan Asset Reutilization Analytics Abbas R. Ali, Pitipong J. Lin, Dr., and Raul Zeng,IBM,United Kingdom ABSTRACT Manufacturers have long benefitted from reusing returned products and parts. This beneficial approach can help contain costs and the manufacturer to play a role in sustaining the environment. Reusing returned products and parts aids sustainability by reducing the use of raw materials, eliminating energy use to produce new parts, and minimizing waste materials. However, handling returns effectively and efficiently can be difficult if the processes and systems do not provide the visibility that is necessary to track, manage, and re-use the returns. This paper applies advanced analytics to support reutilization on procurement data to increase reutilization in new build by optimizing Equal-to-New (ETN) parts return. This will reduce 'the spend' on new buy parts for building new product units. The process involves forecasting and matching returns supply to demand for new build. Complexity in the process is the forecasting and matching while making sure a reutilization engineering process is available. Also, this will identify high demand/value/yield parts for Development Engineering to focus. Customer relationship management by Semi-supervised learning Siavash Emtiyazand Shilan RahmaniAzar,Sardasht Branch, Islamic Azad University,Iran ABSTRACT With the increase of customer information and the rapid change of customer requirements, the need for automated intelligent systems is becoming more vital. An automated system reduces human intervention, improves the quality of information extracted, and provides fast feedback for decision making purposes. This study investigates the use of a technique, semi-supervised learning, for the management and analysis of customer-related data warehouse and information. The idea of semisupervised learning is to learn not only from the labeled training data, but to exploit also the structural information in additionally available unlabeled data. The proposed semi-supervised method is a model by means of a feed-forward neural network trained by a back propagation algorithm (multi-layer perceptron) in order to predict the category of an unknown customer (potential customers). In addition, this technique can be used with Rapid Miner tools for both labeled and unlabeled data. Membership calculation based on dimension Hierarchical division Jinlei Wang^1'2 , Ping Zhou¹ , Xiankai Chen², Guanjun Zhang²,¹GuiLin University of Electronic Technology, GuiLin, China,²Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,china ABSTRACT Since dataset usually contain noises, it is very helpful to find out and remove the noise in a preprocessing step. Fuzzy membership can measure a sample's weight. The weight should be smaller for noise sample but bigger for important sample. Therefore, appropriate sample memberships are vital. In this paper, we propose a novel approach, Membership Calculate based on Hierarchical Division (MCHD), to calculate the membership of training samples. MCHD uses the conception of dimension similarity, which develop a bottom-up clustering technique to calculate the sample membership iteratively. The weight of membership that we computed in each iteration will be considered. The experiment indicates that MCHD can effectively detect noise and removes them from the dataset. Fuzzy support vector machine based on MCHD outperforms most of approaches published recently and hold the better generalization ability to handle the noise. Clustering Methodology apply in MANET for Energy Aware and Power Optimisation Sajal Kanta Das,Women's PolytechnicHapania,Agartala,India. ABSTRACT The study of Mobile Ad-hoc Network remains attractive due to the desire to achieve better performance and scalability. MANETs are distributed systems consisting of mobile hosts that are connected by multi-hop wireless links. Such systems are self-organized and facilitate communication in the network without any centralized administration. MANETs exhibit battery power constraint and suffer scalability issues therefore cluster formation is expensive. This is due to the large number of messages passed during the process of cluster formation. Clustering has evolved as an imperative research domain that enhances system performance such as throughput and delay in Mobile Ad hoc Networks (MANETs) in the presence of both mobility and a large number of mobile terminals.In this thesis, we present a clustering scheme that minimizes message overhead and congestion for cluster formation and maintenance. The algorithm is devised to be independent of the MANET Routing algorithm. Depending upon the context, the clustering algorithm may be implemented in the routing or in higher layers. The dynamic formation of clusters helps reduce data packet overhead, node complexity and power consumption. The simulation shows that the number of clusters formed is in proportion with the number of nodes in MANET. Exploiting Context in Kernel-Mapping Recommender System Algorithms Mustansar Ali Ghazanfar¹, Adam Prugel-Bennett²,¹University of Engineering and Technology,Pakistan,²University of Southampton,United Kingdom. ABSTRACT Recommender systems apply machine learning techniques for filtering unseen information and can predict whether a user would like a given item. Kernel Mapping Recommender (KMR) algorithms have been proposed which give state-of-the-art performance. In this paper, we show how context information can be added to Kernel-Mapping Recommender (KMR). We consider the trusted friends of a user as their social context and show how this information can be used to provide more personalised, refined, and trustworthy recommendations. The limited set of friends, however, restricts the amount of data available to create useful recommendations. This paper sheds light on this issue and specifically on the amount of friends necessary to get satisfactory recommendations. Furthermore, we describe how the proposed system might be used to generate recommendation in distributed way rather than traditional centralised one. Extraction of Features for Predicting Patterns of Heart Disease Iqra Basharat, Mamuna Fatima, Ali Raza Anjum and Shoab Ahmed Khan,National University of Sciences & Technology, Pakistan. ABSTRACT There is a huge amount of 'knowledge-enriched data' in hospitals, which needs to be processed in order to extract useful information from it. That knowledge-enriched data is very useful in making valuable medical decisions. However, there is a lack of effective analysis tools to discover hidden relationships in data. The objective of this research is to analyze the heart patients data and extract the useful information that helps the doctors in making wise decisions. We have a huge quantity of historical unstructured data of patients in the form of their medical reports along with unstructured doctors remarks. In this research, K-means clustering technique is used to extract features for predicting patterns of heart disease. Using patients' medical profiles such as age, sex, ECG, LVEF, EVS, blood pressure and previous history significant features (as male patients above 60 years with high blood pressure and hypertension are having TVCAD) are extracted. Based on these extracted patterns medical practitioners can make intelligent verdicts. Results of this study could be very constructive for medical researchers in the field of medicine research and can help medical team and doctors to suggest best diagnosis for a disease. There is a huge amount of 'knowledge-enriched data' in hospitals, which needs to be processed in order to extract useful information from it. That knowledge-enriched data is very useful in making valuable medical decisions. However, there is a lack of effective analysis tools to discover hidden relationships in data. The objective of this research is to analyze the heart patients' data and extract the useful information that helps the doctors in making wise decisions. We have a huge quantity of historical unstructured data of patients' in the form of their medical reports along with unstructured doctors' remarks. In this research, K-means clustering technique is used to extract features for predicting patterns of heart disease. Using patients' medical profiles such as age, sex, ECG, LVEF, EVS, blood pressure and previous history significant features (as male patients above 60 years with high blood pressure and hypertension are having TVCAD) are extracted. Based on these extracted patterns medical practitioners can make intelligent verdicts. Results of this study could be very constructive for medical researchers in the field of medicine research and can help medical team and doctors to suggest best diagnosis for a disease. Improving Rule-Based Method for Arabic POS Tagging using HMM Technique Meryeme Hadni¹, Said Alaoui Ouatik¹, and Abdelmonaime Lachkar², ¹FSDM, University Sidi Mohamed Ben Abdellah (USMBA), Morocco,²E.N.S.A, University Sidi Mohamed Ben Abdellah (USMBA), Morocco ABSTRACT Part-of-speech (POS) tagger plays an important role in Natural Language Applications like Speech Recognition, Natural Language Parsing, Information Retrieval and Multi Words Term Extraction. This study proposes a building of an efficient and accurate POS Tagger technique for Arabic language using statistical approach. Arabic Rule-Based method suffers from misclassified and unanalyzed words due to the ambiguity issue. To overcome these two problems, we propose a Hidden Markov Model (HMM) integrated with Arabic Rule-Based method. Our POS tagger generates a set of 4 POS tags: Noun, Verb, Particle, and Quranic Initial (INL). The proposed technique uses the different contextual information of the words with a variety of the features which are helpful to predict the various POS classes. To evaluate its accuracy, the proposed method has been trained and tested with the Quran Corpus containing 77 430 terms for undiacritized Classical Arabic language. The experiment results demonstrate the efficiency of our method for Arabic POS Tagging. The obtained accuracies are respectively 97.6% and 94.4% for our method and for the Rule based tagger method. An Efficient Approach To Improve Arabic Documents Clustering Based On A New Keyphrases Extraction Algorithm Hanane Froud , Issam Sahmoudi and Abdelmonaime Lachkar,L.S.I.S, E.N.S.A University Sidi Mohamed Ben Abdellah (USMBA) Fez, Morocco ABSTRACT Document Clustering algorithms group a set of documents into subsets or clusters. The algorithms goal is to create clusters that are coherent internally, but clearly different from each other. In other words, documents within a cluster should be as similar as possible and documents in one cluster should be as dissimilar as possible from documents in other clusters. This task is often can be affected by the documents contents, the useful words on the documents is often accompanied by a large amount of noise words. Therefore, it is necessary to eliminate the noise word and keeping just the useful information to improve the performance of Documents Clustering algorithms. Cardiac Data Mining (CDM); Organization and Predictive Analytics on Biomedical (Cardiac) Data M.Musa Bilal, Masood Hussain, Iqra Basharat, Mamuna Fatima,College of E&ME NUST,pakistan ABSTRACT Data mining and data analytics has been of immense importance to many different fields as we witness the evolution of data sciences over recent years. Biostatistics and Medical Informatics has proved to be the foundation of many modern biological theories and analysis techniques. These are the fields which applies data mining practices along with statistical models to discover hidden trends from data that comprises of biological experiments or procedures on different entities. The objective of this research study is to develop a system for the efficient extraction, transformation and loading of such data from cardiologic procedure reports given by Armed Forces Institute of Cardiology. It also aims to devise a model for the predictive analysis and classification of this data to some important classes as required by cardiologists all around the world. This includes predicting patient impressions and other important features.. Decision Tree Clustering: A Column-Stores Tuple Reconstruction Tejaswini Apte¹, Dr. Maya Ingle² and Dr. A.K.Goyal², ¹Symbiosis Institute of Computer Studies and Research, India,²Devi Ahilya VishwaVidyalaya,India ABSTRACT Column-Stores gained popularity as a promising physical design alternative for aggregate queries. However, for multi-attribute queries column-stores pays performance penalties due to on-the-fly tuple reconstruction. This paper presents an adaptive approach for reducing tuple reconstruction time. Our approach exploits decision tree algorithm to cluster attributes for each projection and also eliminates frequent database scanning. Experimentations with TPC-H data shows the effectiveness of proposed technique. The Application Of Improved Dynamic Decision Tree Based On Particle Swarm Optimization During Transportation Process LI Xin-hai and LI Li,Shijiazhuang University, China ABSTRACT Data mining during transport between the various environmental parameters and event variables come to the corresponding decision tree, by which we are able to predict the occurrence of the event under the certain conditions. Inspired by No Free Lunch Theorem (NFL) , the validity of the decision tree is improved based on Particle Swarm Optimization . Compare the actual outcome to verify the higher efficiency of the new algorithm. Mining Triadic Association Rules Sid Ali Selmane¹, Rokia Missaoui², Omar Boussaid¹ and Fadila Bentayeb¹, ¹Pierre Mendes France, France and ²rue Saint-Jean-Bosco Gatineau (Quebec), Canada ABSTRACT The objective of this research is to extract triadic association rules from a triadic formal context K := (K1, K2, K3, Y) where K1, K2 and K3 respectively represent the sets of objects, properties (or attributes) and conditions while Y is a ternary relation between these sets. Our approach consists to define a procedure to map a set of dyadic association rules into a set of triadic ones. The advantage of the triadic rules compared to the dyadic ones is that they are less numerous and more compact than the second ones and convey a richer semantics of data. Our approach is illustrated through an example of ternary relation representing a set of Customers who purchase their Products from Suppliers. The algorithms and approach proposed have been validated with experimentations on large real datasets.
Copyright (c) Aircc

Accepted Papers

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT