To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure firstname.lastname@example.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Selecting pivot features that connect a source domain to a target domain is an important first step in unsupervised domain adaptation (UDA). Although different strategies such as the frequency of a feature in a domain, mutual (or pointwise mutual) information have been proposed in prior work in domain adaptation (DA) for selecting pivots, a comparative study into (a) how the pivots selected using existing strategies differ, and (b) how the pivot selection strategy affects the performance of a target DA task remain unknown. In this paper, we perform a comparative study covering different strategies that use both labelled (available for the source domain only) as well as unlabelled (available for both the source and target domains) data for selecting pivots for UDA. Our experiments show that in most cases pivot selection strategies that use labelled data outperform their unlabelled counterparts, emphasising the importance of the source domain labelled data for UDA. Moreover, pointwise mutual information and frequency-based pivot selection strategies obtain the best performances in two state-of-the-art UDA methods.
Graph mining is an important research area within the domain of data mining. The field of study concentrates on the identification of frequent subgraphs within graph data sets. The research goals are directed at: (i) effective mechanisms for generating candidate subgraphs (without generating duplicates) and (ii) how best to process the generated candidate subgraphs so as to identify the desired frequent subgraphs in a way that is computationally efficient and procedurally effective. This paper presents a survey of current research in the field of frequent subgraph mining and proposes solutions to address the main research issues.
Data mining has become a well-established discipline within the domain of artificial intelligence (AI) and knowledge engineering (KE). It has its roots in machine learning and statistics, but encompasses other areas of computer science. It has received much interest over the last decade as advances in computer hardware have provided the processing power to enable large-scale data mining to be conducted. Unlike other innovations in AI and KE, data mining can be argued to be an application rather then a technology and thus can be expected to remain topical for the foreseeable future. This paper presents a brief review of the history of data mining, up to the present day, and some insights into future directions.
In this paper a number of alternative strategies for distributed/parallel association rule mining are investigated. The methods examined make use of a data structure, the T-tree, introduced previously by the authors as a structure for organizing sets of attributes for which support is being counted. We consider six different approaches, representing different ways of parallelizing the basic Apriori-T algorithm that we use. The methods focus on different mechanisms for partitioning the data between processes, and for reducing the message-passing overhead. Both ‘horizontal’ (data distribution) and ‘vertical’ (candidate distribution) partitioning strategies are considered, including a vertical partitioning algorithm (DATA-VP) which we have developed to exploit the structure of the T-tree. We present experimental results examining the performance of the methods in implementations using JavaSpaces. We conclude that in a JavaSpaces environment, candidate distribution strategies offer better performance than those that distribute the original dataset, because of the lower messaging overhead, and the DATA-VP algorithm produced results that are especially encouraging.
PKDD 2001, the 5th European Conference on Principles of Knowledge Discovery in Databases (PKDD),
was held in Freiburg, Baden-Württemberg, Germany, this year (Monday 3 to Thursday 7 September), and co-located with the 12th European Conference on Machine Learning (ECML 2001). The proceedings comprised two volumes, one for PKDD (De Raedt & Siebes, 2001) and one for ECML (De Raedt & Flach, 2001); and form part of the Springer Lecture Notes on Artificial Intelligence (LNAI) series. The conference was held in the University buildings in the centre of the old town. Freiburg and the surrounding area were for many years part of the Austro-Hungarian empire and thus the university was described to us as being one of the oldest Austrian Universities.
Knowledge-Based (KB) technology is being applied to complex problem solving and safety and business critical tasks in many application domains. Concerns have naturally arisen as to the dependability of Knowledge-Based Systems (KBS). As with any software, attention to quality and safety must be paid throughout development of a KBS, and rigorous Verification and Validation (V&V) techniques must be employed. Research in V&V of KBSs has emerged as a distinct field only in the last decade, and is intended to address issues associated with quality and safety aspects of KBSs, and to provide such applications with the same degree of dependability as conventional applications. In recent years, V&V of KBSs has been the topic of annual workshops associated with the main AI conferences, such as AAAI, IJCAI and ECAI.
Email your librarian or administrator to recommend adding this to your organisation's collection.