Loading…

Efficient attribute-oriented generalization for knowledge discovery from large databases

We present GDBR (Generalize DataBase Relation) and FIGR (Fast, Incremental Generalization and Regeneralization), two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an online...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on knowledge and data engineering 1998-03, Vol.10 (2), p.193-208
Main Authors:	Carter, C.L., Hamilton, H.J.
Format:	Article
Language:	English
Subjects:	Algorithmics. Computability. Computer arithmetics Applied sciences Artificial intelligence Computer science Computer science control theory systems Computer Society Data mining Exact sciences and technology Government Information systems. Data bases Intelligent robots Learning and adaptive systems Memory organisation. Data processing Partitioning algorithms Radio access networks Runtime Software Testing Theoretical computing
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c368t-a190411cf8daaa967611d3773af2c8ab378dfac5de10a9cb7f83da975fb477813
cites	cdi_FETCH-LOGICAL-c368t-a190411cf8daaa967611d3773af2c8ab378dfac5de10a9cb7f83da975fb477813
container_end_page	208
container_issue	2
container_start_page	193
container_title	IEEE transactions on knowledge and data engineering
container_volume	10
creator	Carter, C.L. Hamilton, H.J.
description	We present GDBR (Generalize DataBase Relation) and FIGR (Fast, Incremental Generalization and Regeneralization), two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an online algorithm and requires only a small, constant amount of space. FIGR also requires a constant amount of space that is generally reasonable, although under certain circumstances, may grow large. FIGR is incremental, allowing changes to the database to be reflected in the generalization results without rereading input data. FIGR also allows fast regeneralization to both higher and lower levels of generality without rereading input. We compare GDBR and FIGR to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively, where n is the number of input tuples and p the number of tuples in the generalized relation. Both require O(n) space that, for large input, causes memory problems. We implemented all four algorithms and ran empirical tests, and we found that GDBR and FIGR are faster. In addition, their runtimes increase only linearly as input size increases, while the runtimes of LCHR and AOI increase greatly when input size exceeds memory limitations.
doi_str_mv	10.1109/69.683752
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_27515195</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>683752</ieee_id><sourcerecordid>28276485</sourcerecordid><originalsourceid>FETCH-LOGICAL-c368t-a190411cf8daaa967611d3773af2c8ab378dfac5de10a9cb7f83da975fb477813</originalsourceid><addsrcrecordid>eNqFkU1LAzEQhhdRsFYPXj3tQQQPWzfJZpMcpdQPKHhR8LbMJpMS3W5qslXqr3fXll4LAzPMPPMyw5sklySfEJKru1JNSskEp0fJiHAuM0oUOe7rvCBZwQpxmpzF-JHnuRSSjJL3mbVOO2y7FLouuHrdYebD0ECTLrDFAI37hc75NrU-pJ-t_2nQLDA1Lmr_jWGT2uCXaQNhaEIHNUSM58mJhSbixS6Pk7eH2ev0KZu_PD5P7-eZZqXsMiCqv4xoKw0AqFKUhBgmBANLtYSaCWksaG6Q5KB0LaxkBpTgti5E_wEbJzdb3VXwX2uMXbXs78KmgRb9OlZUUaaYLA6DkoqykPwwKDjhRA3g7RbUwccY0Far4JYQNhXJq8GNquzj342evd6JQtTQ2ACtdnG_QGlJe4t67GqLOUTcT3caf69NksA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>27515195</pqid></control><display><type>article</type><title>Efficient attribute-oriented generalization for knowledge discovery from large databases</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Carter, C.L. ; Hamilton, H.J.</creator><creatorcontrib>Carter, C.L. ; Hamilton, H.J.</creatorcontrib><description>We present GDBR (Generalize DataBase Relation) and FIGR (Fast, Incremental Generalization and Regeneralization), two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an online algorithm and requires only a small, constant amount of space. FIGR also requires a constant amount of space that is generally reasonable, although under certain circumstances, may grow large. FIGR is incremental, allowing changes to the database to be reflected in the generalization results without rereading input data. FIGR also allows fast regeneralization to both higher and lower levels of generality without rereading input. We compare GDBR and FIGR to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively, where n is the number of input tuples and p the number of tuples in the generalized relation. Both require O(n) space that, for large input, causes memory problems. We implemented all four algorithms and ran empirical tests, and we found that GDBR and FIGR are faster. In addition, their runtimes increase only linearly as input size increases, while the runtimes of LCHR and AOI increase greatly when input size exceeds memory limitations.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/69.683752</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithmics. Computability. Computer arithmetics ; Applied sciences ; Artificial intelligence ; Computer science ; Computer science; control theory; systems ; Computer Society ; Data mining ; Exact sciences and technology ; Government ; Information systems. Data bases ; Intelligent robots ; Learning and adaptive systems ; Memory organisation. Data processing ; Partitioning algorithms ; Radio access networks ; Runtime ; Software ; Testing ; Theoretical computing</subject><ispartof>IEEE transactions on knowledge and data engineering, 1998-03, Vol.10 (2), p.193-208</ispartof><rights>1998 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c368t-a190411cf8daaa967611d3773af2c8ab378dfac5de10a9cb7f83da975fb477813</citedby><cites>FETCH-LOGICAL-c368t-a190411cf8daaa967611d3773af2c8ab378dfac5de10a9cb7f83da975fb477813</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/683752$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,27900,27901,54770</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=2262104$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Carter, C.L.</creatorcontrib><creatorcontrib>Hamilton, H.J.</creatorcontrib><title>Efficient attribute-oriented generalization for knowledge discovery from large databases</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>We present GDBR (Generalize DataBase Relation) and FIGR (Fast, Incremental Generalization and Regeneralization), two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an online algorithm and requires only a small, constant amount of space. FIGR also requires a constant amount of space that is generally reasonable, although under certain circumstances, may grow large. FIGR is incremental, allowing changes to the database to be reflected in the generalization results without rereading input data. FIGR also allows fast regeneralization to both higher and lower levels of generality without rereading input. We compare GDBR and FIGR to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively, where n is the number of input tuples and p the number of tuples in the generalized relation. Both require O(n) space that, for large input, causes memory problems. We implemented all four algorithms and ran empirical tests, and we found that GDBR and FIGR are faster. In addition, their runtimes increase only linearly as input size increases, while the runtimes of LCHR and AOI increase greatly when input size exceeds memory limitations.</description><subject>Algorithmics. Computability. Computer arithmetics</subject><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Computer science</subject><subject>Computer science; control theory; systems</subject><subject>Computer Society</subject><subject>Data mining</subject><subject>Exact sciences and technology</subject><subject>Government</subject><subject>Information systems. Data bases</subject><subject>Intelligent robots</subject><subject>Learning and adaptive systems</subject><subject>Memory organisation. Data processing</subject><subject>Partitioning algorithms</subject><subject>Radio access networks</subject><subject>Runtime</subject><subject>Software</subject><subject>Testing</subject><subject>Theoretical computing</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1998</creationdate><recordtype>article</recordtype><recordid>eNqFkU1LAzEQhhdRsFYPXj3tQQQPWzfJZpMcpdQPKHhR8LbMJpMS3W5qslXqr3fXll4LAzPMPPMyw5sklySfEJKru1JNSskEp0fJiHAuM0oUOe7rvCBZwQpxmpzF-JHnuRSSjJL3mbVOO2y7FLouuHrdYebD0ECTLrDFAI37hc75NrU-pJ-t_2nQLDA1Lmr_jWGT2uCXaQNhaEIHNUSM58mJhSbixS6Pk7eH2ev0KZu_PD5P7-eZZqXsMiCqv4xoKw0AqFKUhBgmBANLtYSaCWksaG6Q5KB0LaxkBpTgti5E_wEbJzdb3VXwX2uMXbXs78KmgRb9OlZUUaaYLA6DkoqykPwwKDjhRA3g7RbUwccY0Far4JYQNhXJq8GNquzj342evd6JQtTQ2ACtdnG_QGlJe4t67GqLOUTcT3caf69NksA</recordid><startdate>19980301</startdate><enddate>19980301</enddate><creator>Carter, C.L.</creator><creator>Hamilton, H.J.</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7TB</scope><scope>FR3</scope></search><sort><creationdate>19980301</creationdate><title>Efficient attribute-oriented generalization for knowledge discovery from large databases</title><author>Carter, C.L. ; Hamilton, H.J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c368t-a190411cf8daaa967611d3773af2c8ab378dfac5de10a9cb7f83da975fb477813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1998</creationdate><topic>Algorithmics. Computability. Computer arithmetics</topic><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Computer science</topic><topic>Computer science; control theory; systems</topic><topic>Computer Society</topic><topic>Data mining</topic><topic>Exact sciences and technology</topic><topic>Government</topic><topic>Information systems. Data bases</topic><topic>Intelligent robots</topic><topic>Learning and adaptive systems</topic><topic>Memory organisation. Data processing</topic><topic>Partitioning algorithms</topic><topic>Radio access networks</topic><topic>Runtime</topic><topic>Software</topic><topic>Testing</topic><topic>Theoretical computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Carter, C.L.</creatorcontrib><creatorcontrib>Hamilton, H.J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Xplore</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Carter, C.L.</au><au>Hamilton, H.J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient attribute-oriented generalization for knowledge discovery from large databases</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>1998-03-01</date><risdate>1998</risdate><volume>10</volume><issue>2</issue><spage>193</spage><epage>208</epage><pages>193-208</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>We present GDBR (Generalize DataBase Relation) and FIGR (Fast, Incremental Generalization and Regeneralization), two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an online algorithm and requires only a small, constant amount of space. FIGR also requires a constant amount of space that is generally reasonable, although under certain circumstances, may grow large. FIGR is incremental, allowing changes to the database to be reflected in the generalization results without rereading input data. FIGR also allows fast regeneralization to both higher and lower levels of generality without rereading input. We compare GDBR and FIGR to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively, where n is the number of input tuples and p the number of tuples in the generalized relation. Both require O(n) space that, for large input, causes memory problems. We implemented all four algorithms and ran empirical tests, and we found that GDBR and FIGR are faster. In addition, their runtimes increase only linearly as input size increases, while the runtimes of LCHR and AOI increase greatly when input size exceeds memory limitations.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/69.683752</doi><tpages>16</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1041-4347
ispartof	IEEE transactions on knowledge and data engineering, 1998-03, Vol.10 (2), p.193-208
issn	1041-4347 1558-2191
language	eng
recordid	cdi_proquest_miscellaneous_27515195
source	IEEE Electronic Library (IEL) Journals
subjects	Algorithmics. Computability. Computer arithmetics Applied sciences Artificial intelligence Computer science Computer science control theory systems Computer Society Data mining Exact sciences and technology Government Information systems. Data bases Intelligent robots Learning and adaptive systems Memory organisation. Data processing Partitioning algorithms Radio access networks Runtime Software Testing Theoretical computing
title	Efficient attribute-oriented generalization for knowledge discovery from large databases
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-24T16%3A56%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20attribute-oriented%20generalization%20for%20knowledge%20discovery%20from%20large%20databases&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Carter,%20C.L.&rft.date=1998-03-01&rft.volume=10&rft.issue=2&rft.spage=193&rft.epage=208&rft.pages=193-208&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/69.683752&rft_dat=%3Cproquest_cross%3E28276485%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c368t-a190411cf8daaa967611d3773af2c8ab378dfac5de10a9cb7f83da975fb477813%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=27515195&rft_id=info:pmid/&rft_ieee_id=683752&rfr_iscdi=true