A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

Data Refining for Text Mining Process in Aviation Safety Data




TekijätSjöblom Olli

ToimittajaClaude Godart, Norbert Gronau, Sushil Sharma, Gérôme Canals

Konferenssin vakiintunut nimiConference on e-Business, e-Services, and e-Society

Julkaisuvuosi2009

JournalIFIP Advances in Information and Communication Technology

Kokoomateoksen nimiSofware Services for e-Business and e-Society

Sarjan nimiIFIP Advances in Information and Communication Technology

Numero sarjassa305

VuosikertaSeptember 2009

Aloitussivu415

Lopetussivu426

ISBN978-3-642-04279-9

eISBN978-3-642-04280-5

ISSN1868-4238

DOIhttps://doi.org/10.1007/978-3-642-04280-5_32


Tiivistelmä

Successful data mining is an iterative process during which data will be refined and adjusted to achieve more accurate mining results. Most important tools in the text mining context are list of stop words and list of synonyms. The size and richness of the lists mentioned depend on the structure of the language used in the text to be mined. English, for example, is an “easy” language for search technologies, because with a couple of exceptions, the stem of the

word is not conjugated and terms are formed using several words instead of creating compounds. This requires special attention to definitions when processing morphologically rich languages like Finnish. This chapter introduces the need and realisation of refining the source data for a successful data mining process based onto the results achieved from first mining round.




Last updated on 2024-26-11 at 22:04