Methodologies for time series prediction and missing value imputation

Antti Sorjamaa

    Research output: ThesisDoctoral ThesisCollection of Articles

    Abstract

    The amount of collected data is increasing all the time in the world. More sophisticated measuring instruments and increase in the computer processing power produce more and more data, which requires more capacity from the collection, transmission and storage. Even though computers are faster, large databases need also good and accurate methodologies for them to be useful in practice. Some techniques are not feasible to be applied to very large databases or are not able to provide the necessary accuracy. As the title proclaims, this thesis focuses on two aspects encountered with databases, time series prediction and missing value imputation. The first one is a function approximation and regression problem, but can, in some cases, be formulated also as a classification task. Accurate prediction of future values is heavily dependent not only on a good model, which is well trained and validated, but also preprocessing, input variable selection or projection and output approximation strategy selection. The importance of all these choices made in the approximation process increases when the prediction horizon is extended further into the future. The second focus area deals with missing values in a database. The missing values can be a nuisance, but can be also be a prohibiting factor in the use of certain methodologies and degrade the performance of others. Hence, missing value imputation is a very necessary part of the preprocessing of a database. This imputation has to be done carefully in order to retain the integrity of the database and not to insert any unwanted artifacts to aggravate the job of the final data analysis methodology. Furthermore, even though the accuracy is always the main requisite for a good methodology, computational time has to be considered alongside the precision. In this thesis, a large variety of different strategies for output approximation and variable processing for time series prediction are presented. There is also a detailed presentation of new methodologies and tools for solving the problem of missing values. The strategies and methodologies are compared against the state-of-the-art ones and shown to be accurate and useful in practice.
    Translated title of the contributionMethodologies for time series prediction and missing value imputation
    Original languageEnglish
    QualificationDoctor's degree
    Awarding Institution
    • Aalto University
    Supervisors/Advisors
    • Simula, Olli, Supervising Professor
    • Lendasse, Amaury, Thesis Advisor
    Publisher
    Print ISBNs978-952-60-3452-2
    Electronic ISBNs978-952-60-3453-9
    Publication statusPublished - 2010
    MoE publication typeG5 Doctoral dissertation (article)

    Keywords

    • Time Series Prediction
    • Missing Values
    • Large Databases
    • Prediction Strategy
    • Variable Selection
    • Nonlinear Imputation
    • EOF Pruning, Ensemble of SOMs

    Fingerprint

    Dive into the research topics of 'Methodologies for time series prediction and missing value imputation'. Together they form a unique fingerprint.

    Cite this