Name: Bruno Légora Souza da Silva
Type: PhD thesis
Publication date: 18/03/2022
Advisor:

Namesort descending Role
Patrick Marques Ciarelli Advisor *

Examining board:

Namesort descending Role
Carmelo José Albanez Bastos Filho External Examiner *
Daniel Cruz Cavaliéri External Examiner *
Luiz Alberto Pinto External Examiner *
Patrick Marques Ciarelli Advisor *
Thomas Walter Rauber External Examiner *

Summary: Artificial Neural Networks have been applied to solve classification and regression problems,
increasing their popularity, mainly since the proposal of the backpropagation algorithm for
its training stage using datasets. In the past years, the volume of generated data and the
increased processing power of computers and Graphical Processing Units (GPU) enabled
the training of large (deep) architectures, capable of extracting and predicting information
from complex problems, which are usually computationally expensive. In contrast, fast
algorithms to train small (shallow) architectures, such as the single hidden layer feedforward
network (SLFN), but capable of approximate any continuous function, were proposed. One
of them is called Extreme Learning Machine (ELM), which has a fast and closed solution
and was applied in wide range of applications, obtaining better performances than other
methods, such as backpropagation-trained neural networks and Support Vector Machines
(SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting
and outliers, but they still suffer when used with large datasets and/or when more neurons
are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed,
which stacks ELM-trained modules, using information from a module in the next one,
improving the results using large datasets, but it has limitation regarding the memory
consumption, furthermore, it is not adequate for handling problems that involve a single
output, such as typical regression tasks. Another stacked method is called Deep Stacked
Network (DSN), which has problems with training time and memory usage, but without
the application limitation of Stacked ELM. Therefore, this work proposes to combine the
DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model
composed of small modules, with fast training and with a reduced memory usage, but
capable of obtain similar performances compared to models with more hidden neurons. We
also propose a variation of this model which deals with data that arrives gradually, called
incremental learning (or online in the ELM context). Extensive experiments were conducted
to evaluate the proposed methods in regression and classification tasks. Regarding the
online approach, only regression tasks were considered. The results show that the methods
are capable of training stacked architectures with statistically equivalent performances to
SLFN with a large amount of neurons or (or other online methods), when comparing an
error or accuracy metric. Considering the training time, the proposed methods spent less
time in many cases. When memory usage is considered, some of the proposed methods
were considered statistically better, which favors its use in restricted environments.
Keywords: Deep Stacked Network. Extreme Learning Machine. Classification. Regression. Stacked Models. Online Sequential Learning.

Access to document

Acesso à informação
Transparência Pública

© 2013 Universidade Federal do Espírito Santo. Todos os direitos reservados.
Av. Fernando Ferrari, 514 - Goiabeiras, Vitória - ES | CEP 29075-910