SaLLy - Machine Learning

Machine learning is a field of study within artificial intelligence that seeks to create models that can learn based on data, having a great intersection with statistical predictive modeling. Depending on the type of learning, we can call it supervised or unsupervised, thus being classified by the existence or absence of the original response variable to perform some kind of supervision on the model. Within these groups, we have models for classification and regression in the supervised case, as well as clustering analysis and dimensionality reduction for the unsupervised case. Thus, in this research line, it will be of interest to develop work with the application of these classes of models mentioned. Among these classes, we can cite as models of greatest interest the supervised learning techniques support vector machines, tree models (decision tree, regression tree, and random forest), and KNN. As well as hierarchical and non-hierarchical methods (k-means and fuzzy c-means) of clustering analysis, and dimensionality reduction through principal component analysis.


In addition to the modeling process itself, attention will be dedicated to the process known as data engineering, where the entire process of cleaning, transformation, addition, and removal of variables is carried out. This process is responsible for 80% of the time used during data analysis, so attention and care are necessary, given the importance of this process for the modeling phase. In addition to the engineering process, it is also necessary to apply some data visualization techniques to better understand and communicate the information presented by the data. Thus, it is expected that this research line can develop projects independently as well as offer support for the development of projects for other research lines.

Meet Our Machine Learning Team

Jonatha Pimentel

Jonatha Pimentel

Research Coordinator & Team Leader

Olawale Awe

Olawale Awe

Research Coordinator & Team Leader

Key Publications

- Neto, E. D. A. L., & Rodrigues, P. C. (2023). Kernel robust singular value decomposition. Expert Systems with Applications, 211, 118555.


- de Oliveira, G. P., Fonseca, A., & Rodrigues, P. C. (2022). Diabetes diagnosis based on hard and soft voting classifiers combining statistical learning models. Brazilian Journal of Biometrics, 40(4), 415-427.


- Pimentel, J. S., & Rodrigues, P. C. (2022). Clustering the world currency exchange rates using hierarchical methods based on dynamic time warping. Fluctuation and Noise Letters, 21(01), 2250001.


- Pimentel, J. S., Ospina, R., & Ara, A. (2021). Learning time acceleration in support vector regression: a case study in educational data mining. Stats, 4(3), 682-700.


- Maia, M., Pimentel, J. S., Pereira, I. S., Gondim, J., Barreto, M. E., & Ara, A. (2020). Convolutional support vector models: Prediction of coronavirus disease using chest x-rays. information, 11(12), 548.


The complete list of publications from SaLLy members can be found in our Google Scholar profile.

Ongoing Projects and Activities

- Analysis and modeling of the relationship between economic factors and the effects and outcomes of earthquakes;


- Analysis of the effect of human and climatic variables on hotspots in Brazilian biomes over the past 12 years;


- Analysis of the trend of mortality due to severe acute respiratory syndrome in the neonatal population in Brazil before and during the COVID-19 pandemic period.