Maestría en Matemática Aplicada

Permanent URI for this collectionhttp://repositorio.uta.edu.ec/handle/123456789/32203

Browse

Search Results

Now showing 1 - 1 of 1
  • Item
    Diseño de un modelo matemático para estimar la deserción estudiantil mediante técnicas de análisis multivariado en una institución de educación superior tecnológica
    (Universidad Técnica de Ambato. Facultad de Ingeniería en Sistemas, Electrónica e Industrial. Maestría en Matemática Aplicada, 2021) Vinueza López, Cristina Nataly; Loza Aguirre, Edison Fernando
    EXECUTIVE SUMMARY In this research, a logistic regression model was used to estimate student dropout from the IST Luis A. Martínez Agronómico. The data of 849 students registered in the institute between 2018 and 2020 was used to build the model. The independent variables considered for the model were: gender, marital status, age, career, repetition, occupation and economic status. We used the KDD methodology to estimate the mathematical model, which allows generating information from a database with the records to be studied. In the evaluated period, 82.45 percent of the students did not dropout but 17.55 percent did it. In the study, four logistic regression models were established, the first one includes all the independent variables but only the ‘career’ variable was significant. The ‘age’ and ‘gender’ variables were eliminated (higher p-value) for generating a second logistic regression model, where the ‘repetition’ and ‘career’ variables were considered significant. Subsequently, the highest p-value variables, ‘marital status’ and ‘economic status’ were eliminated for obtaining a third logistic regression model wherein the ‘repetition’ and ‘career’ variables were the only significant ones. Finally, it was chosen the logistic regression model 4, which only includes the career and repetition variables as the only significant ones. The null hypothesis was rejected because the coefficients Beta 1 and Beta 2 of the variables ‘career’ and ‘repetition’ aren´t zero. The logistic regression model 4 correctly classified 83 percent of the training data and 79 percent of the test data. Additionally, we build a prediction model based on decision trees, which established ‘career’ as a unique explanatory variable. The F1_Score value of the logistic regression model 4 was higher than the F1_Score value of the decision tree model.