Michal Gregus

Work place: Faculty of Managemen, Comenius University Bratislava, Bratislava, 82005, 25, Slovakia

E-mail: michal.gregus@fm.uniba.sk

Website: https://orcid.org/0000-0002-8156-8962

Research Interests:

Biography

Michal Greguš is a professor at the Faculty of Management, Comenius University Bratislava. He finished his university studies with summa cum laude and obtained his PhD degree in the field of mathematical analysis at the Faculty of Mathematics and Physics, Comenius University in Bratislava. He has been working previously in the field of functional analysis and its applications. At present his research interests are in applied mathematics, modelling of economic processes and data and business analytics.

Author Articles

Information Technology for Gender Voice Recognition Based on Machine Learning Methods

By Victoria Vysotska Denys Shavaiev Michal Gregus Yuriy Ushenko Zhengbing Hu Dmytro Uhryn

DOI: https://doi.org/10.5815/ijmecs.2024.05.05, Pub. Date: 8 Oct. 2024

The growing use of social networks and the steady popularity of online communication make the task of detecting gender from posts necessary for a variety of applications, including modern education, political research, public opinion analysis, personalized advertising, cyber security and biometric systems, marketing research, etc. This study aims to develop information technology for gender voice recognition by sound based on supervised learning using machine learning algorithms. A model, methods and means of recognition and gender classification of voice speech samples are proposed based on their acoustic properties and machine learning. In our voice gender recognition project, we used a model built based on the neural network using the TensorFlow library and Keras. The speaker’s voice was analysed for various acoustic features, such as frequency, spectral characteristics, amplitude, modulation, etc. The basic model we created is a typical neural network for text classification. It consists of the input layer, hidden layers, and the output layer. For text processing, we use a pre-trained word vector space such as Word2Vec or GloVe. We also used such techniques as dropout to prevent model overtraining, such activation functions as ReLU (Rectified Linear Unit) for non-linearity, and a softmax function in the last layer to obtain class probabilities. To train a model, we used the Adam optimizer, which is a popular gradient descent optimization method, and the “sparse categorical cross-entropy” loss function, since we are dealing with multi-class classification. After training the model, we saved it to a file for further use and evaluation of new data. The application of neural networks in our project allowed us to build a powerful model that can recognize a speaker’s gender by voice with high accuracy. The intelligent system was trained using machine learning methods with each of the methods being analysed for accuracy: K-Nearest Neighbours (98.10%), Decision Tree (96,69%), Logistic Regression (98.11%), Random Forest (96.65%), Support Vector Machine (98.26%), neural networks (98.11%). Additional techniques such as regularization and optimization can be used to improve model performance and prevent overtraining.

[...] Read more.

MECS Press Menu

Michal Gregus

Author Articles

Information Technology for Gender Voice Recognition Based on Machine Learning Methods

Other Articles