Denys Shavaiev

Work place: Department of Information Systems and Networks, Lviv Polytechnic National University, Lviv, 79013, Ukraine

E-mail: denys.shavaiev.sa.2019@lpnu.ua

Website: https://orcid.org/0009-0004-0487-0780

Research Interests:

Biography

Denys Shavaiev is a dedicated student enrolled in the Department of Information Systems and Networks at Lviv Polytechnic National University. His academic journey is fuelled by a profound interest in cutting-edge technologies such as Machine Learning, Artificial Intelligence, and Data Science.

Author Articles
Information Technology for Gender Voice Recognition Based on Machine Learning Methods

By Victoria Vysotska Denys Shavaiev Michal Gregus Yuriy Ushenko Zhengbing Hu Dmytro Uhryn

DOI: https://doi.org/10.5815/ijmecs.2024.05.05, Pub. Date: 8 Oct. 2024

The growing use of social networks and the steady popularity of online communication make the task of detecting gender from posts necessary for a variety of applications, including modern education, political research, public opinion analysis, personalized advertising, cyber security and biometric systems, marketing research, etc. This study aims to develop information technology for gender voice recognition by sound based on supervised learning using machine learning algorithms. A model, methods and means of recognition and gender classification of voice speech samples are proposed based on their acoustic properties and machine learning. In our voice gender recognition project, we used a model built based on the neural network using the TensorFlow library and Keras. The speaker’s voice was analysed for various acoustic features, such as frequency, spectral characteristics, amplitude, modulation, etc. The basic model we created is a typical neural network for text classification. It consists of the input layer, hidden layers, and the output layer. For text processing, we use a pre-trained word vector space such as Word2Vec or GloVe. We also used such techniques as dropout to prevent model overtraining, such activation functions as ReLU (Rectified Linear Unit) for non-linearity, and a softmax function in the last layer to obtain class probabilities. To train a model, we used the Adam optimizer, which is a popular gradient descent optimization method, and the “sparse categorical cross-entropy” loss function, since we are dealing with multi-class classification. After training the model, we saved it to a file for further use and evaluation of new data. The application of neural networks in our project allowed us to build a powerful model that can recognize a speaker’s gender by voice with high accuracy.  The intelligent system was trained using machine learning methods with each of the methods being analysed for accuracy: K-Nearest Neighbours (98.10%), Decision Tree (96,69%), Logistic Regression (98.11%), Random Forest (96.65%), Support Vector Machine (98.26%), neural networks (98.11%). Additional techniques such as regularization and optimization can be used to improve model performance and prevent overtraining.

[...] Read more.
Other Articles