Oleksandr Mediakov

Work place: Lviv Polytechnic National University, Lviv, 79013, Ukraine

E-mail: oleksandr.mediakov.sa.2019@lpnu.ua


Research Interests: Deep Learning, Machine Learning, Artificial Intelligence


Oleksandr Mediakov is an accomplished senior student pursuing his undergraduate degree in the Department of Information Systems and Network at Lviv Polytechnique National University. He is a budding researcher passionate about Artificial Intelligence, particularly Machine and Deep Learning.

Author Articles
Information Technology for Generating Lyrics for Song Extensions Based on Transformers

By Oleksandr Mediakov Victoria Vysotska Dmytro Uhryn Yuriy Ushenko Cennuo Hu

DOI: https://doi.org/10.5815/ijmecs.2024.01.03, Pub. Date: 8 Feb. 2024

The article develops technology for generating song lyrics extensions using large language models, in particular the T5 model, to speed up, supplement, and increase the flexibility of the process of writing lyrics to songs with/without taking into account the style of a particular author. To create the data, 10 different artists were selected, and then their lyrics were selected. A total of 626 unique songs were obtained. After splitting each song into several pairs of input-output tapes, 1874 training instances and 465 test instances were obtained. Two language models, NSA and SA, were retrained for the task of generating song lyrics. For both models, t5-base was chosen as the base model. This version of T5 contains 223 million parameters. The analysis of the original data showed that the NSA model has less degraded results, and for the SA model, it is necessary to balance the amount of text for each author. Several text metrics such as BLEU, RougeL, and RougeN were calculated to quantitatively compare the results of the models and generation strategies. The value of the BLEU metric is the most diverse, and its value varies significantly depending on the strategy. At the same time, Rouge metrics have less variability and a smaller range of values. In total, for comparison, we used 8 different decoding methods for text generation supported by the transformers library, including Greedy search, Beam search, Diverse beam search, Multinomial sampling, Beam-search multinomial sampling, Top-k sampling, Top-p sampling, and Contrastive search. All the results of the lyrics comparison show that the best method for generating lyrics is beam search and its variations, including ray sampling. The contrastive search usually outperformed the usual greedy approach. The top-p and top-k methods do not have a clear advantage over each other, and in different situations, they produced different results.

[...] Read more.
Other Articles