Victoria Vysotska

Work place: Information Systems and Network Department, Lviv Polytechnic National University, Lviv, Ukraine

E-mail: Victoria.A.Vysotska@lpnu.ua

Website: https://orcid.org/0000-0001-6417-3689

Research Interests: Models of Computation, Analysis of Algorithms, Data Structures and Algorithms, Database Management System, Systems Architecture, Computer systems and computational processes

Biography

Victoria Vysotska, PhD, Deputy Head of Information systems and networks Department, Associate Professor of Information Systems and Networks Department, Institute of Computer Science and Information Technology at Lviv Polytechnic National University, Lviv, Ukraine. In 2014 defended candidate thesis (PhD) “Methods and tools of information resources processing in the electronic content commerce systems”. Research interests: content, information systems and networks, ecommerce, business-process, information resources, commercial content, content analysis, content monitoring, content search, electronic content commerce systems, content management system, content lifecycle, Internet newspaper, software systems, models, algorithms, analysis, methods and strategies of systems design. Victoria has over 18 years of teaching in Lviv Polytechnic National University. She has published more than 270 scientific papers in various national and international journals and conferences, 4 monograph, 5 textbooks. Lector is mathematical linguistics, discrete mathematics and numerical methods in informatics, information resources processing. Information about citations is available in http://orcid.org/0000-0001-6417-3689, http://victana.lviv.ua/index.php/naukovi-statti or https://scholar.google.com.ua/citations?hl=uk&user=-MCARowAAAAJ&view_op=list_works. Detailed information about Victoria Vysotska can be found on: https://ua.linkedin.com/pub/victoria-vysotska/29/1b7/261.

Author Articles
Information Technology for Generating Lyrics for Song Extensions Based on Transformers

By Oleksandr Mediakov Victoria Vysotska Dmytro Uhryn Yuriy Ushenko Cennuo Hu

DOI: https://doi.org/10.5815/ijmecs.2024.01.03, Pub. Date: 8 Feb. 2024

The article develops technology for generating song lyrics extensions using large language models, in particular the T5 model, to speed up, supplement, and increase the flexibility of the process of writing lyrics to songs with/without taking into account the style of a particular author. To create the data, 10 different artists were selected, and then their lyrics were selected. A total of 626 unique songs were obtained. After splitting each song into several pairs of input-output tapes, 1874 training instances and 465 test instances were obtained. Two language models, NSA and SA, were retrained for the task of generating song lyrics. For both models, t5-base was chosen as the base model. This version of T5 contains 223 million parameters. The analysis of the original data showed that the NSA model has less degraded results, and for the SA model, it is necessary to balance the amount of text for each author. Several text metrics such as BLEU, RougeL, and RougeN were calculated to quantitatively compare the results of the models and generation strategies. The value of the BLEU metric is the most diverse, and its value varies significantly depending on the strategy. At the same time, Rouge metrics have less variability and a smaller range of values. In total, for comparison, we used 8 different decoding methods for text generation supported by the transformers library, including Greedy search, Beam search, Diverse beam search, Multinomial sampling, Beam-search multinomial sampling, Top-k sampling, Top-p sampling, and Contrastive search. All the results of the lyrics comparison show that the best method for generating lyrics is beam search and its variations, including ray sampling. The contrastive search usually outperformed the usual greedy approach. The top-p and top-k methods do not have a clear advantage over each other, and in different situations, they produced different results.

[...] Read more.
Intelligent Analysis of Ukrainian-language Tweets for Public Opinion Research based on NLP Methods and Machine Learning Technology

By Oleh Prokipchuk Victoria Vysotska Petro Pukach Vasyl Lytvyn Dmytro Uhryn Yuriy Ushenko Zhengbing Hu

DOI: https://doi.org/10.5815/ijmecs.2023.03.06, Pub. Date: 8 Jun. 2023

The article develops a technology for finding tweet trends based on clustering, which forms a data stream in the form of short representations of clusters and their popularity for further research of public opinion. The accuracy of their result is affected by the natural language feature of the information flow of tweets. An effective approach to tweet collection, filtering, cleaning and pre-processing based on a comparative analysis of Bag of Words, TF-IDF and BERT algorithms is described. The impact of stemming and lemmatization on the quality of the obtained clusters was determined. Stemming and lemmatization allow for significant reduction of the input vocabulary of Ukrainian words by 40.21% and 32.52% respectively. And optimal combinations of clustering methods (K-Means, Agglomerative Hierarchical Clustering and HDBSCAN) and vectorization of tweets were found based on the analysis of 27 clustering of one data sample. The method of presenting clusters of tweets in a short format is selected. Algorithms using the Levenstein Distance, i.e. fuzz sort, fuzz set and Levenshtein, showed the best results. These algorithms quickly perform checks, have a greater difference in similarities, so it is possible to more accurately determine the limit of similarity. According to the results of the clustering, the optimal solutions are to use the HDBSCAN clustering algorithm and the BERT vectorization algorithm to achieve the most accurate results, and to use K-Means together with TF-IDF to achieve the best speed with the optimal result. Stemming can be used to reduce execution time. In this study, the optimal options for comparing cluster fingerprints among the following similarity search methods were experimentally found: Fuzz Sort, Fuzz Set, Levenshtein, Jaro Winkler, Jaccard, Sorensen, Cosine, Sift4. In some algorithms, the average fingerprint similarity reaches above 70%. Three effective tools were found to compare their similarity, as they show a sufficient difference between comparisons of similar and different clusters (> 20%).
The experimental testing was conducted based on the analysis of 90,000 tweets over 7 days for 5 different weekly topics: President Volodymyr Zelenskyi, Leopard tanks, Boris Johnson, Europe, and the bright memory of the deceased. The research was carried out using a combination of K-Means and TF-IDF methods, Agglomerative Hierarchical Clustering and TF-IDF, HDBSCAN and BERT for clustering and vectorization processes. Additionally, fuzz sort was implemented for comparing cluster fingerprints with a similarity threshold of 55%. For comparing fingerprints, the most optimal methods were fuzz sort, fuzz set, and Levenshtein. In terms of execution speed, the best result was achieved with the Levenshtein method. The other two methods performed three times worse in terms of speed, but they are nearly 13 times faster than Sift4. The fastest method is Jaro Winkler, but it has a 19.51% difference in similarities. The method with the best difference in similarities is fuzz set (60.29%). Fuzz sort (32.28%) and Levenshtein (28.43%) took the second and third place respectively. These methods utilize the Levenshtein distance in their work, indicating that such an approach works well for comparing sets of keywords. Other algorithms fail to show significant differences between different fingerprints, suggesting that they are not adapted to this type of task.

[...] Read more.
Time Dependence of the Output Signal Morphology for Nonlinear Oscillator Neuron Based on Van der Pol Model

By Vasyl Lytvyn Victoria Vysotska Ivan Peleshchak Ihor Rishnyak Roman Peleshchak

DOI: https://doi.org/10.5815/ijisa.2018.04.02, Pub. Date: 8 Apr. 2018

Time-frequency and time dependence of the output signal morphology of nonlinear oscillator neuron based on Van der Pol model using analytical and numerical methods were investigated. Threshold effect neuron, when it is exposed to external non-stationary signals that vary in shape, frequency and amplitude was considered.

[...] Read more.
Other Articles