Semantic search of texts in the Serbian language
Semantic search involves analyzing text to understand its meaning. Unlike the commonly used lexical search, which relies on keywords, semantic search takes it a step further, allowing the discovery of relevant information within text, even when the specific keywords are not explicitly mentioned. The HCI group is working on creating datasets and building systems for semantic text search in the Serbian language, tailored for various application domains. The developed models can be used in different applications, where search systems, content recommendation, and chatbots are just a few of the common use cases.
Automatic speech recognition for the Serbian language
Automatic speech recognition is a technology that analyzes spoken words and phrases and translates them into written text, facilitating human-computer interaction through speech. The HCI group is dedicated to the development of automatic speech recognition systems for the Serbian language, including the creation of specific datasets and training of models adapted to the Serbian language. Such models can be applied in various areas, including the development of virtual assistants, improving customer service, automating business processes, and facilitating access to information for people with special needs.
Named Entity Recognition
Named Entity Recognition (NER) constitutes a pivot task in Natural Language Processing (NLP). Its primary objective is to identify and categorize named entities – personal names, organization names, locations, gene names, etc. – from unstructured text. Current methodologies successfully address this task when the Named Entities have been previously encountered in the training dataset or included in named entity dictionaries across various domains.
Members of the HCI (Human-Computer Interaction) group are engaged in developing methods for the recognition of biomedical Named Entities using a zero-shot learning approach, which does not require labeled examples for training, and a few-shot learning approach that necessitates a minimal set of about ten to a hundred labeled samples. Currently, these methods are being utilized internally at Bayer. Plans are underway to enhance the model’s performance through more efficient hyperparameter selection and validation set optimization.
Topic modeling for texts in the Serbian language
Topic modeling is a natural language processing technique used to identify hidden topics or concepts present in a set of documents. This technique enables grouping of similar documents into thematic categories without the need to predefine the topics.
Members of the HCI group have been actively involved in topic modeling as part of a project aimed at determining the reasons why Serbian citizens hesitate to get vaccinated against the COVID-19 virus. In this project, the team focused on identifying topics among tweets with a negative sentiment. Additionally, as part of a pilot project with Telekom Srbija, a topic modeling system was successfully implemented to automatically detect topics in customer service data
Sentiment analysis of texts in the Serbian language
Sentiment analysis is the process of automatically determining the emotional tone or stance in reviews, social media comments, or any other type of text. Its goal is usually to classify the sentiment as positive, negative, or neutral.
So far, the HCI group has been focused on determining the sentiment of comments on social media, particularly on Twitter. This effort aimed to understand the reasons why Serbian citizens are hesitant to get vaccinated against the COVID-19 virus.
Anonymization of texts in the Serbian language
Text anonymization is a process by which identifying or sensitive information in text is replaced or deleted to protect the privacy of individuals or to preserve data in accordance with data protection laws. This process is often used in situations where it is necessary to share, analyze, or store textual information while ensuring that individuals’ personal data is not identifiable or accessible.
HCI Group has developed a general domain anonymization system that can be employed to mask various types of confidential data (names, surnames, locations, organizations, personal identity numbers, etc.).
Predictive maintenance in industrial plants
HCI Group is engaged in the development of solutions for predictive maintenance of key elements in industrial plants. The failure of these elements can cause downtime in production and significant financial losses. To avoid this, early detection of potential problems is essential. Our approach involves training a machine learning model to monitor various process variables, after which the delivered models effectively detect anomalies in real time.
Our approach is specifically applied in the context of thermal power plants, where most operators rely on their own experience to manually monitor the operation of these key elements. This process is time-consuming and subject to human error. The proposed approach significantly improves the reliability and efficiency of thermal power plants by identifying and solving potential problems before they become serious and costly.
Olfactory perception
Olfactory perception refers to the ability to detect and recognize odors. In the context of biology, it is the process by which animals and humans recognize and interpret smells using olfactory (scent) receptors located in the nose.
An e-nose (electronic nose) is a device designed to detect and recognize odors in a similar way to the human nose, but using sensors and algorithms instead of biological receptors. These devices can ‘smell’ and analyze complex odors in the environment and convert them into electrical signals that are then interpreted using software. AI plays crucial role in processing and interpreting signals coming from gas sensors to enable olfactory perception.
Distributed Inference over Linear Models using Alternating Gaussian Belief Propagation
In this paper we consider the problem of maximum likelihood estimation in linear models represented by factor graphs and solved via the Gaussian belief propagation algorithm. Motivated by massive internet of things (IoT) networks and edge computing, we set the above problem in a clustered scenario, where the factor graph is divided into clusters and assigned for processing in a distributed fashion across a number of edge computing nodes. For these scenarios, we show that an alternating Gaussian belief propagation (AGBP) algorithm that alternates between inter-and intra-cluster iterations, demonstrates superior performance in terms of convergence properties compared to the existing solutions in the literature. We present a comprehensive framework and introduce appropriate metrics to analyse AGBP algorithm across a wide range of linear models characterised by symmetric and non-symmetric, square, and rectangular matrices. We extend the analysis to the case of dynamic linear models by introducing dynamic arrival of new data over time. Using a combination of analytical and extensive numerical results, we show the efficiency and scalability of AGBP algorithm, making it a suitable solution for large-scale inference in massive IoT networks.
A Smart Alcoholmeter Sensor Based on Deep Learning Visual Perception
Process automation, in general, enables the enhancement of productivity, product quality, and consistency alongside other production metrics. Liquor production on an industrial scale also follows the automation trend. However, small and medium producers lag with equipment modernization due to the high costs of industrial equipment. One of the important sensors in automation equipment for distilleries is the alcohol concentration sensor used for fraction separation, process automation, and supervision. This paper proposes a novel low-cost approach to alcohol concentration sensing by employing deep learning on the visual perception of traditional alcoholmeter. For purposes of the training model, dataset acquisition apparatus is developed and a large dataset of labeled images of alcoholmeter readings is acquired. The problem of reading alcohol concentration from an alcoholometer image is treated as a regression and classification problem. Performances of both regression and classification models were investigated with Resnet18 as an architecture of choice. Both models achieved satisfying performance metrics demonstrating the feasibility of the proposed approaches. The proposed system implemented on Raspberry Pi with a camera can be integrated into new distillation equipment. Additionally, it can be used for retrofitting existing equipment due to its non-invasive nature of reading. The scope of use can be further expanded to the reading of other types of analog instruments simply by retraining the model.
Comparing the Clinical Viability of Automated Fundus Image Segmentation Methods
Recent methods for automatic blood vessel segmentation from fundus images have been commonly implemented as convolutional neural networks. While these networks report high values for objective metrics, the clinical viability of recovered segmentation masks remains unexplored. In this paper, we perform a pilot study to assess the clinical viability of automatically generated segmentation masks in the diagnosis of diseases affecting retinal vascularization. Five ophthalmologists with clinical experience were asked to participate in the study. The results demonstrate low classification accuracy, inferring that generated segmentation masks cannot be used as a standalone resource in general clinical practice. The results also hint at possible clinical infeasibility in experimental design. In the follow-up experiment, we evaluate the clinical quality of masks by having ophthalmologists rank generation methods. The ranking is established with high intra-observer consistency, indicating better subjective performance for a subset of tested networks. The study also demonstrates that objective metrics are not correlated with subjective metrics in retinal segmentation tasks for the methods involved, suggesting that objective metrics commonly used in scientific papers to measure the method’s performance are not plausible criteria for choosing clinically robust solutions.
Non-adversarial Robustness of Deep Learning Methods for Computer Vision
Non-adversarial robustness, also known as natural robustness, is a property of deep learning models that enables them to maintain performance even when faced with distribution shifts caused by natural variations in data. However, achieving this property is challenging because it is difficult to predict in advance the types of distribution shifts that may occur. To address this challenge, researchers have proposed various approaches, some of which anticipate potential distribution shifts, while others utilize knowledge about the shifts that have already occurred to enhance model generalizability. In this paper, we present a brief overview of the most recent techniques for improving the robustness of computer vision methods, as well as a summary of commonly used robustness benchmark datasets for evaluating the model’s performance under data distribution shifts. Finally, we examine the strengths and limitations of the approaches reviewed and identify general trends in deep learning robustness improvement for computer vision.
Graph neural networks on factor graphs for robust, fast, and scalable linear state estimation with PMUs
As phasor measurement units (PMUs) become more widely used in transmission power systems, a fast state estimation (SE) algorithm that can take advantage of their high sample rates is needed. To accomplish this, we present a method that uses graph neural networks (GNNs) to learn complex bus voltage estimates from PMU voltage and current measurements. We propose an original implementation of GNNs over the power system’s factor graph to simplify the integration of various types and quantities of measurements on power system buses and branches. Furthermore, we augment the factor graph to improve the robustness of GNN predictions. This model is highly efficient and scalable, as its computational complexity is linear with respect to the number of nodes in the power system. Training and test examples were generated by randomly sampling sets of power system measurements and annotated with the exact solutions of linear SE with PMUs. The numerical results demonstrate that the GNN model provides an accurate approximation of the SE solutions. Furthermore, errors caused by PMU malfunctions or communication failures that would normally make the SE problem unobservable have a local effect and do not deteriorate the results in the rest of the power system.
GP CC-OPF: Gaussian Process based optimization tool for Chance-Constrained Optimal Power Flow
The Gaussian Process (GP) based Chance-Constrained Optimal Power Flow (CC-OPF) is an open-source Python code developed for solving economic dispatch (ED) problem in modern power grids. In recent years, integrating a significant amount of renewables into a power grid causes high fluctuations and thus brings a lot of uncertainty to power grid operations. This fact makes the conventional model-based CC-OPF problem non-convex and computationally complex to solve. The developed tool presents a novel data-driven approach based on the GP regression model for solving the CC-OPF problem with a trade-off between complexity and accuracy. The proposed approach and developed software can help system operators to effectively perform ED optimization in the presence of large uncertainties in the power grid.
Near Real-Time Distributed State Estimation via AI/ML-Empowered 5G Networks
Fifth-Generation (5G) networks have a potential to accelerate power system transition to a flexible, softwarized, data-driven, and intelligent grid. With their evolving support for Machine Learning (ML)/Artificial Intelligence (AI) functions, 5G networks are expected to enable novel data-centric Smart Grid (SG) services. In this paper, we explore how data-driven SG services could be integrated with ML/AI-enabled 5G networks in a symbiotic relationship. We focus on the State Estimation (SE) function as a key element of the energy management system and focus on two main questions. Firstly, in a tutorial fashion, we present an overview on how distributed SE can be integrated with the elements of the 5G core network and radio access network architecture. Secondly, we present and compare two powerful distributed SE methods based on: i) graphical models and belief propagation, and ii) graph neural networks. We discuss their performance and capability to support a near real-time distributed SE via 5G network, taking into account communication delays.
Move Away From Me! User Repulsion Under Proximity-Induced Interference in OWC Systems
As communication systems shift towards ever higher frequency bands, the propagation of signal between a user device and an infrastructure becomes more susceptible to nearby obstacles, including other users. As an extreme case, we consider such proximity-induced channel impairments in indoor optical wireless communication (OWC) systems. We set up a model, where the achievable OWC data rate depends not only on the relative position between a user device and an infrastructure access point, but also on the location of other users modeled as proximal interferers. We use a reinforcement learning (RL) approach to enable users to find suitable positions, both relative to the access point and to each other, that maximise the sum-rate capacity of the system. Our initial results demonstrate a feasibility of RL-based approach that enables indoor OWC users to find suitable balance between establishing high-rate direct link while remaining distant from proximal interferers.
Factorizing Multiclass into Binary Token Classification Task for Low-shot Transfer Learning: A Case Study on NER in Biomedical Domain
With the emergence of the transformer pre-trained models, it is possible to apply transfer learning to many NLP tasks. We presented the transfer learning using 5 BERT-based pre-trained models for the NER task. We experimented with each model in two versions: classic multiclass classification and binary token classification. The results indicate that the task of multiclass token classification can be successfully factorized to binary classification, which can be further used for low-shot transfer learning for classification on new unseen classes.
Zero-and Few-Shot Machine Learning for Named Entity Recognition in Biomedical Texts
This paper introduces a novel method for zero- and few-shot biomedical named entity recognition (NER) in English. By transforming multi-class token classification into binary one and fine-tuning transformer models with an extensive biomedical dataset and numerous biomedical entities, our model effectively learns semantic relationships between known and new entity labels. Experimental results show promising average F1 scores: 35.44% for zero-shot NER, 50.10% for one-shot, 69.94% for 10-shot, and 79.51% for 100-shot NER using a fine-tuned PubMedBERT-based model. Our approach outperforms previous transformer models and performs similarly to GPT3-based models with significantly fewer parameters. As a contribution to the community, the trained models and code are made publicly available as open source software under MIT license.
Multilingual transformer and BERTopic for short text topic modeling: The case of Serbian
This paper presents the results of the first application of BERTopic, a state-of-the-art topic modeling technique, to tweets written in Serbian, a morphologically rich language. We applied BERTopic with three multilingual embedding models on two levels of text preprocessing (partial and full) to evaluate its performance on partially preprocessed short text in Serbian. We also compared it to LDA and NMF on fully preprocessed text. Our results show that with adequate parameter setting, BERTopic can yield informative topics even when applied to partially preprocessed short text. When the same parameters are applied in both preprocessing scenarios, the performance drop on partially preprocessed text is minimal. Compared to LDA and NMF, judging by the keywords, BERTopic offers more informative topics and gives novel insights when the number of topics is not limited. The findings of this paper can be significant for researchers working with other morphologically rich low-resource languages and short text.
Uncovering the Reasons behind COVID-19 Vaccine Hesitancy in Serbia: Sentiment-Based Topic Modeling
This paper presents a combination of NLP methods applied to find the reasons for vaccine hesitancy in Serbia. To investigate the attitudes and beliefs surrounding vaccine hesitancy, a batch of tweets mentioning aspects of COVID-19 vaccination was collected and annotated with sentiment. These annotated tweets were used to train BERT-based classifiers, which were then employed to annotate a second batch of tweets. As a result, tweets with negative sentiment were identified from the combined datasets. To uncover the reasons for vaccine hesitancy, topic modeling methods, LDA and NMF, were applied to the preprocessed tweets. Given these reasons, it is now possible to better understand the concerns of people regarding the vaccination process.
Topic Modeling Technique on Covid19 Tweets in Serbian
COVID19 pandemic has brought health problems that concern individuals, the state, and the whole world. The information available on social networks, which were used more frequently and intensively during the pandemic than before, may contain hidden knowledge that can help to better address some problems and apply protective measures more adequately. Since the messages on Twitter are specific in their length, informal style, figurative speech, and frequent use of slang, this analysis requires the application of slightly different techniques than those classically applied to long, formal documents. To determine which topics appear in tweets related to vaccination, we apply state-of-the-art topic modeling techniques to determine which one is the most appropriate in this case.
Deep learning analysis of tweets regarding Covid19 Vaccination in the Serbian language
We present an efficient classifier that is able to perform automatic filtering and detection of tweets with clear negative sentiment towards COVID-19 vaccination process. We used a transformer-based architecture in order to build the classifier. A pre-trained transformer encoder that is trained in ELECTRA fashion, BERTic, was selected and fine-tuned on a dataset we collected and manually annotated. Such an automatic filtering and detection algorithm is of utmost importance in order to explore the reasons behind the negative sentiment of Twitter users towards a particular topic and develop a communication strategy to educate them and provide them with accurate information regarding their specific beliefs that have been identified.
A Benchmark of PDF Information Extraction Tools Using a Multi-task and Multi-domain Evaluation
Extracting information from academic PDF documents is crucial for numerous indexing, retrieval, and analysis use cases. We provide a large and diverse evaluation framework that supports more extraction tasks than most related datasets. Our framework builds upon DocBank, a multi-domain dataset of 1.5 M annotated content elements extracted from 500 K pages of research papers on arXiv. Using the new framework, we benchmark ten freely available tools in extracting document metadata, bibliographic references, tables, and other content elements from academic PDF documents.
Query Expansion, Argument Mining and Document Scoring for an Efficient Question Answering System
In the current world, individuals are faced with decision making problems and opinion formation processes on a daily basis. Nevertheless, answering a comparative question by retrieving documents based only on traditional measures (such as TF-IDF and BM25) does not always satisfy the need. In this paper, we propose a multi-layer architecture to answer comparative questions based on arguments. Our approach consists of a pipeline of query expansion, argument mining model, and sorting of the documents by a combination of different ranking criteria. Given the crucial role of the argument mining step, we examined two models: DistilBERT and an ensemble learning approach using stacking of SVM and DistilBERT. We compare the results of both models using two argumentation corpora on the level of argument identification task, and further using the dataset of CLEF 2021 Touché Lab shared task 2 on the level of answering comparative questions.
Performance analysis of large language models in the domain of legal argument mining
Generative pre-trained transformers have recently demonstrated excellent performance in various natural language tasks. However, the applicability and strength of these models in classifying legal texts in the context of argument mining are yet to be realized and have not been tested thoroughly. In this study, we investigate the effectiveness of GPT-like models, specifically GPT-3.5 and GPT-4, for argument mining via prompting. We closely study the model’s performance considering diverse prompt formulation and example selection in the prompt via semantic search using state–of–the-art embedding models from OpenAI and sentence transformers. We primarily concentrate on the argument component classification task on the legal corpus from the European Court of Human Rights. Our experiments demonstrate, quite surprisingly, that relatively small domain-specific models outperform GPT 3.5 and GPT-4 in the F1-score for premise and conclusion classes. We hypothesize that the performance drop indirectly reflects the complexity of the structure in the dataset, which we verify through prompt and data analysis. Nevertheless, our results demonstrate a noteworthy variation in the performance of GPT models based on prompt formulation. We observe comparable performance between the two embedding models, with a slight improvement in the local model’s ability for prompt selection. This suggests that local models are as semantically rich as the embeddings from the OpenAI model.