Mesopotamian Journal of Big Data

Advances and Insights in Image Texture Analysis : A Review

Ghaith H. Alashour — 2025-08-08

Texture analysis is an essential step in image analysis, and subsequent applications such as medical imaging, remote sensing, and scene understanding are highly important in image processing. Although it is vital, the field presents its own set of research challenges, especially when manipulating variations in texture patterns and requirements for properties that remain unaffected by related transformations, such as rotation, scaling, and translation. This review provides an in-depth description of key activities in the field of texture analysis, including classification, segmentation, synthesis, and image retrieval, along with their strengths and limitations. The approaches are classified as structural, statistical, and model-based and are discussed in consideration of their appropriateness and performance. The regular textures most favour structural methods, whereas statistical and model-based methods are more flexible, although they sometimes require more computational resources. Other challenges that are outlined in the review include a lack of support for real-time and transform-invariant applications. These findings can aid in determining the appropriate techniques to use and in developing lightweight yet durable methods for texture analysis. Overall, the review offers profound insights into the field and provides a course for future research and creativity.

Intelligent Techniques for Autism Spectrum Disorder Diagnosis: A Review

Ekraam Jabier — 2025-07-25

Autism spectrum disorder (ASD) is a neurodevelopmental disorder whose prevalence has increased drastically around the world due to the shortcomings of the traditional method of diagnosis, which has been shown to be unsustainable since it is usually time-consuming, expensive, and subjective to clinical interpretation. These difficulties make finding more scalable, efficient, and objective methods of diagnosis incredibly expedient. This review explores the impacts of intelligent technologies such as artificial intelligence (AI), machine learning (ML), deep learning (DL), and Internet of Things (IoT) sensor-supported systems and the revolution they are bringing into the diagnosis of ASD. This work reviews the recent developments in the use of multisensor platforms (e.g., eye tracking, electroencephalography (EEG), speech processing, and computer vision) and computational models to increase accuracy, accessibility, and speed. The systematic review approach was utilized, where only peer-reviewed journal articles published from 2019-2025 were considered and retrieved from major scholarly databases. Seven research questions that addressed diagnostic performance, algorithm innovation, data sources, dimension reduction, and clinical significance guided the review. Even with fewer than 128 sensors and similar sensors incorporated into diagnostic models, an accuracy rate of 85–95% is achieved, which at least meets or surpasses previous standards. Generalizability, fairness, and data privacy are increased because of federated learning and explainable AI systems. Openly accessible resources such as ABIDE-III, SFARI Genomes 2.0, and NDAR-2024 have been essential in terms of discovering robust biomarkers and enabling the validation of models in various ethnicities and populations. These findings indicate the potential of intelligent systems for early detection and accurate and personalized ASD diagnosis. These technologies make it possible to screen for autism noninvasively, in real time, and at an affordable cost, hence opening up avenues to more inclusive and fairer approaches to autism care across the world.

A Comparative Study of RFID System Performance in Large-Scale Network Planning Facility

Ali Abdulqader Al Qisi — 2025-07-25

Big data in manufacturing fields present several challenges leads to reduce profitability and missed opportunity for innovation. One of the used strategies is the use of radio frequency identification system. considered a business strategy to increase productivity, speed up decision-making, and enhance production monitoring and control while preserving the structure and integrity of current manufacturing systems. The present research compares five artificial inelegant algorithms based on RFID system in facility layout design to investigate the fitness of each algorithm in manufacturing big data processing. The objective functions have been used are the minimum number of required readers, minimum readers overlap, and maximum tags coverage. The contribution in this work is the workability of each algorithm in different facility design condition based on design alternatives. the results present that cuckoo search (CS) has the optimum fitness reach to 74.68% in big data and large area condition while particle swarm optimization (PSO) observed optimum fitness 74.46% in small data and large area. The simulation results illustrate the applicability and robustness of the proposed method, with the characteristics maintaining exceptional approximation capabilities even in high-dimensional spaces.

Coupling a dual tree - complex wavelet transform with K-means to clustering the epileptic seizures in EEG signals

Raid Lafta — 2025-06-29

Electroencephalography (EEG) signals are routinely recorded in clinical settings for the diagnosis of epilepsy, that is, brain electrical disorders, by a neurologist. Nevertheless, both the reliability and safety of GB analysis of EEG parameters are not satisfactory. Therefore, identifying the effectiveness of EEG for diagnosis is a considerable challenge for hospitals. This study was conducted to improve the efficiency of detecting epileptic seizures in EEG signals by combining three methodologies: the double tree complex wavelet transform (DT-CWT), K means clustering, and the ChaCha20 encryption algorithm. EEG segments are initially partitioned into regular segments, each of which is further partitioned into smaller clusters by K-means. To analyse the EEG waves to extract frequency information and select six discriminable statistical features, these clusters are examined via the DT-CWT. After feature extraction, epileptic seizures are discerned on the basis of K-means, which can enable very accurate detection of the seizures. The final results are encrypted via the ChaCha20 standard to ensure patient data confidentiality at the send and receipt stages. The findings presented in this study show that the proposed approach has a good clustering accuracy of 99% to help doctors diagnose patients with epilepsy and prescribe the best treatment to cure patients to maintain privacy and prevent data from being seen by unauthorized persons with an overall accuracy of 96.3%. Through the improvement of the accuracy of neurological disease identification, this method opens up the possibility for further progress in the domain of EEG signal analysis.

Overview of the CICIoT2023 Dataset for Internet of Things Intrusion Detection Systems

Wisam Ali Hussein Salman — 2025-06-10

The rapid expansion of the use of the Internet of Things (IoT) has encouraged many attackers to exploit the vulnerabilities in these networks to violate data privacy or disrupt service; they are easy targets due to the diversity of devices within the network, which has led to the loss of unified security standards. intrusion detection system (IDS) play a pivotal role in securing IoT networks by monitoring inbound and outbound traffic to these networks and issuing a security alarm when there is an attack; moreover, they respond directly to these security threats to prevent them from harming the network and violating data privacy. To design an IDS capable of performing work with high efficiency, an appropriate dataset must be chosen to train and evaluate the designed model. This dataset works as a fundamental task in the success of these systems because it plays a major role in training the system, feature engineering, evaluating the performance of the model, and other tasks. This paper focused on one of the modern datasets used in training and evaluating IDS models, that is, the CICIOT2023 dataset. The CICIOT2023 dataset is distinguished from other datasets, such as CICIDS2017, UNSW-NB15, and KDD1999. It focuses on the IoT environment, unlike other datasets that focus on data traffic in traditional networks, and it uses a variety of devices and protocols; moreover, it contains modern and complex attacks and a balance between the data of those attacks and normal traffic. This paper discusses the structure of the dataset, the kinds of attacks it contains, the applications and fields in which it is used, the strengths that distinguish it from other datasets, its role in developing cybersecurity research, the most important studies that have been written and dealt with this dataset, and finally, the future visions for developing the dataset.

Bridging Law and Machine Learning: A Cybersecure Model for Classifying Digital Real Estate Contracts in the Metaverse

Faris Kamil Hasan Mihna — 2025-04-23

The metaverse indicates an ever-evolving digital ecosystem where virtual real estate has now become an asset class. These properties, subject to smart contracts on the blockchain and represent as non-fungible tokens (NFTs), gives rise to new legal and cyber issues due to the decentralized and dematerialized nature of these digital assets .This paper proposes a machine learning approach to classify the digital real estate contracts into Ownership and Lease contracts. The study utilizes a dataset of one thousand digital real estate contracts collected from platforms such as Decentraland and The Sandbox. The dataset also included attributes such as plot size, plot location, transaction value, and contract duration. Preprocessing of data included encoding categorical data, standardization of numerical variables, and UTF-8 encoded text to preserve data quality. Two classification models were used: Logistic Regression and Random Forest. The model's evaluation used accuracy, precision, recall, and F1-score as evaluation criteria. The Random Forest outperformed with a perfect classification score showing that it may have been better suited to dealing with the complexity and dimensionality of the dataset. The outcomes of the study highlight the role AI could play in automating the analysis of contracts, at the same time highlighting that cybersecurity practices are important when working with data. The framework of this study seeks to support the development of a regulatory regime and add further transparency to real estate contracts in the metaverse - as a scalable tool for future digital real estate management.

DeepSeek: Is it the End of Generative AI Monopoly or the Mark of the Impending Doomsday?

Malik Sallam — 2025-01-30

The rise of superintelligent open-source generative AI (genAI) heralds both extraordinary potential and unprecedented risk, exemplified by the rapid emergence of DeepSeek as a global AI innovator. This perspective article examines the dual-edged nature of open source genAI technologies, highlighting their capacity to democratize innovation while exposing critical vulnerabilities. By providing affordable, high-performing, and openly available models like DeepSeek-R1 and DeepSeek-V3, this Chinese AI company has disrupted the proprietary dominance of Western AI giants. These advancements are expected to empower researchers in resource-limited settings, foster global collaboration, and enable breakthroughs across numerous fields. Open-source AI, as illustrated by DeepSeek, has the potential to redefine the technological landscape by making advanced capabilities accessible to underrepresented communities and encouraging ethical and inclusive innovation. However, the openness that drives such progress is fraught with existential risks. Superintelligent open-source models, accessible to anyone with minimal resources, lower barriers for misuse by malicious actors. From automated cyberattacks and disinformation campaigns to destabilizing critical infrastructures, the potential for harm is vast and unprecedented. Beyond immediate security concerns, these technologies threaten economic stability by displacing entire workforces and exacerbating inequalities, and they undermine human agency by enabling manipulation on an individual and societal level. This perspective seeks to explore the profound benefits of open-source superintelligent AI while critically addressing the urgent need for ethical and regulatory frameworks to mitigate its risks. The story of DeepSeek underscores the fragile balance between innovation and destruction in an era where technological progress outpaces safeguards. Humanity’s ability to harness the transformative power of open-source AI without succumbing to its destructive potential is not just a technological challenge—it is an existential imperative. This perspective argues for vigilance, responsibility, and global cooperation to ensure that the promise of open-source AI serves humanity rather than imperiling it.

Advanced Machine Learning Models for Accurate Kidney Cancer Classification Using CT Images

Dhuha Abdalredha Kadhim — 2025-01-10

Kidney cancer, particularly renal cell carcinoma (RCC), poses significant challenges in early and accurate diagnosis due to the complexity of tumor characteristics in computerized tomography (CT) images. Traditional diagnostic approaches often struggle with variability in data and lack the precision required for effective clinical decision-making. This study aims to develop and evaluate machine learning (ML) models for the accurate classification of kidney cancer using CT images, focusing on improving diagnostic precision and addressing potential challenges of overfitting and dataset heterogeneity. Two ML models, Support Vector Machines (SVM) and Multi-Layer Perceptrons (MLP), were employed for classification. Key attribute extraction techniques, including grayscale-level co-occurrence matrix (GLCM) and Gabor filters, were utilized to capture texture and structural features of CT images. Data normalization and preprocessing ensured consistency and enhanced model reliability. The SVM model achieved an accuracy of 93%, while the MLP model demonstrated superior performance with a 99.64% accuracy rate. These results highlight the MLP model's ability to capture complex patterns in the data. However, the exceptional accuracy of the MLP model raises concerns about potential overfitting, warranting further evaluation on more diverse datasets. This study underscores the potential of ML techniques, particularly MLP, in enhancing the accuracy of kidney cancer diagnosis. Integrating such advanced ML models into clinical workflows could significantly improve patient outcomes.

Deep Learning-based English-Arabic Machine Translation for Sulfur Manufacture Texts

Diadeen Ali Hameed — 2024-12-14

The field of machine translation (MT) has seen significant advancements with deep learning (DL) techniques for translating texts among different languages. Despite the wealth of studies, there exists a noticeable gap in significant research dedicated to its translate Sulfur manufacture texts, primarily hindered by resource scarcity and the intricate grammatical structures inherent to these texts. This paper explores the application of transformer-based Arabic MT for sulfur manufacture texts, including its attention mechanisms and encoder-decoder framework, focusing on the new model ability to handle the linguistic and syntactic complexities inherent in these languages, such as morphological richness and context, and how the transformer's self-attention mechanism addresses these issues. It discusses the specific challenges of our proposed translation model, the obtained results indicate that this model is effective and has an accuracy of 90.7% in comparison with Mishraq application, which has 84.9% for the same test samples.

Anomaly-Based IDS (Intrusion Detection System) for Cyber-Physical Systems

Ahmad Muter Awaad — 2024-12-06

Cyber-physical systems (CPS) are critical infrastructures that integrate physical processes with computational components. The security of CPS is paramount, as any breach can lead to severe consequences. Anomaly-based intrusion detection systems (IDS) have emerged as a promising approach to safeguard CPS against cyber threats. This paper presents an anomaly-based IDS designed specifically for CPS, leveraging machine learning techniques to establish a baseline of normal system behaviour and promptly detect deviations indicative of malicious activities. The proposed system incorporates multiple classification techniques, including KNeighbors, RandomForest, XGB, DecisionTree, SGD, SVM, LGBM, AdaBoost, Bagging, and MLP Classifier, to enhance detection accuracy and robustness. Key components of the IDS, such as data collection, feature extraction, anomaly detection, and alert generation, are thoroughly outlined. The system's performance is evaluated, highlighting its effectiveness in accurately identifying intrusions while maintaining low false positive rates. The proposed anomaly-based IDS aims to provide a robust and reliable solution for enhancing the security of CPS and protecting critical infrastructure from cyber threats.

Healthcare Intelligence and Decision Making: Big Data’s Role in Predictive Analytics for Clinical Decision-Making

Sahar Yousif Mohammed — 2024-12-05

Technology and data are transforming healthcare systems. The use of big data in predictive analytics to anticipate healthcare outcomes, make accurate diagnoses, and improve care is a major advance. Predictive modeling analyzes patient data from EHRs, genetic data, and wearable devices to improve early diagnosis, targeted treatment, and efficiency. For example, predictors of chronic diseases like diabetes and heart disease can identify high-risk groups and treat them early, improving outcomes and saving money. A backpack matched to the patient's genetics and surroundings is also used. Privacy, system integration, and algorithmic transparency remain major issues. Predictive analytics may transform healthcare and overcome adoption hurdles in early detection and individualized care, as this article shows.

Automated Water Quality Assessment Using Big Data Analytics

Yasmin Makki Mohialden — 2024-11-07

Water is one of the world's most precious resources, essential to life. Industrial waste, agricultural runoff, and urban discharge degrade water, rendering it unfit for consumption. Water quality monitoring and evaluation are more important than ever. Big Data analytics is used to examine water quality utilizing enormous datasets of pH, hardness, solids concentration, chloramine, sulfate, conductivity, organic carbon, trihalomethanes, and turbidity. This work classifies water potability, which is vital for human consumption, using strong machine learning on massive datasets. Classifiers were Random Forest, Gradient Boosting, and Support Vector Machine on 3,276 water bodies. The Random Forest classifier obtained the highest accuracy at 66.77% after significant data preparation and training, followed by Gradient Boosting at 66.01% and SVM at 62.80%. This shows that Big Data analytics and machine learning algorithms can interpret complex water quality data for public health and natural resource management.

The Random Forest classifier and SVM in this study accurately calculate water potability. Prediction algorithms consider water cleanliness data and may aid public safety and water resource monitoring.

Deep Learning Approaches for Gender Classification from Facial Images

Mustafa Abdulfattah Habeeb — 2024-10-11

Gender recognition on the facial level is considered one of the most important technologies that finds use in such fields as a personalized marketing plan, safe systems of authentication, and effective human-computer interfaces. However, it has the following challenges; variation of lighting, facial movement, and ethnic/age face images. AI and DL has been improving on the effectiveness, flexibility, and speed of the gender classification system. AI enables complex and automatic feature learning in Data, while DL is tailored for handle variants in vision-based data. In this paper, we evaluated several architectures including Efficient Net_B2, ResNet50, ResNet18, and Lightning whilst determining the performance of the architectures in gender classification tasks. Self-assessment criteria included accuracy, precision, recall, and the F1-score. As for the performance, we found that ResNet18 had the highest scores on all the metrics, with the validation accuracy of above 98%, closely accompanied by the ResNet50 that, although it performed well as well, needed more epochs for convergence. The implications of this study for the development of future work in the gender classification technology include the discovery of ethnical, dependable, and effective techniques. Through the consideration of the state of the art and case studies, stakeholders can optimise the efficacy and the accountability of such systems, and thus support societal gains as a result of the improvement in technology.

A Framework for Automated Big Data Analytics in Cybersecurity Threat Detection

Mohamed Ariff Ameedeen — 2024-09-25

This research presents a novel framework designed to enhance cybersecurity through the integration of Big Data analytics, addressing the critical need for scalable and real-time threat detection in large-scale environments. Utilizing technologies such as Apache Kafka for efficient data ingestion, Apache Flink for stream processing, and advanced machine learning models like LSTM and Autoencoders, the framework offers robust anomaly detection capabilities. It also includes automated response mechanisms using SOAR and XDR systems, significantly improving response times and accuracy in threat mitigation. The proposed solution not only addresses current challenges in handling vast and complex data but also paves the way for future advancements, such as the integration of more sophisticated AI techniques and application across various domains, including IoT and cloud security. This research contributes to the field by providing a comprehensive, adaptive, and scalable framework that meets the demands of modern cybersecurity landscapes.

Hybrid Model for Forecasting Temperature in Khartoum Based on CRU data

Hussein Alkattan — 2024-08-20

This consider leverages verifiable climatic data from the Climatic Research Unit (CRU), traversing from 1901 to 2022, to create progressed temperature forecasting models for Khartoum, Sudan. By applying state-of-the-art machine learning techniques, including Hybrid model, we aim to progress the precision of temperature forecasts in a semi-arid climate. The integration of long-term CRU data permits for the recognizable proof of climate patterns and patterns, upgrading the unwavering quality of short- and long-term forecasts. Moved forward temperature forecasting can altogether advantage basic segments empowering way better adjustment to climatic changes and extraordinary climate occasions. Our approach illustrates the potential of combining authentic climate data with machine learning to supply noteworthy experiences for climate flexibility.

Harnessing the Potential of Artificial Intelligence in Managing Viral Hepatitis

Guma Ali — 2024-08-15

Viral hepatitis continues to be a serious global health concern, impacting millions of people, putting a strain on healthcare systems across the world, and causing significant morbidity and mortality. Traditional diagnostic, prognostic, and therapeutic procedures to address viral hepatitis are successful but have limits in accuracy, speed, and accessibility. Artificial intelligence (AI) advancement provides substantial opportunities to overcome these challenges. This study investigates the role of AI in revolutionizing viral hepatitis care, from early detection to therapy optimization and epidemiological surveillance. A comprehensive literature review was conducted using predefined keywords in the Nature, PLOS ONE, PubMed, Frontiers, Wiley Online Library, BMC, Taylor & Francis, Springer, ScienceDirect, MDPI, IEEE Xplore Digital Library, and Google Scholar databases. Peer-reviewed publications written in English between January 2019 and August 2024 were examined. The data of the selected research papers were synthesized and analyzed using thematic and narrative analysis techniques. The use of AI-driven algorithms in viral hepatitis control involves many significant aspects. AI improves diagnostic accuracy by integrating machine learning (ML) models with serological, genomic, and imaging data. It enables tailored treatment plans by assessing patient-specific characteristics and predicting therapy responses. AI-powered technologies aid in epidemiological modeling, and AI-powered systems effectively track treatment adherence, identify medication resistance, and control complications associated with chronic hepatitis infections. It is vital in identifying new antiviral medicines and vaccines, speeding the development pipeline through high-throughput screening and predictive modeling. Despite its transformational promise, using AI in viral hepatitis care presents various challenges, including data privacy concerns, the necessity for extensive and varied datasets, and the possibility of algorithmic biases. Ethical considerations, legal frameworks, and multidisciplinary collaboration are required to resolve these issues and ensure AI technology’s safe and successful use in clinical practice. Exploiting the full AI’s potential for viral hepatitis management provides unparalleled prospects to improve patient outcomes, optimize public health policies, and, eventually, and alleviate the disease’s negative impact worldwide. This study seeks to provide academics, medics, and policymakers with the fundamental knowledge they need to harness AI’s potential in the fight against viral hepatitis.

Using Data Anonymization in big data analytics security and privacy

Abdulatif Ali Hussain — 2024-08-10

Big Data and Analytics mean an enormous and complex collection of very diverse information, which is processed with various technologies and methods to produce and deliver useful and valuable insights. Analytics is the science of using data, or information to extract useful and actionable insights, facts and knowledge from a collection of data it could be stated that Big Data Analytics is the best thing since every commercial data system ever built, although everybody with a more optimistic vision of technology would like to take note that there is a fine line where Everything Data crosses the boundary to something else, especially with regard to privacy and security of the world as we know it. Privacy and security are two distinct but closely related phenomena. Whereas privacy refers to the control over access to the individual, security refers to the stability or strength of controls designed to protect the individual’s privacy. There are many obvious considerations and obstacles when attempting to securely share data. During big data analytics, many invasive techniques such as data fusion, cross-correlation, and algorithm training are often conducted over shared data, which can lead to severe privacy leaks. This means that every enterprise, organization, and individual maintaining large data repositories are in danger of being breached. Our study teaches us that security, privacy, and ethical concerns in big data analytics do not exist in parallel to the business cycle, but must be wisely and ethically managed in coherence throughout all emerging processes of the big data and information systems.

Advanced Machine Learning Approaches for Enhanced GDP Nowcasting in Syria Through Comprehensive Analysis of Regularization Techniques

Khder Alakkari — 2024-08-02

This study addresses the challenge of nowcasting Gross Domestic Product (GDP) in data-scarce environments, with a focus on Syria, a country facing significant economic and political instability. Utilizing a dataset from 2010 to 2022, three machine learning algorithms Elastic Net, Ridge, and Lasso were applied to model GDP dynamics based on macroeconomic indicators, commodity prices, and high-frequency internet search data from Google Trends. Among these, the Lasso regression model, noted for its variable selection and sparsity promotion, proved most effective in capturing Syria's complex economic realities, achieving the lowest Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE). This accuracy highlights the Lasso model's capability to identify robust economic relationships despite limited data, thereby reducing overfitting and improving forecast generalizability. The study underscores the significant impact of non-traditional indicators, such as Google Trends Agriculture (GTA) and Google Trends Consumption (GTC), on GDP growth, offering valuable insights for policymakers and analysts in data-scarce environments. The findings support the use of machine learning techniques, particularly Lasso regression, as powerful tools for economic forecasting, enhancing informed decision-making in challenging settings.

Face Morphing Attacks Detection Approaches: A Review

Essa Mokna Namis — 2024-07-20

Face recognition systems (FRSs) that are applied by real-time applications such as border control are vulnerable to attacks such as face morphing, which blends two or more facial images into a single morphed image. The vulnerability of FRSs to many types of attacks, including both direct and indirect attacks, as well as face-morphing attacks, has garnered significant attention from the biometric field. A morphing attack aims to undermine the security of an FRS at an automated border control (ABC) gate by using an electronic machine-readable travel document (eMRTD) or e-passport that is acquired using a morphed face image. Most countries require applicants for an e-passport to present a passport photograph throughout the application process. A person with malicious intent and a collaborator can create a morphed facial image to illegally get an e-passport. A fraudulent individual, together with their accomplice, can exploit an e-passport with a morphed facial image to successfully travel through a border. Both individuals can authenticate the altered facial image, making it possible. A malicious individual could enter the border undetected, concealing their criminal history, while the access control system's log records information about their accomplice, posing a significant risk. This paper aims to provide a comprehensive overview of face morphing attacks and the developments happening in this field. We will go over the difficulties encountered, the methods for generating morphing images, and the pros and cons of these approaches. Along with the most important performance metrics that measure the efficiency of the algorithms used. The paper also covers the types of techniques used in deep learning and machine learning to detect and determine the attack of mutant faces. Indeed, it provides an overview of the most significant results from studies done in this area of research.

Generalized Time Domain Prediction Model for Motor Imagery-based Wheelchair Movement Control

Z.T. Al-Qaysi — 2024-06-15

Brain-computer interface (BCI-MI)-based wheelchair control is, in principle, an appropriate method for completely paralyzed people with a healthy brain. In a BCI-based wheelchair control system, pattern recognition in terms of preprocessing, feature extraction, and classification plays a significant role in avoiding recognition errors, which can lead to the initiation of the wrong command that will put the user in unsafe condition. Therefore, this research's goal is to create a time-domain generic pattern recognition model (GPRM) of two-class EEG-MI signals for use in a wheelchair control system.
This GPRM has the advantage of having a model that is applicable to unknown subjects, not just one. This GPRM has been developed, evaluated, and validated by utilizing two datasets, namely, the BCI Competition IV and the Emotive EPOC datasets. Initially, fifteen-time windows were investigated with seven machine learning methods to determine the optimal time window as well as the best classification method with strong generalizability. Evidently, the experimental results of this study revealed that the duration of the EEG-MI signal in the range of 4–6 seconds (4–6 s) has a high impact on the classification accuracy while extracting the signal features using five statistical methods. Additionally, the results demonstrate a one-second latency after each command cue when using the eight-second EEG-MI signal that the Graz protocol used in this study. This one-second latency is inevitable because it is practically impossible for the subjects to imagine their MI hand movement instantly. Therefore, at least one second is required for subjects to prepare to initiate their motor imagery hand movement. Practically, the five statistical methods are efficient and viable for decoding the EEG-MI signal in the time domain. Evidently, the GPRM model based on the LR classifier showed its ability to achieve an impressive classification accuracy of 90%, which was validated on the Emotive EPOC dataset. The GPRM developed in this study is highly adaptable and recommended for deployment in real-time EEG-MI-based wheelchair control systems.

MLP and RBF Algorithms in Finance: Predicting and Classifying Stock Prices amidst Economic Policy Uncertainty

Bushra Ali — 2024-05-11

In the realm of stock market prediction and classification, the use of machine learning algorithms has gained significant attention. In this study, we explore the application of Multilayer Perceptron (MLP) and Radial Basis Function (RBF) algorithms in predicting and classifying stock prices, specifically amidst economic policy uncertainty. Stock market fluctuations are greatly influenced by economic policies implemented by governments and central banks. These policies can create uncertainty and volatility, which in turn makes accurate predictions and classifications of stock prices more challenging. By leveraging MLP and RBF algorithms, we aim to develop models that can effectively navigate these uncertainties and provide valuable insights to investors and financial analysts. The MLP algorithm, based on artificial neural networks, is able to learn complex patterns and relationships within financial data. The RBF algorithm, on the other hand, utilizes radial basis functions to capture non-linear relationships and identify hidden patterns within the data. By combining these algorithms, we aim to enhance the accuracy of stock price prediction and classification models. The results showed that both MLB and RBF predicted stock prices well for a group of countries using an index reflecting the impact of news on economic policy and expectations, where the MLB algorithm proved its ability to predict chain data. Countries were also classified according to stock price data and uncertainty in economic policy, allowing us to determine the best country to invest in according to the data. The uncertainty surrounding economic policy is what makes stock price forecasting so crucial. Investors must consider the degree of economic policy uncertainty and how it affects asset prices when deciding how to allocate their assets.

Agent-Interacted Big Data-Driven Dynamic Cartoon Video Generator

Yasmin Makki Mohialden — 2024-04-17

This study presents a novel method for animating videos using three Kaggle cartoon faces data sets. Dynamic interactions between cartoon agents and random backgrounds, as well as Gaussian blur, rotation, and noise addition, make cartoon visuals look better. This approach also evaluates video quality and animation design by calculating the backdrop colour's average and standard deviation, ensuring visually appealing material. This technology uses massive datasets to generate attractive animated videos for entertainment, teaching, and marketing.

Enhancing XML-based Compiler Construction with Large Language Models: A Novel Approach

Idrees A. Zahid — 2024-03-20

Considering the prevailing rule of Large Language Models (LLMs) applications and the benefits of XML in a compiler context. This manuscript explores the synergistic integration of Large Language Models with XML-based compiler tools and advanced computing technologies. Marking a significant stride toward redefining compiler construction and data representation paradigms. As computing power and internet proliferation advance, XML emerges as a pivotal technology for representing, exchanging, and transforming documents and data. This study builds on the foundational work of Chomsky's Context-Free Grammar (CFG). Recognized for their critical role in compiler construction, to address and mitigate the speed penalties associated with traditional compiler systems and parser generators through the development of an efficient XML parser generator employing compiler techniques. Our research employs a methodical approach to harness the sophisticated capabilities of LLMs, alongside XML technologies. The key is to automate grammar optimization, facilitate natural language processing capabilities, and pioneer advanced parsing algorithms. To demonstrate their effectiveness, we thoroughly run experiments and compare them to other techniques. This way, we call attention to the efficiency, adaptability, and user-friendliness of the XML-based compiler tools with the help of these integrations. And the target will be the elimination of left-recursive grammars and the development of a global schema for LL(1) grammars, the latter taking advantage of the XML technology, to support the LL(1) grammars construction. The findings in this research not only underscore the signification of these innovations in the field of compilation construction but also indicate a paradigm move towards the use of AI technologies and XML in the context of the resolution of programming traditional issues. The outlined methodology serves as a roadmap for future research and development in compiler technology, which paves the way for open-source software to sweep across all fields. Gradually ushering in a new era of compiler technology featuring better efficiency, adaptability, and all CFGs processed through existing XML utilities on a global basis.

Leveraging AI and Big Data in Low-Resource Healthcare Settings

Ahmed Hussein Ali — 2024-02-14

Big data and artificial intelligence are game-changing technologies for the underdeveloped healthcare industry because they help optimize the entire supply chain and deliver more exact patient outcome information. Machine learning approaches that have recently seen more growing popularity include deep learning models that have brought revolution within the healthcare system in the previous years due to more complicated data compared to previous years . Machine learning is an essential data analysis procedure to describe efficient and effective methods to extract hidden information from large amounts of data that it would take logical analytics too long to manage. Recent years have seen an expansion and growth of advanced intelligent systems that have been able to learn more about clinical treatments and glean untapped medical information emanating from vast quantities of data when it comes to drug discovery and chemistry. The aim of this chapter is, therefore, to assess which big data and artificial intelligence approaches are prevalent in healthcare systems by investigating the most advanced big data structures, applications, and industry trends today available. First and foremost, the purpose is to provide a comprehensive overview of how the artificial intelligence and big data models can allocation in healthcare solutions fill the gap between machine learning approaches’ lack of human coverage and the healthcare data’s complexity. Moreover, current artificial intelligence technologies, including generative models, Bayesian deep learning, reinforcement learning, and self-driving laboratories, are also increasingly being used for drug discovery and chemistry . Finally, the work presents the existing open challenges and the future directions in the drug formulation development field. To this end, the review will cover on published algorithms/automation tools for artificial intelligence applied to large scale-data in the case of healthcare .

Hybrid Spotted Hyena based Load Balancing algorithm (HSHLB)

Raed A.Hasan — 2024-11-02

In this paper, we delve into the core of our proposed dynamic load balancing mechanism, the Hybrid Spotted Hyena based Load Balancing algorithm (HSHLB). We begin by presenting a comprehensive overview of the Spotted Hyena Optimization Algorithm (SHOA) and the fundamental behaviors it encompasses, namely migration and attacking. We then introduce the Load Balancing Algorithm (LB) and outline its principles, emphasizing its distinct characteristics. Recognizing the strengths and weaknesses of both SHOA and LB, we embark on a journey to hybridize these two optimization techniques, resulting in the potent HSHLB. We elucidate the intricate process of combining these algorithms, elucidating how HSHLB harnesses their respective strengths while mitigating their limitations. This hybridization is pivotal to the overarching goal of achieving dynamic load balancing within cloud computing environments. As we progress through this section, we provide invaluable insights into the inner workings of HSHLB, offering readers a comprehensive understanding of its algorithmic steps, parameters, and intricacies. For clarity and enhanced comprehension, we incorporate pseudocode or flowcharts to illustrate the practical implementation of HSHLB. In sum, Section 3 lays the foundation for the practical application of HSHLB in achieving dynamic load balancing, setting the stage for subsequent sections where we delve into implementation details, results, and analysis. HSHLB emerges as a promising solution to the multifaceted challenges of load balancing in cloud computing, leveraging the unique strengths of SHOA and LB to optimize resource allocation and enhance Quality of Service (QoS).

Assessing the Transformative Influence of ChatGPT on Research Practices among Scholars in Pakistan

Nayab Arshad — 2024-01-10

This article investigates the transformative impact of ChatGPT on research practices within the scholarly community in Pakistan. ChatGPT, a powerful AI language model, has added significant consideration for its possibility of improving academic research. Survey data was gathered via a structured questionnaire distributed to researchers in Pakistan. A total of 278 questionnaires were distributed for the randomly chosen sample, of which 223 were returned. For calculating descriptive statistics, SPSS was utilized. Results of the study indicated that 90% of scholars are familiar with the practice of ChatGPT in research activities. 86% of scholars used 3.5 (Basic version) of ChatGPT for their research and only 14% used 4 (Plus version) of ChatGPT for their research work. The overall satisfaction level was 46% response satisfied with the usage of ChatGPT in research activities. The article discusses how ChatGPT's natural language processing capabilities have advanced literature reviews, data analysis, and content generation, thereby saving time and fostering greater productivity. Moreover, it examines how the tool’s accessibility and affordability have democratized research, making it more inclusive and open to a broader range of scholars. By shedding light on these critical aspects, this article provides valuable insights into the evolving landscape of research practices in Pakistan and highlights the potential for ChatGPT to revolutionize academic scholarship in the digital age.

Artificial Neural Networks Advantages and Disadvantages

Maad M. Mijwel — 2021-08-23

Artificial neural networks (ANNs) are the modelling of the human brain with the simplest definition, and the building blocks are neurons. There are about 100 billion neurons in the human brain. Each neuron has a connection point between 1,000 and 100,000. In the human brain, information is stored in such a way as to be distributed, and we can extract more than one piece of this information from our memory in parallel when necessary. We are not mistaken when we say that a human brain is made up of thousands of very, very powerful parallel processors. In multi-layer artificial neural networks, there are also neurons placed in a similar manner to the human brain. Each neuron is connected to other neurons with specific coefficients. During training, information is distributed to these connection points so that the network is learned.

Navigating the Void: Uncovering Research Gaps in the Detection of Data Poisoning Attacks in Federated Learning-Based Big Data Processing: A Systematic Literature Review

Mohammad Aljanabi — 2023-12-07

This systematic literature review scrutinizes the landscape of research at the intersection of federated learning, big data processing, and data poisoning attacks. Employing a meticulous search strategy across multiple databases, the study unveils a surge in annual scientific production, emphasizing a growing interest in federated learning and related fields. However, a critical research gap becomes evident during the investigation of data poisoning attacks specifically in the context of federated learning when processing big data. The most relevant keywords and a visually compelling word cloud further illuminate the prevailing themes and emphases within the literature, emphasizing the lack of explicit focus on detecting data poisoning attacks. This identified gap presents a significant avenue for future research, offering opportunities to enhance the security and robustness of federated learning systems against adversarial threats in large-scale data scenarios.

Mapping the Evolution of Intrusion Detection in Big Data: A Bibliometric Analysis

Mohanad G. Yaseen — 2023-12-03

This study provides a comprehensive analysis of the dynamic amalgamation of intrusion detection and big data, revealing trends and patterns within cybersecurity research. The investigation reveals a notable surge in scholarly output from 2018 onwards, reflecting heightened interest and exploration within the field. Dominant themes such as "intrusion detection," "big data," and "machine learning" underscore the integration of security concerns with advanced technologies. Geographical influences showcase diverse contributions, with varying citation impacts from countries like India, China, and Saudi Arabia. Author contributions reveal a balance between prolific authors and impactful contributions from authors with fewer publications. Recommendations include fostering interdisciplinary collaborations, integrating advanced computational methods, and conducting longitudinal studies to gauge sustained impacts. This research underscores collaboration dynamics, thematic evolution, and global influences as pivotal facets within the realm of intrusion detection and big data, guiding future research to fortify digital security in an ever-evolving technological landscape.

Climate Changes through Data Science: Understanding and Mitigating Environmental Crisis

Ahmed Hussein Ali — 2023-12-02

Climate change represents an urgent environmental crisis with far-reaching risks to ecosystems and human communities worldwide. Rapid development of mitigation strategies and solutions is imperative but relies profoundly on advancements in detection, attribution, and prediction derived from climate data analytics. This paper examines the growing role of data science in not only quantifying anthropogenic climate change but also informing impact assessment and targeted intervention across climate-sensitive sectors. First, we survey established and emerging techniques for climate characterization, including machine learning applications on Earth systems data. Next, we discuss how sophisticated climate models alongside statistical analysis of multi-domain datasets—from migration patterns to crop yields—deepens scientific comprehension of climate change repercussions. Building on these insights, we spotlight data-enabled solution paradigms enabling smart climate action, ranging from high-resolution climate risk mapping, emissions reductions via optimized renewable energy infrastructure, to global warming suppression via solar radiation management. However, we also carefully examine the practical limitations hindering deployment and the ethical concerns posed by certain climate intervention proposals. Ultimately, while data science delivers powerful tools for climate change detection, attribution, and response, this paper underscores how continued climate data gathering alongside cross-disciplinary collaboration is vital to overcome analytical uncertainties, implementation barriers, and moral objections as we work to avert profound environmental breakdown.