https://journals.mesopotamian.press/index.php/ADSA/issue/feed Applied Data Science and Analysis 2024-10-20T17:39:30+00:00 Open Journal Systems <p style="text-align: justify;">Applied Data Science and Analysis is a respected journal dedicated to real-world applications of data science. It provides a platform for experts to share innovative ideas and methodologies. Focused on bridging theory and practice, it showcases cutting-edge research and case studies in data analysis, machine learning, and more. Welcoming diverse contributions from fields like business, healthcare, and social sciences, the journal fosters collaboration among data professionals, aiming to advance the impact of data science in practical settings</p> https://journals.mesopotamian.press/index.php/ADSA/article/view/395 Big Data Predictive Analytics for Personalized Medicine: Perspectives and Challenges 2024-07-07T08:04:57+00:00 Tahsien Al-Quraishi tahsien.a@vit.edu.au Naseer Al-Quraishi naseerali@alayen.edu.iq Hussein AlNabulsi Hussein.a@vit.edu.au Hussein AL-Qarishey halqarish@ltu.edu.us Ahmed Hussein Ali ahmed.ali@aliraqia.edu.iq <p>The integration of predictive analytics into personalized medicine has become a promising approach for improving patient outcomes and treatment efficacy. This paper provides a review of the field, examining the tools, methodologies, and challenges associated with this advanced statistical methodology. Predictive analytics leverages machine learning algorithms to analyze vast datasets, including Electronic Health Records (EHRs), genomic data, medical imaging, and real-time data from wearable devices. The review explores key tools such as the Hadoop Distributed File System (HDFS), Apache Spark, and Apache Hive, which facilitate scalable storage, efficient data processing, and comprehensive data analysis. Key challenges identified include managing the immense volume of healthcare data, ensuring data quality and integration, and addressing privacy and security concerns. The paper also highlights the difficulties in achieving real-time data processing and integrating predictive insights into clinical practice. Effective data governance and ethical considerations are critical to maintaining trust and transparency. The strategic use of big data tools, combined with investment in skill development and interdisciplinary collaboration, is essential for harnessing the full potential of predictive analytics in personalized medicine. By overcoming these challenges, healthcare providers can enhance patient care, optimize resource management, and drive medical discoveries, ultimately revolutionizing healthcare delivery on a global scale.</p> 2024-04-11T00:00:00+00:00 Copyright (c) 2024 Tahsien Al-Quraishi, Naseer Al-Quraishi, Hussein AlNabulsi, Hussein AL-Qarishey, Ahmed Hussein Ali https://journals.mesopotamian.press/index.php/ADSA/article/view/511 Emerging Trends in Applying Artificial Intelligence to Monkeypox Disease: A Bibliometric Analysis 2024-09-10T10:15:12+00:00 Yahya Layth Khaleel yahya@tu.edu.iq Mustafa Abdulfattah Habeeb yahya@tu.edu.iq Rabab Benotsmane yahya@tu.edu.iq <p>Monkeypox is a rather rare viral infectious disease that initially did not receive much attention but has recently become a subject of concern from the point of view of public health. Artificial intelligence (AI) techniques are considered beneficial when it comes to diagnosis and identification of Monkeypox through the medical big data, including medical imaging and other details from patients’ information systems. Therefore, this work performs a bibliometric analysis to incorporate the fields of AI and bibliometrics to discuss trends and future research opportunities in Monkeypox. A search over various databases was performed and the title and abstracts of the articles were reviewed, resulting in a total of 251 articles. After eliminating duplicates and irrelevant papers, 108 articles were found to be suitable for the study. In reviewing these studies, attention was given on who contributed on the topics or fields, what new topics appeared over time, and what papers were most notable. The main added value of this work is to outline to the reader the process of how to conduct a correct comprehensive bibliometric analysis by examining a real case study related to Monkeypox disease. As a result, the study shows that AI has a great potential to improve diagnostics, treatment, and public health recommendations connected with Monkeypox. Possibly, the application of AI to Monkeypox study can enhance the public health responses and outcomes since it can hasten the identification of effective interventions.</p> 2024-09-08T00:00:00+00:00 Copyright (c) 2024 Yahya Layth Khaleel , Mustafa Abdulfattah Habeeb , Mustafa Abdulfattah Habeeb , Rabab Benotsmane https://journals.mesopotamian.press/index.php/ADSA/article/view/391 Deep Transfer Learning Model for EEG Biometric Decoding 2024-07-07T08:05:01+00:00 Rasha A. Aljanabi ziadoontareq@tu.edu.iq Z.T. Al-Qaysi ziadoontareq@tu.edu.iq M. S Suzani ziadoontareq@tu.edu.iq <p>In automated systems, biometric systems can be used for efficient and unique identification and authentication of individuals without requiring users to carry or remember any physical tokens or passwords. Biometric systems are a rapidly developing and promising technology domain. in contrasting with conventional methods like password IDs. Biometrics refer to biological measures or physical traits that can be employed to identify and authenticate individuals. The motivation to employ brain activity as a biometric identifier in automatic identification systems has increased substantially in recent years. with a specific focus on data obtained through electroencephalography (EEG). Numerous investigations have revealed the existence of discriminative characteristics in brain signals captured during different types of cognitive tasks. However, because of their high dimensional and nonstationary properties, EEG signals are inherently complex, which means that both feature extraction and classification methods must take this into consideration. In this study, a hybridization method that combined a classical classifier with a pre-trained convolutional neural network (CNN) and the short-time Fourier transform (STFT) spectrum was employed. For tasks such as subject identification and lock and unlock classification, we employed a hybrid model in mobile biometric authentication to decode two-class motor imagery (MI) signals. This was accomplished by building nine distinct hybrid models using nine potential classifiers, primarily classification algorithms, from which the best one was finally selected. The experimental portion of this study involved, in practice, six experiments. For biometric authentication tasks, the first experiment tries to create a hybrid model. In order to accomplish this, nine hybrid models were constructed using nine potential classifiers, which are largely classification methods. Comparing the RF-VGG19 model to other models, it is evident that the former performed better. As a result, it was chosen as the method for mobile biometric authentication. The performance RF-VGG19 model is validated using the second experiment. The third experiment attempts for verifying the RF-VGG19 model's performance. The fourth experiment performs the lock and unlock classification process with an average accuracy of 91.0% using the RF-VGG19 model. The fifth experiment was performed to verify the accuracy and effectiveness of the RF-VGG19 model in performing the lock and unlock task. The mean accuracy achieved was 94.40%. Validating the RF-VGG19 model for the lock and unlock task using a different dataset (unseen data) was the goal of the sixth experiment, which achieved an accuracy of 92.8%. This indicates the hybrid model assesses the left and right hands' ability to decode the MI signal. Consequently, The RF-VGG19 model can aid the BCI-MI community by simplifying the implementation of the mobile biometric authentication requirement, specifically in subject identification and lock and unlock classification.</p> 2024-02-28T00:00:00+00:00 Copyright (c) 2024 Rasha A. Aljanabi, Z.T. Al-Qaysi , M. S Suzani https://journals.mesopotamian.press/index.php/ADSA/article/view/449 Transforming Amazon's Operations: Leveraging Oracle Cloud-Based ERP with Advanced Analytics for Data-Driven Success 2024-08-12T07:44:38+00:00 Tahsien Al-Quraishi bba67725@gmail.com Osama A. Mahdi bba67725@gmail.com Ali Abusalem bba67725@gmail.com Chee Keong NG bba67725@gmail.com Amoakoh Gyasi bba67725@gmail.com Omar Al-Boridi bba67725@gmail.com Naseer Al-Quraishi bba67725@gmail.com <p><strong>Background:</strong> This research paper discusses a detailed exploration of Amazon's adoption of Oracle ERP Cloud, focusing on the strategic benefits of the implementation and the challenges and wider implications of implementing cloud-based ERP solutions within one of the world's largest and most complex enterprises. Further, it is detailed how, through a strict selection process, Amazon was led to settle for Oracle ERP Cloud from several leading ERP systems in the market. It also brings forth the criteria and evaluations at hand that guided this decision-making.</p> <h3>Method: This technique focuses on the phased rollout strategy, showing how Amazon brought the ERP system incrementally across departments, beginning with finance and procurement. It underlines the important role played by cross-functional teamwork, depicting efforts between finance, supply chain, HR, and IT teams to smooth implementation.</h3> <h3>Results: The study shows how deep technologies such as AI, machine learning, the Internet of Things, and blockchain are integrated into the ERP system. These go a long way to increase the decision-making ability and better operation of security, with improved transparency in Amazon; they provide it with real-time analytics, predictive insights, and improved transparency.</h3> <p><strong>Conclusion:</strong> Implementing Oracle ERP Cloud at Amazon sheds light on how scalable and cost-efficient cloud-based ERP solutions are. The availability of real-time data access and advanced analytics has spurred data-driven decision-making, but issues such as data migration and security require careful consideration in the planning process. This work provides valuable insights for enterprises seeking to implement similar ERP systems.</p> 2024-07-10T00:00:00+00:00 Copyright (c) 2024 Tahsien Al-Quraishi, Osama A. Mahdi , Ali Abusalem , Chee Keong NG , Amoakoh Gyasi , Omar Al-Boridi , Naseer Al-Quraishi https://journals.mesopotamian.press/index.php/ADSA/article/view/136 Optimal Time Window Selection in the Wavelet Signal Domain for Brain–Computer Interfaces in Wheelchair Steering Control 2024-07-07T08:04:50+00:00 Z.T. Al-Qaysi ziadoontareq@tu.edu.iq M. S Suzani ziadoontareq@tu.edu.iq Nazre bin Abdul Rashid ziadoontareq@tu.edu.iq Rasha A. Aljanabi ziadoontareq@tu.edu.iq Reem D. Ismail ziadoontareq@tu.edu.iq M.A. Ahmed ziadoontareq@tu.edu.iq Wan Aliaa Wan Sulaiman ziadoontareq@tu.edu.iq Harish Kumar ziadoontareq@tu.edu.iq <p>Background and objective: Principally, the procedure of pattern recognition in terms of segmentation plays a significant role in a BCI-based wheelchair control system for avoiding recognition errors, which can lead to the initiation of the wrong command that will put the user in unsafe situations. Arguably, each subject might have different motor-imagery signal powers at different times in the trial because he or she could start (or end) performing the motor-imagery task at slightly different time intervals due to differences in the complexities his or her brain. Therefore, the primary goal of this research is to develop a generic pattern recognition model (GPRM)-based EEG-MI brain-computer interface for wheelchair steering control. Additionally, having a simplified and well generalized pattern recognition model is essential for EEG-MI based BCI applications. Methods: Initially, bandpass filtering and segmentation using multiple time windows were used for denoising the EEG-MI signal and finding the best duration that contains the MI feature components. Then, feature extraction was performed using five statistical features, namely the minimum, maximum, mean, median, and standard deviation, were used for extracting the MI feature components from the wavelet coefficient. Then, seven machine learning methods were adopted and evaluated to find the best classifiers. Results: The results of the study showed that, the best durations in the time-frequency domain were in the range of (4-7 s). Interestingly, the GPRM model based on the LR classifier was highly accurate, and achieved an impressive classification accuracy of 85.7%.</p> 2024-06-15T00:00:00+00:00 Copyright (c) 2024 Z.T. Al-Qaysi , M. S Suzani , Nazre bin Abdul Rashid , Rasha A. Aljanabi, Reem D. Ismail , M.A. Ahmed , Wan Aliaa Wan Sulaiman , Harish Kumar https://journals.mesopotamian.press/index.php/ADSA/article/view/437 A Frequency-Domain Pattern Recognition Model for Motor Imagery-Based Brain-Computer Interface 2024-07-07T08:04:47+00:00 Z.T. Al-Qaysi ziadoontareq@tu.edu.iq M. S Suzani ziadoontareq@tu.edu.iq Nazre bin Abdul Rashid ziadoontareq@tu.edu.iq Reem D. Ismail ziadoontareq@tu.edu.iq M.A. Ahmed ziadoontareq@tu.edu.iq Wan Aliaa Wan Sulaiman ziadoontareq@tu.edu.iq Rasha A. Aljanabi ziadoontareq@tu.edu.iq <p>Brain-computer interface (BCI) is an appropriate technique for totally paralyzed people with a healthy brain. BCI based motor imagery (MI) is a common approach and widely used in neuroscience, rehabilitation engineering, as well as wheelchair control. In a BCI based wheelchair control system the procedure of pattern recognition in term of preprocessing, feature extraction, and classification plays a significant role in system performance. Otherwise, the recognition errors can lead to the wrong command that will put the user in unsafe conditions. The main objectives of this study are to develop a generic pattern recognition model-based EEG –MI Brain-computer interfaces for wheelchair steering control. In term of preprocessing, signal filtering, and segmentation, multiple time window was used for de-noising and finding the MI feedback. In term of feature extraction, five statistical features namely (mean, median, min, max, and standard deviation) were used for extracting signal features in the frequency domain. In term of feature classification, seven machine learning were used towards finding the single and hybrid classifier for the generic model. For validation, EEG data from BCI Competition dataset (Graz University) were used to validate the developed generic pattern recognition model. The obtained result of this study as the following: (1) from the preprocessing perspective it was seen that the two-second time window is optimal for extracting MI signal feedback. (2) statistical features are seen have a good efficiency for extracting EEG-MI features in the frequency domain. (3) Classification using (MLP-LR) is perfect in a frequency domain based generic pattern recognition model. Finally, it can be concluded that the generic pattern recognition model-based hybrid classifier is efficient and can be deployed in a real-time EEG-MI based wheelchair control system.</p> 2024-06-20T00:00:00+00:00 Copyright (c) 2024 Z.T. Al-Qaysi , M. S Suzani , Nazre bin Abdul Rashid , Reem D. Ismail , M.A. Ahmed , Rasha A. Aljanabi, Mohd Arfian Ismail https://journals.mesopotamian.press/index.php/ADSA/article/view/397 An Innovative Method of Malicious Code Injection Attacks on Websites 2024-07-07T08:04:56+00:00 Hussein Alnabulsi hussein.a@vit.edu.au Rafiqul Islam hussein.a@vit.edu.au Izzat Alsmadi hussein.a@vit.edu.au Savitri Bevinakoppa hussein.a@vit.edu.au <p>This paper provides a model to identify website vulnerability to Code Injection Attacks (CIAs). The proposed model identifies vulnerabilities to CIA of various websites, to check vulnerable to CIAs. The lack of existing models in providing checking against code injection has motivated this paper to present a new and enhanced model against web code injection attacks that uses SQL injections and Cross-Site Script (XSS) injections. This paper previews a self-checking protection model which enables web administrators to know whether their current protection program is adequate, or whether a website needs stronger protection against CIAs. The Automated Injection’s model is to check vulnerable to cod injection. The checking methodology consists of many intrusion methods that the attacker may use to launch code injection attacks. Methodology can give a high precision of CIA vulnerability checking for a website compared with other approaches (the minimum accuracy different between proposed approach and other approaches is 3.15%). CIAs can be a serious problem for vulnerable websites including stealing, deleting, or altering important data. Extensive experiments are conducted and compared with existing research [e.g. 1, 5, and 9] to study the effectiveness of the proposed model that can check whether a website is vulnerable to CIAs. The performance of the suggested approach has been tested on SQL injections and XSS injections. The studies showed that the detection rate of our model is 95.27%, and the false positive rate is 5.55%.</p> 2024-05-20T00:00:00+00:00 Copyright (c) 2024 Hussein Alnabulsi , Rafiqul Islam , Izzat Alsmadi , Savitri Bevinakoppa https://journals.mesopotamian.press/index.php/ADSA/article/view/567 Lexicon annotation in sentiment analysis for dialectal Arabic: Consensus Expert Standardized Criteria 2024-10-20T17:39:30+00:00 Sameh M. Sherif alamoodi.abdullah91@gmail.com A.H. Alamoodi alamoodi.abdullah91@gmail.com <p>Sentiment Analysis (SA) in Natural Language Processing (NLP) involves analyzing perceptions, attitudes, and emotions from text. It is crucial for decision-making and consumer insights. Recent studies focus on developing Lexicons for SA research. Understanding the construction and evaluation of existing lexicons is key to advancing development efforts. Evaluation and benchmarking of lexicons are vital for identifying the most suitable ones and establishing best practices. Factors like effectiveness and importance must be considered when building or selecting lexicons. This research outlines three key phases: Determining Lexicons, Identifying Evaluation Criteria, and Engaging Experts. The study aims to enhance understanding of lexicon development processes and improve future guidelines. Efforts in lexicon development can benefit from a structured approach that considers various criteria for evaluation. The research emphasizes the importance of expert input in refining lexicons for optimal performance. Evaluating lexical criteria helps in identifying gaps and areas for improvement in sentiment analysis tools. Benchmarking different lexicons aids in selecting the most appropriate ones for specific applications or domains. Establishing best practices in lexicon development involves thorough evaluation against predefined criteria to ensure quality and reliability. Expert opinions play a crucial role in validating the significance of developed lexicons for sentiment analysis tasks. The research methodology involves systematic identification of lexicons relevant criteria, and experts to inform best practices in the field of sentiment analysis. By focusing on these three key phases, this study aims to contribute valuable insights into enhancing sentiment analysis through improved lexicon development processes.</p> 2024-09-24T00:00:00+00:00 Copyright (c) 2024 Sameh M. Sherif , A.H. Alamoodi https://journals.mesopotamian.press/index.php/ADSA/article/view/394 Semantic Image Retrieval Analysis Based on Deep Learning and Singular Value Decomposition 2024-08-12T07:44:44+00:00 M.H. Hadid marwa.h.hadid@tu.edu.iq Z.T. Al-Qaysi ziadoontareq@tu.edu.iq Qasim Mohammed Hussein marwa.h.hadid@tu.edu.iq Rasha A. Aljanabi marwa.h.hadid@tu.edu.iq Israa Rafaa Abdulqader marwa.h.hadid@tu.edu.iq M. S Suzani marwa.h.hadid@tu.edu.iq WL Shir marwa.h.hadid@tu.edu.iq <p>The exponential growth in the total quantity of digital images has necessitated the development of systems that are capable of retrieving these images. Content-based image retrieval is a technique used to get images from a database. The user provides a query image, and the system retrieves those photos from the database that are most similar to the query image. The image retrieval problem pertains to the task of locating digital photographs inside extensive datasets. Image retrieval researchers are transitioning from the use of keywords to the utilization of low-level characteristics and semantic features. The push for semantic features arises from the issue of subjective and time-consuming keywords, as well as the limitation of low-level characteristics in capturing high-level concepts that users have in mind. The main goal of this study is to examine how convolutional neural networks can be used to acquire advanced visual features. These high-level feature descriptors have the potential to be the most effective compared to the handcrafted feature descriptors in terms of image representation, which would result in improved image retrieval performance. The (CBIR-VGGSVD) model is an ideal solution for content-based image retrieval that is based on the VGG-16 algorithm and uses the Singular Value Decomposition (SVD) technique. The suggested model incorporates the VGG-16 model for the purpose of extracting features from both the query images and the images kept in the database. Afterwards, the dimensionality of the features retrieved from the VGG-16 model is reduced using SVD. Then, we compare the query photographs to the dataset images using the cosine metric to see how similar they are. When all is said and done, images that share a high degree of similarity will be successfully extracted from the dataset. A validation of the retrieval performance of the CBIR-VGGSVD model is performed using the Corel-1K dataset. When the VGG-16 standard model is the sole one used, the implementation will produce an average precision of 0.864. On the other hand, when the CBIR-VGGSVD model is utilized, this average precision is revealed to be (0.948). The findings of the retrieval ensured that the CBIR-VGGSVD model provided an improvement in performance on the test pictures that were utilized, surpassing the performance of the most recent approaches.</p> 2024-03-25T00:00:00+00:00 Copyright (c) 2024 M.H. Hadid, Z.T. Al-Qaysi , Qasim Mohammed Hussein, Rasha A. Aljanabi, Israa Rafaa Abdulqader, M. S Suzani, WL Shir https://journals.mesopotamian.press/index.php/ADSA/article/view/489 Adversarial Attacks in Machine Learning: Key Insights and Defense Approaches 2024-08-27T04:10:48+00:00 Yahya Layth Khaleel yahya@tu.edu.iq Mustafa Abdulfattah Habeeb yahya@tu.edu.iq Hussein Alnabulsi yahya@tu.edu.iq <p>There is a considerable threat present in genres such as machine learning due to adversarial attacks which include purposely feeding the system with data that will alter the decision region. These attacks are committed to presenting different data to machine learning models in a way that the model would be wrong in its classification or prediction. The field of study is still relatively young and has to develop strong bodies of scientific research that would eliminate the gaps in the current knowledge. This paper provides the literature review of adversarial attacks and defenses based on the highly cited articles and conference published in the Scopus database. Through the classification and assessment of 128 systematic articles: 80 original papers and 48 review papers till May 15, 2024, this study categorizes and reviews the literature from different domains, such as Graph Neural Networks, Deep Learning Models for IoT Systems, and others. The review posits findings on identified metrics, citation analysis, and contributions from these studies while suggesting the area’s further research and development for adversarial robustness’ and protection mechanisms. The identified objective of this work is to present the basic background of adversarial attacks and defenses, and the need for maintaining the adaptability of machine learning platforms. In this context, the objective is to contribute to building efficient and sustainable protection mechanisms for AI applications in various industries</p> 2024-08-07T00:00:00+00:00 Copyright (c) 2024 Yahya Layth Khaleel , Mustafa Abdulfattah Habeeb , Hussein Alnabulsi https://journals.mesopotamian.press/index.php/ADSA/article/view/341 Harnessing the Tide of Innovation: The Dual Faces of Generative AI in Applied Sciences; Letter to Editor 2024-07-07T08:05:04+00:00 A.S. Albahri ahmed.albahri@ijsu.edu.iq Idrees A. Zahid ahmed.albahri@ijsu.edu.iq Mohanad G. Yaseen ahmed.ali@aliraqia.edu.iq Mohammad Aljanabi ahmed.albahri@ijsu.edu.iq Ahmed Hussein Ali ahmed.albahri@ijsu.edu.iq Akhmed Kaleel ahmed.albahri@ijsu.edu.iq <p>Advancements in Artificial Intelligence (AI) and emerging generative capabilities added paradoxical aspects. One aspect is its positive impact and limitless power it brings to users. On the other hand, concerns about the misuse of this powerful tool have consistently increased [1]. AI advancements affect all domains and sectors as they evolve in their applicable nature in the applied sciences. The more powerful AI the more influence it has on the model workflow within the specific domain and its applied field [2]. This dual nature of generative AI ignited a wide discussion on implementation and produced a debate according to the latest employed tools and technologies by scientists and researchers.</p> 2024-01-10T00:00:00+00:00 Copyright (c) 2024 A.S. Albahri, Idrees A. Zahid , Mohanad G. Yaseen, Mohammad Aljanabi, Ahmed Hussein Ali, Akhmed Kaleel https://journals.mesopotamian.press/index.php/ADSA/article/view/448 Is LiFi Technology Ready for Manufacturing and Adoption? An End-user questionnaire-based study 2024-08-12T07:44:42+00:00 Sallar Salam Murad sallarmurad@gmail.com Rozin Badeel rozinbabdal1987@gmail.com Rehem A. Ahmed hp220093@student.uthm.edu.my <p>Because of the exponential development of emerging technologies and the increase of devices that use the internet, the wireless fidelity (WiFi) spectrum has been saturated, therefore, the light fidelity (LiFi) has been under development for wireless communication including internet access. LiFi network systems can provide high speed data rates with high security. However, LiFi is still under development and research, and is not yet popular for end-users to be used in homes, companies, and other industries. Therefore, for the first time, this study investigates the adoption probability of LiFi technology by the end-users to anticipate the success rate when launching ready-to-use LiFi devices for end-users by the manufacturer companies. A well-designed questionnaire is used in this study for data collection. A total of 100 participants from around the world have been chosen to fill-up the questionnaire forms including three phases:&nbsp; basic information, preferences, and usage, and LiFi and Pricing. The findings of this study show a high and positive probability for adoption rate of LiFi technology. However, the pricing aspect has a critical impact on the acceptance of using LiFi systems by the end-users.</p> 2024-07-05T00:00:00+00:00 Copyright (c) 2024 Sallar Salam Murad, Rozin Badeel ; Rehem A. Ahmed https://journals.mesopotamian.press/index.php/ADSA/article/view/405 Advanced Ensemble Classifier Techniques for Predicting Tumor Viability in Osteosarcoma Histological Slide Images 2024-07-07T08:04:53+00:00 Tahsien Al-Quraishi tahsien.a@vit.edu.au Chee Keong NG tahsien.a@vit.edu.au Osama A. Mahdi tahsien.a@vit.edu.au Amoakoh Gyasi tahsien.a@vit.edu.au Naseer Al-Quraishi tahsien.a@vit.edu.au <p><strong>Background:</strong> Osteosarcoma is considered as the primary malignant tumor of the bone, emanating from primitive mesenchymal cells that form osteoid or immature bone. Accurate diagnosis and classification play a key role in management planning to achieve improved patient outcomes. Machine learning techniques may be used to augment and surpass existing conventional methods towards an analysis of medical data.</p> <p><strong>Methods:</strong> In the present study, the combination of feature selection techniques and classification methods was used in the development of predictive models of osteosarcoma cases. The techniques include L1 Regularization (Lasso), Recursive Feature Elimination (RFE), SelectKBest, Tree-based Feature Importance, while the following classification methods were applied: Voting Classifier, Decision Tree, Naive Bayes, Multi-Layer Perceptron, Random Forest, Logistic Regression, AdaBoost, and Gradient Boosting. Some model assessment was done by combining metrics such as accuracy, precision, recall, F1 score, AUC, and V score.</p> <p><strong>Results:</strong> The combination of the Tree-Based Feature Importance for feature selection and Voting Classifier with Decision Tree Classifier proved to be giving a higher performance compared to all other combinations, where such combinations helped in correct classification of positive instances and wonderful minimization of false positives. Other combinations also gave significant performances but slightly less effective, for example, L1 Regularization with the Voting Classifier, RFE with the Voting Classifier.</p> <p><strong>Conclusion:</strong> This work presents strong evidence that advanced machine learning with ensemble classifiers and robust feature selection can result in overall improvement of the diagnostic accuracy and robustness for the classification of osteosarcoma. Research on class imbalance and computational efficiency will be its future research priority.</p> 2024-05-29T00:00:00+00:00 Copyright (c) 2024 Tahsien Al-Quraishi , Chee Keong NG , Osama A. Mahdi , Amoakoh Gyasi, Naseer Al-Quraishi