If online learning works for you, what about deaf students? Emerging challenges of online learning for deaf and hearing-impaired students during COVID-19: a literature review

Wajdi Aljedaani, Rrezarta Krasniqi, Sanaa Aljedaani, Mohamed Wiem Mkaouer, Stephanie Ludi, Khaled Al-Raddah

Universal access in the information society


With the coronavirus (COVID-19) outbreak, educational systems worldwide were abruptly affected and hampered, causing nearly total suspension of all in-person activities in schools, colleges, and universities. Government officials prohibited the physical gatherings in educational institutions to reduce the spread of the virus. Therefore, educational institutions have aggressively shifted to alternative learning methods and strategies such as online-based platforms—to seemingly avoid the disruption of education. However, the switch from the face-to-face setting to an entirely online setting introduced a series of challenges, especially for the deaf or hard-of-hearing students. Various recent studies have revealed the underlying infrastructure used by academic institutions may not be suitable for students with hearing impairments. The goal of this study is to perform a literature review of these studies and extract the pressing challenges that deaf and hard-of-hearing students have been facing since their transition to the online setting. We conducted a systematic literature review of 34 articles that were carefully collected, retrieved, and rigorously categorized from various scholarly databases. The articles, included in this study, focused primarily on highlighting high-demanding issues that deaf students experienced in higher education during the pandemic. This study contributes to the research literature by providing a detailed analysis of technological challenges hindering the learning experience of deaf students. Furthermore, the study extracts takeaways and proposed solutions, from the literature, for researchers, education specialists, and higher education authorities to adopt. This work calls for investigating broader and yet more effective teaching and learning strategies for deaf and hard-of-hearing students so that they can benefit from a better online learning experience.

Detection of Fake Job Postings by Utilizing Machine Learning and Natural Language Processing Approaches

Aashir Amaar, Wajdi Aljedaani, Furqan Rustam, Saleem Ullah, Vaibhav Rupapara, Stephanie Ludi

Neural Processing Letters


The modern era is about everything that can be handled virtually in human life, such as online banking, education, security, job, etc. This increase in technology use also makes it easy for a scammer to loot people and make money quickly. A popular scam nowadays is fake job advertisements. People apply for these fake job vacancies, pay application fees to scammers, send their data to the scammers, and end up with a scam and waste their money. For this purpose, we proposed a methodology that uses natural language processing and supervised machine learning techniques to detect fraudulent job ads from online recruitment portals. We used two feature extraction techniques to extract the features from data: Term Frequency-Inverse Document Frequency (TF-IDF) and Bag-of-Words (BoW). In the study, we used six machine learning models to analyze whether these job ads are fraudulent or legitimate. Then, we compared all models with both BoW and TF-IDF features to analyze the classifier’s overall performance. One of the challenges in this study is our used dataset. The ratio of real and fake job posts samples is unequal, which caused the model over-fitting on majority class data. To overcome this limitation, we used the adaptive synthetic sampling approach (ADASYN), which help to balance the ratio between target classes by generating the number of sample for minority class artificially. We performed two experiments, one with the balanced dataset and the other with the imbalanced data. Through experimental analysis, ETC achieved 99.9% accuracy by using ADASYN as over-sampling ad TF-IDF as feature.

Spam SMS filtering based on text features and supervised machine learning techniques

Muhammad Adeel Abid, Saleem Ullah, Muhammad Abubakar Siddique, Muhammad Faheem Mushtaq, Wajdi Aljedaani, Furqan Rustam

Multimedia Tools and Applications


The advancement in technology made a significant mark with time, which affects every field of life like medicine, music, office, traveling, and communication. Telephone lines are used as a communication medium in ancient times. Currently, wireless technology overrides telephone wire technology with much broader features. The advertisement agencies and spammers mostly use SMS as a medium of communication to convey their business brochures to the typical person. Due to this reason, more than 60% of spam SMS are received daily. These spam messages cause users’ anger and sometimes scam with innocent users, but it creates large profits for the spammer and advertisement companies. This study proposed an approach for the classification of spam and ham SMS using supervised machine learning techniques. The feature extracting techniques such as Term Frequency-Inverse Document Frequency (TF-IDF) and bag-of-words are used to extract features from data. The SMS dataset used was imbalanced, and to solve this problem, we used over-sampling and under-sampling techniques. The support vector classifier, gradient boosting machine, random forest, Gaussian Naive Bayes, and logistics regression are applied on the spam and ham SMS dataset to evaluate the performance using accuracy, precision, recall, and F1 score. The experiment result shows that the random forest classifies spam ham SMS more accurately with 99% accuracy. The proposed model is trained well to identify the SMS category in terms of Ham or Spam with TF-IDF features and oversampling technique. The performance of the proposed approach was also evaluated on the spam email dataset with significant 99% accuracy.

On the identification of accessibility bug reports in open source systems

Wajdi Aljedaani, Mohamed Wiem Mkaouer, Stephanie Ludi, Ali Ouni, Ilyes Jenhani

The 19th International Web for All Conference


Today, mobile devices provide support to disabled people to make their life easier due to their high accessibility and capability, e.g., finding accessible locations, picture and voice-based communication, customized user interfaces and vocabulary levels. These accessibility frameworks are directly integrated, as libraries, in various apps, providing them with accessibility functions. Just like any other software, these frameworks regularly encounter errors. These errors are reported by app developers in the form of bug reports. These bug reports related to accessibility faults need to be urgently fixed since their existence significantly hinders the usability of apps. In this context, the manual inspection of a large number of bug reports to identify accessibility-related ones is time-consuming and error-prone. Prior research has investigated mobile app user reviews classification for various purposes, including bug reports identification, feature request identification, app performance optimization etc. Yet, none of the prior research has investigated the identification of accessibility-related bug reports, making their prioritization and timely correction difficult for software developers. To support developers with this manual process, the goal of this paper is to automatically detect, for a given bug report, whether it is about accessibility or not. Thus, we tackle the identification of accessibility bug reports as a binary classification problem. To build our model, we rely on an existing dataset of manually curated accessibility bug reports, extracted from popular open-source projects, namely Mozilla Firefox and Google Chromium. We design our solution to learn from these reports the appropriate discriminative features i.e., keywords that properly represent accessibility issues. Our trained model is evaluating using stratified cross-validation, and the findings show that our classifier achieves high F1-scores of 93%.

Automatic Classification of Accessibility User Reviews in Android Apps

Wajdi Aljedaani, Mohamed Wiem Mkaouer, Stephanie Ludi, Yasir Javed

The 7th International Conference on Data Science and Machine Learning Applications (CDMA)


In recent years, mobile applications have gained popularity for providing information, digital services, and content to users including users with disabilities. However, recent studies have shown that even popular mobile apps are facing issues related to accessibility, which hinders their usability experience for people with disabilities. For discovering these issues in the new app releases, developers consider user reviews published on the official app stores. However, it is a challenging and time-consuming task to identify the type of accessibility-related reviews manually. Therefore, in this study, we have used super-vised learning techniques, namely, Extra Tree Classifier (ETC), Random Forest, Support Vector Classification, Decision Tree, K-Nearest Neighbors (KNN), and Logistic Regression for automated classification of 2,663 Android app reviews based on four types of accessibility guidelines, i.e., Principles, Audio/Images, Design and Focus. Results have shown that the ETC classifier produces the best results in the automated classification of accessibility app reviews with 93% accuracy.

COVID-19 Vaccination-Related Sentiments Analysis: A Case Study Using Worldwide Twitter Dataset

Aijaz Ahmad Reshi, Furqan Rustam, Wajdi Aljedaani, Shabana Shafi, Abdulaziz Alhossan, Ziyad Alrabiah, Ajaz Ahmad, Hessa Alsuwailem, Thamer A Almangour, Musaad A Alshammari, Ernesto Lee, Imran Ashraf



COVID-19 pandemic has caused a global health crisis, resulting in endless efforts to reduce infections, fatalities, and therapies to mitigate its after-effects. Currently, large and fast-paced vaccination campaigns are in the process to reduce COVID-19 infection and fatality risks. Despite recommendations from governments and medical experts, people show conceptions and perceptions regarding vaccination risks and share their views on social media platforms. Such opinions can be analyzed to determine social trends and devise policies to increase vaccination acceptance. In this regard, this study proposes a methodology for analyzing the global perceptions and perspectives towards COVID-19 vaccination using a worldwide Twitter dataset. The study relies on two techniques to analyze the sentiments: natural language processing and machine learning. To evaluate the performance of the different lexicon-based methods, different machine and deep learning models are studied. In addition, for sentiment classification, the proposed ensemble model named long short-term memory-gated recurrent neural network (LSTM-GRNN) is a combination of LSTM, gated recurrent unit, and recurrent neural networks. Results suggest that the TextBlob shows better results as compared to VADER and AFINN. The proposed LSTM-GRNN shows superior performance with a 95% accuracy and outperforms both machine and deep learning models. Performance analysis with state-of-the-art models proves the significance of the LSTM-GRNN for sentiment analysis.

Blood cancer prediction using leukemia microarray gene data and hybrid logistic vector trees model

Vaibhav Rupapara, Furqan Rustam, Wajdi Aljedaani, Hina Fatima Shahzad, Ernesto Lee, Imran Ashraf

Scientific Reports


Blood cancer has been a growing concern during the last decade and requires early diagnosis to start proper treatment. The diagnosis process is costly and time-consuming involving medical experts and several tests. Thus, an automatic diagnosis system for its accurate prediction is of significant importance. Diagnosis of blood cancer using leukemia microarray gene data and machine learning approach has become an important medical research today. Despite research efforts, desired accuracy and efficiency necessitate further enhancements. This study proposes an approach for blood cancer disease prediction using the supervised machine learning approach. For the current study, the leukemia microarray gene dataset containing 22,283 genes, is used. ADASYN resampling and Chi-squared (Chi2) features selection techniques are used to resolve imbalanced and high-dimensional dataset problems. ADASYN generates artificial data to make the dataset balanced for each target class, and Chi2 selects the best features out of 22,283 to train learning models. For classification, a hybrid logistics vector trees classifier (LVTrees) is proposed which utilizes logistic regression, support vector classifier, and extra tree classifier. Besides extensive experiments on the datasets, performance comparison with the state-of-the-art methods has been made for determining the significance of the proposed approach. LVTrees outperform all other models with ADASYN and Chi2 techniques with a significant 100% accuracy. Further, a statistical significance T-test is also performed to show the efficacy of the proposed approach. Results using k-fold cross-validation prove the supremacy of the proposed model.

Racism Detection by Analyzing Differential Opinions Through Sentiment Analysis of Tweets Using Stacked Ensemble GCR-NN Model

Ernesto Lee, Furqan Rustam, Patrick Bernard Washington, Fatima El Barakaz, Wajdi Aljedaani, Imran Ashraf

IEEE Access


With social media’s dominating role in the socio-political landscape, several existing and new forms of racism took place on social media. Racism has emerged on social media in different forms, both hidden and open, hidden with the use of memes and open as the racist remarks using fake identities to incite hatred, violence, and social instability. Although often associated with ethnicity, racism is now thriving based on color, origin, language, cultures, and most importantly religion. Social media opinions and remarks provocating racial differences have been regarded as a serious threat to social, political, and cultural stability and have threatened the peace of different countries. Consequently, social media being the leading source of racist opinions dissemination should be monitored and racism remarks should be detected and blocked timely. This study aims at detecting Tweets that contain racist text by performing the sentiment analysis of Tweets. Owing to the superior performance of deep learning, a stacked ensemble deep learning model is assembled by combining gated recurrent unit (GRU), convolutional neural networks (CNN), and recurrent neural networks RNN, called, Gated Convolutional Recurrent- Neural Networks (GCR-NN). GRU is on the top in the GCR-NN model to extract the suitable and prominent features from raw text, CNN extracts important features for RNN to make accurate predictions. Obviously, several experiments are conducted to investigate and analyze the performance of the proposed GCR-NN within the scope of machine learning and deep learning models indicating the superior performance of GCR-NN with increased 0.98 accuracy. The proposed GCR-NN model can detect 97% of the tweets that contain racist comments.

Vector mosquito image classification using novel RIFS feature selection and machine learning models for disease epidemiology

Furqan Rustam, Aijaz Ahmad Reshi, Wajdi Aljedaani, Abdulaziz Alhossan, Abid Ishaq, Shabana Shafi, Ernesto Lee, Ziyad Alrabiah, Hessa Alsuwailem, Ajaz Ahmad, Vaibhav Rupapara

Saudi Journal of Biological Sciences


Every year about one million people die due to diseases transmitted by mosquitoes. The infection is transmitted to a person when an infected mosquito stings, injecting the saliva into the human body. The best possible way to prevent a mosquito-borne infection till date is to save the humans from exposure to mosquito bites. This study proposes a Machine Learning (ML) and Deep Learning based system to detect the presence of two critical disease spreading classes of mosquitoes such as the Aedes and Culex. The proposed system will effectively aid in epidemiology to design evidence-based policies and decisions by analyzing the risks and transmission. The study proposes an effective methodology for the classification of mosquitoes using ML and CNN models. The novel RIFS has been introduced which integrates two types of feature selection techniques – the ROI-based image filtering and the wrappers-based FFS technique. Comparative analysis of various ML and deep learning models has been performed to determine the most appropriate model applicable based on their performance metrics as well as computational needs. Results prove that ETC outperformed among the all applied ML model by providing 0.992 accuracy while VVG16 has outperformed other CNN models by giving 0.986 of accuracy.

Predicting Pulsars from Imbalanced Dataset with Hybrid Resampling Approach

Ernesto Lee, Furqan Rustam, Wajdi Aljedaani, Abid Ishaq, Vaibhav Rupapara, Imran Ashraf

Advances in Astronomy


Pulsar stars, usually neutron stars, are spherical and compact objects containing a large quantity of mass. Each pulsar star possesses a magnetic field and emits a slightly different pattern of electromagnetic radiation which is used to identify the potential candidates for a real pulsar star. Pulsar stars are considered an important cosmic phenomenon, and scientists use them to study nuclear physics, gravitational waves, and collisions between black holes. Defining the process of automatic detection of pulsar stars can accelerate the study of pulsar stars by scientists. This study contrives an accurate and efficient approach for true pulsar detection using supervised machine learning. For experiments, the high time-resolution (HTRU2) dataset is used in this study. To resolve the data imbalance problem and overcome model overfitting, a hybrid resampling approach is presented in this study. Experiments are performed with imbalanced and balanced datasets using well-known machine learning algorithms. Results demonstrate that the proposed hybrid resampling approach proves highly influential to avoid model overfitting and increase the prediction accuracy. With the proposed hybrid resampling approach, the extra tree classifier achieves a 0.993 accuracy score for true pulsar star prediction.

Learning sentiment analysis for accessibility user reviews

Wajdi Aljedaani, Furqan Rustam, Stephanie Ludi, Ali Ouni, Mohamed Wiem Mkaouer

36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)


Nowadays, people use different ways to express emotions and sentiments such as facial expressions, gestures, speech, and text. With the exponentially growing popularity of mobile applications (apps), accessibility apps have gained importance in recent years as it allows users with specific needs to use an app without many limitations. User reviews provide insightful information that helps for app evolution. Previously, work has been done on analyzing the accessibility in mobile applications using machine learning approaches. However, to the best of our knowledge, there is no work done using sentiment analysis approaches to understand better how users feel about accessibility in mobile apps. To address this gap, we propose a new approach on an accessibility reviews dataset, where we use two sentiment analyzers, i.e., TextBlob and VADER along with Term Frequency—Inverse Document Frequency (TF-IDF) and Bag-of-words (BoW) features for detecting the sentiment polarity of accessibility app reviews. We also applied six classifiers including, Logistic Regression, Support Vector, Extra Tree, Gaussian Naive Bayes, Gradient Boosting, and Ada Boost on both sentiments analyzers. Four statistical measures namely accuracy, precision, recall, and F1-score were used for evaluation. Our experimental evaluation shows that the TextBlob approach using BoW features achieves better results with accuracy of 0.86 than the VADER approach with accuracy of 0.82.

I cannot see you—the perspectives of deaf students to online learning during covid-19 pandemic: Saudi arabia case study

Wajdi Aljedaani, Mona Aljedaani, Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, Stephanie Ludi, Yousef Bani Khalaf

Education Sciences


The COVID-19 pandemic brought about many challenges to course delivery methods, which have forced institutions to rapidly change and adopt innovative approaches to provide remote instruction as effectively as possible. Creating and preparing content that ensures the success of all students, including those who are deaf and hard-of-hearing has certainly been an all-around challenge. This study aims to investigate the e-learning experiences of deaf students, focusing on the college of the Technical and Vocational Training Corporation (TVTC) in the Kingdom of Saudi Arabia (KSA). Particularly, we study the challenges and concerns faced by deaf students during the sudden shift to online learning. We used a mixed-methods approach by conducting a survey as well as interviews to obtain the information we needed. Our study delivers several important findings. Our results report problems with internet access, inadequate support, inaccessibility of content from learning systems, among other issues. Considering our findings, we argue that institutions should consider a procedure to create more accessible technology that is adaptable during the pandemic to serve individuals with diverse needs.

Factors Affecting Intention to Adopt Open Source ERP Systems by SMEs in Yemen

Abdullatif Ghallab, Stephanie Ludi Abdullatif Ghallab, Ali Almuzaiqer, Abdullah Al-Hashedi, Abdulqader Mohsen, Kamal Bechkoum, Wajdi Aljedaani

International Conference on Intelligent Technology, System and Service for Internet of Everything (ITSS-IoE)


Small and medium-sized enterprises (SMEs) are significant contributors to countries' economic activities. SMEs need to use enterprise resource planning (ERP) systems to increase revenue and productivity. Due to the high licensing costs of these systems, open source ERP (OSERP) could be an alternative solution to this problem. This study investigates the factors affecting the intention to adopt the OSERP system by SMEs in Yemen using the Technology-Organization-Environment (TOE) Framework and The Diffusion of Innovation (DOI) Theory. Using a questionnaire, data were collected from a sample of 600 subjects. The model was validated empirically using Structural Equation Modeling (SEM). The results show that relative advantage, compatibility, trialability, observability, ICT infrastructure, IT skills, top management support, cost-saving, competitive pressure, vendor support, and regulatory support positively influence the intention to adopt OSERP. In contrast, complexity has a negative impact on the intention to adopt. However, security and organizational culture have no significant influence on SMEs' intention to adopt OSERP in Yemen.

Test smell detection tools: A systematic mapping study

Wajdi Aljedaani, Anthony Peruma, Ahmed Aljohani, Mazen Alotaibi, Mohamed Wiem Mkaouer, Ali Ouni, Christian D Newman, Abdullatif Ghallab, Stephanie Ludi

Evaluation and Assessment in Software Engineering


Test smells are defined as sub-optimal design choices developers make when implementing test cases. Hence, similar to code smells, the research community has produced numerous test smell detection tools to investigate the impact of test smells on the quality and maintenance of test suites. However, little is known about the characteristics, type of smells, target language, and availability of these published tools. In this paper, we provide a detailed catalog of all known, peer-reviewed, test smell detection tools. We start with performing a comprehensive search of peer-reviewed scientific publications to construct a catalog of 22 tools. Then, we perform a comparative analysis to identify the smell types detected by each tool and other salient features that include programming language, testing framework support, detection strategy, and adoption, among others. From our findings, we discover tools that detect test smells in Java, Scala, Smalltalk, and C++ test suites, with Java support favored by most tools. These tools are available as command-line and IDE plugins, among others. Our analysis also shows that most tools overlap in detecting specific smell types, such as General Fixture. Further, we encounter four types of techniques these tools utilize to detect smells. We envision our study as a one-stop source for researchers and practitioners in determining the tool appropriate for their needs. Our findings also empower the community with information to guide future tool development.

On the classification of bug reports to improve bug localization

Fan Fang, John Wu, Yanyan Li, Xin Ye, Wajdi Aljedaani, Mohamed Wiem Mkaouer

Soft Computing


Bug localization is the automated process of finding the possible faulty files in a software project. Bug localization allows developers to concentrate on vital files. Information retrieval (IR)-based approaches have been proposed to assist automatically identify software defects by using bug report information. However, some bug reports that are not semantically related to the relevant code are not helpful to IR-based systems. Running an IR-based reporting system can lead to false-positive results. In this paper, we propose a classification model for classifying a bug report as either uninformative or informative. Our approach helps to lower false positives and increase ranking performances by filtering uninformative information before running an IR-based bug location system. The model is based on implicit features learned from bug reports that use neural networks and explicit features defined manually. We test our proposed model on three open-source software projects that contain over 9000 bug reports. The results of the evaluation show that our model enhances the efficiency of a developed IR-based system in the trade-off between precision and recall. For implicit features, our tests with comparisons show that the LSTM network performs better than the CNN and multilayer perceptron with respect to the F-measurements. Combining both implicit and explicit features outperforms using only implicit features. Our classification model helps improve precision in bug localization tasks when precision is considered more important than recall.

Finding the needle in a haystack: On the automatic identification of accessibility user reviews

Abdullah AlOmar, Wajdi Aljedaani, Murtaza Tamjeed, Mohamed Wiem Mkaouer, Yasmine N El-Glaly

CHI conference on human factors in computing systems


In recent years, mobile accessibility has become an important trend with the goal of allowing all users the possibility of using any app without many limitations. User reviews include insights that are useful for app evolution. However, with the increase in the amount of received reviews, manually analyzing them is tedious and time-consuming, especially when searching for accessibility reviews. The goal of this paper is to support the automated identification of accessibility in user reviews, to help technology professionals in prioritizing their handling, and thus, creating more inclusive apps. Particularly, we design a model that takes as input accessibility user reviews, learns their keyword-based features, in order to make a binary decision, for a given review, on whether it is about accessibility or not. The model is evaluated using a total of 5,326 mobile app reviews. The findings show that (1) our model can accurately identify accessibility reviews, outperforming two baselines, namely keyword-based detector and a random classifier; (2) our model achieves an accuracy of 85% with relatively small training dataset; however, the accuracy improves as we increase the size of the training dataset.

Recommending pull request reviewers based on code changes

Xin Ye, Yongjie Zheng, Wajdi Aljedaani, Mohamed Wiem Mkaouer

Soft Computing


Pull-based development supports collaborative distributed development. It enables developers to collaborate on projects hosted on GitHub. If a developer wants to collaborate on a project, he/she will fork the repository, make modifications on the forked repository and send a pull request to the development team to ask for a merge of the code changes to the official repository. When the development team receives a pull request, the team members will review the changes and make a decision on whether to accept the changes or not. However, efficiently finding suitable pull request reviewers is a challenge. In this paper, we propose a multi-instance-based deep neural network model to recommend reviewers for pull requests. Given a pull request, our model extracts three features, which pull request title, commit message, and code change. The proposed model extracts the three features automatically from the code changes of every commit in the pull request. The features of different commits are then merged to predict the likelihood that a reviewer candidate is the appropriate reviewer. We use CNN and LSTM-network to learn features since the pull requisition and commit message feature have different structures than code change, written in a programming language. To test the effectiveness of our model, we performed a set of experiments using 43,986 pull requests extracted from 12 open-source projects. We compare our model with two baselines approaches, CoreDevRec and Majority Classes. Experiments demonstrate that our model outperforms two state-of-the-art baselines. For instance, for the TensorFlow project, our model’s accuracy in determining the appropriate reviewers is 50.80%, 74.70%, and 84.04%, respectively, in Top-1, Top-3, and Top-5 recommendation.

Lda categorization of security bug reports in chromium projects

Wajdi Aljedaani, Yasir Javed, Mamdouh Alenezi

The 2020 European symposium on software engineering


Security bug reports (SBR) depict potential security vulnerabilities in software systems. Bug tracking systems (BTS) usually contain huge numbers of bug reports including securityrelated ones. Malicious attackers could exploit these SBRs. Henceforth, it is very critical to pinpoint SBRs swiftly and correctly. In this work, we studied the security bug reports of the Chromium project. We looked into three main aspects of these bug reports, namely: frequencies of reporting them, how quickly they get fixed and is LDA effective in grouping these reports to known vulnerabilities types. We report our findings in these aspects.

Learning to rank developers for bug report assignment

Bader Alkhazia, Andrew DiStasi, Wajdi Aljedaani, Hussein Alrubaye, Xin Ye, Mohamed Wiem Mkaouer

Applied Soft Computing


Bug assignment is a burden for projects receiving many bug reports. To automate the process of assigning bug reports to the appropriate developers, several studies have relied on combining natural language processing and information retrieval techniques to extract two categories of features. One of these categories targets developers who have fixed similar bugs before, and the other determines developers working on source files similar to the description of the bug. Commit messages represent another rich source for profiling developer expertise as the language used in commit messages is closer to that used in bug reports. In this work, we propose a more enhanced profiling of developers through their commits, which are captured in a new set of features that we combine with features used in previous studies. More precisely, we propose an adaptive ranking approach that takes as input a given bug report and ranks the top developers who are most suitable to fix it. This approach learns from the history of previously fixed bugs to profile developers in terms of their expertise. With respect to a given bug report, the ranking score of each developer is computed as a weighted combination of an array of features encoding domain knowledge, where the weights are trained automatically on previously solved bug reports using a learning-to-rank technique. Our model was evaluated using around 22,000 bug reports, exported from four large scale open-source Java projects. Results show that our model significantly outperformed two recent state-of-the-art methods in recommending the suitable developer to handle a certain bug report. Specifically, the percentage of recommending a developer within the top 5 ranked developers correctly was over 80% for both the Eclipse UI Platform and Birt projects.


Open Source Systems Bug Reports: Meta-Analysis

Wajdi Aljedaani, Yasir Javed, Mamdouh Alenezi

The 3rd International Conference on Big Data and Education


Bug Tracking System (BTS) is a wealthy source of software development information. They contain many insights about the health status of the software project. Making sense of this information is a big challenge to software development communities. In this work, we perform an investigation on fixing time, components, and platforms related to five open source systems hosted by Bugzilla. The motive is to identify what are the most error-prone components and to allocate the right developer to fix the bug. The results are indicators for how data in BTS should be utilized for decision-making processes. The results reveal a strong relationship between bugs and committers, where it is seen that committer is usually related to fixing a single domain of bugs that also shows their expertise. This study can also help in the automated classification of bug allocation to the right kind of committers instead of manual allocation that will result in a reduction of fixing time and interloping between different committers.


Empirical study of software test suite evolution

Wajdi Aljedaani, Yasir Javed

The 6th Conference on Data Science and Machine Learning Applications (CDMA)


In a variety of market environments, open-source software plays a major role these days. Open-source systems have expanded to the research area from only academic projects. There are more than thousands of successful and effective open source projects to be checked and their level of performance requires to be calculated. The reliability of software systems can be measured in several respects. Essentially, the ability to detect and locate flaws in test cases is measured. This research aims to identify a good technique for evaluating the efficacy of test cases in open source systems to identify defects. This research study focused on six OSS projects (Open Source Software) publicly available. This study tends to find a relationship between software code suites and test code suites in terms of software evolution. It is seen from results that test suite is becoming enriched to have a better code coverage that directly relates to awareness about writing better test cases. The complexity of software as depicted in the result is still in-fancy as only a marginal change of less than 2 percent has occurred.


A Comparison of Bugs Across the iOS and Android Platforms of Two Open Source Cross Platform Browser Apps

Wajdi Aljedaani, Meiyappan Nagappan, Bram Adams, Michael Godfrey

IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft)


—Mobile app developers want to maximize their revenue and hence want to reach as large an audience as possible. In order to do this, they need to build apps for multiple platforms - like Google’s Android and Apple’s iOS, and maintain them in parallel. Past research has examined properties of the issues addressed in either Android or iOS, but not to compare the work between both. Our main motivation has been to determine if there were differences in how issues manifest themselves in iOS and Android, when we control for the projects, by considering the same apps across multiple platforms. In this paper, we compare issues across two mobile platforms — iOS and Android — for two open source browsers — Mozilla Firefox and Google Chromium. We consider three dimensions of study: frequency of issue report submission, fixing time of issues, and type of issues (using topic modeling on the issue description to generate the categories). We found that there were indeed differences; in particular, we found that there were more issues in the Android version of the apps and the gap with the iOS version is increasing. We observe that in both apps the fix time and type of issues are different for each platform. We also noted certain kinds of issues that may be more prevalent for different browser/platform combinations. This can advise project leads in identifying and allocating development resources to address key problem areas. Hence, issue reports seem more dependent on the platform than on the mobile app, making development and maintenance effort hard to estimate. Index Terms—Issue repository, issue reports; Mozilla Firefox; issue fixing; Google Chromium; empirical studies; topic model.


Learning to rank faulty source files for dependent bug reports

Nasir Safdari, Hussein Alrubaye, Wajdi Aljedaani, Bladimir Baez Baez, Andrew DiStasi, Mohamed Wiem Mkaouer

Big data: learning, analytics, and applications


With the rise of autonomous systems, the automation of faults detection and localization becomes critical to their reliability. An automated strategy that can provide a ranked list of faulty modules or files with respect to how likely they contain the root cause of the problem would help in the automation bug localization. Learning from the history if previously located bugs in general, and extracting the dependencies between these bugs in particular, helps in building models to accurately localize any potentially detected bugs. In this study, we propose a novel fault localization solution based on a learning-to-rank strategy, using the history of previously localized bugs and their dependencies as features, to rank files in terms of their likelihood of being a root cause of a bug. The evaluation of our approach has shown its efficiency in localizing dependent bugs.

research diagram

Bug reports evolution in open source systems

Wajdi Aljedaani, Yasir Javed

The 5th International Symposium on Data Mining Applications


Open Source Software communities usually utilize open bug reporting system to enable users to report and fix bugs. In addition, the lifetime of most open source system stays for long periods of time. In this work, we comprehensively examine the evolution of bug reports in four different open source systems from various languages. The selected project are analyzed since 2004 in order to find how many bugs are reported compared to their resolution. We report our results and some recommendations to the open source community.