ABSTRACT
Fake news has continued to grow both locally and globally due to the increase of Online Social Media web forums like Facebook, Twitter and blogging. This has been propelled even further by smartphones and mobile data penetration locally. This study provides a fake news digital forensics tool through the design, development and implementation of a software application. The objective of this study is to identify and analyze the different software techniques for fake news monitoring and provide the best suited and customized application. The study will develop an application using Linux Apache MySQL PHP and Python. The application will use Scrapy Python page ranking algorithm to perform web crawling and the data will be placed in a MySQL database for data mining. The application used Agile Software development methodology with twenty websites being the subject of interest. The websites will be the sample size to demonstrate how the application works together with the Python libraries as the framework for web crawling. MySQL data mining, database query application models will be used in performing the search of the lexicon of keywords for fake news, Inferences from the data mined from crawled web pages will be drawn using Microsoft Excel 2016. Excel will be used for data analysis with the data being presented in tables and figures.
CHAPTER ONE
INTRODUCTION
1.1 Background to the Study
Fake news is the deliberate spread of misinformation via traditional news media or via social media. Social media provides for easy access, little to no cost, and the spread of information at an impressive rate. On the other hand, social media provides the ideal place for the creation and spread of fake news. Buntain and Golbeck (2018).
Sometimes, using social media as a medium for news updates is a double-edged sword. People can download articles from sites, share the information, re-share from others and by the end of the day the false information has gone so far from its original site that it becomes indistinguishable from real news. Kaur, P. Kumar, P. Kumaraguru (2019)
Fake news and hoaxes have been there since before the advent of the Internet. The widely accepted definition of Internet fake news is: fictitious articles deliberately fabricated to deceive readers”. Some news outlets publish fake news to increase readership or as part of psychological warfare.
The purpose of this paper is to come up with a solution that can be utilized by users to detect and filter out the sites containing false and misleading information. Therefore, we collect the three publicly available dataset and classify the fake and real news using the machine learning algorithms. Before classifying the data set, data preprocessing has to be performed using natural language processing techniques of feature extraction methods to get the highest accuracy as much as possible. After classifying the datasets, performance of the classification algorithms: Passive Aggressive Classifier (PAC), Support Vector Machine (SVM), and Naïve Bayes classifier (NB) are measured. Their evaluation results are displayed with precision, recall, measure and accuracy score.
1.2 Statement of the Problem
Fake news on social media and blogs need to be tracked, tackled and means provided to apprehend the cyber criminals and fake news mongers. The process of collecting and documenting online fake news digital evidence should be optimized efficiently.
With security agencies and communication authority personnel being overstretched, with less manpower and tools, cyber expertise, the need for an automated and easier to use system for fake news is crucial.
Using the Social Media platform, we will provide an application that will be able to mine social media opinions and easily present the results of possible fake news offenders. The system will invoke a web crawler that will collect all the web forum details and insert them in a database. Once the database is created, a script will be created to check the data against a list of potential keywords of fake news. The script will search and provide the list of fake news data together with the person of interest.
Communication authorities, security agencies and the country at large stands to benefit by placing checks on social media fake news. The movement of fake news mongers towards the digital cyber space needs to be addressed before it escalates further as was experienced regularly when political debates are held or government corruption cases are discussed.
1.3 Objectives of the Study
The purpose of the study was to design, develop and implement a software application for fake news monitoring and reporting.
Specific Objectives
- To identify and analyze techniques used in fake news monitoring and select the best suitable technique for creating a customized fake news
- To develop an application that will combine fake news keywords for data
- To demonstrate and test the application while providing analysis on the fake news websites being
1.4 Research Questions
The research answers the following:
- How can Social Media web forums be monitored and fake news collected?
- What are the best application that will combine fake news keywords for data mining?
- What are the methods used to collect the fake news opinions and developing an algorithm?
1.5 Significance of the Study
Fake news on Social Media has been shown to directly influence and promote physical violent acts. Hence, the need for this monitoring tool to capture the digital evidence. The research proposal is a much needed approach to address the gap fake news mongers are taking advantage in the cyberspace arena in Nigeria. It’s critical and a much needed approach to check the Nigerian cyber space as such acts of fake news are affecting and influencing different people, races, tribes and the entire country. The tool will assist law enforcement agencies to easily and readily make use of the features to capture data and digital evidence.
The algorithm developed will capture English, Hausa, Yoruba, and Igbo fake news keywords as means of increasing its relevance towards the local population and social media sites. With this tool in hand law enforcement and communication authorities will be on high alert and be able to bring down offensive social media web forums and thus reduce potential political, ethnical and tribal conflict. After all, prevention is better than cure when it comes to civil war and internal conflicts.
1.6 Scope and Limitations of the Study
This work is on the Performance Evaluation Of Fake News Detection Using Machine Learning Algorithm. This study is limited to social media websites and blogs because it the most popular medium for disseminating fake news in Nigeria