Preliminary Literature Review - PowerPoint Presentation

353 views
Uploaded On 2020-10-06

Preliminary Literature Review - PPT Presentation

Currently botdetection is approached by data analytics surprised learning and unsupervised learning high cost of data collection timeconsuming Developed AI results in popular botscontrolled accounts ID: 813330

model data dataset bots data model bots dataset lstm shared accounts time reply metadata controlled train status included twitter

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/813330" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download The PPT/PDF document "Preliminary Literature Review" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

Preliminary Literature Review

Currently, bot-detection is approached by

: data analytics, surprised learning, and unsupervised learning. high cost of data collection time-consuming

Developed AI results in popular bots-controlled accounts

BACKGROUND & MOTIVATION

Politicians use bots in twitter to do

propaganda campaign Restaurants’ owners use bots to rate their own restaurants higher scores and write down more positive comments on Yelp. Bots issues are not just time-consuming, but could do harm to people’s benefits seriously

Develop a less expensive process of detecting whether an account or a tweet is bots-controlled To help users save time and be aware of fake information.

WE WOULD LIKE TO

Slide2

Dataset

Dataset 1: Twitter

From a group of researchers

from Indian Institute of Technology

The same dataset they use for their research on the application of Contextual LSTM models.

Already well-preprocessed Includes genuine accounts and different types of social spambots. Also contains some metadata to improve the analysis.Dataset 2: Facebook From a group of researchers from Harvard Web page addresses (URLs) that have been shared on Facebook. URLs are included if shared by at least 20 unique accounts, and shared publicly at least once Starting January 1, 2017 and ending about a month before the present day. Columns Included:Id; text; source; user_id; truncated; in_reply_to_status_id; in_reply_to_user_id; in_reply_to_screen_name; retweeted_status_id; geo; place…

Slide3

Methodology

Data processing

clean data

handle missing values

Tokenizer

slice the sentences into words

form a string of tokens for each piece of data

Pre-trained model

use pre-trained model to explore what model we should use

Neural Network

Use LSTM, an NLP model to train the data