Study Guide for IITM BS Privacy and Security in Online Social Media

Posted on 6-May-2023 by Kunal Chaturvedi in IITM BS

This is a heavily research and application oriented course with a lot of practical examples in the field of Social Computing. Best advice is to give time and sincerely go through all lectures. Don't just read the slides.

Weekly Study Guide

Week 1

This is just an intro to the course with basic stats about the internet like how much each social media is used daily and so on.

Important Points to Remember from an Exam Point of View

4 Vs of Social Media : Volume, Veracity, Variety and Velocity
Important Social Media Names to Remember: Parler and Gab (Controversial), Whisper (Anonymous), Mastodon (Twitter Alternative), Blind (Corporate Anonymous Social Media)
Important Events (Implications): Indian Elections, Arab Springs, UK Riots, Mumbai Blasts, Capitol Riots, Hurricane Sandy
Graph Centrality Measures (VERY VERY IMPORTANT): Degree Centrality, Betweenness Centrality, Closeness Centrality (https://youtu.be/Cziv7Xfl8Zs)
Other Graph Properties: Global/Local Clustering Coefficient, Homophily (Very Important), Exponential Law, Power Law
Research Papers: Go through lectures for descriptions on implications of various researches

Week 2

This week is heavily practical oriented with coding questions asked in the exams. For being able to write the code in exams start practicing what is told in the lectures.

Important Libraries Discussed

Twitter Data Collection using Tweepy (Python wrapper for Twitter API) Also keep an eye on Streaming API, Filter API for Twitter API using Tweepy (discussed in lectures).
Reddit Data Collection using Pushshift.
Some other APIs discussed: Instagram Graph API, Facebook API etc.
RECOMMENDATION: TRY OUT EXAMPLES PRACTICALLY ON YOUR OWN TO GET A GOOD GRIP ON THE PROCESS..

Week 3

This week is dedicated to NLP pipelines and social media text processing, including feature extraction etc.

Important Points Discussed

Artefacts in Social Media Text: Hashtags, Mentions, URLs etc.
NLP Pipeline : Go through the lectures to see how nltk,spacy,gensim etc is used and how a classifier is built (Logistic Regression)
VERY IMPORTANT : TF-IDF and Bag of Words Representations of sentences. Do this manually as well for the exam
RECOMMENDATION 1: GO THROUGH THE WHOLE COLAB NOTEBOOK MENTIONED IN THE LECTURES ONCE ATLEAST.
RECOMMENDATION 2: Perform a full NLP social media text classification for the best understanding of all the above methods. Right from data collection to the classification algorithm.

Week 4

This week deals with fake news and it's implications.

Important Points Discussed

Fake News Examples and Implications : Sushant Singh Rajput Case, 5G Fake News etc. refer to the lectures to study each case. Factual and very pinpointed details about these cases are not asked in the quizzes but is good to know for related questions
Types of Fake News (VERY IMPORTANT): Satire, Parody, Manipulated Content, False Context, Imposter Content, Deep Fakes etc. These classifications are asked in the exams.
Fake News Concepts: Partisanship, Crowd Sourcing, Digital Illiteracy
Psychological Phenomenon (VERY IMPORTANT): Cognitive Bias, Echo Chamber Effect, Implied Truth Effect, Confirmation Bias etc.
Spotting Fake News (VERY IMPORTANT) : Steps to verify fake news, Check the Author, Check the Source, Check the Grammar etc.

Week 5

Research Overview of Fake News Detection Methods : Who is target who is the attacker, refer to lectures.
Text Based Fake News Detection : Word2Vec, Glove, BERT, Sentence BERT, Doc2Vec VERY IMPORTANT, UNDERSTAND THE DIFFERENCE BETWEEN WORD2VEC, TF-IDF etc.
Image Based Fake News Detection : Visual Clarity Scores and EXIF Data
Multi-modal Methods VERY IMPORTANT, GO THROUGH LECTURES: Event Adversarial Neural Networks (IMPORTANT RESEARCH PAPER)
FAKE NEWS ON TWITTER : SPOTFAKE (IMPORTANT), FAKING SANDY (IMPORANT RESEARCH PAPER)
FAKE NEWS ON WHATSAPP: WHATSAPP MONITOR/TIPLINES (IMPORTANT), Categories of Misinformation.
FAKE NEWS DATASETS (IMPORTANT): LIAR/LIAR-PLUS, FA-KES, NELA-GT, MediaEval 2015/2016 Dataset, FakeNewsNet, FactDrill. Go through lectures and understand the features of each dataset.
OTHER FAKE NEWS APPROACHES (IMPORTANT): TweetCred, Facebook Inspector, WhatsFarzi, CredEye.
Drawbacks of BirdWatch by Twitter, analyzed by the HawkEye Paper

Week 6

This week is about policing. Focusing on types of requests by the public, types of responses by the public and the type of replies by the police. Go through the lectures thoroughly for this week.

Go through the first lecture for an idea on all the examples of police activity on Twitter for different cities.
What are the outcomes of publicizing police interactions ? (Appreciation of Police, Quick Resolution etc.) Refer to the lectures.
Privacy Policy and how Privacy Policies look on Different OSMs. (VERY VERY IMPORTANT)
Personally Identifiable Information (PII) VERY IMPORTANT : Research Papers and Techniques to look out for FreeSWITCH, SocioCaller. GO THROUGH THE LECTURES FOR ALL THE DISCUSSIONS ON RESEARCH PAPERS AS THOSE ARE VERY IMPORTANT FROM THE EXAM'S PERSPECTIVE
IDENTITY RESOLUTION (VERY IMPORTANT): How to do Identity Resolution (Nudging Nemo and other Identity Resolution Algorithms) and determining Linkability Scores.
RECOMMENDATION: It's important to understand the functionality of these algorithms as discussed by professor in the lectures, specially the flowcharts which come as questions in the exam.