🔍 Title: A Learning-Based Approach for Real-Time Emotion Classification of Tweets
🧠 1. What Is the Problem Being Solved?
People post millions of tweets every day expressing feelings, emotions, and opinions. Detecting emotions in tweets
helps in:
Monitoring public mood
Understanding customer sentiment
Improving mental health tools
Assisting governments/brands during events (e.g., elections, disasters, product launches)
🤖 2. What Is a Learning-Based Approach?
A learning-based approach means using machine learning algorithms to train models on labeled tweet data and
classify new tweets based on that training.
Steps:
1. Collect tweets
2. Preprocess the tweets (cleaning text)
3. Extract features (e.g., word embeddings)
4. Train a classifier (e.g., SVM, LSTM, BERT)
5. Predict emotions (e.g., joy, sadness, anger, fear)
📱 3. Why Real-Time?
Real-time classification means analyzing tweets as they are posted, not hours or days later. It's important for:
Emergency response
Trending topic analysis
Real-time marketing
Crisis detection
To achieve this, we need:1) Fast algorithms 2)Streaming architecture 3)Efficient preprocessing pipelines
💡 4. Where Does Social Network Analysis Fit In?
SNA can enhance emotion classification in several ways:
📊 a. Network Context:
Analyzing who is connected to whom
Understanding influence and emotion spread across the network
For example: 1) Emotions can spread through social ties
2)SNA helps identify influential users whose emotions affect many others
🌐 b. Hashtag & Mention Graphs: Tweets using the same hashtag or mentioning the same user can be linked
Graphs can show communities with similar emotions
🧩 c. Emotion Propagation: Predict emotions of unlabeled tweets by seeing emotions in their connected neighbors
🧪 5. Machine Learning Models Used
Model Description
Naive Bayes Simple, fast, good baseline for text
SVM Good for high-dimensional feature spaces like TF-IDF
LSTM/GRU Captures long-term dependencies in tweet sequences
BERT Pretrained language model that captures deep meaning
🧹 6. Preprocessing of Tweets
Remove URLs, hashtags, mentions, emojis
Lowercase everything
Tokenize and remove stopwords
Convert text into vectors using:
o TF-IDF
o Word2Vec
o BERT embeddings
📈 7. Evaluation Metrics: To evaluate the model:
Accuracy: % of correct predictions
Precision, Recall, F1-score
Confusion Matrix
ROC-AUC for multi-label emotion detection
🧠 8. Emotions Usually Detected
Some common emotion categories in tweets:1)Joy 2)Sadness 3)Anger 4)Fear 5)Surprise 6)Love 7)Neutral
🌟 10. Applications
Mental health monitoring (detect suicidal or depressed tweets)
Brand sentiment tracking
Disaster response (detect panic, fear)
Election campaigns (joy/anger analysis on politicians)
Social bot detection (unusual emotion spikes)
"A New Linguistic Approach to Assess the Opinion of Users in Social Network Environments"
🧠 1. What Is This Topic About?
This topic focuses on understanding user opinions (like support, disagreement, satisfaction, etc.) from the language
they use in social networks (like Twitter, Facebook, Reddit, etc.). Instead of just using machine learning or basic
sentiment analysis, this paper proposes a linguistic approach — using the structure, grammar, and meaning of
language to better understand opinions in online posts.
📌 2. Keywords to Understand
Term Meaning
Uses language structure, grammar, syntax, and semantics to analyze text. Goes deeper
Linguistic Approach
than word matching.
Opinion Mining Finding and analyzing people's opinions, sentiments, and emotions in text.
Social Network Platforms like Twitter, Facebook, Instagram, YouTube, where users share their thoughts
Environments and interact.
🔍 3. Why a Linguistic Approach?
Most traditional methods:
Use keyword-based sentiment analysis
Focus on positive, negative, neutral
But those are limited because:
People don’t always express directly (“Yeah, right!” may mean sarcasm)
Tweets/posts are short, noisy, and informal
Emotions/opinions can be hidden in grammar or contextual clues
So a linguistic approach tries to:
Understand how sentences are built
Analyze modifiers, intensifiers, negations
Detect sarcasm, irony, or subtlety
Use natural language rules, not just machine learning
4. What Techniques Are Involved?
a. Syntax Parsing
Analyzes sentence structure (subject, verb, object)
Helps identify who is expressing what opinion
Example: "I don’t think this movie is bad." → double negative → positive opinion
b. Semantic Analysis
Understands meaning of words in context
c. Lexical Resources
Use dictionaries like:
o SentiWordNet (words with sentiment scores)
o LIWC (Linguistic Inquiry and Word Count)
o WordNet for synonyms, antonyms
d. Discourse Analysis
Studies how multiple sentences together form an opinion
Tracks pronouns ("he", "they") and logical connectors ("but", "however")
e. Sarcasm & Irony Detection
Important in social networks
Example: "Just great! I missed the bus AND it's raining." → sarcastic, negative
🌐 5. How Is It Used in Social Network Analysis?
✅ a. User-Level Opinion Profiling
Build opinion profiles for each user (positive/negative tone, topic preferences)
✅ b. Influence Detection
People who use strong/opinionated language may be influencers
Combine with SNA metrics like centrality to find powerful voices
✅ c. Community Sentiment Analysis
Analyze collective opinion of a group (e.g., fans of a celebrity or voters)
🧠 6. Example Workflow
Tweet: "I absolutely love the new iPhone, though it’s super expensive."
→ Syntax parsing: “absolutely love” = strong positive, “super expensive” = negative cost aspect
→ Semantic score = +0.7 (overall positive opinion)
📊 7. Advantages Over Traditional Methods
Traditional Approach Linguistic Approach
Word-based only Context-based
Can miss sarcasm Detects irony
Shallow analysis Deep structure
Less explainable More interpretable
🧭 8. Real-World Applications
Brand monitoring: Understanding detailed customer feedback
Political sentiment analysis: Tracking voter opinion linguistically
Health opinion mining: Patients' feedback on treatments
Fake news detection: Analyzing how opinion is framed linguistically
🧠 Topic: Explaining Scientific and Technical Emergence Forecasting
As your Social Network Analysis (SNA) professor, I will break this down into easy, structured, and logical segments
so you fully understand this advanced concept.
🔍 1. What Is Scientific and Technical Emergence?
Emergence refers to the rise of new ideas, technologies, or scientific trends that are likely to become important in the
future.
🎯 2. What Is Emergence Forecasting?
Emergence forecasting is the process of:
Identifying, tracking, and predicting
Which scientific or technological ideas
Will grow fast, gain attention, and create real-world impact
This helps:
Policymakers
Research institutes
Investors
Innovators
Make strategic decisions early.
🧪 3. Where Does This Apply?
Research trend detection (e.g., in arXiv, IEEE)
Funding allocation (which areas to invest in)
Patent analysis (new tech innovations)
Horizon scanning (anticipating risks or opportunities)
Innovation management (what tech to adopt next)
🔗 4. How Is Social Network Analysis Involved?
Social Network Analysis is key in forecasting emergence. Here’s how:
a. Co-authorship Networks
Who is publishing with whom?
Emerging ideas often come from new collaborations
b. Citation Networks
Which papers are being cited frequently and recently?
Sudden citation growth = possible emergence
c. Keyword Co-occurrence Networks
Track which keywords (e.g., “blockchain + AI”) are appearing together more often
d. Innovation Diffusion
SNA shows how ideas spread across communities
Early spreaders (influential nodes) are innovation drivers
🧭 5. Forecasting Techniques Used
Technique Description
Bibliometric Analysis Analyzes scientific articles (keywords, citations) to find trends
Text Mining & NLP Extract emerging terms from titles, abstracts, patents
Time Series Analysis Tracks growth in usage over time
Topic Modeling Groups related documents using LDA to discover hot topics
Machine Learning Predicts which topics will grow based on past data
🧱 6. Indicators of Emergence To forecast emergence, look for patterns like:
Indicator Meaning
Burst in Publications Sudden spike in number of papers
Growth in Citations More attention from researchers
New Terms Appearing Novel keywords or jargon
Cross-Disciplinary Spread Topic moving between different fields
Global Interest Topic being discussed worldwide
7. Example: Detecting the Rise of AI in Healthcare Step-by-step:
1. Collect papers from 2010 to 2024 in medicine.
2. Analyze keyword trends: “deep learning,” “medical imaging,” “AI diagnosis.”
3. Build a co-author and citation network.
4. Observe sudden increase in:
o Papers on “AI + radiology”
o International collaborations
o Patent filings in “AI health diagnostics”
5. Forecast: This area is an emerging domain likely to grow in funding and startups.
🧠 8. Why Is This Important?
Governments can plan future R&D funding.
Companies can invest early in disruptive tech.
Researchers can focus on high-impact fields.
Society can prepare for new tech impacts (e.g., ethics of AI or privacy in biotech).
📌 9. Challenges in Emergence Forecasting:1) Noisy Data 2) Lags in Publication 3) Language Ambiguity 4) Prediction
Errors
📘 Topic: Social Network Analysis for Biometric Template Protection
As your Social Network Analysis (SNA) professor, I’ll explain this advanced topic in simple, clear, and organized
steps so you understand how SNA and biometric security are connected.
🔐 1. What Is Biometric Template Protection? Biometric templates are digital representations of biometric traits like:
Fingerprints
Iris patterns
Face features
Voiceprints
These templates are sensitive and private. If leaked or stolen, unlike passwords, they can't be changed — you can't
change your fingerprint! :So, biometric template protection ensures:
Data is stored securely
It can’t be reverse-engineered
It resists spoofing, tampering, and unauthorized access
🌐 2. Where Does Social Network Analysis Come In?
Social Network Analysis (SNA) is used in this context to:
Analyze relationships between biometric data and users/devices
Detect patterns in access, similarity, or potential attacks
Model and secure interactions in biometric systems (especially multi-user or cloud-based)
💡 3. Conceptual Overview: Biometric SNA
Let’s build a network where:
Nodes = biometric templates (or users)
Edges = similarities or shared patterns (e.g., matching traits, shared access points, attack vectors)
By analyzing this graph, we can:
Detect anomalies (e.g., one user linked to multiple templates = spoofing)
Identify clusters of reused or correlated data
Visualize leakage or compromise paths
🧠 4. Key Applications of SNA in Template Protection
Application Area Description
Template Similarity Networks Build networks based on similarity scores to detect duplicates or fakes
Attack Path Prediction Use SNA to trace possible attack vectors in multi-biometric systems
Access Pattern Monitoring Who is accessing what, how often, and from where
Spoof Detection Identify outliers or unusual connections that suggest template tampering
User-Device Graphs Map users and devices to detect cloning or identity fraud
📊 5. Techniques Used
SNA Technique Use in Biometric Protection
Community Detection Identify user groups with common biometric behavior
Centrality Measures (Degree, Betweenness) Spot suspicious templates involved in many matches
Anomaly Detection Flag irregular connections between users and templates
Clustering Coefficients Detect dense clusters of fake identities (Sybil attacks)
SNA Technique Use in Biometric Protection
Temporal SNA Track changes in the network over time to spot misuse
🔍 6. Example Scenario
Imagine a cloud-based fingerprint authentication system used by employees.
Build a user-template graph
Run SNA to find:
o Multiple templates tied to one user: Possible spoofing
o One template reused by many users: Data leakage
o A node with high betweenness centrality: Gateway to attack
🧬 7. Linking SNA with Biometric Template Protection Techniques
Protection Method SNA Support
Cancelable Biometrics Track how changes affect relationships in the network
Template Watermarking Detect tampering by observing trust flows in the network
Multimodal Biometric Fusion Use SNA to analyze relationships between different modalities (e.g., face + voice)
Template Revocability Use temporal SNA to track the effectiveness of template updates over time
🚨 9. Advantages of Using SNA
Early Warning for suspicious activity
Better visualization of template misuse or fraud
Improved decision-making for revoking or updating templates
Insight into trust/distrust relationships in biometric systems
⚠️10. Challenges
Challenge Description
Data Privacy Graphs must not leak sensitive biometric traits
Scalability Large biometric systems = huge graphs
False Positives SNA may flag legitimate users as threats
Dynamic Environments Templates and users change over time — graph must update too
Unit-1
📘 Introduction to Web, Its Limitations, Semantic Web, and the Emergence of the Social Web
🌐 1. Introduction to the Web
The World Wide Web (WWW) is a system of interlinked hypertext documents and multimedia content accessed via
the Internet using web browsers.
Key Components:
HTML: HyperText Markup Language for designing web pages.
HTTP: Protocol used to access web pages.
URL: Unique address for every web page.
Web Browser: Interface to access the web (e.g., Chrome, Firefox).
🚫 2. Limitations of the Current Web (Web 1.0 and Web 2.0)
Limitation Explanation
Web content is designed for humans, not machines — machines can't understand
Lack of Meaning
meaning.
Information Overload Too much information, but poor filtering and personalization.
No Interlinking of Data Pages are linked, but data is not — making knowledge integration difficult.
Static Content (Web 1.0) Early web pages were read-only; no interaction.
Privacy and Trust Issues (Web User-generated content raises concerns over fake profiles, misinformation, and
2.0) surveillance.
Difficult for Automated Agents Machines can't process or reason about web content intelligently.
3. Development of the Semantic Web (Web 3.0)
The Semantic Web, proposed by Tim Berners-Lee, is an extension of the current web where information is given
well-defined meaning, making it easier for computers to process, share, and reuse.
📌 Goals:
Make machine-understandable data
Enable intelligent search and automation
Support semantic reasoning (logic-based)
Technologies Used:
Technology Purpose
RDF (Resource Description Framework) Basic data representation format
OWL (Web Ontology Language) Describes relationships and rules
SPARQL Query language for semantic data
Ontology Defines domain knowledge (e.g., diseases, people, books)
✅ Benefits:
Intelligent agents (e.g., smart assistants)
Personalized services
Better search results
Interoperability between systems
🤝 4. Emergence of the Social Web (Web 2.0)
The Social Web refers to the interactive, collaborative, user-driven web, where people create, share, and connect
through platforms.
🔑 Features:
User-generated content (blogs, tweets, reviews)
Social networking (Facebook, Twitter, LinkedIn)
Collaboration tools (Wikipedia, GitHub)
Rich interactivity (AJAX, APIs, mobile apps)
Reputation and trust systems (likes, followers, ratings)
📱 Examples:
Platform Function
Facebook Social networking
Twitter Microblogging
Instagram Photo sharing
YouTube Video sharing
Reddit Social news and forums
🧩 5. Differences: Semantic Web vs. Social Web
Aspect Semantic Web Social Web
Purpose Machine-readable content Human interaction and content sharing
Focus Data meaning and logic Social interaction and connectivity
Example Ontologies, linked data Social media, wikis
Goal Automation and intelligent agents Community building and user engagement
📌 6. Importance in Social Network Analysis
Semantic Web helps structure social data (e.g., interests, relationships) in meaningful formats.
Social Web provides raw user-generated content (posts, likes, comments) for SNA.
Combined, they power intelligent social platforms (like recommendation systems, fake news detection,
behavior analysis).
📘 Statistical Properties of Social Networks – Network Analysis – Development of Social Network Analysis
🌐 1. What Is a Social Network?
A social network is a structure made up of:
Nodes (vertices): individuals, users, or entities
Edges (links): relationships or interactions between them (friendship, following, messaging, etc.)
Think of Facebook: each person is a node, and each friendship is an edge.
📊 2. Statistical Properties of Social Networks
These are mathematical characteristics that help understand the structure and behavior of networks.
a) Degree
Definition: Number of connections a node has.
Types:
o In-degree: Number of incoming links (e.g., followers on Twitter)
o Out-degree: Number of outgoing links (e.g., how many you follow)
Significance: Identifies influencers or active users
b) Average Path Length
Definition: Average number of steps it takes to go from one node to another.
Shorter path = better information flow
Example: In LinkedIn, you are usually 2–3 steps away from any professional.
c) Clustering Coefficient
Definition: Measures how likely a node’s neighbors are also connected.
High clustering = tight-knit community
Example: In WhatsApp groups, many users know each other.
d) Network Diameter
Definition: The longest shortest path between any two nodes.
Helps identify network size and reachability.
e) Centrality Measures
Used to find important or influential nodes:
Type Meaning
Degree Centrality Nodes with most connections
Betweenness Centrality Nodes that act as bridges in the network
Closeness Centrality Nodes closest to all others
Eigenvector Centrality Nodes connected to other important nodes (used in Google's PageRank)
f) Density
Definition: Measures how many edges exist compared to the maximum possible.
High density = highly connected network
g) Homophily : People tend to connect with similar people (same age, interest, opinion)
h) Power Law Distribution : In social networks, a few nodes have many connections (hubs), most have few —
follows a power law.
📈 3. Network Analysis: Network analysis uses mathematical and visual techniques to study structure, behavior, and
relationships in a network.
Steps in Network Analysis:
1. Data Collection (e.g., user tweets, likes, follows)
2. Graph Construction (nodes and edges)
3. Metric Calculation (degree, centrality, etc.)
4. Visualization (using tools like Gephi, NetworkX, Cytoscape)
5. Interpretation (find influencers, communities, anomalies)
🧠 4. Development of Social Network Analysis (SNA)
🔙 Early History:
Began in sociology and anthropology (1930s–1950s)
Studied group behavior, family ties, and friendship circles
Jacob Moreno introduced sociograms (early network maps)
📚 Mid 20th Century:
SNA became mathematical using graph theory
Used in psychology, communication, political science
💻 Recent Years:
With digital platforms (Facebook, Twitter, LinkedIn):
Huge data available from social media
Used for influencer detection, sentiment analysis, spread of information, fake news detection
🌍 Applications Today:
Area Use
Marketing Influencer targeting, viral campaigns
Epidemiology Disease spread tracking (COVID-19)
Security Detecting terrorist/fraud networks
Politics Analyzing opinion clusters
Education Student collaboration patterns
📘 Key Concepts and Measures in Network Analysis – Discussion Networks – Blogs and Online Communities –
Web-Based Networks
1️⃣ Key Concepts and Measures in Network Analysis
These are the foundational ideas and metrics used to understand how networks function and behave.
🔑 Key Concepts:
Concept Description
Node (Vertex) An individual entity (e.g., person, user, page)
Edge (Link) A connection between nodes (e.g., friendship, comment, retweet)
Directed/Undirected Directional (A → B) or mutual (A — B) relationships
Path A sequence of nodes connected by edges
Graph A visual or mathematical representation of the network
Concept Description
Subgraph A smaller portion of the network
Component A group of nodes that are connected together
Bridge An edge whose removal increases the number of components
📏 Key Measures:
Measure Explanation Purpose
Degree Centrality Number of connections a node has Measures popularity
Closeness Centrality How fast a node can reach others Measures reachability
Betweenness Centrality Number of times a node lies on the shortest path Finds bridges or brokers
Eigenvector Centrality Influence of a node based on connected nodes' importance Identifies power users
Density Ratio of actual to possible edges Measures network cohesion
Clustering Coefficient How connected a node’s neighbors are Identifies communities
Average Path Length Average steps between nodes Measures network efficiency
Diameter Longest shortest path Indicates size of the network
2️⃣ Discussion Networks
These networks represent how people interact through discussions—common in forums, Q&A platforms, or online
groups.
🧠 Properties:
Nodes = users
Edges = reply, mention, or quote
Often threaded (tree-like structures)
Time-based evolution: discussions grow and decay
🔍 Use in SNA:
Track active users
Identify discussion leaders
Monitor topic diffusion (how discussions spread)
Detect polarization or agreement groups
3️⃣ Blogs and Online Communities
✍️Blogs:
Personal or topical web pages where individuals share opinions, stories, or information.
In SNA:
Nodes = blog authors or blog pages
Edges = comments, links, references
Blogosphere is a complex network of interlinked blogs
Uses: 1)Analyze influencer bloggers 2)Study opinion leaders
Track topic propagation
👨👩👧👦 Online Communities: Groups formed around common interests or goals on platforms like:
Reddit, Quora, Discord, Stack Exchange, Facebook Groups
SNA Applications:
Detect community leaders
Understand group dynamics
Track engagement patterns
Study emergence of trust or conflict
4️⃣ Web-Based Networks
Type Example Nodes Edges
Hyperlink Network Websites linking to each other Web pages Hyperlinks
Search Engine Graph Google index Keywords/pages Relevance
Social Media Graph Twitter, Facebook Users Follows, likes, shares
Citation Networks Google Scholar, ResearchGate Papers Citations
🔍 SNA Applications:
PageRank algorithm (used by Google)
Discover important websites
Identify influential users/content
Detect spam or fake link farms
Map information flow online
Unit-2
🔍 Visualizing Online Social Networks :Visualization helps see patterns, communities, influencers, and interactions
in social networks. It's crucial for understanding the structure and flow of information.
📚 1. A Taxonomy of Visualizations: A taxonomy means classification. Visualizations of social networks can be
grouped based on structure, size, and layout:
Category Description
Node-Link Diagrams Nodes as points, links as lines (most common)
Matrix-Based Representations Nodes in rows/columns, cells indicate connections
Hybrid Representations Combine node-link and matrix for complex networks
Timeline/Temporal Shows evolution of relationships over time
Geo-Spatial Maps social connections to real-world geography
Semantic-based Shows meaning/contexts between entities using ontologies
Category Description
📈 2. Graph Representation: Social networks are represented as graphs:
G = (V, E), where: 1) V = Nodes (users, posts, pages) 2) E = Edges (follows, likes, mentions)
Types: 1)Directed: Follower-followee 2)Undirected: Mutual friends 3) Weighted: With strength
⭐ 3. Centrality Measures importance of a node:
Type Meaning
Degree Centrality Most connected
Closeness Centrality Fastest to reach others
Betweenness Centrality Connects different parts
Eigenvector Centrality Connected to other important nodes
🧩 4. Clustering: Clustering refers to grouping similar nodes: 1) Communities: Densely connected groups
2)Methods: Modularity maximization, Louvain method, K-Means (for feature vectors)
Visualized by different colors or shapes of node groups.
🔵 5. Node-Edge Diagrams: This is the classic graph layout: 1)Nodes: Circles or icons 2)Edges: Lines (with arrows if
directed) 3)Node size = centrality 4)Edge thickness = interaction strength
Used in tools like: 1) Gephi 2)Cytoscape 3)Graphviz 4)NetworkX (Python)
🔳 6. Matrix-Based Representations: Represented as adjacency matrix
Rows and columns = nodes
Cell (i, j) = 1 if connection exists, else 0
Useful for:1.Dense graphs 2.Pattern recognition 3.Machine learning models
Drawback: Less intuitive than graphs for humans.
🔗 7. Node-Link Diagrams: 1.Most intuitive 2.Visual map of how individuals are connected
3.Issues: Gets cluttered for large networks
Use in:
Facebook friendship maps
LinkedIn connection graphs
🧬 8. Hybrid Representations
Combine matrix + node-link:
Useful when analyzing both structure and connection strength
Example: Start with matrix for computation, switch to node-link for visualization
📊 9. Modeling and Aggregating Social Network Data
Aggregation = Combining data across multiple users or platforms to get summary networks.
Modeling involves: 1.Dynamic graphs 2.Multiplex networks 3.Heterogeneous networks
Tools used: NetworkX, SNAP, Neo4j, Apache Spark GraphFrames
🎲 10. Random Walks and Their Applications
A random walk is a process where a node moves randomly from one neighbor to another.
Applications: 1.PageRank (used by Google)
Community detection
Recommendation systems
Sampling large graphs
Mathematically, it models how information or influence spreads.
☁️11. Use of Hadoop and MapReduce
When networks are too large for a single machine (like Facebook’s graph), we use distributed computing.
Hadoop: A framework to store and process big data.
MapReduce: A programming model for parallel processing.
Use in SNA:
Compute centrality across billions of nodes
Aggregate social media data
Detect trending communities or influencers in real-time
🌐 12. Ontological Representation of Social Individuals and Relationships
Ontology = a formal structure that defines: 1.Entities 2.Relationships 3.Properties
Used in Semantic Web: 1.RDF 2.OWL 3.SPARQL
Helps in: 1.Making social media machine-readable 2.Enabling smart recommendation systems
3.Linking user data across platforms