Sentiment analysis is the automated process of tagging text according to its emotional tone—positive, negative, or neutral. It enables companies to analyze data at scale, uncover insights, and automate workflows.
In the past, sentiment analysis was reserved for researchers and machine learning engineers with deep NLP expertise. But recent advances from the AI community have democratized access, allowing anyone to perform sentiment analysis with just a few lines of code.
This guide covers everything you need to get started:
- What is sentiment analysis?
- Using pre-trained models with Python
- Building your own model
- Analyzing tweets
What is Sentiment Analysis?
Sentiment analysis is a natural language processing technique that identifies the polarity of text. The most common approach labels data as positive, negative, or neutral. For example:
- "dear @verizonsupport your service is straight 💩 in dallas..." → Negative
- "@verizonsupport ive sent you a dm" → Neutral
- "thanks to michelle et al at @verizonsupport who helped push my no-show-phone problem along..." → Positive
This technique processes data in real time, enabling you to analyze thousands of tweets, reviews, or support tickets automatically. Common applications include:
- Monitoring social media mentions to compare brand sentiment against competitors.
- Extracting insights from surveys and product reviews.
- Flagging angry support tickets in real time to prevent churn.
Using Pre-trained Models with Python
You can tap into state-of-the-art sentiment analysis models from the Hugging Face Hub with just five lines of code. The Hub hosts over 215 public sentiment models, many with browser-based demos.
pip install -q transformers
from transformers import pipeline
sentiment_pipeline = pipeline("sentiment-analysis")
data = ["I love you", "I hate you"]
sentiment_pipeline(data)
This uses the default sentiment model and returns:
[{'label': 'POSITIVE', 'score': 0.9998},
{'label': 'NEGATIVE', 'score': 0.9991}]
To use a model tailored for tweets, specify its ID:
specific_model = pipeline(model="finiteautomata/bertweet-base-sentiment-analysis")
specific_model(data)
Recommended models include:
Building Your Own Model
You can fine-tune a pre-trained model on your own dataset using Python or AutoNLP. Fine-tuning adapts the model to specific domains (e.g., product reviews). Here's a minimalist fine-tuning example:
from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments
training_args = TrainingArguments(output_dir="./results", num_train_epochs=3)
trainer = Trainer(
model=AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=3),
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
)
trainer.train()
AutoNLP provides a no-code alternative: upload your dataset, and it automatically trains and deploys a model.
Analyzing Tweets
To analyze tweets, follow these steps:
- Install dependencies:
pip install tweepy transformers - Set up Twitter API credentials (consumer key, consumer secret, access token, access token secret).
- Search for tweets using Tweepy:
import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
tweets = api.search(q="@VerizonSupport", count=100)
- Run sentiment analysis on the tweet texts using the pipeline from earlier.
- Explore the results: compute aggregate sentiment, visualize distributions, or identify trends.
Wrapping Up
You now have the tools to perform sentiment analysis in Python, from using pre-trained models to building custom ones and analyzing live Twitter data. The Hugging Face ecosystem makes it easy to get started with minimal code.
For a hands-on notebook, check out the Colab demo.