Plotly - Basics

Posted on 2023-08-03 Edited on 2024-01-27 Views:

If you work with massive datasets,
you can use a library like Plotly, designed to handle massive datasets well.

Before Doing Data Visualization

Find Datasets

Prepared Environment

Install packages
- Plotly
- Pandas
- JupyterLab

$ pip install plotly==5.15.0
$ pip install pandas
$ pip install jupyterlab
$ jupyter lab

Load Dataset

📘 Download World Earthquake Data From 1906-2022

head()
- Shows the first n (the default is 5) rows
tail()
- The “opposite” method of head() is tail()
- Shows the last n (5 by default) rows of the dataframe object
info()
- Prints out a concise summary of the dataframe, including information about the index, data types, columns, non-null values, and memory usage
describe()
- Generates descriptive statistics, including those that summarize the central tendency, dispersion, and shape of the dataset’s distribution

import pandas as pd

df = pd.read_csv('data.csv')

# Shows the first 5 rows
print(df.head())

# Shows the last 5 rows
print(df.tail())

# Concise summary of the dataframe
print(df.info())

# Descriptive statistics
print(df.describe())

# Save as `Year` field
df['Year'] = pd.to_datetime(df['time']).dt.year

# Save as `Country` field
df["Country"] = df["place"].str.split(pat=',', expand=False).str.get(-1)

After collection, most data requires some degree of cleaning or reformatting before it can be analyzed or used to create visualizations.

Getting Started - Plotly

Line Charts

Line charts are used to convey changes over time.

import plotly.express as px

df = px.data.gapminder().query("country=='Canada'")
fig = px.line(df, x="year", y="lifeExp", title='Life expectancy in Canada')
fig.show()

import plotly.express as px

df = px.data.gapminder().query("continent=='Oceania'")
fig = px.line(df, x="year", y="lifeExp", color='country')
fig.show()

Histogram

Use a histogram to visualize the frequency distribution of a single event over a certain time period.
A histogram is the graphical representation of quantitative data.

import plotly.express as px

df = px.data.tips()
fig = px.histogram(df, x="total_bill")
fig.show()

Bar Charts

The bar chart is the graphical representation of categorical data.

import plotly.express as px

long_df = px.data.medals_long()
fig = px.bar(long_df, x="nation", y="count", color="medal", title="Long-Form Input")
fig.show()

Scatter Plots

If you wanted to highlight the relationship or correlations between two variables (e.g. marketing spend and revenue, or hours of weekly exercise vs. cardiovascular fitness), you could use a scatter plot to see, at a glance, if one increases as the other decreases (or vice versa).

import pandas as pd
import plotly.express as px

df = pd.read_csv('data.csv')
df['Year'] = pd.to_datetime(df['time']).dt.year
fig = px.scatter(df, x="Year", y="mag")
fig.show()

Pie chart

import plotly.express as px

df = px.data.tips()
fig = px.pie(df, values='tip', names='day')
fig.show()

Maps

import pandas as pd
import plotly.express as px

df = pd.read_csv('data.csv')

# Draw a map after doing `mag >= 7` query
fig = px.density_mapbox(df.query("mag >= 7"), lat='latitude', lon='longitude', z='mag', radius=10,
                        center=dict(lat=0, lon=180), zoom=0, mapbox_style="stamen-terrain")
fig.show()

import pandas as pd
import plotly.express as px

df = pd.read_csv('data.csv')

# Do pre-processing on `place` field 
df["Country"] = df["place"].str.split(pat=',', expand=False).str.get(-1)

fig = px.scatter_mapbox(df.query("mag >= 7"), lat="latitude", lon="longitude", 
                        hover_name="Country", hover_data=["mag", "depth"],
                        color_discrete_sequence=["red"], zoom=3, height=300)
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

Plotly - Basics

If you work with massive datasets,
you can use a library like Plotly, designed to handle massive datasets well.

Before Doing Data Visualization

Find Datasets

Prepared Environment

Load Dataset

Getting Started - Plotly

Line Charts

Histogram

Bar Charts

Scatter Plots

Pie chart

Maps

References

If you work with massive datasets, you can use a library like Plotly, designed to handle massive datasets well.

Before Doing Data Visualization

Find Datasets

Prepared Environment

Load Dataset

Getting Started - Plotly

Line Charts

Histogram

Bar Charts

Scatter Plots

Pie chart

Maps

References

If you work with massive datasets,
you can use a library like Plotly, designed to handle massive datasets well.