Developing Prediction App with Streamlit
Streamlit is a python library that can be used by data scientists, data analysts and all other developers who do not focus on web development but need an interface.
Contents:
- Why Streamlit?
- Streamlit 101
- Models development
- App development for models
- Model comparison
- Selecting and plotting the best model
Why Streamlit?
- It can be used to turn data science projects into web applications. For example, we want to run a prediction model. We wrote all the AI code, but it still has processes such as deployment and API. With Streamlit you can make this easy at the “predict button press” level. If we think of the .py code we wrote as a backend, streamlit gives us the opportunity to make the frontend of it.
- We can make data visualization dashboards. Wouldn’t it be nice to create your own BI tool this way?
- Streamlit is not limited to AI, ML, DL models. Let’s say we do a simple EDA and find some notable insights. Now is the time to explain this to steakholders, but our workspace is code hell. Here it is also allowing us to show our non-technical colleagues the final output in an understandable way.
- Other than that, it’s up to the imagination, in short, anything done with Python can be done as a dynamic web page.
Let’s start by creating the env
1- Create empty env
conda create — name streamlit_env
2- Activate it
conda activate streamlit_env
3- Install pre requirements
pip install streamlit
4- Switch the directory and run streamlit.py. (For now our screen will be blank.)
conda activate streamlit_envcd pred_w_streamlitstreamlit run streamlit.py
Streamlit 101: Basic Commands
Before we get started, let’s get a little familiar with the main commands.
- Adding title
# import module
import streamlit as st# Add title
st.title(“Hello, let’s build an app!”)
Output:
- Write: st.write() works like the print method.
st.write(“We will predict and show in app something.”)st.write(data[“Date”].head())
Output:
- Adding images:
from PIL import Imageimg = Image.open("st_logo.png")st.image(img, width=300)
Output:
- Adding radio button:
status = st.radio(“Select animal: “, (‘Cat’,’Dog’))# conditional statement to print cat or dogif (status == ‘Cat’):st.success(“Cat”)else:st.success(“Dog”)
Output:
In addition to these, we can add different functions such as slider, button, selection box, side bar, date input, color picker.
For more examples, you can check the streamlit cheat sheet.
Note: Once you save a change in the .py file, you can run streamlit again in the upper right corner of the web screen. So you can instantly observe the changes.
That’s enough information for now. In order to understand what we can do in practice, we will set up and visualize simple models with randomly prepared data.
The models will be Linear Regression, XGB Regressor and LGBM Regressor, respectively. They will predict the next 20 days by looking at historical data. We will then compare and visualize the performances of these 3 models.
Note: Our focus here is to understand the capabilities of streamlite, not model development.
1- Read data and prepare models
- Let’s start by importing the libraries.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly_express as px
import streamlit as st
from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
from xgboost import XGBRegressor
from xgboost import plot_importance
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
import lightgbm as lgb
from lightgbm import LGBMRegressor
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import GridSearchCV
- Let’s read the data and perform the preprocess steps. I do not focus much on these stages because the main topic we want to talk about in this article is streamlit.
# Read data from csv.
data=pd.read_csv(‘/Users/sk/pred_w_streamlit/streamlit_data.csv’, sep=’,’)# Change data type of date column. (object to datetime)
data[‘Date’] = pd.to_datetime(data[“Date”], format=’%Y-%m-%d’)# Sort data by date column.
data.sort_values(by=[‘Date’], inplace=True)# Create new time columns.
data[‘day_of_week’] = data[‘Date’].dt.dayofweek
data[‘day_of_month’] = data[‘Date’].dt.day
data[‘month’] = data[‘Date’].dt.month
data[‘week_of_year’] = data[‘Date’].dt.week
data[‘season’] = (data[‘Date’].dt.month % 12 + 3) // 3# Encode col1, col2, col3 variables.
le = preprocessing.LabelEncoder()
data[‘col1’] = le.fit_transform(data[‘col1’])
data[‘col2’] = le.fit_transform(data[‘col2’])
data[‘col3’] = le.fit_transform(data[‘col3’])
- Define handler functions to report
def report_metric(pred, test, model_name):
# Creates report with mae, rmse and r2 metric and returns as df mae = mean_absolute_error(pred, test)
mse = mean_squared_error(pred, test)
rmse = np.sqrt(mse)
r2 = r2_score(test, pred) metric_data = {‘Metric’: [‘MAE’, ‘RMSE’, ‘R2’], model_name: [mae, rmse, r2]}
metric_df = pd.DataFrame(metric_data) return metric_dfdef plot_preds(data_date,test_date, target, pred):
# Plots prediction vs real fig = plt.figure(figsize=(20,10))
plt.plot(data_date, target, label = 'Real')
plt.plot(test_date, pred, label = 'Pred')
plt.legend()
st.pyplot(fig)
- Determine the test period for all models before starting and split the train test.
# Split train test and define test period.
test_period = -20
test = data[test_period:]
train = data[:test_period]
- Let’s prepare the models that we will run and visualize:
### Prepare for model 1 Linear Regressorx_trainm1 = train[["col1", "col2", "col3", "day_of_week", "day_of_month", "month", "week_of_year", "season"]]
y_trainm1 = train[["target"]]x_testm1 = test[["col1", "col2", "col3", "day_of_week", "day_of_month", "month", "week_of_year", "season"]]
y_testm1 = test[["target"]]lr = LinearRegression()
lr.fit(x_trainm1, y_trainm1)
m1pred = lr.predict(x_testm1)metric1 = report_metric(m1pred, y_testm1, "Linear Regression")### Prepare for model 2 XGB Regressorx_trainm2 = train[["col1", "col2", "col3", "day_of_week", "day_of_month", "month", "week_of_year", "season"]]
y_trainm2 = train[["target"]]x_testm2 = test[["col1", "col2", "col3", "day_of_week", "day_of_month", "month", "week_of_year", "season"]]
y_testm2 = test[["target"]]xgb = XGBRegressor(n_estimators=1000, learning_rate=0.05)
# Fit the model
xgb.fit(x_trainm2, y_trainm2)# Get prediction
m2pred = xgb.predict(x_testm2)metric2 = report_metric(m2pred, y_testm2, "XGB Regression")### Prepare for model 3 LGBM Regressorx_trainm3 = train[["col1", "col3", "day_of_week", "day_of_month"]]
y_trainm3 = train[["target"]]x_testm3 = test[["col1", "col3", "day_of_week", "day_of_month"]]
y_testm3 = test[["target"]]# fit scaler on training data
norm = MinMaxScaler().fit(x_trainm3)
# transform training data
x_train_normm3 = pd.DataFrame(norm.transform(x_trainm3))
# transform testing data
x_test_normm3 = pd.DataFrame(norm.transform(x_testm3))
# We tuned parameters below with best params.
lgb_tune = LGBMRegressor(learning_rate=0.1, max_depth=2, min_child_samples=25,
n_estimators=100, num_leaves=31)lgb_tune.fit(x_train_normm3, y_trainm3)
m3pred = lgb_tune.predict(x_test_normm3)metric3 = report_metric(m3pred, y_testm3, "LGBM Regression")
2- Develop an app to visualize models
- Let’s add a sidebar
# Create a page dropdownpage = st.sidebar.selectbox(“”” Hello there! I’ll guide you! Please select model”””,
[“Main Page”,
“Linear Regressor”,
“XGB Regressor”,
“LGBM Regressor”,
“Compare Models”])
Output:
- Let’s design what will be shown on the main page when “Main Page” is selected from the sidebar.
if page == "Main Page": ### INFO
st.title("Hello, welcome to sales predictor!")
st.write(""" This application predicts sales for the next 20 days with 3 different models
# Sales drivers used in prediction:
- Date: date format time feature
- col1: categorical feature
- col2: second categorical feature
- col3: third categorical feature
- target: target variable to be predicted
""") st.write("Lets plot sales data!") st.line_chart(data[["Date", "target"]].set_index("Date"))
Output
- Let’s design what will be displayed when “Linear Regressor” is selected from the sidebar. We have defined and predicted above. Now is the time to visualize this.
elif page == "Linear Regressor":
# Base model, it uses linear regression. st.title("Model 1: ")
st.write("Model 1 works with linear regression as base model.")
st.write("The columns it used are: col1, col2, col3,
day_of_week, day_of_month, month, week_of_year, season") st.write(metric1) """
### Real vs Pred. Plot for 1. Model
"""
plot_preds(data["Date"],test["Date"], data["target"], m1pred)
Output:
- Let’s continue with the other models, the second model is the XGB Regressor.
elif page == “XGB Regressor”: # Model 2 st.title(“Model 2: “)
st.write(“Model 2 works with XGB Regressor.”)
st.write(“The columns it used are: col1, col2,
col3,day_of_week, day_of_month,
month, week_of_year, season”)
st.write(metric2) “””
### Real vs Pred. Plot for 2. Model
"""
plot_preds(data["Date"],test["Date"], data["target"], m2pred)
Output:
- Our latest model LGBM Regressor
elif page == "XGB Regressor":
# Model 2
st.title("Model 2: ")
st.write("Model 2 works with XGB Regressor.")
st.write("The columns it used are: col1, col2, col3,
day_of_week, day_of_month, month, week_of_year, season") st.write(metric2) """
### Real vs Pred. Plot for 2. Model
"""
plot_preds(data["Date"],test["Date"], data["target"], m2pred)
Output:
- We have created 3 pages for the three models, now we may want to compare them in this app. Let’s show the performances of the models on a page and have the best model drawn.
elif page == “Compare Models”: # Compare models.
st.title(“Compare Models: “)
all_metrics = metric1.copy()
all_metrics[“XGB Regression”] = metric2[“XGB Regression”].copy()
all_metrics[“LGBM Regression”] = metric3[“LGBM Regression”].copy()
st.write(all_metrics) # Best Model
st.title(“Best Model /XGB Regressor: “)
st.write(“Lets plot best models predictions in detail.”)
# Plot best model results.
plot_preds(test[“Date”],test[“Date”], test[“target”], m2pred) # Show rowbase best result and real
st.write(“Best Model Predictions vs Real”)
best_pred = pd.DataFrame(test[[“target”]].copy())
best_pred[“pred”] = m2pred
st.write(best_pred)
Output:
- We can also add links and emojis.
link=’Made by [Sengul Karaderili](https://github.com/Sk1613/)'
st.markdown(link,unsafe_allow_html=True)
st.write(“💛”)
Output:
Lets see final product:
https://github.com/Sk1613/prediction_w_streamlit/blob/main/streamlit-app.mov
You can use this repo to access the full code.
Thank you for reading, you can contact me if you want to add or have feedback.