Understanding the local electricity market

📖 Background You work for an energy company in Australia. Your company builds solar panel arrays and then sells the energy they produce to industrial customers. The company wants to expand to the city of Melbourne in the state of Victoria.

Prices and demand for electricity change every day. Customers pay for the energy received using a formula based on the local energy market's daily price.

Your company's pricing committee wants your team to estimate energy prices for the next 12-18 months to use those prices as the basis for contract negotiations.

In addition, the VP of strategy is researching investing in storage capacity (i.e., batteries) as a new source of revenue. The plan is to store some of the energy produced by the solar panels when pricing conditions are unfavorable and sell it by the next day on the open market if the prices are higher.

💾 The data

You have access to over five years of energy price and demand data (source):

"date" - from January 1, 2015, to October 6, 2020.
"demand" - daily electricity demand in MWh.
"price" - recommended retail price in AUD/MWh.
"demand_pos_price" - total daily demand at a positive price in MWh.
"price_positive" - average positive price, weighted by the corresponding intraday demand in AUD/MWh.
"demand_neg_price" - total daily demand at a negative price in MWh.
"price_negative" - average negative price, weighted by the corresponding intraday demand in AUD/MWh.
"frac_neg_price" - the fraction of the day when the demand traded at a negative price.
"min_temperature" - minimum temperature during the day in Celsius.
"max_temperature" - maximum temperature during the day in Celsius.
"solar_exposure" - total daily sunlight energy in MJ/m^2.
"rainfall" - daily rainfall in mm.
"school_day" - "Y" if that day was a school day, "N" otherwise.
"holiday" - "Y" if the day was a state or national holiday, "N" otherwise.

Note: The price was negative during some intraday intervals, so energy producers were paying buyers rather than vice-versa.

import numpy as np 
import pandas as pd 
import seaborn as sns
import matplotlib.pyplot as plt
import xgboost as xgb
from xgboost import plot_importance, plot_tree
from sklearn.metrics import mean_squared_error, mean_absolute_error
from sklearn.model_selection import RandomizedSearchCV
# --- Set defaults of the notebook ---
sns.set(font="'Source Code Pro', monospace")
plt.rcParams["font.family"] = "'Source Code Pro', monospace"
import warnings
warnings.filterwarnings("ignore")
# Color Palettes
treasure_colors = ["#703728", "#c86b25", "#dc9555", "#fed56f", "#c89a37"]
pirate_colors = ["#010307", "#395461", "#449FAF", "#B1F4FC", 
                 "#F4D499", "#835211"]

df_electricity = pd.read_csv('./data/energy_demand.csv', parse_dates=['date'],index_col =["date"])
df_electricity.head()

💪 Competition challenge

Create a report that covers the following:

How do energy prices change throughout the year? Are there any patterns by season or month of the year?
Build a forecast of daily energy prices the company can use as the basis of its financial planning.
Provide guidance on how much revenue the energy storage venture could generate per year using retail prices and a 70MWh storage system.

⌛️ Exploratory Data Analysis!

color_pal = ["#F8766D", "#D39200", "#93AA00", "#00BA38", "#00C19F", "#00B9E3", "#619CFF", "#DB72FB"]
_ = df_electricity["price"].plot(style='.', figsize=(15,5), color=color_pal[0], title='Daily electricity price in Victoria from January 2015 to October 2020')
plt.ylim(0,300)
plt.ylabel("Price")

Text(0, 0.5, 'Price')

Insights 📍 : The distribution of daily electricity price

Based on the plot above show:

The distribution of daily electricity price in Victoria from January 2015 to October 2020 .
There is a significant electricity price change especially in 2016 to 2020.

(df_electricity.describe()
 .style
 .highlight_max(axis=0,color="#c07fef")
 .highlight_min(axis=0,color="#00FF00")
 .set_caption("Statistics of Electricity in Australia for 2015-2020"))

fig, ax = plt.subplots(4, 1, figsize = (15, 20))
ax[0].tick_params(axis='both', which='major', labelsize=13)
ax[1].tick_params(axis='both', which='major', labelsize=13)
ax[2].tick_params(axis='both', which='major', labelsize=13)
ax[3].tick_params(axis='both', which='major', labelsize=13)


df_electricity.reset_index().plot(x='date', y='demand_pos_price', lw = 1, ax = ax[0])
ax[0].set_title("Daily electricity demand in Victoria from January 2015 to October 2020")
ax[0].set_ylabel("Demand Positive Price [MWh]")
ax[0].get_legend().remove()


df_electricity.reset_index().plot(x='date', y='demand_neg_price', lw = 1, color='red', ax = ax[1])
ax[1].set_ylabel("Demand Negative Price [MWh]")
ax[1].get_legend().remove()


df_electricity.reset_index().plot(x='date', y='price_positive', lw = 1, color='red', ax = ax[2])
ax[2].set_ylabel("Price Positive [AUD$/MWh]")
ax[2].get_legend().remove()


df_electricity.reset_index().plot(x='date', y='price_negative', lw = 1, color='red', ax = ax[3])
ax[3].set_ylabel("Price Negative [AUD$/MWh]")
ax[3].get_legend().remove()

Insights 📍 : Daily electricity demand in Victoria from January 2015 to October 2020

Based on the plot above show:

Demand positive price shows some seasonality based on plot shown above</span>. Maybe this feature may benefit our model.
Demand negative price We can see there is a significant difference electricity price through maximum price which is very far from the mean and standard deviation.
Positive price mostly with small values except for some values that is greater than 1000.
Price negative ocurred in each year with variation of values.

Make a correlation plot

def heatmap(x, y, size):
    fig, ax = plt.subplots()
    
    # Mapping from column names to integer coordinates
    x_labels = [v for v in sorted(x.unique())]
    y_labels = [v for v in sorted(y.unique())]
    x_to_num = {p[1]:p[0] for p in enumerate(x_labels)} 
    y_to_num = {p[1]:p[0] for p in enumerate(y_labels)} 
    
    size_scale = 500
    ax.scatter(
        x=x.map(x_to_num), # Use mapping for x
        y=y.map(y_to_num), # Use mapping for y
        s=size * size_scale, # Vector of square sizes, proportional to size parameter
        marker='s' # Use square as scatterplot marker
    )
    
    # Show column labels on the axes
    ax.set_xticks([x_to_num[v] for v in x_labels])
    ax.set_xticklabels(x_labels, rotation=45, horizontalalignment='right')
    ax.set_yticks([y_to_num[v] for v in y_labels])
    ax.set_yticklabels(y_labels)
    
data = df_electricity
columns = df_electricity.columns
corr = data[columns].corr()
corr = pd.melt(corr.reset_index(), id_vars='index') # Unpivot the dataframe, so we can get pair of arrays for x and y
corr.columns = ['x', 'y', 'value']
heatmap(
    x=corr['x'],
    y=corr['y'],
    size=corr['value'].abs()
)

Insights 📍 : Correlation Matrix

Based on the plot above show:

Price target has higher correlation with price_positive, demand, demand_pos_price and frac_negative_price.

plt.figure(figsize=(18, 12))
for i, (combi, df) in enumerate(df_electricity.groupby([df_electricity.index.year])):
    ax = plt.subplot(2, 3, i+1, ymargin=0.5)
    #print(df.num_sold.values.shape, df.num_sold.values)
    ax.plot(df.price)
    ax.set_title(combi)
    plt.xticks(rotation=45)
plt.tight_layout(h_pad=3.0)
plt.suptitle('Energy prices change for 2015-2020', y=1.03)
plt.show()

Insights 📍 : Electricity price change

Based on the plot above show:

There is a significant increase of electricity price in December 2015 after a stable price during January-September.> 2. The electricity price tends to be stable in range 50-150 AUD except in February where there is an increase of energy price.> 2. For 2016, significant increase of price occured in January and March followed by stable price during April-December> 3. For 2018-2019, There is a significant increase of electricity price around January-February.

plt.figure(figsize=(18, 12))
for i, (combi, df) in enumerate(df_electricity.groupby([df_electricity.index.year,"holiday"])):
    ax = plt.subplot(3,4, i+1, ymargin=0.5)
    #print(df.num_sold.values.shape, df.num_sold.values)
    ax.plot(df.price)
    ax.set_title(combi)
    plt.xticks(rotation=45)
plt.suptitle('Energy prices change for 2015-2020 on holiday Season', y=1.03)

Text(0.5, 1.03, 'Energy prices change for 2015-2020 on holiday Season')

Insights 📍 : Electricity prices change for 2015-2020 on holiday Season

Based on the plot above show:

There is an unstable price pattern of energy price in holiday season.> 2. For annual Non-Holiday Season, There is an increase of energy price in the beginning of the year

plt.figure(figsize=(18, 12))
for i, (combi, df) in enumerate(df_electricity.groupby([df_electricity.index.year,"school_day"])):
    ax = plt.subplot(3,4, i+1, ymargin=0.5)
    #print(df.num_sold.values.shape, df.num_sold.values)
    ax.plot(df.price)
    ax.set_title(combi)
    plt.xticks(rotation=40)
plt.suptitle('Energy prices change for 2015-2020 on School Season', y=1.03)

Text(0.5, 1.03, 'Energy prices change for 2015-2020 on School Season')

Insights 📍 : Electricity price change for 2015-2020 on School Season

Based on the plot above show:

There is a significant increase of electricity price in the end of 2015 on school season> 2. There is a similar pattern of the increase of electricity price on school season in February-March in 2018, 2019, 2020> 2. For annual non-holiday Season, There is an increase of electricity price in the beginning of the year in 2016, 2018, 2019

Splitting the data based on time

split_date = '01-Jan-2019'
df_train = df_electricity.loc[df_electricity.index <= split_date].copy()
df_test = df_electricity.loc[df_electricity.index > split_date].copy()

def create_features(df, label=None):
    """
    Creates time series features from datetime index
    """
    df['date'] = df.index
    df['hour'] = df['date'].dt.hour
    df['dayofweek'] = df['date'].dt.dayofweek
    df['quarter'] = df['date'].dt.quarter
    df['month'] = df['date'].dt.month
    df['year'] = df['date'].dt.year
    df['dayofyear'] = df['date'].dt.dayofyear
    df['dayofmonth'] = df['date'].dt.day
    df['weekofyear'] = df['date'].dt.isocalendar().week.astype(np.int64) 
    df["school_day"] = pd.get_dummies(df["school_day"])
    df["school_day"] = pd.get_dummies(df["school_day"])
    df['price_7_days_lag'] = df['price'].shift(7)
    df['price_15_days_lag'] = df['price'].shift(15)
    df['price_30_days_lag'] = df['price'].shift(30)
    df['price_7_days_mean'] = df['price'].rolling(window = 7).mean()
    df['price_15_days_mean'] = df['price'].rolling(window = 15).mean()
    df['price_30_days_mean'] = df['price'].rolling(window = 30).mean()
    df['price_7_days_std'] = df['price'].rolling(window = 7).std()
    df['price_15_days_std'] = df['price'].rolling(window = 15).std()
    df['price_30_days_std'] = df['price'].rolling(window = 30).std()
    df['price_7_days_max'] = df['price'].rolling(window = 7).max()
    df['price_15_days_max'] = df['price'].rolling(window = 15).max()
    df['price_30_days_max'] = df['price'].rolling(window = 30).max()
    df['price_7_days_min'] = df['price'].rolling(window = 7).min()
    df['price_15_days_min'] = df['price'].rolling(window = 15).min()
    df['price_30_days_min'] = df['price'].rolling(window = 30).min()
    
 
   
    cols = ['hour','dayofweek','dayofyear','quarter','month','year','dayofmonth','weekofyear','price_positive','demand_pos_price','demand','demand_neg_price','price_negative','frac_neg_price']
    #cols = ['hour','dayofweek','quarter','month','year','dayofyear','dayofmonth','weekofyear','demand_pos_price',]
    
    for d in ('7', '15', '30'):
        for c in ('lag', 'mean', 'std', 'max', 'min'):
            cols.append(f'price_{d}_days_{c}')
    X = df[cols]
    if label:
        y = df[label]
        return X, y
    return X

X_train, y_train = create_features(df_train, label='price')
X_test, y_test = create_features(df_test, label='price')

reg = xgb.XGBRegressor(n_estimators=100,eta=0.1)
fitted_xgb_model = reg.fit(X_train, y_train,
        eval_set=[(X_train, y_train), (X_test, y_test)],
        early_stopping_rounds=50,
       verbose=True)

[0]	validation_0-rmse:78.10894	validation_1-rmse:232.43314
[1]	validation_0-rmse:71.00056	validation_1-rmse:226.20828
[2]	validation_0-rmse:64.59680	validation_1-rmse:220.67780
[3]	validation_0-rmse:58.83662	validation_1-rmse:215.75482
[4]	validation_0-rmse:53.66943	validation_1-rmse:211.30708
[5]	validation_0-rmse:49.02540	validation_1-rmse:207.36438
[6]	validation_0-rmse:44.84807	validation_1-rmse:203.70424
[7]	validation_0-rmse:41.09291	validation_1-rmse:200.41937
[8]	validation_0-rmse:37.70780	validation_1-rmse:197.01787
[9]	validation_0-rmse:34.63074	validation_1-rmse:193.86409
[10]	validation_0-rmse:31.85113	validation_1-rmse:190.97063
[11]	validation_0-rmse:29.33829	validation_1-rmse:188.28787
[12]	validation_0-rmse:27.05840	validation_1-rmse:185.79985
[13]	validation_0-rmse:24.98889	validation_1-rmse:183.49159
[14]	validation_0-rmse:23.10552	validation_1-rmse:181.34113
[15]	validation_0-rmse:21.39121	validation_1-rmse:179.33682
[16]	validation_0-rmse:19.83474	validation_1-rmse:178.55098
[17]	validation_0-rmse:18.41598	validation_1-rmse:177.83699
[18]	validation_0-rmse:17.11959	validation_1-rmse:177.18620
[19]	validation_0-rmse:15.93410	validation_1-rmse:176.59129
[20]	validation_0-rmse:14.84714	validation_1-rmse:176.04675
[21]	validation_0-rmse:13.85011	validation_1-rmse:175.54684
[22]	validation_0-rmse:12.93431	validation_1-rmse:175.08743
[23]	validation_0-rmse:12.09328	validation_1-rmse:174.66530
[24]	validation_0-rmse:11.32025	validation_1-rmse:173.82794
[25]	validation_0-rmse:10.60783	validation_1-rmse:173.03954
[26]	validation_0-rmse:9.95008	validation_1-rmse:172.29640
[27]	validation_0-rmse:9.34216	validation_1-rmse:171.59558
[28]	validation_0-rmse:8.77918	validation_1-rmse:170.93398
[29]	validation_0-rmse:8.25816	validation_1-rmse:170.30927
[30]	validation_0-rmse:7.77358	validation_1-rmse:169.71977
[31]	validation_0-rmse:7.32364	validation_1-rmse:169.16220
[32]	validation_0-rmse:6.90481	validation_1-rmse:168.63498
[33]	validation_0-rmse:6.51409	validation_1-rmse:168.13646
[34]	validation_0-rmse:6.14908	validation_1-rmse:167.66449
[35]	validation_0-rmse:5.80786	validation_1-rmse:167.21773
[36]	validation_0-rmse:5.48851	validation_1-rmse:166.79488
[37]	validation_0-rmse:5.18929	validation_1-rmse:166.39388
[38]	validation_0-rmse:4.90825	validation_1-rmse:166.01436
[39]	validation_0-rmse:4.64466	validation_1-rmse:165.65491
[40]	validation_0-rmse:4.39683	validation_1-rmse:165.31377
[41]	validation_0-rmse:4.16372	validation_1-rmse:164.99037
[42]	validation_0-rmse:3.94428	validation_1-rmse:164.68390
[43]	validation_0-rmse:3.73747	validation_1-rmse:164.39325
[44]	validation_0-rmse:3.54249	validation_1-rmse:164.11774
[45]	validation_0-rmse:3.35867	validation_1-rmse:163.85646
[46]	validation_0-rmse:3.18519	validation_1-rmse:163.60869
[47]	validation_0-rmse:3.02139	validation_1-rmse:163.37366
[48]	validation_0-rmse:2.86655	validation_1-rmse:163.15067
[49]	validation_0-rmse:2.72040	validation_1-rmse:162.93970
[50]	validation_0-rmse:2.58236	validation_1-rmse:162.73958
[51]	validation_0-rmse:2.45181	validation_1-rmse:162.54944
[52]	validation_0-rmse:2.32823	validation_1-rmse:162.36912
[53]	validation_0-rmse:2.21122	validation_1-rmse:162.19797
[54]	validation_0-rmse:2.10046	validation_1-rmse:162.03554
[55]	validation_0-rmse:1.99561	validation_1-rmse:161.88142
[56]	validation_0-rmse:1.89622	validation_1-rmse:161.73517
[57]	validation_0-rmse:1.80210	validation_1-rmse:161.59613
[58]	validation_0-rmse:1.71306	validation_1-rmse:161.46413
[59]	validation_0-rmse:1.62860	validation_1-rmse:161.33887
[60]	validation_0-rmse:1.54868	validation_1-rmse:161.21994
[61]	validation_0-rmse:1.47296	validation_1-rmse:161.10706
[62]	validation_0-rmse:1.40104	validation_1-rmse:160.99974
[63]	validation_0-rmse:1.33303	validation_1-rmse:160.89801
[64]	validation_0-rmse:1.26847	validation_1-rmse:160.80133
[65]	validation_0-rmse:1.20725	validation_1-rmse:160.70961
[66]	validation_0-rmse:1.14925	validation_1-rmse:160.62248
[67]	validation_0-rmse:1.09428	validation_1-rmse:160.53976
[68]	validation_0-rmse:1.04232	validation_1-rmse:160.46124
[69]	validation_0-rmse:0.99303	validation_1-rmse:160.38670
[70]	validation_0-rmse:0.94642	validation_1-rmse:160.31592
[71]	validation_0-rmse:0.90220	validation_1-rmse:160.24872
[72]	validation_0-rmse:0.86030	validation_1-rmse:160.18489
[73]	validation_0-rmse:0.82075	validation_1-rmse:160.12427
[74]	validation_0-rmse:0.78316	validation_1-rmse:160.06668
[75]	validation_0-rmse:0.74781	validation_1-rmse:160.01205
[76]	validation_0-rmse:0.71458	validation_1-rmse:159.96014
[77]	validation_0-rmse:0.68281	validation_1-rmse:159.91087
[78]	validation_0-rmse:0.65270	validation_1-rmse:159.86403
[79]	validation_0-rmse:0.62447	validation_1-rmse:159.81957
[80]	validation_0-rmse:0.59772	validation_1-rmse:159.77733
[81]	validation_0-rmse:0.57273	validation_1-rmse:159.73723
[82]	validation_0-rmse:0.54923	validation_1-rmse:159.69914
[83]	validation_0-rmse:0.52638	validation_1-rmse:159.66295
[84]	validation_0-rmse:0.50531	validation_1-rmse:159.62856
[85]	validation_0-rmse:0.48540	validation_1-rmse:159.59593
[86]	validation_0-rmse:0.46696	validation_1-rmse:159.56493
[87]	validation_0-rmse:0.44928	validation_1-rmse:159.53546
[88]	validation_0-rmse:0.43296	validation_1-rmse:159.50746
[89]	validation_0-rmse:0.41758	validation_1-rmse:159.48086
[90]	validation_0-rmse:0.40334	validation_1-rmse:159.45570
[91]	validation_0-rmse:0.38988	validation_1-rmse:159.43173
[92]	validation_0-rmse:0.37735	validation_1-rmse:159.40901
[93]	validation_0-rmse:0.36581	validation_1-rmse:159.38736
[94]	validation_0-rmse:0.35467	validation_1-rmse:159.36688
[95]	validation_0-rmse:0.34369	validation_1-rmse:159.34740
[96]	validation_0-rmse:0.33422	validation_1-rmse:159.32890
[97]	validation_0-rmse:0.32500	validation_1-rmse:159.31133
[98]	validation_0-rmse:0.31654	validation_1-rmse:159.29463
[99]	validation_0-rmse:0.30846	validation_1-rmse:159.27872

plt.plot(fitted_xgb_model.evals_result()['validation_0']
['rmse'])
plt.plot(fitted_xgb_model.evals_result()['validation_1']
['rmse'])
plt.ylabel('RMSE', fontsize=14)
plt.xlabel('Price', fontsize=14)
plt.legend(['Train', 'Val'], loc='upper right')

<matplotlib.legend.Legend at 0x7f0267b412e0>

We can see the comparison between training rmse and validation rmse score where the is an underfitting problem in our case where maybe due to outlier electricity price in 2019-2020

feature_important = reg.get_booster().get_score(importance_type='weight')
keys = list(feature_important.keys())
values = list(feature_important.values())

data = pd.DataFrame(data=values, index=keys, columns=["score"]).sort_values(by = "score", ascending=False)
data.nlargest(10, columns="score").plot(kind='barh', figsize = (20,10)) ## plot top 10 features

<AxesSubplot:>

Based on the feature importances extracted from the model, price_positive feature, demand_neg_price, dayofweek,dayofyear and dayofyear show the most 5th of feature importances.

df_test['price_prediction'] = reg.predict(X_test)
price_all = pd.concat([df_test, df_train], sort=False)

_ = price_all[['price','price_prediction']].plot(figsize=(15, 5))
plt.ylim(0,1000)

(0.0, 1000.0)

f, ax = plt.subplots(1)
f.set_figheight(5)
f.set_figwidth(15)
_ = price_all[['price','price_prediction']].plot(ax=ax,style=['-','.'])
ax.set_xbound(lower='01-05-2019', upper='02-10-2020')
ax.set_ylim(0, 2000)
plot = plt.suptitle('May 2019 Forecast vs Actuals')

Insights 📍 : Prediction error price between real price and prediction price

Our model is generalizable for the next 12-18 months based on the plot above. In Machine Learninig, It is a mandatory to make our model to be generalizable as possible so that it can predict the real/actual value in real situation. There is a few errors around february and in the beginining of January 2020. If it can predict the same value for actual demand, it is a overfitting problem. We have to take into account this case to make better prediction that can mimic the actual value with fewer errors.

mean_squared_error(y_true=df_test['price'],y_pred=df_test['price_prediction'])

25369.714397883905

mean_absolute_error(y_true=df_test['price'],y_pred=df_test['price_prediction'])

10.73348890658493

df_test['error'] = df_test['price'] - df_test['price_prediction']
df_test['abs_error'] = df_test['error'].apply(np.abs)
error_by_day = df_test.groupby(['year','month','dayofmonth']).mean()[['price','price_prediction','error','abs_error']]

(error_by_day.sort_values('error', ascending=True).head(10)
.style
.bar(subset="error",color="#c07fef",vmax=error_by_day.error.quantile(0.95))
.highlight_max(axis=0,color="#fef70c")
)

(error_by_day.sort_values('abs_error', ascending=False).head(10)
.style
.bar(subset="abs_error",color="#c07fef",vmax=error_by_day.abs_error.quantile(0.95))
.highlight_max(axis=0,color="#fef70c")
)

Insights 📍 : Notice anything about the over forecasted days?

#1 worst day - February 2nd, 2020.
#6 worst day - March 25, 2019.

Looks like our model influenced by outliers .

(error_by_day.sort_values('abs_error', ascending=True).head(10)
.style
.bar(subset="abs_error",color="#c07fef",vmax=error_by_day.abs_error.quantile(0.95))
.highlight_max(axis=0,color="#fef70c")
)

Revenue generated per year using retail prices and a 70MWh storage system by the energy storage venture

predicton_vs_Actual = df_test.rename(columns={"price":"PRICE","price_prediction":"PRICE_PREDICTION"})
predicton_vs_Actual.head()

(predicton_vs_Actual.sample(5)
 .style
 .background_gradient(axis=0,subset=["PRICE","PRICE_PREDICTION"],cmap="Blues")
 .set_caption("Comparison of Real price and Prediction price")
)

(predicton_vs_Actual.describe()
 .style
 .highlight_max(axis=0,color="#c07fef")
 .highlight_min(axis=0,color="#00FF00")
 .set_caption("Price for the next 12-18 months dataset")
)

train_df_result =df_train.groupby("year")["price","demand_pos_price","demand_neg_price","frac_neg_price","price_negative","demand"].agg(["mean","max","min"]).stack()
(train_df_result
 .style
 .background_gradient(axis=0,subset=["demand","price"],cmap="Blues")
 .set_caption("Comparison of Real Demand and Prediction Demand")
)

test_df_result =df_test.groupby("year")["price","demand_pos_price","demand_neg_price","frac_neg_price","price_negative","demand","price_prediction"].agg(["mean","max","min"]).stack()
(test_df_result
 .style
 .background_gradient(axis=0,subset=["price","price_prediction"],cmap="Blues")
 .set_caption("Comparison of Real Demand and Prediction Demand")
)

test_df_result =df_test.groupby("year")["price","demand_pos_price","demand_neg_price","frac_neg_price","price_negative","demand","price_prediction"].agg(["mean","max","min"]).stack()
(test_df_result
 .style
 .background_gradient(axis=0,subset=["price","price_prediction"],cmap="Blues")
 .set_caption("Comparison of Real Demand and Prediction Demand")
)

df_electricity["total_price"] = df_electricity["demand"]*df_electricity["price"]

df_electricity.groupby(df_electricity.index.year)["price"].agg(["mean","max","min"]).unstack()

      date
mean  2015      35.068136
      2016      50.094252
      2017      94.740161
      2018      94.648823
      2019     117.281370
      2020      61.266055
max   2015     188.086125
      2016     545.737820
      2017     213.339432
      2018    1210.137920
      2019    4549.645105
      2020    2809.437516
min   2015      13.279841
      2016       6.869135
      2017      19.865114
      2018      14.673588
      2019      19.170951
      2020      -6.076028
dtype: float64

demand_negative_price = df_electricity[df_electricity["price_negative"] < 0]
demand_negative_price

demand_positive_price = df_electricity[df_electricity["price_negative"] >= 0]
demand_positive_price

demand_negative= (demand_negative_price.groupby(demand_negative_price.index.year)["total_price"].agg(["sum","max","min"])).reset_index()
demand_negative

This is the condition when the demand for negative price per year. You can see on 2020 where the minimum price show a negative value.

demand_positive = demand_positive_price.groupby(demand_positive_price.index.year)["total_price"].agg(["sum","max","min"]).reset_index()
demand_positive

This is the condition when the demand for positive price per year. You can see on per year where the minimum price show a an appreciation from the market that is shown on the positive electiricity price.

result = demand_negative.merge(demand_positive,on="date",suffixes =["_demand_pos","_demand_neg"])
result

result["profit"] =result["sum_demand_pos"]-result["sum_demand_neg"]
result

We then substract the result when the demand is positive and demand is negative based on the total price feature that we have been created. Most of the years shows a good result.

(result.sort_values("profit",ascending=False)
 .style
 .background_gradient(axis=0,subset=["profit"],cmap="Blues")
 .set_caption("Revenue Generated per year in the daily electiricity in Victoria Australia")
)

This is the result of revenue generated per year. We can see a good revenue generated per year. If you find this notebook usefull, Please show your appreciation by upvoting this notebook. Thank you.

References

Here is a few references for creating advanced visualization using matplotlib ¹.

1. You can see a lot of data visualization plots here Data Visualization with Python!↩

2. Towards Data Science Article by Shiu Tang Li !↩

3. TPSJAN22-01 EDA which makes sense !↩

	demand	price	demand_pos_price	price_positive	demand_neg_price	price_negative	frac_neg_price	min_temperature	max_temperature	solar_exposure	rainfall	school_day	holiday
date
2015-01-01	99635.030	25.633696	97319.240	26.415953	2315.790	-7.240000	0.020833	13.3	26.9	23.6	0.0	N	Y
2015-01-02	129606.010	33.138988	121082.015	38.837661	8523.995	-47.809777	0.062500	15.4	38.8	26.8	0.0	N	N
2015-01-03	142300.540	34.564855	142300.540	34.564855	0.000	0.000000	0.000000	20.0	38.2	26.5	0.0	N	N
2015-01-04	104330.715	25.005560	104330.715	25.005560	0.000	0.000000	0.000000	16.3	21.4	25.2	4.2	N	N
2015-01-05	118132.200	26.724176	118132.200	26.724176	0.000	0.000000	0.000000	15.0	22.0	30.7	0.0	N	N

	demand	price	demand_pos_price	price_positive	demand_neg_price	price_negative	frac_neg_price	min_temperature	max_temperature	solar_exposure	rainfall
count	2106.000000	2106.000000	2106.000000	2106.000000	2106.000000	2106.000000	2106.000000	2106.000000	2106.000000	2105.000000	2103.000000
mean	120035.476503	76.079554	119252.305055	76.553847	783.171448	-2.686052	0.008547	11.582289	20.413200	14.743373	1.505944
std	13747.993761	130.246805	14818.631319	130.114184	3578.920686	19.485432	0.039963	4.313711	6.288693	7.945527	4.307897
min	85094.375000	-6.076028	41988.240000	13.568986	0.000000	-342.220000	0.000000	0.600000	9.000000	0.700000	0.000000
25%	109963.650000	38.707040	109246.250000	39.117361	0.000000	0.000000	0.000000	8.500000	15.525000	8.200000	0.000000
50%	119585.912500	66.596738	119148.082500	66.869058	0.000000	0.000000	0.000000	11.300000	19.100000	12.700000	0.000000
75%	130436.006250	95.075012	130119.477500	95.130181	0.000000	0.000000	0.000000	14.600000	23.900000	20.700000	0.800000
max	170653.840000	4549.645105	170653.840000	4549.645105	57597.595000	0.000000	0.625000	28.000000	43.500000	33.300000	54.600000

			price	price_prediction	error	abs_error
year	month	dayofmonth
2019	1	22	278.777743	350.801331	-72.023587	72.023587
2019	12	30	295.829202	342.720764	-46.891562	46.891562
2020	1	30	1044.447303	1078.613281	-34.165979	34.165979
	10	2	-6.076028	21.710295	-27.786323	27.786323
	10	3	-1.983471	21.918243	-23.901714	23.901714
	1	23	-1.761423	19.056923	-20.818346	20.818346
2019	2	2	240.954524	257.686920	-16.732396	16.732396
2019	8	31	79.347198	91.819000	-12.471802	12.471802
2020	8	30	9.421019	20.588486	-11.167467	11.167467
2020	1	4	28.042231	38.049507	-10.007276	10.007276

			price	price_prediction	error	abs_error
year	month	dayofmonth
2019	1	24	4549.645105	1078.613281	3471.031824	3471.031824
2020	1	31	2809.437516	812.882507	1996.555008	1996.555008
2019	3	1	1284.799876	812.882507	471.917369	471.917369
	1	25	906.437232	645.617493	260.819740	260.819740
	1	22	278.777743	350.801331	-72.023587	72.023587
	12	30	295.829202	342.720764	-46.891562	46.891562
2020	1	30	1044.447303	1078.613281	-34.165979	34.165979
	10	2	-6.076028	21.710295	-27.786323	27.786323
	10	3	-1.983471	21.918243	-23.901714	23.901714
2019	8	13	267.347650	246.398193	20.949457	20.949457

			price	price_prediction	error	abs_error
year	month	dayofmonth
2019	7	29	95.873753	95.874405	-0.000652	0.000652
2019	8	20	61.926547	61.927246	-0.000699	0.000699
2020	8	6	69.433709	69.434418	-0.000708	0.000708
	9	9	37.015434	37.016582	-0.001149	0.001149
	8	15	52.515043	52.512608	0.002436	0.002436
2019	10	20	87.533212	87.530640	0.002573	0.002573
2020	4	2	61.537020	61.540165	-0.003145	0.003145
2020	4	14	40.838554	40.842064	-0.003510	0.003510
2019	6	9	69.561306	69.557457	0.003849	0.003849
2020	3	23	44.639439	44.643665	-0.004226	0.004226

	demand	PRICE	demand_pos_price	price_positive	demand_neg_price	price_negative	frac_neg_price	min_temperature	max_temperature	solar_exposure	...	price_30_days_std	price_7_days_max	price_15_days_max	price_30_days_max	price_7_days_min	price_15_days_min	price_30_days_min	PRICE_PREDICTION	error	abs_error
date
2019-01-02	106470.675	92.202011	106470.675	92.202011	0.0	0.0	0.0	18.4	22.2	26.3	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	92.063560	0.138451	0.138451
2019-01-03	118789.605	127.380303	118789.605	127.380303	0.0	0.0	0.0	15.9	29.5	27.6	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	126.187721	1.192581	1.192581
2019-01-04	133288.460	121.020997	133288.460	121.020997	0.0	0.0	0.0	18.0	42.6	27.4	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	121.093582	-0.072585	0.072585
2019-01-05	97262.790	83.493520	97262.790	83.493520	0.0	0.0	0.0	17.4	21.2	12.9	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	83.704323	-0.210802	0.210802
2019-01-06	93606.215	65.766407	93606.215	65.766407	0.0	0.0	0.0	14.6	22.1	30.9	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	65.620796	0.145611	0.145611

	demand	PRICE	demand_pos_price	price_positive	demand_neg_price	price_negative	frac_neg_price	min_temperature	max_temperature	solar_exposure	rainfall	school_day	holiday	date	hour	dayofweek	quarter	month	year	dayofyear	dayofmonth	weekofyear	price_7_days_lag	price_15_days_lag	price_30_days_lag	price_7_days_mean	price_15_days_mean	price_30_days_mean	price_7_days_std	price_15_days_std	price_30_days_std	price_7_days_max	price_15_days_max	price_30_days_max	price_7_days_min	price_15_days_min	price_30_days_min	PRICE_PREDICTION	error	abs_error
date
2019-10-17 00:00:00	116690.040000	92.498044	116690.040000	92.498044	0.000000	0.000000	0.000000	9.500000	15.200000	9.600000	4.400000	1	N	2019-10-17 00:00:00	0	3	4	10	2019	290	17	42	167.981136	108.661677	149.599176	107.013610	111.958930	115.165759	14.385468	26.376231	28.347460	123.261328	167.981136	192.485035	90.164236	54.720115	54.720115	92.161629	0.336415	0.336415
2020-02-28 00:00:00	110006.970000	50.306353	110006.970000	50.306353	0.000000	0.000000	0.000000	13.100000	20.100000	17.600000	0.000000	1	N	2020-02-28 00:00:00	0	4	1	2	2020	59	28	9	48.658203	79.018494	78.272342	50.750555	54.706821	174.695236	6.434453	9.322584	529.975511	60.554702	79.502644	2809.437516	42.236094	42.236094	14.235635	50.108658	0.197695	0.197695
2019-05-26 00:00:00	106153.430000	63.903082	106153.430000	63.903082	0.000000	0.000000	0.000000	9.500000	15.100000	8.600000	6.600000	1	N	2019-05-26 00:00:00	0	6	2	5	2019	146	26	21	76.294168	87.036533	60.918383	90.270987	89.690794	95.045620	17.449081	13.195141	13.561886	115.894456	115.894456	121.029642	63.903082	63.903082	63.903082	63.706802	0.196280	0.196280
2019-03-19 00:00:00	123290.835000	113.644774	123290.835000	113.644774	0.000000	0.000000	0.000000	16.900000	23.400000	12.700000	0.000000	1	N	2019-03-19 00:00:00	0	1	1	3	2019	78	19	12	76.971548	135.206447	110.489593	110.805143	100.527496	148.568061	11.182552	17.085113	215.716053	126.785957	126.785957	1284.799876	95.251188	74.374932	74.374932	113.493477	0.151297	0.151297
2019-04-02 00:00:00	114927.205000	119.397551	114927.205000	119.397551	0.000000	0.000000	0.000000	7.400000	25.900000	16.600000	0.000000	1	N	2019-04-02 00:00:00	0	1	2	4	2019	92	2	14	95.567894	111.149795	153.784296	105.271113	105.146212	103.555577	17.997891	15.698139	17.232821	131.333531	131.333531	135.206447	83.989538	80.292927	74.374932	118.808243	0.589308	0.589308

	demand	PRICE	demand_pos_price	price_positive	demand_neg_price	price_negative	frac_neg_price	min_temperature	max_temperature	solar_exposure	rainfall	school_day	hour	dayofweek	quarter	month	year	dayofyear	dayofmonth	weekofyear	price_7_days_lag	price_15_days_lag	price_30_days_lag	price_7_days_mean	price_15_days_mean	price_30_days_mean	price_7_days_std	price_15_days_std	price_30_days_std	price_7_days_max	price_15_days_max	price_30_days_max	price_7_days_min	price_15_days_min	price_30_days_min	PRICE_PREDICTION	error	abs_error
count	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	644.000000	637.000000	629.000000	614.000000	638.000000	630.000000	615.000000	638.000000	630.000000	615.000000	638.000000	630.000000	615.000000	638.000000	630.000000	615.000000	644.000000	644.000000	644.000000
mean	117643.164884	92.987010	115960.088610	93.920500	1683.076273	-2.628288	0.018763	11.258851	20.047516	14.121429	1.571118	0.579193	0.000000	3.000000	2.310559	5.920807	2019.434783	164.804348	15.633540	24.015528	93.691424	94.312974	95.825203	93.212968	93.538990	91.768173	54.150249	72.852308	87.392116	200.741524	322.505861	494.818502	52.455595	43.836577	36.683321	83.745338	9.241676	10.733489
std	13719.536422	221.045838	16058.351649	220.815258	5468.830434	14.501710	0.061677	4.178613	6.275934	7.527379	4.622587	0.494072	0.000000	2.001555	1.038818	3.204041	0.496114	97.726483	8.830309	13.916315	222.139909	223.477412	225.979776	105.016925	78.052549	55.971449	204.237108	204.005225	183.989880	555.624866	798.388712	1010.230907	26.400018	23.519784	21.738763	83.260063	159.133995	159.040192
min	86891.230000	-6.076028	41988.240000	14.558266	0.000000	-304.150000	0.000000	1.700000	9.600000	1.300000	0.000000	0.000000	0.000000	0.000000	1.000000	1.000000	2019.000000	1.000000	1.000000	1.000000	-1.761423	-1.761423	-1.761423	23.314900	26.399112	30.011994	2.643165	5.364451	8.586690	33.031411	37.881752	53.063441	-6.076028	-6.076028	-6.076028	14.941895	-72.023587	0.000652
25%	107860.791250	47.951427	106688.311250	48.534078	0.000000	0.000000	0.000000	8.300000	15.300000	8.200000	0.000000	0.000000	0.000000	1.000000	1.000000	3.000000	2019.000000	81.750000	8.000000	12.000000	48.336165	48.542669	49.852245	48.849650	50.822983	52.430808	10.068998	12.243217	15.547400	64.270729	75.818450	103.916022	31.208768	21.755296	18.875436	47.338225	-0.289771	0.100342
50%	115552.020000	70.942903	115005.677500	72.009555	0.000000	0.000000	0.000000	11.000000	18.700000	11.700000	0.000000	1.000000	0.000000	3.000000	2.000000	6.000000	2019.000000	162.000000	16.000000	24.000000	72.065599	72.815311	74.405598	75.313011	83.410650	87.232646	15.213009	18.972156	23.018682	100.676262	125.099031	134.747615	49.910197	44.176312	42.328652	71.584919	-0.011184	0.268921
75%	127321.842500	101.439833	127115.777500	101.665942	0.000000	0.000000	0.000000	13.800000	23.200000	19.300000	1.050000	1.000000	0.000000	5.000000	3.000000	8.000000	2020.000000	242.250000	23.000000	35.000000	101.944171	102.608515	103.262500	103.681588	105.118976	105.963637	24.736949	32.092289	38.685609	134.488422	162.491995	243.810160	75.512742	63.903082	52.525320	101.335741	0.233059	0.588213
max	170653.840000	4549.645105	170653.840000	4549.645105	57597.595000	0.000000	0.625000	25.100000	43.500000	32.000000	54.600000	1.000000	0.000000	6.000000	4.000000	12.000000	2020.000000	365.000000	31.000000	52.000000	4549.645105	4549.645105	4549.645105	902.337815	506.831433	318.015562	1663.267896	1140.727552	817.222368	4549.645105	4549.645105	4549.645105	118.261945	89.427805	80.292927	1078.613281	3471.031824	3471.031824

		price	demand_pos_price	demand_neg_price	frac_neg_price	price_negative	demand
year
2015	mean	35.068136	124491.479493	170.690384	0.001598	-2.305297	124662.169877
	max	188.086125	158052.890000	19480.250000	0.229167	0.000000	158052.890000
	min	13.279841	84331.030000	0.000000	0.000000	-318.660000	95093.295000
2016	mean	50.094252	121028.504249	961.034508	0.010303	-4.438947	121989.538757
	max	545.737820	160285.015000	29110.575000	0.333333	0.000000	160285.015000
	min	6.869135	65215.145000	0.000000	0.000000	-289.190000	90227.480000
2017	mean	94.740161	118910.605411	42.443274	0.000400	-2.460516	118953.048685
	max	213.339432	154632.335000	4666.465000	0.041667	0.000000	155060.610000
	min	19.865114	85094.375000	0.000000	0.000000	-342.220000	85094.375000
2018	mean	94.648823	118438.166507	372.398726	0.003881	-1.643921	118810.565233
	max	1210.137920	165070.595000	22839.285000	0.250000	0.000000	165070.595000
	min	14.673588	73658.870000	0.000000	0.000000	-157.920000	88903.065000
2019	mean	78.560979	98933.060000	0.000000	0.000000	0.000000	98933.060000
	max	78.560979	98933.060000	0.000000	0.000000	0.000000	98933.060000
	min	78.560979	98933.060000	0.000000	0.000000	0.000000	98933.060000

	demand	price	demand_pos_price	price_positive	demand_neg_price	price_negative	frac_neg_price	min_temperature	max_temperature	solar_exposure	rainfall	school_day	holiday	total_price
date
2015-01-01	99635.030	25.633696	97319.240	26.415953	2315.790	-7.240000	0.020833	13.3	26.9	23.6	0.0	N	Y	2.554014e+06
2015-01-02	129606.010	33.138988	121082.015	38.837661	8523.995	-47.809777	0.062500	15.4	38.8	26.8	0.0	N	N	4.295012e+06
2015-01-07	153514.820	48.312309	149498.715	49.639712	4016.105	-1.100000	0.020833	18.9	37.4	20.7	0.0	N	N	7.416655e+06
2015-01-18	97728.750	17.008681	95473.965	20.911790	2254.785	-148.260000	0.020833	15.3	19.5	23.4	0.0	N	N	1.662237e+06
2015-02-13	136070.620	18.736971	133078.540	26.322857	2992.080	-318.660000	0.020833	16.1	32.4	14.9	0.0	Y	N	2.549551e+06
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2020-10-01	106641.790	34.654671	95349.610	41.651658	11292.180	-24.426925	0.125000	9.4	19.5	21.2	1.8	N	N	3.695636e+06
2020-10-02	99585.835	-6.076028	41988.240	26.980251	57597.595	-30.173823	0.625000	12.8	26.0	22.0	0.0	N	N	-6.050864e+05
2020-10-03	92277.025	-1.983471	44133.510	32.438156	48143.515	-33.538025	0.583333	17.4	29.4	19.8	0.0	N	N	-1.830288e+05
2020-10-04	94081.565	25.008614	88580.995	26.571687	5500.570	-0.163066	0.062500	13.5	29.5	8.4	0.0	N	N	2.352850e+06
2020-10-05	113610.030	36.764701	106587.375	39.616015	7022.655	-6.511550	0.083333	9.1	12.7	7.3	12.8	N	N	4.176839e+06