Data Analysis in Daily Life — Find My Electricity Provider

Craig Chen
5 min readFeb 23, 2020

--

Photo by Sven Mieke on Unsplash

Data analysis is so popular in this decade. Every company loves it, because it helps corporate solve a lot of difficult problems. Since data analysis is so powerful, let’s utilize data analysis in a daily life scenario.

In this post, I will go through 6 steps about how I used data analysis technique to solve problem in my daily life.

  1. Define a problem: What is the problem? What is the goal?
  2. Collecting Data: Collect information
  3. Pre-processing Data: Get the most essential piece of information
  4. Model: Utilize data to solve problem
  5. Visualization: Visualize result to understand the whole picture
  6. Result: Make a decision

Let’s get started!

Scenario: I was going to move to my new place last year. I had to find an electricity provider for my apartment.

Step 1. Define a problem

My goal is to find the best electricity plan for next year. It makes perfect sense, because everyone wants “the best” service. But what is “the best” meaning? The definition of “The Best” is too general and varies from person to person. Some people care about the electricity stability, some people want to have a better customer service, and some other people may worry about the electricity price. Therefore, in order to understand the core of my problem, I have to break it down to make the problem statements clearer.

After second thought, I listed down my concerns in priority:

  1. 12 Months Plan
  2. Low Cost Plan for around 500kWh
  3. Great Customer Service

These three points will be my goal to find “my best” plan. First, I want to have a 12 months electricity plan to match my lease. Second, I put my emphasis on having a low cost bill. I do not want to spend a lot of money to pay my electricity bill. Third, I would like to register my electricity service under a company with great customer service. I want to have a great communication with my electricity provider when there is any issues happened.

Step 2: Collect Data

After identifying the problem, the next step is to find data. This is the most important step in tthe process. The more accurate data one collect, the better result one gets. Data with high precision will present results closer to real life situation. (b) I can take more options in consideration and decide the one with the best value for me. In my case, I need to look for available electricity plans in the market as many as possible. I used this website: ChooseTexasPower.org (FYI, I think it only works in Texas) to get current market prices and to understand what companies provide the electricity service in my area. (This is not enough in my scenario, and I will explain it later on.)

In this step, I collected raw data without any process, because I did not want to lose any valuable information. In my scenario, I keep EFL file (Electricity Facts Label) which presents comprehensive information about the service for each plan.

Step 3: Pre-processing Data

What should we do with those collected data? We get the most important information out of it. Collected data from previous step may not be in a great format for further process such as a PDF file with a lot of unrelated information. Therefore, I have to identify what information is useful and important in PDF files based on my goals in Step 1.

I filtered out plans which are not 12 months plan based on my goal 1. Then, I put some important cost information from EFL file into a spreadsheet which makes me easy to see the difference between plans.

Some important information captured from EFL examples. * Data is from other search and calculation outside of EFL document. ** Value is calculated from EFL file.

Step 4: Build Models

After having data all prepared, it is time to utilize data to solve problems. In this step, I want to find the lowest cost plan based on my goal 2. I transformed each plan in the spreadsheet into python functions which make me easily to get the real cost by a given electricity usage.

Step 5: Visualization

It is not enough to only get function outputs working in some cases, because it is hard to figure out the insight in a large amount of data output without visualization. Visualizing data provides us a better way to understand the whole picture of the situation.

I made a simple line chart with usage in x-axis and cost in y-axis to easily understand the difference between costs and plans. Although this chart is very simple, it provides enough information and comparison for me to make decision in the next step.

(From left to right) (a) Raw numbers output (b) Plot based on numbers from (a)

Step 6: Result

With the support from chart, I can easily decide which plan is my best fit. Generally, I consumed around 400–700 kWh every month in the previous years. Therefore, I can even cut the usage range from 0~1000 to 400~700 to get a more precise insight.

Besides the cost analysis, I did some other researches on customer servic of electricity companies to understand their quality which meets my goal 3.

Finally, all analyses are finished. I have a better understand about each plan, which I can better decide my electricity provider based on these information.

Afterword

After deciding the plan, I navigated to that company’s website and started to register service. I found another plan in that company website which is better than what I decided. (LOL) Here is the visualization result:

Obviously, the new plan I found in this company website is the best plan among my candidates in price aspect. This is an solid evidence that I should consider more data sources when collecting the data. Thus, I can not emphasize more about this: The more accurate data one collects, the better result one gets.

Thanks for reading it. I hope that you enjoy this article. If you like it, and you are looking for/switching your electricity provider in Texas area, check out my referral link.

Happy Coding!

--

--

Craig Chen
Craig Chen

Written by Craig Chen

Quantitative Analyst / Data Scientist

No responses yet