Title: Initial Data Insights Report on ‘Sales Data Sample’

Introduction

The purpose of this report is to provide an initial overview and insights from the ‘sales data sample’ data set on Kaggle. The scope includes identifying initial insight at first glance and identifying significant patterns and trends.

Data Overview

The data set was sourced from Kaggle (https://www.kaggle.com/datasets/kyanyoga/sample-sales-data). The dataset contains 2,824 rows including the Column header row with 25 columns. The data spans from 2003 to 2005 across 19 different countries.

Initial Insights from the Data

The data shows that most of the orders have been shipped. This suggests that this business is doing very well in meeting up with its orders without delay.

The total number of quantities ordered was approximately 99k units averaging at 35 units per order. The maximum units ordered was 97 and the minimum was 6.

The total sales made was approximately 10million averaging at 3,553.89 per order. The maximum and minimum sale was 14,082.8 and 482.13 respectfully. 

There are seven product lines namely; Classic Cars, Vintage Cars, Motorcycles, Trucks and Buses, Planes, Ships and Trains. Of these product lines Classic Cars had the highest quantities ordered meanwhile Trains had the least. Similarly, the sales volume of Classic Cars was the highest and Train was the least. 

Over the three years quantities ordered and overall sales value rose in 2004 at its highest in comparison to 2003 and dropped to its lowest in 2005 well below the values in 2003. 

 


 




























Data Quality

Overall the data looks more than fairly consistent in formatting in each column and only a few columns have missing values at first glance. The following issues were spotted

  • The format in the ‘PHONE’ column is inconsistent
  • Postal code format in the ‘POSTALCODE’ column is also inconsistent
  • The ‘State’ column has many blank cells
  • The cells in ‘ADDRESSLINE2’ are mostly blank 
  • The column headers are all joined with no space. This space can be included using the underscore between each word


Conclusion

Most orders were successfully shipped, indicating efficient order management. Approximately 99,000 units were ordered, averaging 35 units per order. Total sales amounted to around 10 million, averaging 3,553.89 per order. Classic Cars dominated both in quantity ordered and sales volume, while Trains had the least performance in both metrics. Quantities ordered and sales peaked in 2004 and were at their worst in 2005 by a significant margin.

Overall data formatting is consistent, though minor inconsistencies exist in the 'PHONE' and 'POSTALCODE' columns. Issues include blank entries in the 'State' column and largely empty 'ADDRESSLINE2' cells. Headers lack spacing between words, which could be improved for clarity.


Next Steps for Further Analysis

  • Handling missing values and ensuring format consistency throughout the data set.
  • Investigating monthly and quarterly trends of sales and quantity ordered.
  • Analyze the data Geographically.
  • Explore the profitability and popularity of each product line.







Thanks for reading!

Comments

Popular posts from this blog

Uncovering Revenue Trends: A Deep Dive into Sales Performance and Strategic Recommendations