跳至正文

专业代写SQL作业assignment Visio, PostgreSQL and DataGrip

 

以下是小组作业案例分享,主题是Collect And Analyze Financial Data

Goal of the project代写:

We are going to create an Entity-Relationship Diagram from the financial data. We will load the CSV files to a PostgreSQL database代写. Then we will alter the database to add primary keys for each table and foreign keys as appropriate. Indexes for suitable and important columns in each table will be created.  We will find out which products got at least a 4-point rating for each product category in 2018 and which ones received a lower review in the previous year. And we will calculate the average review score for products that did not get delivered. Products which have a high review but has low sales will be also found out.

扫一扫又不会怀孕,扫一扫,作业无烦恼。

Methodology:

We mainly use Visio, PostgreSQL and DataGrip代写.

 

Data collection:

  • Customers

○ 99,441 observations

○ 5 variables

  • Geolocation

○ 1,000,163 observations

○ 5 citation variables

  • Order Items

○ 112,650 observations

○ 7 citation variables

  • OrderPayments

○ 103,886 observations

○ 5 citation variables

  • OrderReviews

○ 100, 000 observations

○ 7 citation variables

  • Orders

○ 99,441 observations

○ 8 citation variables

  • ProductCategoryName Translation

○ 71 observations

○ 2 citation variables

  • Products

○ 32,951 observations

○ 9 citation variables

  • Sellers

○ 3,095 observations

○ 4 citation variables

 

Data set analysis and presentation:

  After gathering the data we need, we ran the analysis listed below:

  1. We used Visio to create an Entity-Relationship Diagram from the dataof 9 tables.
  2. We created 9 tables in PostgreSQL database, then used copy command to load the CSV data into these tables.
  3. Then we addedprimary keys for each table. Some import foreign keys were also added.
  4. Then we created indexes for suitable columns in tables, which we think are
  5. Next, we calculated higher review score of products in 2018 and lower review score of products in 2017 for each product category: Firstly, we used Table Products to join Table Order_Items whose join condition was product_id. Secondly, Table Order_Items was used to join Table Order_Reviews whose join condition was order_id. With three tablescorrelation, we could get review’ score of each product category. Thirdly, Table Products was used to join Table Product_Category_Name_Translation to translate category name in English, whose join condition was product_category_name. Fourthly, the where condition were review_answer_timestamp like ‘2018%’ and review_score >=4 to get higher review score of 2018, review_answer_timestamp like ‘2017%’ and review_score <4 to get lower review score of 2017. In the end, we grouped by columns product_category_name_english, product_id, review_score and ordered by review_score, so the review scores of products in each category could be presented.
  6. We calculated the average review score for products that did not get delivered:  Firstly, wed needed to find out the types of order status in Table Orders, and we found that ‘delivered’ status covered the value of  ‘delivered’,’invoiced’ and ‘approved’ in column order_status, which would be used in where condition next. Secondly, we used Table Order_Reviews to join Table Orders whose condition was order_id. Thirdly, the where condition was that order_status was not in the value of ‘delivered’,’invoiced’ and ‘approved’. Lastly, we used the avg function to calculate the average review score which is 1.76.
  7. We found out products which have a high review but has low sales: Firstly, we used Table Products to join Table Order_Items whose join condition was product_id. Secondly, Table Order_Items was used to join Table Order_Reviews whose join condition was order_id. So with three tables correlation, we could get review’ score of each product category. Thirdly, Table Products was used to join Table Product_Category_Name_Translation to translate category name in English, whose join condition was product_category_name. Fourthly, we used count function to calculate the sales and avg function to calculate the score. In the end, we ordered by avg_review_score with descending and sales with ascending, then we found out products of cds_dvds_musicals and fashion_childrens_clothes have a high review but has low sales.

 

Conclusion:

  1. The average review score for products that did not get delivered is 1.76.
  2. Products of cds_dvds_musicals and fashion_childrens_clothes have a high review but has low sales.