Option #1: Northwind Data Mining and Statistical Analysis Project – Planning

The objective of the Portfolio Project is mining data from a data warehouse, which contains data from the Northwind database you constructed during the installation of PostgreSQL. (I have already installed PostgreSQL and seeded my data base with northwind data base from Microsoft. I will also include a word file for steps to seed a database with northwind data just in case)

Summary of Tasks for the Portfolio Project

Data Warehouse:

· Create a data warehouse database, including the fact and dimension tables (star schema).

· Create the schema for each table.

· Populate the tables using either ETL (Pentaho) or SQL (PostgreSQL).

Preprocessing for SAS:

· Extract data from the data warehouse, creating a file for input into SAS. The format of the file is your choice. Ensure SAS University Edition accepts your selected format.

Statistical Analysis Using SAS:

· Import data created in the preprocessing step.

· Conduct statistical analysis using the appropriate statistics from each category:

o Summary statistics

o Classification

o Clustering

o Association

· Prepare an analysis report.

Milestone Deliverables:

· A detailed plan including the tasks, activities, and software requirements

· A brief description of any challenges you might face in completing the Portfolio Project

Your paper must meet the following requirements:

· Be 2-3 pages in length, not including the cover and references pages.

