General
![]() |
Business Analytics Special Project can be seen as a capstone-like experience in which the student would demonstrate a broad knowledge of Business Analytics by undertaking hands-on projects with realistic data. |
![]() |
Business Analytics Special Project can be seen as a capstone-like experience in which the student would demonstrate a broad knowledge of Business Analytics by undertaking hands-on projects with realistic data. |
The project charter is a formal document that describes the project in its entirety, guiding project development.
In order to define all the essential components for the successful development of the project, a Project Charter document must be developed. You should make sure that all relevant details regarding the scope, requirements and risks of the project are well documented and the final version of this document should also be validated by the client. Your Project Charter should include the following sections:
The data dictionary is a document that allows anyone to easily understand a dataset. In this case, this dataset will be the one used in your PBL project. The document must present a detailed description of the information contained in the shared database, identifying variables/ features, their description, their source, among other. In case the database consists of different tables, you should also detail how they are related.
• Variable name • Type • Description • Example • # of distinct values (if applicable) • Source • Confidentiality • Min. value (if applicable) • Max. value (if applicable) • Mean (if applicable) • Volume (# of entries) • % missing values •
The exploratory data analysis helps guide data validation and allows the project teams to have a better understanding of the data and the problem, by discovering patterns, spotting anomalies, and check assumptions with the help of summary statistics and graphical representations.
The exploratory data analysis helps to understand all the variables, their formats and examples, and contextualize them within the business and
through time. This step also allows the team to validate that the dataset selected for analysis meets the established objective, as well as other
data quality requirements. The exploratory data analysis should contain univariate and multivariate analysis:
Description:
The modeling roadmap
will specify the modelling
dataset, the models to be
trained, and the models'
validation and evaluation.
The modelling roadmap serves as an agreement on how the modelling will be developed. It should detail and justify all technical aspects related to modelling, such as:
To say that a machine learning model is good or bad, we need to compare it with the existing practice. The baseline model represents the ”business-as-usual” of the task. If there was no machine learning model implemented, how would the task be performed? With what performance? It is important to establish a baseline in order to understand how much your model would improve the performance of the task.
A baseline model adapted to the problem in hand should be defined, justified, and implemented, evaluating it using the selected metric for model performance.
Description:
The pipeline consists of a sequence of scripts that perform all the tasks needed for modelling and evaluation of model performance.
A code repository that is well-organized, clean (e.g., all branches merged) and well-documented (i.e., has a readme, has instructions to run
the code, functions are described…) should be delivered. The repository should have no data or passwords committed to it.
The pipeline should allow the user to automatically train one or more models, starting from the raw data. Then, it should allow the user to
visualize the results of the trained model(s). The more automated and flexible the pipeline is (e.g., allowing the user to select and run
multiple algorithms with different hyperparameters, and then to compare their results; allowing the user to pick different train and test
sets, or different sets of features, etc.,), the better.
Your project’s pipeline should include scripts that perform the following tasks:
Output:
Github repository
Please assess the work of your colleagues by using the following criteria. Your feedback will be considered in assigning the individual grades. Please try to be as honest and fair as possible in your assessment.