Difference between revisions of "Analytics Data Model"

From Datonis
Jump to: navigation, search
(added images)
Line 1: Line 1:
 
== Data Model ==
 
== Data Model ==
MINT currently has 4 main fact data models that it uses to render reports.
+
The Data Model is divided into 3 sections:
 +
# '''Productivity Data Model'''
 +
# '''CBM Data Model'''
 +
# '''Quality Data Model'''
 +
 
 +
=== Productivity Data Model ===
 +
MINT Productivity data, that is data about performance, is available across two kinds of time dimensions. Data that is rolled up at the shift level for shift level productivity analysis and at the hour level for more granular hourly trends.
 +
 
 +
==== '''Productivity data at a shift level''' ====
 +
Here is a diagram that represents the data model. 
 +
 
 +
[[File:Workcenter Shift Tables.png|600x600px]]
 +
 
 +
The main tables are 
 +
 
 +
'''Workcentershiftfact:''' <Please work with Amit on the naming convention>
 +
 
 +
All productivity data is centered around a workcenter which is a machine that is performing the task. This table contains the primary shift level productivity parameters.
 +
 
 +
<Link to detailed documentation where you should describe every column and the information it contains for all tables. Use this as an opportunity to rename columns that sound confusing>
 +
 
 +
'''Workcentershiftdowntimereasonsfact:''' Downtimes represent durations when the machine was non-operational. This table contains information about the downtimes for a workcenter for a shift and the reasons for it if that is available
 +
 
 +
And so on.
 +
 
 +
Followed by examples of how you will use these tables to build actual reports. 
 +
 
 +
Here is an example of how you can use the data model to analyse workcenter performance across shifts.
 +
 
 +
<Show an actual working example>
 +
 
 +
Here is an example of how you can use the data model to analyse workcenter performance for a day.
 +
 
 +
<Show an actual working example>
 +
 
 +
==== '''Productivity data at hour level''' ====
  
 
1) '''Slot_data''': This contains the production timeline data which includes the uptime, downtime, part booking, downtime reason bookings, etc which is used to compute operational dashboards. This can be used to create slot level reports for e.g. performance per Part per Slot or Part changeover analysis.
 
1) '''Slot_data''': This contains the production timeline data which includes the uptime, downtime, part booking, downtime reason bookings, etc which is used to compute operational dashboards. This can be used to create slot level reports for e.g. performance per Part per Slot or Part changeover analysis.
Line 19: Line 54:
  
 
'''Note -''' Any categorical column can be used as a dimension.
 
'''Note -''' Any categorical column can be used as a dimension.
 
[[File:Workcenter Shift Tables.png|600x600px]]
 
  
 
The proposed star schema has 6 main fact tables:
 
The proposed star schema has 6 main fact tables:

Revision as of 13:12, 21 May 2020

Data Model

The Data Model is divided into 3 sections:

  1. Productivity Data Model
  2. CBM Data Model
  3. Quality Data Model

Productivity Data Model

MINT Productivity data, that is data about performance, is available across two kinds of time dimensions. Data that is rolled up at the shift level for shift level productivity analysis and at the hour level for more granular hourly trends.

Productivity data at a shift level

Here is a diagram that represents the data model. 

Workcenter Shift Tables.png

The main tables are 

Workcentershiftfact: <Please work with Amit on the naming convention>

All productivity data is centered around a workcenter which is a machine that is performing the task. This table contains the primary shift level productivity parameters.

<Link to detailed documentation where you should describe every column and the information it contains for all tables. Use this as an opportunity to rename columns that sound confusing>

Workcentershiftdowntimereasonsfact: Downtimes represent durations when the machine was non-operational. This table contains information about the downtimes for a workcenter for a shift and the reasons for it if that is available

And so on.

Followed by examples of how you will use these tables to build actual reports. 

Here is an example of how you can use the data model to analyse workcenter performance across shifts.

<Show an actual working example>

Here is an example of how you can use the data model to analyse workcenter performance for a day.

<Show an actual working example>

Productivity data at hour level

1) Slot_data: This contains the production timeline data which includes the uptime, downtime, part booking, downtime reason bookings, etc which is used to compute operational dashboards. This can be used to create slot level reports for e.g. performance per Part per Slot or Part changeover analysis.

2) Faas_data: This contains data rolled up at a greater granularity i.e. Shift or Day level and Part and Operation level. This can be considered as analytics data and most of the multi-day reports in the current MINT application run on this source.

3) CBM/Quality/Production data: This contains hourly statistical data like Cp/CpK, min, avg, max etc. which are essential for condition-based monitoring for the selected parameters. This can be used to create hour level reports like First -our output, Points out of limits per hour trend, Hourly Part Energy, etc.

4) Traced Object: This data model of a traced object is completely use case-specific and it only has a few common fields like: machine_key, part_key, workorder_key, from, to,etc. There are currently no roll -ps happening on this data. This is a very custom data model and is currently out of the cope of the star schema model proposed below as we attempt to define a generic data model that is applicable to most of the MINT user data.

Star Schema of the MINT Analytics Data Model

Faas Data

This is further divided into 2 models:

WorkcenterShiftFact

This model is used to create most of the Productivity related KPIs and Reports in existing MINT. The main dimensions (groupby) of this model are at Workenter, Shift and Working day (Calendar Date).

Note - Any categorical column can be used as a dimension.

The proposed star schema has 6 main fact tables:

● WorkcenterShiftFact: Captures all the productivity parameters that can be used to calculate OEE, availability, performance and energy consumption ,etc.

● WorkcenterShiftDownTimeReasonsFact: Captures details of all the downtime incidents occurring on a workcenter on a given day.

● WorkcenterShiftRejectionsFact: Captures details of all the rejections on a workcenter.

● WorkcenterShiftOperatorsFact: Captures operators who have worked on a workcenter on a given day.

● WorkcenterShiftProductivityDownTimeReasonsFact: Captures details of all the downtime incidents occurring on a workcenter when it is running on a given day.

● WorkcenterShiftProductionParametersFact: Captures details of all the production parameters produced on a workcenter when it's running on a given day.

This is the most insightful data model for overall productivity metrics and will let us create a lot of reports on a daily/weekly/monthly basis. This will also allow reports like QOQ comparison of cells/workcenters/shifts on productivity, quality and performance parameters.

Note - It is possible for the user to select any categorical column mentioned in the table as a groupby to the chart being created (e.g. machine_name, shift_name, category_code, etc).

Along with the groupby, the user will have to choose a numeric column from the table to view the details. This column will have an aggregation associated with it (e.g. if you select production_quantity; options like MIN, MAX, AVG, COUNT, etc will be shown) to be selected by the user.

Additionally, the user can create custom metrics and/or KPIs to add business meaning to the charts (e.g. If the user wants to see specific energy consumption, he can create a metric by taking the ratio of electricity and production_quantity).

WorkcenterPartRoutingShiftFact

While the model above in (a) only provides roll up to a machine level, this model adds 2 more dimensions to the above model namely: Part and Routing.

Part can be compared to an SKU, Batch, Product (depending on the nomenclature each industry uses).

Routing is a set of operations that a Part goes through in its lifecycle from Raw Material to Finished Good (e.g. In a bottling plant, each Part (bottle in this case) goes through Cleaning -> Filling -> Capping -> Sticker Printing -> Packing. These 5 are operations and the complete flow is called routing).

Workcenter Part Routing Shift Tables.png

The fact tables in the star schema for this model are:

● WorkcenterPartRoutingDailyFact: Contains all the parameters (production, rejection, energy) for a part and routing for a day/shift on the machine on which it was produced.

● WorkcenterPartRoutingDownTimeReasonsDailyFact: Contains all the downtime reason details for each row in the part routing model.

● WorkcenterPartRoutingRejectionsDailyFact: Contains all the rejection details for each row in the part routing model.

This data model will allow for reports on part level analytics w.r.t. the machines on which they are produced some of which include performance monitoring, specific energy consumption, Part rejections etc.