Skip to content

Latest commit

 

History

History
18 lines (14 loc) · 962 Bytes

02-data-requirements-and-collection.md

File metadata and controls

18 lines (14 loc) · 962 Bytes

Data Requirements and Collection

The state of data science methodology that covers identifying, sourcing, understanding and preparing the required data for further analysis.

Data Requirements

  • Data that may skew the results are excluded in the sample dataset
  • Identifying correct and required data content, formatrs and sources to support the selected analytical approach
  • Initial data collecton steps and may revise the requirements depending on availability of data

Data Collection

  • Gather availale data related to case under study
  • We can defer data that is not available at the moment and take it in later
  • Systematic and meticuluous preparation of data to ensury right quantity and quality

Data Understanding

  • Use statistics and visualization tools to assess fitness
  • Assess data quality issues such as missing data and other anomalies
  • Eliminate redudnant data and prepare for next stage of analysis