Posts

Showing posts from September, 2019

Step 2 in Data Science - Prepare

We divide, prepare data into two steps based on the nature of the activity. The first step in data preparation involves literally looking at the data to understand its nature, what it means, it's quality, and format. It often takes a preliminary analysis of data or samples of data to understand this. This is why this step is called Prepare. Once we know more about the data through exploratory analysis, the next step is pre-processing of data for analysis. It includes cleaning data, subsetting or filtering data, and creating data that programs can read and understand via modeling raw data into a more defined data model or packaging it using a specific data format. If there are multiple datasets involved, this step also includes integration of data from different data sources or streams

Step 1 in Data Science - Acquire the Data

Acquire includes anything that makes us retrieve data including finding, accessing, acquiring, and moving data. It includes identification of an authenticated access to all related data, transportation of the data from different sources, and ways to subset and match the data to regions or times of interests. Sometimes we refer to this as a geospatial query.