Data Modeling and analysis is the process used to analyze data in support of business and research decisions. It is the process of looking at data that has been collected and determining an appropriate model for the data, using classical or modern statistical and data analytic techniques, and therefore it involves a professional data modeler working closely with business or research stakeholders, as well as the potential users of the result.
Data models are built during the analysis phase of a project to give readily interpretable results or to forecast future events. Data modelers often use multiple models to view the same data to ensure that all perspectives have been included and to double-check results. There are several different approaches to data modeling, including:
Classical Modeling – Modeling with response and predictor variables, like much economic data.
Categorical Data Modeling – Modeling data that falls into categories and is measured in counts, like astrological data.
Classification and Cluster Modeling – Modeling to predict the class of a future observation or to see patterns in a data set.
Classical Modeling – It is modeling using regression. The purpose of a classical data model is to describe how a response variable changes when predictor variables change, usually in a linear way. This is useful for understanding underlying processes. For classical models, there is often an assumption that the errors are normally distributed.
Categorical Data Modeling - It is modeling when all the data falls into categories. It incorporates a probability model for the observations. A Categorical Data Model might use the binomial, multinomial, Poisson, or negative binomial distribution to model the data. Some methods of modeling for categorical variables include log-linear modeling – which models contingency tables, correspondence analysis, and logistic regression - when the predictor variables are also categorical.
Classification and Cluster Modeling – It is using data to make rules that can be used to classify later observations or to cluster the observations into groups to gain insight into which types of observations are alike. Some methods of classification and clustering are principle components analysis, correspondence analysis, k-means clustering, and the building of decision trees.