Basic modeling course
​
Goals:
Introductory notions to the use of SPSS Modeler.
​
Techniques presented:
Introduction to data quality and introduction to data manipulation and data preparation, overview of graphic nodes, advanced data processing topics such as sampling, management of missing data, manipulation of files.
​
Prerequisites:
Minimum familiarity with the PC, with Windows and with the most common office automation tools is required.
Introduction
-
Introduction to Modeler
-
Methodological outlines and CRISP-DM
-
Using the GUI and help
-
Modeler Clients and Servers
-
Work files, saving and organization in projects
Using nodes and analysis
-
Reading data files from different sources (db, text, excel, and other proprietary formats)
-
Different types of fields, automatic reading and manual setting of the type
-
Introduction to data quality:
-
Recognition and treatment of invalids
-
Diagnostic tools to recognize outliers and/or other data anomalies
-
-
Introduction to data manipulation and data preparation
-
Basic operations on records and fields
-
Sorts of fields and records
-
Generation of new fields
-
Use of CLEM language and expression generator
-
Multiple way of creating fields
-
-
Overview of graphical nodes and their use of data analysis/exploration and interactive mode
-
Study of univariate distributions and bivariate relationships between data
Advanced data processing issues
-
Simultaneous use of multiple data sources
-
Aggregations
-
Data source merges
-
Accommodations
-
-
The use of super nodes
-
Sampling
-
Basic sampling techniques
-
Sampling for the efficiency of executions
-
Separation of data sources based on samples
-
Data caching
-
Data partitioning
-
-
Management of missing data
-
Setting blank values
-
Using global values for substitutions
-
Automatic check for null and out of range values
-
Advice on managing nulls
-
-
Working with dates
-
General format options and two-digit dates
-
Data readings and transformations
-
Applying formulas to multiple fields
-
-
Working with string data
-
Advanced examples of string manipulation
-
-
Working with sequential data
-
Counts and status
-
Sequential data functions in Modeler
-
-
File manipulation
-
Using aggregation
-
Pure and conditional transpositions of files
-
Creation of flag fields/indicator functions from categorical data
-
-
Notes on the types of joins and SQL optimization
Introduction to modelling
-
Overview of modeling techniques
-
neural networks
-
Decision trees
-
Statistical prediction (prediction) models
-
Linear regressions and logistic
-
Clustering techniques
-
Dimensionality reduction techniques (principal component analysis)
-
Association rules
-
-
Using a neural network
-
Understanding of the logic of neural networks
-
Use of decision trees in basic and interactive mode
-
Combining and comparing different models
-
Example of clustering
-
Examples of using association rules