top of page
Statistics_stamp-03.png
Training MOD

Basic modeling course

​

Goals: 

Introductory notions to the use of SPSS Modeler.

​

Techniques presented

Introduction to data quality and introduction to data manipulation and data preparation, overview of graphic nodes, advanced data processing topics such as sampling, management of missing data, manipulation of files.

​

Prerequisites

Minimum familiarity with the PC, with Windows and with the most common office automation tools is required.

 

Introduction

  • Introduction to Modeler

  • Methodological outlines and CRISP-DM

  • Using the GUI and help

  • Modeler Clients and Servers

  • Work files, saving and organization in projects

 

Using nodes and analysis

  • Reading data files from different sources (db, text, excel, and other proprietary formats)

  • Different types of fields, automatic reading and manual setting of the type

  • Introduction to data quality:

    • Recognition and treatment of invalids

    • Diagnostic tools to recognize outliers and/or other data anomalies

  • Introduction to data manipulation and data preparation

    • Basic operations on records and fields

    • Sorts of fields and records

    • Generation of new fields

    • Use of CLEM language and expression generator

    • Multiple way of creating fields

  • Overview of graphical nodes and their use of data analysis/exploration and interactive mode

  • Study of univariate distributions and bivariate relationships between data

 

Advanced data processing issues

  • Simultaneous use of multiple data sources

    • Aggregations

    • Data source merges

    • Accommodations

  • The use of super nodes

  • Sampling

    • Basic sampling techniques

    • Sampling for the efficiency of executions

    • Separation of data sources based on samples

    • Data caching

    • Data partitioning

  • Management of missing data

    • Setting blank values

    • Using global values for substitutions

    • Automatic check for null and out of range values

    • Advice on managing nulls

  • Working with dates

    • General format options and two-digit dates

    • Data readings and transformations

    • Applying formulas to multiple fields

  • Working with string data

    • Advanced examples of string manipulation

  • Working with sequential data

    • Counts and status

    • Sequential data functions in Modeler

  • File manipulation

    • Using aggregation

    • Pure and conditional transpositions of files

    • Creation of flag fields/indicator functions from categorical data

  • Notes on the types of joins and SQL optimization

 

Introduction to modelling

  • Overview of modeling techniques

    • neural networks

    • Decision trees

    • Statistical prediction (prediction) models

    • Linear regressions and logistic

    • Clustering techniques

    • Dimensionality reduction techniques (principal component analysis)

    • Association rules

  • Using a neural network

  • Understanding of the logic of neural networks

  • Use of decision trees in basic and interactive mode

  • Combining and comparing different models

  • Example of clustering

  • Examples of using association rules

bottom of page