
  • acquisition modeling
    • pilot campaign
    • profiling of high-value customers
  • AIC see Akaike Information Criterion
  • Akaike Information Criterion
  • association (affinity) models
    • apriori and FP-growth algorithms
    • market basket analysis
    • rule’s confidence
    • sequence algorithms
  • Bagging (Bootstrap aggregation)
    • Decision Tree model
    • in IBM SPSS Modeler
    • in RapidMiner
  • balancing approach
    • balance factor
    • cross-selling
    • in Data Mining for Excel
    • disproportionate stratified sampling
    • in IBM SPSS Modeler
    • oversampling
    • in RapidMiner
  • Bayesian belief networks
    • CPT see conditional probability table
    • IBM SPSS Modeler
    • Microsoft Naïve Bayes
    • Naive Bayes models
    • parent variables
    • RapidMiner Naïve Bayes
    • Tree Augmented Naive Bayes models
  • Bayesian Information Criterion
  • Bayesian networks
  • behavioral segmentation methodology
    • business objective, definition
    • cluster modeling see cluster modeling, identification of segments
    • CRISP DM methodology, phases
    • customer segmentation
    • data exploration and validation
    • data integration and aggregation
    • data transformations and enrichment
    • deployment of segmentation solution
      • customer scoring model
      • design and deliver of differentiated strategies
      • distribution of segmentation information
    • input set reduction
    • investigation of data sources
    • modeling process design
    • revealed segments
      • marketing research information
      • optimal cluster solution and labeling segments
      • profiling of
      • technical evaluation of clustering solution
    • selecting data
  • BIC see Bayesian Information Criterion
  • BIRCH algorithm
  • boosting
    • Adaptive Boosting (AdaBoost)
    • in IBM SPSS Modeler
    • in RapidMiner
  • Bootstrap validation method
  • Call Detail Records
  • candidate model
    • CHAID see CHAID model
    • CHURN model
    • ensemble model
    • Gains chart
  • candidate models, performance measures
  • CART see classification and regression trees
  • CDRs see Call Detail Records
  • C5.0/C4.5
    • information gain ratio index
      • of child nodes
      • Information or Entropy of node
      • 2-level C5.0 Decision Tree
      • of root node
      • split on profession
    • predictors, handling of
  • CHAID model
    • and C5.0
    • chi-square test see chi-square test, CHAID
    • churn scores
    • and ensemble model
    • handling of predictors
    • parameter
    • right branches
    • in rules format
    • spending frequency
    • in tree format
  • chi-square test, CHAID
    • cross-tabulation of target
    • Decision Tree algorithm
    • IBM SPSS Modeler CART
      • advanced options
      • basics options
      • parameters
      • stopping criteria
    • IBM SPSS Modeler C5.0 Decision Trees
    • IBM SPSS Modeler CHAID Decision Trees
      • advanced options
      • 2-level
      • parameters
      • stopping rules
    • independence hypothesis
    • Microsoft Decision Trees parameters
    • Pearson chi-square statistic
    • predictors for split
    • p-value or observed significance level
    • RapidMiner Decision Trees
      • parameters
      • recursive partitioning
  • classification algorithms
    • Bayesian belief networks
    • Bayesian networks
    • chi-square test, CHAID
    • data mining algorithms
    • Decision Tree models
    • Gini index, CART
    • information gain ratio index, C5.0/C4.5
    • Naive Bayesian networks
    • support vector machines
  • classification and regression trees
    • Gini index
    • handling of predictors
  • classification modeling methodology
    • acquisition modeling
    • business understanding and process design
    • combining models
    • CRISP-DM phases
    • cross-selling modeling
    • in Data Mining for Excel
    • data preparation and enrichment
    • deep-selling modeling
    • direct marketing campaigns
    • form of supervised modeling
    • in IBM SPSS Modeler
    • likelihood of prediction
    • meta-modeling
    • model deployment
    • model evaluation
    • product campaigns, optimization
    • in RapidMiner
    • up-selling modeling
    • voluntary churn modeling
  • classification or propensity models
    • Bayesian networks
    • decision rules
    • Decision Trees
    • logistic regression
    • neural networks
    • support vector machine
  • class weighting
    • class-imbalanced modeling file
    • in IBM SPSS Modeler
    • in RapidMiner
  • clustering algorithms
  • cluster modeling, identification of segments
    • agreement level
    • in Data Mining for Excel
    • in IBM SPSS Modeler
    • profiling
      • cluster centroid
      • cluster separation
      • in Data Mining for Excel
      • effective marketing strategies, development
      • in IBM SPSS Modeler
      • in RapidMiner
      • table of cluster centers
    • in RapidMiner
    • revealed segments
      • cohesion of clusters
      • descriptive statistics and technical measures
      • in IBM SPSS Modeler
      • in RapidMiner
      • separation of clusters
  • cluster models
    • agglomerative or hierarchical
    • data preparation
      • data miners of organization
      • data reduction technique
      • explained variance
      • interpretation results
      • principal components analysis model
      • rotated component matrix
    • expectation maximization clustering
    • identifying segments
      • cluster models, comparison
      • parameter settings
      • revealed clusters, distribution
      • Silhouette measure
    • K-means
    • K-medoids
    • Kohonen network/self-organizing map
    • profiling
      • behavioral profile
      • distribution of factors for Cluster
      • profiling chart
      • profiling of clusters
      • structures
      • table of centroids
    • RapidMiner process
    • TwoStep cluster
  • conditional probability table
    • of gender input attribute
    • probabilities of output
    • of profession input attribute
    • of SMS calls input attribute
    • of voice calls input attribute
  • confusion matrix and accuracy measures
    • in Data Mining for Excel
    • error rate
    • F-measure
    • in IBM SPSS Modeler
    • misclassification or coincidence matrix
    • Performance operator
    • Precision measure
    • in RapidMiner
    • Recall measure
    • ROC curve
    • sensitivity and specificity
  • CPT see conditional probability table
  • CRISP-DM see Cross Industry Standard Process for Data Mining process model
  • CRM see customer relationship management
  • Cross Industry Standard Process for Data Mining process model
    • business understanding
    • data preparation
    • data understanding
    • deployment
    • evaluation
    • modeling
    • phases
  • Cross or n-fold validation method
    • in Data Mining for Excel
    • modeling dataset
    • n iterations
    • in RapidMiner
  • cross-selling modeling
    • browsing the model
      • C5.0 model
      • CHAID model
      • CPTs
      • FLAG_GROCERY attribute
      • Gains chart
      • gains chart of ensemble model
      • performance metrics for individual models
      • response rate
      • ROI chart
      • TAN
    • campaign list
      • Modeler deployment stream
      • scored customers, estimated fields
    • Data Mining for Excel
      • Accuracy Chart wizard
      • campaign response
      • classification algorithm and parameters
      • Classification Matrix wizard
      • Classify Wizard
      • confusion matrix
      • cumulative percentage of responders
      • Decision Tree model
      • dependency network of BDE tree
      • Gains charts for two Decision Tree models
      • model deployment
      • Split (Holdout) validation
      • validation dataset
      • validation of model performance
    • development of
    • mining approach
    • modeling procedure
      • IBM SPSS Modeler procedure
      • setting roles of attributes
      • Split (Holdout) validation
      • test and loading campaign responses
      • training
    • parameters
    • pilot campaign
    • product uptake
    • profiling of owners
  • customer relationship management
    • customer development
    • customer satisfaction
    • data mining
  • customer scoring model
    • in Data Mining for Excel
    • Decision Tree
    • deployment procedure
    • in IBM SPSS Modeler
    • in RapidMiner
  • customer segmentation
    • behavioral
    • definition
    • loyalty based
    • needs/attitudinal
    • propensity based
    • sociodemographical
    • value based
  • customers grouping, value segmentation
    • binning node
    • binning procedure
    • Data Audit node
    • high-value customers
    • investigation of characteristics
    • medium-and low-value customers
    • quantiles
    • regrouping quantiles into value segments
    • RFM segmentation
    • segmentation bands selected by retailer
    • and total purchase amount
  • data dictionary
    • card level, voluntary churn model
    • closing date and reason
    • demographical input data
    • of modeling file
    • time periods, model training phase
    • transactional input data
    • usage attributes
  • data enrichment
    • customer signature
    • data reduction algorithm
    • feature selection
    • informative KPIs
    • naming of attributes
  • data exploration
    • assessment of data quality
    • categorical attributes
    • continuous (range) attributes
    • tool of IBM SPSS Modeler
  • data integration and aggregation
  • data management procedure, churn model
    • from cards to customers
      • cards’ closing dates
      • filtering out cards
      • flagging cards
      • IBM SPSS Modeler node
      • individual card records, grouping
      • initial card usage data
      • open at end, observation period
      • total number of transactions, calculation
    • enrichment, customer data
      • balances
      • deltas, spending
      • limit ratios
      • monthly average number, transactions
      • spending amount, monthly average
      • spending frequency
      • spending recency
      • tenure, customer
      • trends, card ownership
    • modeling population and target field
      • defined
      • latency period
      • in scoring phase
      • selection
      • short-term churners
  • data mining
    • algorithms
    • CRISP-DM
    • CRM strategy
    • customer life cycle management
    • customer segmentation
    • datamart
    • direct marketing campaigns
    • market basket and sequence analysis
    • marketing reference table
    • personalized customer handling
    • required data per industry
    • supervised models
    • unsupervised models
  • Data Mining for Excel
    • balancing approach
    • churn model
      • accuracy and error rate
      • approaches
      • churners and nonchurners, cumulative distribution
      • classification algorithm
      • Classify Wizard
      • confusion matrix
      • Decision Tree model
      • mining structure, storing
      • Query wizard
      • scored customers and model derived estimates
      • Split (Holdout) validation
      • validation, performance
    • classification modeling methodology
    • cluster modeling
    • confusion matrix and accuracy measures
    • Cross or n-fold validation method
    • in cross-selling model see cross-selling modeling
    • customer scoring model
    • Gains/Response/Lift charts
    • K-means
    • Naive Bayesian networks
      • Classify Wizard
      • Dependency network
    • receiver operating characteristic curves
    • scoring customers
    • Split (Holdout) validation method
  • data preparation procedure
    • aggregating at customer level
    • aggregating at transaction level
      • adding demographics using merge node
      • at customer level
      • invoice level
    • categorizing transactions into time zones
    • classification modeling, tasks
    • customer level, usage aspects
    • data exploration and validation
    • data integration and aggregation
    • data transformations and enrichment
    • enrich customer information
      • average basket size
      • basket diversity
      • customer tenure
      • deriving new fields
      • flags of product groups
      • frequency
      • monetary value
      • monthly average purchase amount
      • ratio of transactions
      • recency
      • relative spending
    • IBM SPSS Modeler Derive nodes
    • imbalanced outcomes
    • initial transactional data
    • investigation of data sources
    • Modeler datetime_weekday() function
    • pivoting transactional data
      • payment type
      • series of Restructure nodes
    • selecting data sources
    • validation techniques
  • data transformations
    • event outcome period
    • label or target attribute
    • optimal discretization or binning
  • data validation process
  • Decision Tree model
    • algorithms
      • attribute selection method
      • classes of target attribute
      • decision rules
      • handle predictors
      • root node and
      • user-specified terminating criteria
    • with bagging
    • Bagging (Bootstrap aggregation)
    • churn model
    • classification algorithms
    • classification or propensity models
    • cross-selling model
    • customer scoring model
    • “divide-and-conquer” procedure
    • handling of predictors
      • binary splits
      • C5.0/C4.5 and CHAID
      • Classification and Regression Trees
    • IBM SPSS Modeler C5.0
    • IBM SPSS Modeler CHAID
    • Microsoft Decision Trees parameters
    • modeling dataset
    • Random Forests
    • RapidMiner
    • supervised segmentation
    • tree pruning
    • using terminating criteria
      • prepruning or forward pruning
      • split-and-grow procedure
  • deep-selling modeling
    • pilot campaign
    • profiling of customers
    • usage increase
  • dimensionality reduction models
  • direct marketing campaigns
  • eigenvalue (or latent root) criterion
    • of components
    • percentage of variance/information
    • z-score method
  • EM see Expectation Maximization clustering
  • estimation (regression) models
    • linear or nonlinear functions
    • ordinary least squares regression
  • Expectation Maximization clustering
  • factor analysis
  • feature selection (field screening)
  • Gains/Response/Lift charts
    • binary classification problem
    • churn model
    • creation of charts
    • cumulative Lift or Index chart
    • in Data Mining for Excel
    • Evaluation Modeler node
    • Gains Chart
    • in IBM SPSS Modeler
    • Kolmogorov–Smirnov statistic
    • performance measures
    • in RapidMiner
    • Response chart
  • Gini index, CART
    • child nodes
    • distribution of target classes
    • IBM SPSS Modeler CART
    • purity improvement
    • of root node
    • splits and predictors
    • voice and SMS usage
  • IBM SPSS Modeler
    • Bagging (Bootstrap aggregation)
    • balancing approach
    • Bayesian belief networks
    • boosting
    • CART
    • C5.0 Decision Trees
    • CHAID Decision Trees
    • classification modeling methodology
    • class weighting
    • cluster modeling
    • confusion matrix and accuracy measures
    • cross-selling modeling
    • customer scoring model
    • data exploration
    • derive nodes
    • Gains/Response/Lift charts
    • K-means
    • mobile telephony
    • principal components analysis
    • profiling
    • receiver operating characteristic curves
    • revealed segments
    • scoring customers
    • Split (Holdout) validation method
    • stream (procedure), churn modeling
      • Auto-Classifier node
      • Balance node
      • derived fields/candidate predictors
      • description
      • initial and balanced distribution
      • modeling steps
      • parameters
      • Split (Holdout) validation and Partition node
      • Three Decision Tree and SVM model
      • Type node, setting
      • undersampling
    • support vector machines
    • Tree Augmented Naïve Bayesian network
    • TwoStep
  • ICA see independent component analysis
  • imbalanced outcome distribution
    • balancing
    • class weighting
  • independent component analysis
  • K-means, clustering algorithms
    • Bayesian Information Criterion
    • centroid-based partitioning technique
    • centroids of identified clusters
    • in Data Mining for Excel
    • Euclidean distance
    • IBM SPSS Modeler
    • K-medoids
    • Modeler’s
    • RapidMiner K-means and K-medoids cluster
  • K-medoids
  • Kohonen network/self-organizing map
  • Kolmogorov–Smirnov statistic
  • KS see Kolmogorov–Smirnov statistic
  • logistic regression
  • market basket analysis
  • marketing reference table
    • aggregations/group by
    • deltas
    • derive
    • filtering of records
    • flag fields
    • joins
    • ratios (proportions)
    • restructure/pivoting
    • sums/averages
  • maximum marginal hyperplane
  • meta-modeling or ensemble modeling
  • mining approach
    • cross-selling model
      • data and predictors
      • modeling population and level of data
      • target population and attribute
      • time periods and historical information
    • and data model
      • cross-selling campaign
      • pilot campaign approach
      • product possession approach
      • product uptake approach
    • voluntary churn propensity model
      • data sources and predictors, selection
      • modeling population and data level
      • target population and churn definition
      • time periods and historical information
  • mining datamart
    • marketing reference table
    • of mobile telephony operator
    • of retail banking
    • of retailers
  • MMH see maximum marginal hyperplane
  • mobile telephony
    • behavioral segmentation
    • Call Detail Records
    • clustering, IBM SPSS Modeler procedure
    • core segments
    • high-level quality services
    • modeling steps
    • organization’s mining datamart
    • segmentation fields
    • SMS and MMS messages
  • model deployment
    • churn propensities
      • defined, churn
      • ensemble model
      • propensity-based segmentation
      • scored customers, sample
      • voluntary churn model scoring procedure
      • $XF-CHURN field
    • direct marketing campaigns
      • procedure and results
      • scoring customers, marketing campaign
  • model evaluation procedure
    • accuracy measures and confusion matrices
    • gains, response, and lift charts
    • precampaign model validation
    • profit/ROI charts
    • RapidMiner modeling process
      • confidence(T) field
      • model deployment
      • ROC curve
      • Split Validation operator
    • ROC curve
    • test-control groups
  • modeling process design
    • behavioral segmentation methodology
      • determining segmentation level
      • selecting observation window
      • selecting segmentation population
      • selection of appropriate segmentation criteria
    • classification modeling methodology
      • defining modeling population
      • determining modeling (analysis) level
      • target event and population
      • time frames
  • Naive Bayesian networks
    • Apply Model operator
    • Attribute Characteristics
    • Attribute Profiles output
    • Data Mining for Excel
    • conditional probability
    • normal distribution assumption
    • RapidMiner process
  • Naïve Bayes model
    • Create Lift Chart operator
    • with Laplace correction
    • ROC curve
  • PCA see principal components analysis
  • principal components analysis,
    • clustering
    • component scores
      • IBM SPSS Modeler
      • RapidMiner
    • components to extract
      • behavioral fields
      • eigenvalue (or latent root) criterion
      • interpretability and business meaning
      • pairwise correlation coefficients
      • percentage of variance criterion
      • scree test criterion
    • data reduction
    • linear correlation between continuous measures
    • meaning of component
      • interpretation process
      • Modeler
      • in RapidMiner
      • rotation techniques
    • model
    • reduction of dimensionality
  • Random Forests
    • Decision Tree models
    • in RapidMiner
  • RapidMiner modeling process
    • Attribute operator
    • Bagging (Bootstrap aggregation)
    • balancing approach
    • boosting
    • chi-square test, CHAID
    • classification modeling methodology
    • class weighting
    • cluster modeling
    • confusion matrix
    • Cross or n-fold validation method
    • cross-selling model
    • customer scoring model
    • Decision Tree model with bagging
    • Gains/Response/Lift charts
    • K-means and K-medoids cluster
    • model evaluation procedure
    • Naïve Bayes model
    • predictors
    • principal components analysis
    • profiling
    • Random Forests
    • receiver operating characteristic curves
    • retail case study see retail case study, RapidMiner
    • revealed segments
    • scoring customers
    • Set Role settings
    • Split (Holdout) validation method
    • SVM models
    • value segmentation and RFM cells analysis
  • receiver operating characteristic curves
    • area under the curve measure
    • confusion matrix and accuracy measures
    • Gains chart
    • Gini index
    • in IBM SPSS Modeler
    • model evaluation
    • Naïve Bayes model
    • performance of model
    • Profit/ROI charts
      • customers
      • in Data Mining for Excel
      • in IBM SPSS Modeler
      • marketers
    • in RapidMiner
    • sensitivity
  • recency, frequency, and monetary analysis
    • cell segmentation procedure
      • data preparation phase
      • distribution
    • clustering model
    • components
    • cross-selling models
    • grouping (binning) of customers
    • indicators, construction of
    • monitoring consuming behaviors
    • quintiles, grouping customers
    • in retail industry
    • scatter plot
  • regression models see estimation (regression) models
  • retail case study, RapidMiner
    • cross-selling model
    • Decision Tree model with bagging
      • bagging operator
      • parameter settings
      • in tree format
    • performance of model
      • confusion matrix
      • ROC curve
    • scoring customers
      • model deployment process
      • prediction fields
    • Split (Holdout) validation
    • value segmentation and RFM cells analysis
  • RFM see recency, frequency, and monetary analysis
  • ROC see receiver operating characteristic curves
  • scoring customers, marketing campaign
    • binary classification problems
    • Create Threshold operator
    • in Data Mining for Excel
    • Gains/Profit/ROC charts and tables
    • in IBM SPSS Modeler
    • probabilistic classifiers
    • propensity segmentation
    • in RapidMiner
  • scree test criterion
  • segmentation algorithms
    • clustering algorithms
      • with K-means
      • with TwoStep
  • sequence algorithms
  • Split (Holdout) validation method
    • churn modeling
    • cross-selling modeling
    • in Data Mining for Excel
    • distribution of target attribute
    • in IBM SPSS Modeler
    • model training
    • performance metrics
    • random sampling
    • in RapidMiner
    • retail case study
  • supervised modeling
    • classification or propensity models
    • estimation (regression) models
    • feature selection (field screening)
  • support vector machines
    • linearly inseparable data
      • IBM SPSS Modeler
      • Kernel functions
      • Polynomial transformation
      • RapidMiner SVM models
    • linearly separable data
      • linear discriminant function
      • maximum marginal hyperplane
      • separating hyperplane
    • nonlinear mappings for classification
  • SVM see support vector machines
  • TAN see Tree Augmented Naïve Bayesian network
  • telecommunications, segmentation application
    • data dictionary and segmentation fields
    • data preparation procedure
    • mobile telephony
    • modeling procedure
      • identifying segments with cluster model
      • preparing data for clustering
      • profiling and understanding clusters
      • segmentation deployment
    • segmentation procedure
      • deciding level
      • dimensions
      • population, mobile telephony core segments
      • time frames and historical information analyzed
    • using RapidMiner and K-means cluster
      • Cluster Distance Performance operator
      • clustering with K-means algorithm
      • Euclidean distance
      • K-means parameter settings
      • mobile telephony segments
      • PCA algorithm
      • profile of clusters
      • variance/information, by components
  • test-control groups
    • direct marketing campaign
    • Model Holdout group
    • Random Holdout group
    • recorded response rate, cross-selling campaign
  • time frames
    • in churn model
    • customer profiles
    • event outcome period
    • latency period
    • multiple time frames
    • observation (historical) period
    • potential voluntary churners, identification of
    • validation phase
  • Tree Augmented Naïve Bayesian network,
    • IBM SPSS Modeler
    • structure
    • training dataset
  • TwoStep cluster
    • Akaike Information Criterion
    • Bayesian Information Criterion
    • IBM SPSS Modeler
    • preclusters
  • unsupervised models
    • association (affinity) and sequence models
    • cluster models
    • dimensionality reduction models
    • record screening models
  • up-selling modeling
    • pilot campaign
    • product upgrade
    • profiling of premium product owners
  • validation techniques
    • Bootstrap validation
    • Cross or n-fold validation
    • Split (Holdout) validation method
  • value segmentation
    • and cross-selling in retail
      • data dictionary
      • data preparation procedure
      • exploration and marketing usage
      • grouping customers see customers grouping, value segmentation
      • mining approach
      • modeling procedure
      • predictive accuracy of classifiers
      • recency, frequency, and monetary analysis
      • retail case study using RapidMiner
      • transactional data
    • and RFM cells analysis
      • discretization of numeric fields
      • Map operator, value segments
      • MONETARY attribute
      • relevant binned attributes
      • total purchase amount
