Artificial Intelligence (AI) is a term commonly used to define a huge range of techniques within the analytics field (or is that data science). This article is intended to present the original definitions of various analytical terms from the perspective of data analyst. You might want to read the blog article on Hypothesis-Based vs. Discovery Driven Analytics, first.
AI was originally defined as a concept to compliment “Natural Intelligence”, the measurable cognitive ability of organisms to infer knowledge based on their experience. AI is therefore the ability of computational systems to provide feedback that appears to come from an intelligent organism which depends on a wide knowledge of their environment. In fact, the “Turing Test” developed by Alan Turing in the 1950’s was devised to review the ability of a computer to exhibit intelligent behaviour and therefore, AI.
Given that AI is a rather specific part of data analytics and very difficult to achieve, then we should probably ask what the aim of data analysis is. The common goal of analytics is the determination of information from data and subsequently the acquisition of knowledge; this is via hypothesis-based analysis or discovery methods. From this we then predict new information based on these data models, and additionally, the error and likelihood of correctness of this prediction.
I would also note the difference between functional outcome (perception of intelligence for example), and a technology used to achieve this. Both AI and Machine Learning (ML) are both measures of functional outcome while, for example, Bayes method is a technique that can be used within a tool-set that attempts to achieve this.
The ability to predict or behave based on learned experience; where learned experience is a wider analysis of the environment well beyond the immediate prediction. Synonymous with natural intelligence “but without the deception”. Tested by the Turing test.
ML (Machine Learning and different from ML: maximum likelihood), is the creation of predictive models based on training set of parameters that is then used to predict outcome based on new data or predict an improved data model. The tools used in ML are based on statistical/numerical techniques and require that the training set of data is a good sample that represents the model being predicted. ML can be successful where results are interpolated and is usually unsuccessful if outcome is being extrapolated from the training set; that is, the model is likely to produce a sensible result where the predicted result is not an outlier in data. Where it is difficult to identify/extract the reason for the prediction (for example, in neural networks or Hidden Markov Models) then we must be very careful to validate results as the wrong metric may have been learnt. The classic quoted story is the first neural networks used to identify tanks in military photo-reconnaissance images which resulted in identification of rainy days - as all training photos of tanks were taken on rainy days and photos absent of tanks were sunny.
ML methods can consider categorical data or numerical data input, and in the latter case is useful to apply interpolative models such as Bayesian statistics where data is normal (or other models applicable to the type of data). Where model parameters are mixed categorical and numerical then we of course apply the appropriate models for each parameter.
Maximum Likelihood is a statistical optimisation technology: The aim of Maximum Likelihood is to estimate or maximise the correctness of a statistical parameters of a predictor; these may be based on Bernoulli, normal or Poisson statistical models which most appropriately match the model. It is used as a method of unbiased optimisation during a non-linear optimisation method.
GA: (Genetic Algorithms) is an optimisation method: A method synonymous with the natural chromosome replication where learning is based on a mutation model (with probabilistic likelihood of mutation type), region swapping (large scale mix and match), duplication and deletion of regions. Generally, only used in biology for sequences but can be used for anything. Bit of a fashion statement since it is slow to converge.
Neural Networks: Neural networks are the best known of the ML methods. The creation of predictive models based on networks with decision output likelihood at each node based on inputs to each node. These are modelled on the concept of neurons in organisms which take signals from other neurons and based on previous learnt response conditions, result in stimulating connect neurons based on these learnt conditions. Computationally, the neuron uses a test data that defines the “learnt” fire state based on a test set of input parameters and their known outcome.
Self Organising Maps
SOM: Self organising maps are a sort of sub-set of neural networks designed by Turing as a network optimiser.
Markov Models/Hidden Markov Models
MM and HMM (Markov and Hidden Markov Models) are network predictive models based on likelihoods on links rather than node conditions. The number of nodes is equal to the number of variable positions in the analysis, and hidden is a term used where a hidden layer of predictors is used. It is quite fashionable to create predicative models for alignment and sequence fragments like active sites or CDR regions.
Hierarchy methods: Wide range of methods such as clique analysis and often used for ontological model prediction. Although these create predicative models they are not usually considered machine learning.
Semantic web: Not actually ML but comes under hierarchy-based model descriptions – usually triplets which define ontologies of mapping. The description of triplet relationship models was powerful but suffered significant performances issues with large data and problems defining the relationships in a rich ontology.
Simple Numerical Analysis
The creation of models as parametric equations that are the optimised fit to a set of data – such as curve fitting. They can be iterative (i.e., Gaussian) or directly solvable (i.e., linear regression). Although these create predictive models in the wider sense they are usually not considered machine learning but are often tools sets part of a wider machine learning technology.
The Human Element
An important issue often overlooked when considering the search for information and knowledge is humans are “natural intelligence”. They have ability of see patterns in data and take the next step and bring in a wider experience of knowledge to the problem at hand. An expert user should be able to see outside the box and therefore when designing discovery tools, they must be holistic with the user of those tools. This is where good interactive and intuitive graphics come into their own especially when integrate with the analytical tool set, therefore providing the option to take the user on a journey of discovery.
Dotmatics has developed the program Vortex that provides an interactive toolset of mathematical analytical tools: SOM’s, maximum likelihood, high dimensional set theory, HMM’s, Bayes theory with classifiers and Gaussian interpolation, clique analysis, PCA, K-means, SAR. It also includes chemical scientific knowledge: sub/super-graph chemical analysis and five biology sequence alignment algorithms. Integrated with these is a multitude of interactive graphical representations to enable knowledge discovery. Critical to this is the integration of error analysis either via probabilistic measures or free-residual tests to aid the user in the quest for knowledge.
It is quite clear that AI is an important part of the lexicon in the field of science and beyond. The term is now used as a generic term for data analysis, but I hope this short article provides some insights from the coal face of data analytics and the subtlety of mathematical ideas. It should also be noted that the Turing test – the final arbiter of AI has not really been passed 100%.