Tuesday, 5 June 2018

TechforTheTechie - on AI: A brief discussion on Artificial Intelligence, Bias and Subjectivity

When we think of artificial intelligence, the common image that comes to mind is one of humanoid robot overlords, realising that the worst thing to ever happen to mankind is mankind, and thus we need to be enslaved to save ourselves from ourselves.
You know... Terminator / Skynet - type visions of the future.

In fact, AI and Machine Learning are more commonplace than we think and affect our lives every day.

When you get a recommendation on a pair of shoes or the next gadget to purchase on eBay or Amazon, that's not done by a person sat behind a screen poring through troves of data about you. That's AI in action.
When you get a friend recommendation on Facebook, suggestions on Instagram posts you would like, that next YouTube video or Netflix series, that's all AI.
Now let's get a bit more serious.
The decision taken by an HR team based on algorithmic filtering of hundreds to thousands of applicants as to whom should be initially screened from the application process for a highly-in-demand job? ....could be based on AI algorithms (and in fact, these days, increasingly is!)
The decision as to whether you qualify for that loan?.... interest rates you get on your mortgage? ...again, potentially (and highly likely in most cases to be) influenced by AI.
So, if an artificially intelligent system is advising on serious life-impacting decisions we then make our final decisions from, how can we trust that the 'decisions' made by our AI are free from bias... racial? gender bias? wealth class? religious?
Are the decisions being made 'Ethical'? Equitable? Fair?
Can the AI give us fully explainable reasons as to how it arrives at its decisions? This is an area that gets even more complex when you talk about neural networks and deep learning applications.
Unintended bias may even be more difficult to identify.
If a machine learning algorithm, say for example, a supervised learning technique such as a bayesian linear regresssion model, is used to predict how likely a criminal is to re-offend, how can we tell whether these predictions are free from bias? If the predictions are deemed 'accurate' or 'correct' based on the dataset that was used to train the ML model, are we certain that certain parameters in a new dataset will still hold true and give us 'expected' results?
You will need to define and redefine what are deemed "correct" outcomes in this and other contexts where artificial intelligence and machine learning are applied. There is a common phrase used by us data scientists "correlation does not necessarily infer causation". Unfortunately, in certain forms of supervised and non-supervised machine learning, algorithmic logic and neural networks will make assumptions that may hold true for the test datasets used to train the model, but may not always be "correct" in every real life example. In short, the wrong 'causal' relationships between data dimensions may be assumed to be "correct".
For example, the decision as to whether a first time criminal is likely to re-offend is based on multiple data points, in some cases and jurisdictions, including their "credit score"! (bearing in mind that not everyone has a credit score, this could be very bad news for any such demographic caught on the wrong side of such a law enforcement system... recent immigrants for instance).
When "machines" make decisions on our behalf, (or become our 'wise' consultants), it is imperative that we understand the assumptions used to arrive at what appear to be "correct" decisions.
In many cases, the data we feed our machine learning models is data which (unfortunately) may have some form of hidden inherent bias or another. In some cases, maybe a potentially key correlational factor may have been overlooked, which may actually be of highly important causal impact, and thus may not be even included in AI, ML models at the onset. (You know, "unknown unknowns").
I won't ramble on... I'll simply say as AI models learn, and relearn, and constantly redefine the parameters of what they term "correct" outcomes and predictions, it is of utmost importance that we keep up and ensure it is all happening within the realms of what is fair and equitable to all, as well as, not least important, applicable data laws and regulations.
________________________________