Reading the Financial Robot's Mind
Personification aside...
I've used the catch phrase "Reading the Robot's Mind" to point out the need to manage societal impact of automated pattern recognition systems and artificial intelligence (AI). One of the larger societal impacts of such systems is when a AI is the decision maker for financial outcomes that affect people's lives. I was glad to be able to attend several of the informative sessions at the recent Washington DC Financial Technology Week (#dcfintechweek2020) event, and I wanted to share some of the takeaways that I gathered.As you likely already know, machine learning (ML) must be validated before it is put into live practice, and being able to assess the robustness of the ML model is of tantamount importance. Most developers use a standard "in-sample/out-of-sample" testing model for such an assessment, and also often similarly use "data quality validation" and "outcome monitoring against a benchmark." Nevertheless, more and more professionals are insisting on deeper validity measures, including "explainability" and "transparency," or as I like to put it - reading the robot's mind.
Legislation in the United States has not ignored this issue.
Lawmakers realize their constituencies can be greatly impacted by accuracy of these systems, intentional or unintentional discrimination, and they naturally want all possible improvements to be implemented. For example, the Equal Credit Opportunity Act insists that companies that deny loan applications can demonstrate upon request the reasons for such denial. Similarly, the Fair Credit Reporting Act requires disclosure of key factors affecting the score given to an individual. If a software program robotically makes these decisions of loan acceptance and credit score, then that robot must be transparent about the models and results and also be able to explain the decision making process in such a way that it can be interpreted in a court of law (run by humans). Also, human jurors will ask even tougher questions such as "Are there less discriminatory alternatives?" to which AI developers must be able to respond meaningfully.
Besides attending conference sessions related to AI and ML, I also had the chance to speak one-on-one with attendees (which was set up virtually, and in a very streamlined way, if I may add that kudo to the organizers and their software). In one such conversation, I learned about a document recently created by the National Institute of Standards and Technology (NIST) of the U.S. Commerce Department. This draft publication "NISTIR 8312" discusses the Four Principles of Explainable Artificial Intelligence, and is a fascinating read.
In a nutshell
The document outlines the four principles of explainable AI as:
- Explanation: The robot delivers the evidence or reasons it made the decision.
- Meaningful to Humans: The robot makes the explanation understandable to users.
- Accuracy: The robot's explanation is indeed how the decision was made.
- Self Limiting: The robot only makes a decision when it can do so with sufficient confidence.
Here are those four principles
Here are those four principles, but now posed as questions comparing robots to humans:
- Should robots be able to explain themselves better than humans can? The answer is yes. People, it turns out, become worse and worse at explaining themselves the more they gain expertise, and human decision making processes become more automatic and less conscious. This is great for humans who are competing during the evolutionary "survival of the fittest" for efficiency, speed, and scarce resources. It is not, however, great for robots.
- Should robot explanations be more meaningful than human explanations? The answer is also yes. There is a continuous challenge for expert humans to explain themselves in a meaningful way to other people. The paper uses the example of forensic scientists explaining themselves to jurors in a trial, and the large gap in the understanding by these laypeople.
- Should robot explanations be more accurate than human explanations? Yes again! The paper points out that it is well documented that people report their reasoning for decisions in a way that does not reliably reflect accurate or meaningful introspection. Humans literally fabricate reasons for their decisions!
- Finally, should robots self-limit to only provide decisions within their scope of knowledge in ways that are better than humans do? Yet again, the answer is yes. The well known Dunning-Kruger Effect is a great example of how most humans inaccurately estimate their own abilities and knowledge.
In conclusion
This is not an easy problem to solve. I believe more work is needed in this area, and I am personally working on the the following solution paths:
- Focus on self-explainable models such as decision trees and nearest neighbor. Use the deep learning algorithms only to produce relevant feature extraction, but leave the decision making very explainable.
- Creation of prototypes, or exemplary instances, that demonstrate what the robot thinks of when it is making a decision. This is tied to the above, involving auto encoders (essentially, trained CODECs) that can not only learn important features and encode them into a smaller space as a hashing algorithm would, but also use those features to recreate exemplary instances that can be meaningful to a user.
- Not being so "efficient" in training and running (interpretation) of robot ML. Use a few extra compute cycles to create the above, and give every robot a mind-reading window we can use to control negative societal impacts of AI.
P.S. Image courtesy of Wikimedia.org and Gufosowa
Comments
Post a Comment